透過您的圖書館登入
IP:18.190.156.80
  • 學位論文

在軟體定義資料中心中使用串流探勘技術排程網路流量

Flow Scheduling for Software-Defined Data Centers Using Stream Mining

指導教授 : 陳銘憲

摘要


為了有效利用資料中心所提供的頻寬,需要有效的網路流量管理。 最近的研究集中在偵測網路中的大數據流,並且為它們找出最佳路徑 傳輸以提升網路頻寬利用率。但是目前的大數據流偵測方法都有各種 限制,包括顯著的監控開銷或需要硬體或是終端主機的修改。我們提 出 FlowSeer,一個使用串流探勘技術的快速且低開銷大數據流偵測及 即時排程流量系統。我們的主要觀點是,每條數據流中的前幾個封包 所包含的特徵,可以讓我們訓練出準確的串流探勘模型,並且使用此 模型來預測網路中新產生數據流的流量及持續時間,有了預測出的數 據流資料,FlowSeer 可以動態的為此條數據流做即時排程。FlowSeer 的優點是它讓網路控制器和交換機進行協作預測,因此大多數的預測 決定可以在交換機上完成以減少預測時的網路延遲以及通知網路控 制器時所造成的網路開銷。FlowSeer 只需要在每個交換機中安裝少於 100 條的流表項,因此可以實作於目前已有的交換機中。我們在虛擬網 路以及資料驅動模擬器中實驗我們的設計,結果表明 FlowSeer 對比於 Hedera 增進很多倍的網路傳輸流量,且和需要終端主機修改的 Mahout 有著同等的效能。

並列摘要


Traffic management is known to be important to effectively utilize the high bandwidth provided by datacenters. Recent works have focused on identifying elephant flows and rerouting them to improve network utilization. These approaches however require either a significant monitoring overhead or hardware/end-host modifications. In this thesis, we propose FlowSeer, a fast, low-overhead elephant flow detection and scheduling system using data stream mining. Our key idea is that the features from flows’ first few packets allow us to train the streaming classification models that are able to accurately and quickly predict the rate and duration of any initiated flow. With these predicted information, FlowSeer can adapt routing polices of elephant flows to their demands and dynamic network conditions. Another nice property of FlowSeer is its capability of enabling the controller and switches to perform cooperative prediction. Most of decisions can be made by switches locally, thereby reducing both detection latency and signaling overhead. FlowSeer requires less than 100 flow table entries at each switch to enable cooperative prediction, and hence can be implemented on off-the-shelf switches. The evaluation via both experiments in realistic virtual networks and trace-driven simulation shows that FlowSeer improves the throughput by multiple times over Hedera, which pulls flow statistics, and performs comparably to Mahout, which needs end-host modification.

參考文獻


[1] Openflow switch specification, version 1.0.0. https://www.opennetworking.org/sdn-resources/onf-specifications/openflow.
[2] The CAIDA UCSD Anonymized Internet Traces 2013 - equinix-chicago. http://www.caida.org/data/passive/passive_2013_dataset.xml.
[3] J. H. Ahn, N. Binkert, A. Davis, M. McLaren, and R. S. Schreiber. HyperX: Topology, Routing, and Packaging of Efficient Large-Scale Networks. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2009.
[4] M. Al-Fares, A. Loukissas, and A. Vahdat. A Scalable, Commodity Data Center Network Architecture. In ACM SIGCOMM, 2008.
[5] M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic Flow Scheduling for Data Center Networks. In USENIX NSDI, 2010.

延伸閱讀