透過您的圖書館登入
IP:3.22.79.2
  • 學位論文

G-Storm: 具 GPU 感知之 Storm 規劃方法

G-Storm: GPU-Aware Scheduling in Storm

指導教授 : 李哲榮
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


現今我們正邁向資料經濟的時代,如何能有效分析大量數據則成為成功的關鍵。目前有許多用於處理巨量資料的系統已經被開發出來,當中Storm是為了處理資料串流而設計的。Storm預設只使用了相當簡易的round-robin策略來對工作進行排程。這種策略在同質平台的環境下可以達到不錯的成效,但是在異質環境下則無法達到有效的利用。 此篇論文我們設計並實作出G-Storm,一種新的Storm排程演算法,能讓Storm有效地評估並利用GPU計算卡來加速計算效能。我們的實驗顯示G-Storm在工作量較輕的情況下可以比Storm預設的工作排程多出1.65倍的效能,而在工作量較重的情況下更可達到將近2.04倍的加速。

關鍵字

大數據 串流處理 GPU Storm

並列摘要


Now we are shifting toward to a data driven economy, in which the ability to efficiently analyze huge amount of data in time is the key to successes. Many systems for big data processing have been developed and Storm is one of them, whose target is stream data processing. By default Storm only provides a very simple round robin scheduling policy to assign tasks. The default scheduler can provides nice performance for homogeneous platforms, but does not work well for heterogeneous computing environments. In this thesis, we propose and implement a new Storm scheduling algorithm, named G-Storm, such that Storm can evaluate GPU capacity for scheduling and more effectively make use of GPU to speed up the overall performance. The experimental results show that G-Storm can achieve 1.65x to 2.04x performance acceleration on lightly weight and heavily loading of topology, compared to Storm with default scheduler.

並列關鍵字

big data stream process GPU Storm

參考文獻


Gang Chen 0001, Ke Chen 0005, Dawei Jiang, Beng Chin Ooi, Lei Shi, Hoang Tam Vo, and Sai Wu. E3: an elastic execution engine for scalable data processing.
Leonardo Aniello, Roberto Baldoni, and Leonardo Querzoni. Adaptive online scheduling in storm.
Apache Software Foundation. Storm. http://storm.apache.org.
Vinayak Borkar, Michael Carey, Raman Grover, Nicola Onose, and Rares Vernica. Hyracks: A flexible and extensible foundation for data-intensive computing.
M. Cammert, C. Heinz, J. Kramer, B. Seeger, S. Vaupel, and U. Wolske. Flexible multi-threaded scheduling for continuous queries over data streams.

延伸閱讀