現今我們正邁向資料經濟的時代,如何能有效分析大量數據則成為成功的關鍵。目前有許多用於處理巨量資料的系統已經被開發出來,當中Storm是為了處理資料串流而設計的。Storm預設只使用了相當簡易的round-robin策略來對工作進行排程。這種策略在同質平台的環境下可以達到不錯的成效,但是在異質環境下則無法達到有效的利用。 此篇論文我們設計並實作出G-Storm,一種新的Storm排程演算法,能讓Storm有效地評估並利用GPU計算卡來加速計算效能。我們的實驗顯示G-Storm在工作量較輕的情況下可以比Storm預設的工作排程多出1.65倍的效能,而在工作量較重的情況下更可達到將近2.04倍的加速。
Now we are shifting toward to a data driven economy, in which the ability to efficiently analyze huge amount of data in time is the key to successes. Many systems for big data processing have been developed and Storm is one of them, whose target is stream data processing. By default Storm only provides a very simple round robin scheduling policy to assign tasks. The default scheduler can provides nice performance for homogeneous platforms, but does not work well for heterogeneous computing environments. In this thesis, we propose and implement a new Storm scheduling algorithm, named G-Storm, such that Storm can evaluate GPU capacity for scheduling and more effectively make use of GPU to speed up the overall performance. The experimental results show that G-Storm can achieve 1.65x to 2.04x performance acceleration on lightly weight and heavily loading of topology, compared to Storm with default scheduler.