基於預測性任務轉移之高效能雲端計算系統

雲端運算近年來十分火紅，從IBM、Microsoft到Amazon每家廠商都推出雲端服務，在雲端運算迅速崛起的同時也出現些許問題。將資料存放在雲端上，利用雲端做龐大資料分析與處理的同時，如果出現錯誤或是網路斷線該如何解決？本篇論文主要探討主題為雲端運算上容錯議題，主要著眼在如何在MapReduce中有效且正確判定節點中的緩慢任務，在判定之後能夠使用較有效率的方法做重新分配處理緩慢任務，以避免整體工作時間被緩慢任務所拖慢進而影響到工作完成時間。本文主要以Hadoop作為開發實驗環境，利用模擬比較Hadoop、LATE以及本篇所提出之方法並分析其優劣。

關鍵字

Hadoop ； LATE ；緩慢任務；競爭式執行

並列摘要

Cloud computing is gaining popularity in recent years. Many renowned companies such as IBM, Microsoft, Amazon, are providing services over the cloud. It is inevitable that failures may occur in the cloud, so how to make a cloud computing system fault-tolerant is very important. In this research, we try to identify true slow tasks in Hadoop MapReduce’s jobs and migrate them to other compute nodes before failures occur. Specifically, we modify the LATE algorithm to make MapReduce scheduler adapt to tasks with variable progress rates. We also study three rescheduling methods and compare their performances.

並列關鍵字

Hadoop ； LATE ； Slow Task ； Speculative Execution

參考文獻

[1] Matei Zaharia, Andy Konwinski, Anthony D.Joseph, Randy Katz, Ion Stoica, ¡§Improving MapReduce Performance in Heterogeneous Enviroments,¡¨ USENIX pp.29-42 ,2008

[2] Steven Y.ko,Imranul Hoque,Brian Cho,Indranil Gupta,¡¨Making Cloud Intermediate Data Fault-Tolerant,¡¨ AMC, 2010.

[3] I˜nigo Goiri, Ferran Juli`a, Jordi Guitart, and Jordi Torres,¡§Checkpointpoint-based Fault-tolerant Infrastructure for Virtualized Service Providers,¡¨ IEEE pp.455 ¡V 462, 2010.

[4] Quiane-Ruiz, J.-A., ¡¨ RAFTing MapReduce: Fast recovery on the RAFT,¡¨ IEEE pp.589 - 600, 2011.

[5] Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox, ¡§MapReduce in the Clouds for Science,¡¨ IEEE pp.565-572, 2010

國際替代計量

基於預測性任務轉移之高效能雲端計算系統

全文下載

主題瀏覽