透過您的圖書館登入
IP:18.118.137.243
  • 學位論文

應用於雲端運算系統預測MapReduce 排程機制之研究

MapReduce cloud computing system applied to predict Scheduling Mechanism

指導教授 : 李維聰

摘要


在眾多雲端技術中,MapReduce是Google在雲端技術上所提供出來使用在許多高運算、高儲存量的資料上的一個處理機制。MapReduce所提供的Map和Reduce兩個function,可以讓使用者輕易的將待處理的大量資料自動的完成。因此再藉由Hadoop依據 MapReduce這個架構將概念變成實際的產物,就可以方便使用者來使用。 目前Hadoop的應用絕大部分都還是用在複雜度較低且運算密度較高的程序上如搜尋(Sort)、資料統計等等。 在先前的文獻研究中,有許多是針對改善MapReduce效率這部份的研究,其中有針對於Reduce function的演算法提出了Dynamic Switch of Reduce Function (DSRF)Algorithm的改善方案,因此減少了Reduce function的閒置時間,但此排程機制會因為系統的負載數量增加到一定數量以上的時候,因為切換的頻率過多,反而造成系統效能的降低,甚至無法達到原本Hadoop MapReduce所提供出來的效能品質。本論文研究提出透過一個斜率公式的計算來提供預測系統效能最大工作負載量的方法,因此可以提前增加伺服器的數量,藉此避免系統因負載過多後造成效率的降低。

關鍵字

雲端

並列摘要


Among the many cloud technologies, MapReduce is provided by Google that is technically out of use in many high computing, high data storage capacity on a handling mechanism in the cloud system. MapReduce provided Map and Reduce two function, allows the user to easily handle large amounts of data will be done automatically. So then based on Hadoop MapReduce by this architecture will become the actual concept of the product, the user can easily use. Currently Hadoop applications are still used in the vast majority of low complexity and high density computing procedures such as search (Sort), statistics and so on. In previous studies in the literature, many of which are aimed at improving the efficiency of this part of MapReduce. Reduce function for which the algorithm proposed in a Dynamic Switch of Reduce Function (DSRF) Algorithm improvement plan, thus reducing the Reduce function of idle time. However, this scheduling mechanism because the system load increased to more than a certain amount of time. Because excessive switching frequency, but cause system performance degradation. Can not even reach out Hadoop MapReduce provide quality performance. This thesis put forward by a slope formula calculation to predict system performance to provide maximum working load approach. So you can increase the number of servers in advance, thereby avoiding excessive system due to the load, resulting in reduced efficiency.

並列關鍵字

MapReduce DSRF

參考文獻


[8]郭玲裳,“基於MapReduce的影像處理系統加入DSRF優先排程機制,”淡江大學電機工程學系碩士論文,中華民國一百零一年六月.
[1]NovQinlu He,Zhanhuai Li’Xiao Zhang, “Study on Cloud Storage System based on Distributed Storage Systems”, Computational and Information Sciences( ICCIS), 17-19 Dec 2010, pp. 1332 - 1335
[2]Kevin D.Bowers,Ari Juels, and Alina Oprea., “HAIL: A HighAvailability and Integrity Layer for Cloud Storage”, Computer and communications security (CCS), Nov 2009, pp. 187-198
[3]Mingyue Luo,Gang Liu, “Distributed log information processing with Map-Reduce: A case study from raw data to final models”, Information Theory and Information Security(ICITIS), 17-19 Dec. 2010, pp.1143-1146
[5]Chen Zhang ,De Sterck, H.,”CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase”, Cloud Computing Technology and Science (CloudCom), Nov. 30 2010-Dec. 3 2010, pp. 368-375

延伸閱讀