在網路資訊膨脹的時代,雲端運算在業界或是學術界成為發展的重要項目,其中更以Hadoop為主。Hadoop是開放原始碼的雲端運算平台主要提供Hadoop Distributed File Systems(HDFS)分散式儲存與MapReduce平行運算功能,HDFS可提高雲端資料的可靠性,MapReduce則是提升運算效能,雖然Hadoop平台提供簡化架設雲端運算環境的門檻與困難度,但如何撰寫MapReduce程式卻成為棘手問題,MapReduce的架構區分為Map與Reduce兩階段,巨量資料會被分割成數個較小的獨立資料區段,每個資料區段由一個Mapper來運算,Mapper運算後再交由Reducer處理並輸出為Hadoop運算的最終輸出。在MapReduce這兩階段中,第一階段的Mapper是必要階段,但Reduce階段則是非必要階段,Mapper的輸出亦可作為Hadoop運算最終的輸出。本論文將MapReduce架構區分為三種模式,沒有Reduce、一個Reduce和多個Reduce,不同模式可應用在不同場景的邏輯運算需求,所以不同類型的應用場景需搭配合適的MapReduce模式,才能突顯出雲端運算的效能與價值,本論文主要貢獻在以實際案例來測試與評估這三種不同的MapReduce架構。此外本論文在研究時發現Hadoop雲端運算平台雖有提供Web介面顯示平台資訊,但並沒有支援管理與問題處理,如果Hadoop平台臨時出現問題並須利用SSH連線才有辦法處理問題,本論文利用Andorid行動設備建置Hadoop行動管理系統,提供系統管理員能夠及時處理問題以降低損失。
With the expansion of the internet, cloud computing has become a crucial subject in both the academia and the industry. Hadoop, an open source cloud computing platform, is one of the major topics. The Hadoop Distributed File System (HDFS) and MapReduce in Hadoop provide distributed storage and parallel computation functions, respectively. The HDFS also increases the reliability of cloud data whereas MapReduce enhances computing performance. Although the Hadoop platform lowers the threshold and difficulty of setting up a cloud computing environment, writing MapReduce programs can be difficult. The framework of MapReduce comprises two stages, Mapper and Reducer. Bulk data are divided into several smaller independent data segments, each of which is computed by a Mapper. Subsequently, the Reducer operation takes over the processing and then outputs the final results of Hadoop. The Mapper stage is essential while the Reducer stage is not; the output of the Mapper can also serve as the final output of Hadoop. In this study, we divided the framework of MapReduce into three models: one with no Reduce, another with one Reduce, and the other with multiple Reduces. The different models can be applied to meet the logic operation demands of various scenarios, and therefore, selecting a MapReduce model that is appropriate for the application scenario is crucial to set off the effectiveness and value of cloud computing. The primary contributions of this study lie in the use of actual cases to test and evaluate the three different MapReduce frameworks. Furthermore, we discovered that although the Hadoop cloud computing platform provides a web interface to display platform information, it does not support platform management with a mobile device. We employed an Android mobile device to establish a mobile Hadoop management system that enables system administrators to deal with issues in immediately.