利用MapReduce軟體架構於Hadoop叢集進行地貌型直接逕流模組演算之研究

台灣由於氣候及地形的因素，一旦下起豪大雨便常常造成河川瞬間水位暴漲，甚至釀成嚴重的災情，因此更彰顯洪水預報系統在台灣的重要性。河川流徑洪水演算是洪水預報系統最重要的一環，目的是計算流域中的各項水文相關資料以判斷流量是否超出警戒線。但河川流徑運算公式複雜，流域的相關資料量又龐大，以傳統交予大型電腦處理或者由客戶端連線至伺服端將工作交給伺服器處理等單一主機運算的方式往往需要消耗許多時間，造成預報不夠即時。本研究的程式開發借重於Apache軟體基金會所開發的Hadoop開放源碼平台，Hadoop提供大量資料儲存及運算的分散式運算環境，以及提供程式開發者一種專為大量資料處理所設計的軟體架構－MapReduce，以分散式運算提供整合的運算資源加速處理龐大的資料量以減少運算時間。本研究使用MapReduce架構撰寫河川流徑演算程式，將其置於Hadoop叢集上運作，透過5種情境的量測得到最佳河川流徑演算速率可提升至6倍左右，達到提高洪水預報系統的效能、讓預報更即時的目的。

關鍵字

Hadoop ； MapReduce ；分散式運算；大量資料處理

並列摘要

Because of the weather and landform in Taiwan, a heavy rain often cause sudden rising of the runoff of some basins, even lead to serious disaster. That makes flood information system are highly relied in Taiwan especially in typhoon season. Computing the runoff of a basin is the most important module of flood information system for checking whether the runoff exceeds warning level or not. However this module is complicated and data-intensive, it becomes the bottleneck when the real-time information are needed while a typhoon is attacking the basins. The development of applications in this thesis is on "Apache Hadoop"－an open-source software that builds a distributed storage and computing environment, which allows for the distributed processing of large data sets across clusters of computers using a programming model－"MapReduce". We have developed the runoff computing module of a basin by using MapReduce framework on a Hadoop cluster. In our research, to speed up the runoff computing will increase the efficiency of the flood information system. Running our programs in an 18 nodes Hadoop cluster, we have derived the conclusion that it can speed up the execution of runoff computing by 6 times.

並列關鍵字

Hadoop ； MapReduce ； Distributed Computing ； Processing for Large Data

參考文獻

[12] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. “The Google File System”. ACM Symposium on Operating Systems Principles（Oct. 2003）.

[6] Chuck Lam. “ Hadoop in Action ”. Manning Publications Co（Dec. 2010）.

[8] Jaliya Ekanayake, Shrideep Pallickara, and Geoffrey Fox. “MapReduce for Data Intensive Scientific Analyses”. IEEE eScience（Dec. 2008）.

[9] Tyson Condie, Neil Conway, Peter Alvaro, Joseph M. Hellerstein, Khaled Elmeleegy, and Russell Sears. “MapReduce Online”. EECS Department of UC Berkeley. Technical Report No. UCB/EECS-2009-136（Oct. 2009）.

[10] Feng Wang, Jie Qiu, Jie Yang, Bo Dong, Xinhui Li, and Ying Li. “Hadoop High availability through Metadata Replication”. ACM CloudDB（Nov. 2009）.

被引用紀錄

徐思縯（2017）。Apache Spark 運用於虛擬化技術之效益研究〔碩士論文，中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201700737

蔡秉辰（2016）。智慧型雲端製造之個案研究〔碩士論文，朝陽科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0078-1108201714032963

國際替代計量

利用MapReduce軟體架構於Hadoop叢集進行地貌型直接逕流模組演算之研究

主題瀏覽