近年來資料的成長量以驚人的速度不斷的在增加,再加上物聯網的興起,大數據分析便成為了許多商業模式非常重要的獲利來源,但是在雲端計算的架構下,Hadoop在執行MapReduce動作時,會佔用非常大量的網路頻寬,尤其在越多節點的情況下,對於網路傳輸品質的影響更為嚴重,本文將以MapReduce為改善的基楚,結合霧運算(Fog Computing)的資料預先處理能力,提出新的運作模式,運用Fog Computing的特性,將大多數的資料在送進Hadoop中以MapReduce做分析前,先進行初步的匯整與計算,善加利用Fog Computing在近端設備的優勢,可大幅減少傳進Hadoop MapReduce的Result Set,達到精簡網路資源的目標,進而提升整體的執行效率,把這樣的計算架構運用到目前的雲計算環境下,將可以讓各種的大數據運算得到一定程度的幫助。
In recent years, the rapid growth of the internet data information which cause the big-data analysis has become a very important business model and the sources of the profit; however, under the cloud computing architecture, Hadoop running the MapReduce action will use huge amount of the network bandwidth, especially in more nodes case. It has serious impact for the quality of the network transmission. The purpose of this paper will explain how to use the MapReduce as the base and combine with fog operations(Fog Computing) Data pre-processing capacity to forward a new operating mode. Meanwhile, before MapReduce running the analysis and sending the data to the Hadoop, Fog Computing can collect the initial aggregation and do calculation in advance. By using the advantages of the Fog Computing which is a near-end equipment can significantly reduce the incoming Result Set of Hadoop. It can streamline the resources of the network and implement the overall efficiency. Indeed, when under this architecture which can be applied to variety big-data application to optimize the performance.