透過您的圖書館登入
IP:18.222.163.31
  • 學位論文

在軟體定義網路下建構與設計Hadoop叢集於Docker平台

Design and Construction of Hadoop Cluster on Docker Platform over Software-defined networks

指導教授 : 賴槿峰
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


由於現今的網路發展迅速,現代社會正以不可想像的速度產生大數據,無處不在的社會和商業活動源源不斷地產生各種資料,若要對這些大量的資料進行分析處理,通常也需要借助雲端運算平台來處理。Hadoop是目前最常見且實際運用在大規模商業環境上的雲端運算平台之一,強大且完整的基礎架構可以減少大量的雲端架構開發的時間,大量部署時也相當迅速。 雲端運算是一種基於網際網路的運算方式,通過這種方式,共享的軟硬體資源和資訊可以按需求提供給電腦和其他裝置。若要將共享的軟硬體資源可以按需求提供給其他電腦和裝置,則必須使用虛擬化技術,虛擬化技術可以將一台實體主機的硬體資源轉換成共享的運算資源,其主要的目的是單一主機上執行多個虛擬主機。當需要擴充系統運算能力時,則是以一個虛擬主機為單位來擴充。 本研究的目標是希望藉由Docker容器虛擬化的特性,能夠針對Hadoop叢集運算節點的運算資源做出較好的分配,因此我們將探討在同樣的機器上節點數目與容器所配置的運算資源之間的關係,網路流量對於Hadoop叢集運算影響,最後觀察是主節點配置在效能相異的機器上所造成的差異。

關鍵字

Docker 虛擬化技術 Hadoop

並列摘要


Due to the rapid development of the Internet. Modern society continuously produce social and commercial activities followed by numerous data. If we want to analyze these data, we usually need cloud computing platform to help us handle these large amounts of data. Hadoop is one of the most common application in cloud computing platform that is currently applied in large scale business environment . Its powerful and complete infrastructure efficiently reduces cloud architecture development time. Cloud computing is an Internet-based computing mode. In this way, hardware and software resources can be shared to other computers and devices on demand. It's only possible to implement the concept by utilize the virtualization technology. Virtualization technology can convert a physical host hardware resources into a shared computing resources. The main purpose of virtualization is to execute multiple virtual hosts on a single machine and using a single virtual host to expand the system computing power similarly. Our research purpose is to exploit the feature of Docker container to allocate computing resources of hadoop cluster in better way. Therefore, we will investigate the relationship between configuration of container and number of hadoop cluster node. Next, we will measure the impact of network traffic on hadoop cluster node. Finally, we will demonstrate the relationship between master node setting and hadoop cluster performance on different machine.

並列關鍵字

Virtualization Technology Hadoop Docker

參考文獻


[26] Xavier M.G., De Oliveira I.C., Rossi F.D., Dos Passos R.D., Matteussi K.J., De Rose C.A.F. “A Performance Isolation Analysis of Disk-Intensive Workloads on Container-Based Clouds” in Parallel, Distributed and Network-Based Processing (PDP), 2015 23rd Euromicro International Conference on, pp. 253-260, Mar. 2015
[5] Jeffrey Dean and Sanjay Ghemawat “MapReduce: Simplified Data Processing on Large Clusters”, OSDI 2004
[8] Docker. [Online]. Available: https://www.docker.com/
[9] Felter W., Ferreira A., Rajamony R., Rubio J. “An updated performance comparison of virtual machines and Linux containers”
[10]Docker Doc. [Online]. Available: https://docs.docker.com/

延伸閱讀