透過您的圖書館登入
IP:13.59.236.219
  • 學位論文

連結NoSQL和MapReduce的通用方法

A General Method to Bridge NoSQL and MapReduce

指導教授 : 鍾葉青
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


NoSQL和MapReduce資料庫在資料庫社群中已經各占據一席之地。由於NoSQL和MapReduce適合使用在不同的應用,鮮少有人會將二者放在一起思考。NoSQL資料庫適用於雲端,它提供簡單的APIs,讓用戶能迅速地從海量資料中存取想要的資料。然而,在某些情形下,單純使用NoSQL資料庫提供的API來查詢海量資料可能會耗費數十分鐘至數小時。若能借用MapReduce的框架來平行化處理這些資料,將能大幅度縮短處理時間。在這篇論文中,我們將介紹一種新的系統架構,能夠讓用戶同時使用NoSQL的APIs和MapReduce。並介紹一種連結NoSQL和MapReduce的通用方法:透過設定資料表大小的閾值,讓系統根據資料量以及查詢語言類別,使系統能自主決定何時使用NoSQL所提供的APIs或MapReduce,以獲得較佳的資料查詢效能。

關鍵字

資料庫 雲端運算 巨量資料

並列摘要


NoSQL and MapReduce have taken roles in dealing big data. People seldom put these two methods together since they are suitable for different applications. NoSQL is designed for the Cloud. Users can use APIs supplied by NoSQL to query specific data from great amounts of data quickly. However, it would take minutes to hours to get data with pure NoSQL APIs in some cases. On the other hand, the MapReduce framework can parallel process data. With MapReduce, we may significantly shorten the processing time. In this work, we introduce a system architecture that automatic chooses APIs or MapReduce to retrieve big data from NoSQL. We propose a general prediction method for this new system architecture to hold both the advantages of APIs and MapReduce. We prove the feasibility of our design in the experiments.

並列關鍵字

Database Cloud Computing Big Data

參考文獻


[3] IBM’s DB2 Parallel Edition.
[6] Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins. Pig latin: a not-so-foreign language for data processing. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, SIGMOD ’08, pages 1099–1110, New York, NY, USA, 2008. ACM.
[9] Apache HBase. http://hbase.apache.org/.
[1] Cisco. Cisco Visual Networking Index: Forecast and Methodology, 2012-2017. http://www.cisco.com/.
[2] Oracle Real Application Clusters. http://www.oracle.com/technetwork/products/clustering/overview/index.html.

延伸閱讀