近幾年來,因為Internet的日漸成長,越來越多跨組織、跨地域的計畫被提出執行。其中一項很重要的部份便是儲存或交換資料的儲存系統。這類的儲存系統必須要有很高的擴展性,能動態的整合儲存資源,使用者能快速的取回所需要的檔案。之前的研究考慮到資料的地域性,方便使用者取回資料,但忽略了路由的地域性,造成使用者需費時找到適合的檔案。此外,過去研究的複本機制使用到大量的維護不忽且消耗資源的全域資訊。在本篇論文,我們提出一個分散式資料儲存系統─Malugo。Malugo是一個兩階層式的peer-to-peer:底層網路負責將臨近的端點連接成群,負責提供附近的使用者存取資料;上層網路負責將不同的群連接成一個網路,負責不同群之間的溝通。儲存在Malugo中的檔案會動態的複制到不同的群來提供不同程度的可得性,而不需要的全域資訊輔助。動態的負載平均也被考慮到以針對熱門檔案的服務。實驗顯示我們的系統可以同時考慮到資料與路由的地域性,提供使用者快速的找到檔案,並有效率的下載檔案。同時也有較低的網路負載和較短的檔案上傳複制時間。
Due to the high availability of Internet, many large cross-organization collaboration projects have been presented in the last decade. One of the fundamental requirements of these collaborations is a storage system to store and exchange data. The storage system must be highly scalable, can aggregate storage resources dynamically, and deliver data to user effi-ciently. Previous works have taken care of data locality but without taking routing locality into consideration. Besides, replication strategies of related works are usually relied on global information. In this paper, we propose a distributed storage system, called Malugo, which is based on the distributed two-tier hierarchical peer-to-peer architecture. The bottom layer is constructed for clustering neighboring peers in local area to provide services in local region. The upper layer is constructed for connecting local groups together with locality consideration. File stored in Malugo will be adaptively replicated to different number of groups to provide different level of availability without the need of global information. Furthermore, the issue of load balance among storage peers is also considered to keep high downloading rate for popular files. The simulation results show that Malugo considers not only data locality for user to obtain data efficiently but also the routing locality to achieve efficiently and stablaly routing as well as lower traffic overhead on both file insertion and overlay maintenance.