透過您的圖書館登入
IP:3.147.83.8
  • 學位論文

使用再造編碼及快取技術之低成本高可用性點對點儲存方法

A low cost and high availability P2P storage scheme with Regenerating Code and caching methods

指導教授 : 袁賢銘

摘要


高的資料可靠度是分散式儲存系統的重要特性之一。錯誤更正碼(erasure code)是一個常用的方法來產生冗餘(redundancy)的資料,不過這個方法需要原始檔案來產生冗餘的資料。而再造編碼(Regenerating Code)分散式地蒐集多個節點(peer)的資料來產生冗餘,則不需要維護一份原始的檔案。不過這樣做也有一些缺點,存取時無法只透過一個節點(peer)來取得資料,加上編碼時需要更多的節點(peer)同時提供資訊才能達到比傳統方法更好的效率,而這在點對點(Peer-to-Peer)的環境下是一個嚴苛的條件。   我們在不額外維護一份原始檔案的條件下,利用最近下載完的節點(peer)中的使用最近最少使用演算法的快取(LRU cache)中的資料來提高存取的效率以及降低編碼的成本,我們用模擬的方式,在不同快取大小以及不同節點可靠度(peer availability)之下,記錄最近存取檔案的節點的資訊。在實驗中,我們將一個檔案被分成7個區塊(block),並且發現這些被記錄的節點在快取大小為64個檔案區塊(block)的條件下,有83%以上的比例可以改善存取效率和編碼成本。

並列摘要


High data availability is one of the important properties in P2P storage system. To support the property, erasure coding is a common method to generate redundant data, but the technique requires an original file in the creating process. Regenerating Code solves the problem through collecting encoding information in a distributed way; so it does not need to save the replica. However, there are still some drawbacks in the design. When accessing a file, the peer cannot only communicate to one peer to get enough file blocks. Besides, in the encoding procedure, the scheme suggests that the number of connected peers should be large enough, larger than the number of decoding blocks, so that the maintenance cost can be lower than the traditional way. The suggestion is a harsh condition in P2P environment because there should be simultaneously more peers alive and keep the file blocks in the system.   On the condition that the system does not store a replica additionally, we utilize the data in the LRU cache of the peer last accessed a file to improve the access performance, and reduce the maintenance cost. Our scheme records the information of peers last accessed files to achieve the goal. Through the simulation, we experimented with different cache size under various peer availabilities. The experiments results showed that both the access performance and maintenance cost are improved at least 83% in the Regenerating Code scheme when the cache size is 64 file blocks, where a file is divided into 7 blocks.

參考文獻


[2] M. Mdard, R. Koetter, “An Algebraic Approach to Network Coding”, IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 11, NO. 5, OCTOBER 2003
[3] S.-Y. R. Li, R. W. Yeung and N. Cai, “Linear network coding,” IEEE Trans. Inform. Theory,” IT-49: 371-381, 2003.
[4] C. Fragouli, J.-Y. L. Boudec, and J. Widmer, “Network Coding: An Instant Primer,”
[5] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, “Network information flow,” IEEE Trans. Inform. Theory, vol. IT-46, pp. 1204–1216, 2000.
[6] X. Zhang, G. Neglia, J. Kurose, and D. Towsley, “On the benefits of random linear coding for unicast applications in disruption tolerant networks,” Second Workshop on Network Coding, Theory, and Applications (NETCOD), 2006

延伸閱讀