大型分散式儲存系統之詮釋資料動態切割方法

本篇論文描述了一個在詮釋資料的樹狀結構上有效的分割演算法。在檔案系統、點對點的系統及格網系統上詮釋資料的查詢是相當重要的操作。在資源可以被充分運用之前，其相關的詮釋資料，如資源的位置及存取的權限，必須會先被讀取。於是有效的讀取資源的詮釋資料在整個系統的存取效率將會相當重要。為了要改善詮釋資料查詢的效率，我們可以用多台分散式的詮釋資料伺服器。這篇論文專注在如何切割一個階層式詮釋資料架構而使得負載均勻的分佈在每台詮釋資料伺服器。我們令查詢中伺服器的轉換數為我們的成本，且提出了一個動態規劃的方式將樹狀詮釋資料分割成數個詮釋資料伺服器，使得所有詮釋資料伺服器中的最大成本達到最小化。此外，我們提出了一些優化技巧能夠降低動態規劃演算法的時間複雜度。我們也考慮了當詮釋資料的查詢數目發生動態的改變之情況。在我們大量的實驗結果說明了在最佳化之後的程序在執行時間上相當有效率，而且有效的使工作量不均達到最小化。

關鍵字

詮釋資料切割；分散式系統；大型資料儲存；動態切割；負載平衡

並列摘要

This dissertation describes an efficient partitioning algorithm for metadata trees. Metadata querying is an important operation in file systems, peer-to-peer systems, and grid information systems. Before a resource can be utilized, the related metadata, such as the resource's location and permission to access it, must be obtained. Consequently, efficient retrieval of a resource's metadata is very important to the overall access efficiency of a system. To improve the efficiency of metadata querying we can use multiple distributed metadata servers. This dissertation focuses on how to partition a hierarchical metadata structure so that the query workload is distributed evenly among multiple metadata servers. We take the number of transitions from one server to another as the cost of a query, and propose a dynamic programming approach that partitions the metadata tree so that the maximum cost to each server is minimized. In addition, we propose optimization techniques that reduce the time complexity of the dynamic programming procedure. We also consider the case where the metadata request pattern is dynamically changing. The results of extensive experiments demonstrate that applying the procedure after optimization is efficient in terms of the run time, and effective in minimizing the workload imbalance.

並列關鍵字

metadata partitioning ； distributed file system ； large-scale data storage ； dynamic partitioning ； load balancing

參考文獻

[11] D. Roselli, J. R. Lorch, and T. E. Anderson. A comparison of file system workloads. In Proc. Ann. Usenix Technical Conference, June 2000.

[7] S. Ghemawat, H. Gobioff, and S. T. Leung. The google file system. In SOSP ’03: Proceedings of the nineteenth ACM symposium on Operating systems principles, pages 29–43, New York, NY, USA, 2003. ACM Press.

[2] M. Satyanarayanan, J.J. LKistler, P. Kumar, M.E. Okasaki, E.H. Siegel, and D.C. Steere. Coda: A highly available file system for distributed workstation environments. IEEE Trans. Computers, 39(4), 1990.

[3] Powering Cloud Storage. Parascale cloud storage, 2009.

[4] Amazon. S3: Simple storage service, 2010.

國際替代計量

大型分散式儲存系統之詮釋資料動態切割方法

全文下載

主題瀏覽