透過您的圖書館登入
IP:3.149.229.253
  • 學位論文

運用異質系統架構加速重複數據刪除

Accelerating Data Deduplication with Heterogeneous System Architecture

指導教授 : 洪士灝

摘要


藉由移除多餘的數據塊,去重複技術能更有效率的傳輸數據以及儲存資料。在商用環境下,由於備份程序必須在短時間內完成,重複數據刪除的流程絕不能成為效能瓶頸並拖慢系統速度。然而不幸的,在傳統通用處理器架構下,重複數據刪除所需的運算效能對於低階儲存設備太過高昂。 外接通用圖形處理器確實能用於加速重複數據刪除的流程,但也只有數據切塊及特徵值計算這兩步驟能被有效率的加速。 而且不巧的,這種方式會遭遇這些問題,首先是透過快捷外設互聯標準在裝置以及主機間傳輸資料會產生延遲,二者是於圖形處理器上運行去重複程序時,能處理的資料多寡將受限於裝置上所搭載的記憶體。這些難題降低了系統的執行效率並局限了加速這類應用的方式。 為了解決這類限制,我們利用異質系統架構所提供的記憶體共享這性質,改良了重複數據刪除的程序。依據我們的實驗成果,異質系統架構確實有相對於傳統獨立通用圖形處理器架構有數項優勢,並有更好的每瓦運算效能。

並列摘要


Deduplication techniques remove redundant data segments to transmit or store data economically. For the enterprise environment, where the data copy and backup processes must be completed within short time windows, data deduplication should not become the performance bottleneck and slow down the systems. Unfortunately, the computational requirement for deduplication makes it too expensive for low-end storage devices with traditional process or architectures. While discrete graphical processing units (GPU) have been used to speedup the data deduplication process, only two operations in the process, i.e. chunking and fingerprinting, have been accelerated effectively in the previous works. Unfortunately, such methods suffer from the latency of data movement between the host and GPU via the PCI-e bus, and the amount of data that can be handled by the duplication program running on the GPU is limited by the local memory attached to the GPU. These issues prevent the system from achieving its best utilization and limit the application of such an approach. To overcome the limitations, we renovate the data deduplication process with the shared memory feature provided by the emerging heterogeneous system architecture (HSA).Our experimental results indicate that HSA offers several advantages over traditional discrete GPU architectures and leads to better power efficiency.

參考文獻


[1] P. Bhatotia, R. Rodrigues, and A. Verma, “Shredder: Gpu-accelerated incremental storage and computation.”
in FAST, 2012, p. 14.
[2] K. Suttisirikul and P. Uthayopas, “Accelerating the cloud backup using gpu based data deduplication,” in Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on.
IEEE, 2012, pp. 766–769.
[3] C. Kim, K.-W. Park, and K. H. Park, “Ghost: Gpgpu-offloaded high performance storage i/o deduplication for primary storage system,” in Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores.

延伸閱讀