科技資訊的發展日新月異,隨著網路頻寬以及儲存設備的不斷演進,人們對於數位資料的儲存需求也日益提昇,因此資料壓縮成為一個重要而且無法避免的課題。而無損壓縮具有確保資料正確性的能力,在考量編碼效率、編碼延遲以及編碼複雜度的情況下,如何取得一個平衡值是相當重要的研究方向。 本論文將針對無損壓縮的Bzip2演算法改寫為平行處理的方式,首先介紹Nvidia CUDA的平行程式開發環境,利用顯示卡上的GPU來達成3D圖形顯示以外的運算用途,泛稱GPGPU,藉由繪圖晶片的強大運算能力來進行壓縮編碼的工作。由於CUDA支援C語言的使用,所以對於開發GPU程式的門檻降低,是一個相當適合的實驗環境,除了介紹目前無損壓縮編碼的演進,也順便介紹CUDA的程式架構以及硬體設備。 對於無損編碼的改良,本論文會在壓縮編碼之前先執行Burrows-Wheeler Transformation以及Move-To-Front的轉換,此方式可以改善無損編碼的壓縮率,而我們的平行程式也著重在這兩者的操作上。除了探討程式分配的概念,本論文也會將GPU與CPU的Bzip2程式執行結果作比較與分析,最後討論壓縮演算法平行化對於此系統實作上所帶來的影響。
The development of information technology is rapid. With the network bandwidth and storage devices continue to evolve, require for digital data storage demand is rising. Data compression has become an important and unavoidable issue. The lossless compression has the ability to ensure data accuracy. In consideration of coding efficiency, coding delay and complexity of coding, how to strike a balance between the values is important research direction. In this paper, we rewrite the lossless compression Bzip2 algorithms in the way for the parallel processing, first introduced the Nvidia CUDA the parallel programming environment. Using GPU on the graphics cards to achieve more operations besides 3D graphics computing, GPGPU, by a powerful graphics computing power to carry out the work of compression. As the CUDA support to the use of C language, so the threshold get lower for the development of GPU programming is a very suitable experimental environment. Apart from the evolution of the current lossless compression, but also the way introduce CUDA programming architecture and hardware. For lossless coding improvements, this paper will execute Burrows-Wheeler Transformation and the Move-To-Front transformation before the compression entropy coding. This method can improve the lossless compression ratio, and our program also focuses on the parallel operation on both transform. In addition to the concept of distribute the program, we will compare the performance about CUDA GPU program and Bzip2 CPU program. This paper checks the results for comparison and analysis, finally discuss the impact of parallel compression algorithm implemented on this system.