透過您的圖書館登入
IP:3.144.36.141
  • 學位論文

多視角視訊編碼器中快速搜尋 之NVIDIA CUDA平行實現

Multiview Encoder Parallelized Fast Search Realization on NVIDIA CUDA

指導教授 : 楊士萱 杭學鳴

摘要


由於繪圖晶片的快速發展,將繪圖晶片運用於非圖形的運算已漸漸成熟,使用GPU輔助CPU處理一般運算,此技術通稱為General-purpose computing on graphics processing units (GPGPU),而NVIDIA公司在2007年提出一個全新GPGPU的繪圖處理器架構Compute Unified Device Architecture (CUDA),藉由CUDA技術,可程式NVIDIA硬體多執行緒的GPU,以達到平行處理大量資料的運算,而我們的系統則採用NVIDIA GTX-280,其具有240個運算核心,作為我們實作平行演算法的實驗平台。 H.264/AVC正在進行的延伸標準multiview video coding (MVC),其編碼器中最耗費運算時間的motion estimation (ME) 以及disparity estimation (DE),我們提出一個可平行的快速演算法multithreaded one-dimensional search (MODS),它可使用於ME以及DE,所以我們對編碼器中整數像素的ME以及DE實做MODS於NVIDIA GTX-280平台上,可加速約CPU版本的89倍,而使用CUDA加速的MODS與標準程式中的快速演算法相比,在使用ME與DE編碼的視訊也可加速達21倍。

並列摘要


Due to the rapid growth of the graphics processing unit (GPU) processing capability, it gets more and more popular to use it for non-graphics computations. NVIDIA announced a powerful GPU architecture called Compute Unified Device Architecture (CUDA) in 2007, which is able to provide massive data parallelism under the SIMD architecture constraint. We use NVIDIA GTX-280 GPU system, which has 240 computing cores, as the platform to implement a very complicated video coding scheme. The Multiview Video Coding (MVC) scheme, an extension of H.264/AVC/MPEG-4 Part 10 (AVC), is being developed by the international standard team joined by the ITU-T Video Coding Experts Group and the ISO/IEC JTC 1 Moving Pictures Experts Group (MPEG). It is an efficient video compression scheme; however, its computational compexity is very high. Two of its most time-consuming components are motion estimation (ME) and disparity estimation (DE). In this thesis, we propose a fast search algorithm, called multithreaded one-dimensional search (MODS). It can be used to do both the ME and the DE operations. We implement the integer-pel ME and DE processes with MODS on the GTX-280 platform. The speedup ratio can be 89 times faster than the CPU only configuration. Even when the fast search algorithm of the original JMVC is turned on, the MODS version on CUDA can still be 21 times faster.

參考文獻


[1] A. Smolic and P. Kauff, “Interactive 3-D video representation and coding technologies,” in Proc. IEEE, vol. 93, no. 1, pp. 98–110, Jan. 2005.
[2] T. Fuji and M. Tanimoto, “Free-Viewpoint TV Systems Based on Ray-Space Representation,” in Proc. of SPIE, vol. 4864, pp. 175-189, Nov. 2002.
[6] E. Martinian, A. Behrens, J. Xin, A. Vetro, and H. Sun, “Extensions of H.264/AVC for multiview video compression,” in IEEE Int. Conf. on Image Processing, Atlanta, USA, Oct. 2006.
[7] L. Ding, P. Tsung, S. Chien, W. Chen, and L. Chen, "Content-Aware Prediction Algorithm With Inter-View Mode Decision for Multiview Video Coding", in IEEE Transactions on Circuits and Multimedia, vol. 10, no. 8, pp. 1553-1564, 2008.
[9] W. -N. Chen, H. -M. Hang, “H.264/AVC motion estimation implementation on Compute Unified Device Architecture (CUDA)”, IEEE International Conference on Multimedia and Exposition, pp. 697-700, 2008.

延伸閱讀