透過您的圖書館登入
IP:18.119.126.80
  • 學位論文

多核心架構下純軟體MPEG-4 AVC/H.264影像編碼器的設計與實作

Design and Implementation of Software-Based MPEG-4 AVC/H.264 encoder on Multi-Core Processors

指導教授 : 吳家麟

摘要


MPEG-4 AVC/H.264 是現今最先進的影像編碼技術。相較於昔之技藝,MPEG-4 AVC/H.264可節省大約50%的儲存空間並且提供更優良的視覺效果。然而在達到更優秀壓縮效能的同時,MPEG-4 AVC/H.264需要更高的計算複雜度。根據運算時間的分析結果顯示:在高解析(High Definition)的影片中,移動估測(Motion Estimation)使用了超過百分之75以上的計算量。因此,在本論文中,我們探討如何降低移動估測中各個編碼工具(Coding Tools)的複雜度。在MPEG-4 AVC/H.264中,移動估測使用了相當多先進的編碼工具,包括多重參考畫面(Multiple Reference Frame),可變動區塊大小(Variable Block Size),以及1/4像素精準度的移動向量(1/4 pixel motion vector)。針對多重參考畫面,我們利用空間中相鄰畫面間的相關性設計出一快速演算法,可減低計算的複雜度並保留此編碼工具的優點。此外,針對其他編碼工具,我們探討並實做不同演算法來降低運算的複雜度,並藉以瞭解不同演算法在高解析度的影像下,壓縮效果的差異。藉由上述的方法,移動估測的計算複雜度下降至整體編碼時間的38%,而整體編碼時間則加速了14倍,此外這些的加速方法僅帶來些微的畫質衰退或壓縮比的降低。 隨著各種不同應用的需求,多核心的處理器架構逐漸被廣泛的採用。在嵌入式系統中,非對襯型處理器架構被用以來處理不同性質的工作; 而對襯型處理器則被視為下一代個人電腦的主流。然而,一般的影像編碼器僅為單執行緒,無法使用多核心處理器的優點,因此本論文著重於多核心影像壓縮技術的研發。傳統的多核心壓縮著重於同時處理不同的資料,譬如:Slice ;然而這種方式通常適用於對襯型處理器。為了適用於非對襯處理器,我們分析影像壓縮的流程,探索各項工作間的相關性而設計出一平行處理不同編碼工具的架構(Function parallel scheme)。此外,我們亦結合昔之同時處理不同資料的技術,提出一混合的架構(Hybrid parallel scheme),可提高在對襯架構下的效能。我們所提出的兩種架構可提高MPEG-4 AVC/H.264在多核心架構下編碼的效能,並且不增加系統資源的使用量。

並列摘要


The latest video coding standard, MPEG-4 AVC/H.264, achieves better coding performance than prior codec. As compared to MPEG-4, H.263, and MPEG-2, MPEG-4 AVC/H.264 saves about 37%, 48%, and 64% bit rate, respectively. However, the better performance is contributed by advanced coding tools which result in higher computation complexity. And the higher complexity limits the application scenarios. According to the time profiling analysis, temporal domain prediction takes about 75% execution time on HD resolution videos. Therefore, our work accelerates the temporal prediction via several efficient algorithms without serious side effects. First, we propose a fast algorithm for multiple reference frames based on correlations between temporal adjacent frames. In addition, several simplification methods for block-matching, variable block size motion search, and fractional pixel search are studied and implemented to reduce the required computation load. While those algorithms are applied, the temporal prediction requires only 38% of the computation load. Besides, as compared to the H.264/AVC reference software, our encoder achieves about 12X speed up without inducing serious quality degradation and compression ratio drop. Nowadays, the multi-core processor architecture becomes more and more popular and is widely adopted in many areas. In the embedded systems, asymmetry multi-core processors are applied to complete tasks with different attributes while symmetry multi-core processors are considered the next generation of CPU on personal computers (PCs). However, the common H.264 encoders are single-threaded and can not take the advantage of multi-core processors and therefore, several parallel schemes have been proposed. The traditional methods aim to manipulate multiple data sets, such as slices, in parallel but are only applicable on symmetry architectures. In our work, we exploit the dependency relationship between coding tools and design a function parallel scheme for asymmetry architectures. Furthermore, we utilize a wave-front macroblock encoding order to avoid the inter-dependency between data sets and propose a hybrid parallel scheme applicable for both symmetry and asymmetry architectures. With the proposed schemes, the encoding process is further accelerated.

參考文獻


[4] X. Zhou, E. Li, and Y. Chen, "Implementation of H. 264 Decoder on General-Purpose Processors with Media Instructions," Proceedings of SPIE Conference on Image and Video Communications and Processing, vol. 5022, 2003.
[5] S. Wang, Y. Yang, C. Li, Y. Tung, and J. Wu, "The optimization of H. 264/AVC baseline decoder on low-cost TriMedia DSP processor," Proceedings of SPIE, vol. 5558, p. 524, 2004.
[7] T. Wedi and H. Musmann, "Motion-and aliasing-compensated prediction for hybrid video coding," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 13, no. 7, pp. 577{586, 2003.
[10] S. W. Golomb, "Run-Length Encoding," IEEE Trans. on Information Theory, vol. 12, pp. 399{401, 1966.
[12] D. Marpe, H. Schwarz, and T. Wiegand, "Context-based adaptive binary arithmetic coding in the H. 264/AVC video compression standard," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 13, no. 7, pp. 620{636, 2003.

延伸閱讀