H.264編碼器之整合性快速模式決策與基於結構相似性之R-D優化演算法

H.264標準化的成功意味著下一代視訊編碼標準的編碼工具將變得更加複雜且需要大量的運算以因應我們朝高畫質影像邁進的發展趨勢。於是，為了滿足大量的消費性電子和多媒體通信應用的即時性需求，開發提高先進編碼工具之計算效率的演算法是相當重要的。另一方面，由於視訊品質的優劣，最終依然由人的視覺感知所決定，所以我們堅定地認為，在設計下一代視訊編碼系統時，將人類視覺特性列入演算法設計的考量是必要的。本論文主要由兩部分組成：分別是整合性快速模式決策與基於結構相似性之位元率-失真 (Rate-Distortion, 簡稱R-D)優化演算法。在第一部分中，我們針對H.264模式決策階層中之三個不同層級分別提出其快速模式決策演算法，分別是基於變異數之宏塊模式決策、基於濾波器之預測模式決策之強化、與依據R-D特徵的選擇性畫面內模式決策。它們的整合方式也是研究重點之一，並進一步分別提出對畫面內預測幀編碼與畫面間預測幀編碼之整合性快速演算法。整合性演算法大量降低計算複雜度卻不造成明顯的R-D表現損失。實驗結果亦顯示提出之演算法的優越性。在第二部分中，我們則基於結構相似性 (SSIM) 制定了一個R-D優化的架構以應用於H.264的模式決策過程，並提出可適用於此架構之預測性拉格朗日乘數選擇方法。為滿足不同應用之需求，不同計算複雜度之預測方式分別提出並討論。而在以SSIM衡量之影像品質相同下，我們所提出的方法可達到約5 ％ -10 ％的位元率減少。由主觀的視覺評估可發現，在相同位元率的限制下，相較於傳統基於MSE優化之H.264編碼器，所提出的方法可保留更多的細節並且產生較少的區塊效應，進而得到較佳的影像品質。

關鍵字

H.264 ；模式決策；位元率-失真優化；結構相似性；拉格朗日乘數選取

並列摘要

The success of H.264 standardization implies that the video coding tools of the next-generation video coding standard, for example, H.265, will become more complicated and require extensive computations for high quality video. To satisfy the real-time requirements of many consumer electronic and multimedia communication applications, it is absolutely necessary to enhance the computational efficiency of such advanced coding tools. On the other hand, because the video quality is ultimately judged by human eyes, we strongly believe that the characteristics of human visual system must be taken into account in the design of the next-generation video coding system. Motivated by these requirements of next-generation video coding, this thesis targets the development of algorithm for 1) integrated fast mode decision algorithm and 2) structural similarity based rate distortion optimization. In the first part, three fast intra mode decision algorithms for different stages in the mode decision hierarchy of H.264 are proposed, which are variance-based MB mode decision, improved filter-based prediction mode decision, and an R-D characteristic based selective intra mode decision. Their integration is also investigated and we propose integrated fast algorithms for intra-frame coding and inter-frame coding, respectively. The integrated algorithms achieve high complexity reduction without introducing noticeable R-D performance loss. The experimental results are provided to show the superiority of the proposed algorithms. In the second part, we develop a rate-distortion optimization framework based on structural similarity for the mode decision process in H.264, and propose a predictive Lagrangian multiplier selection method for the proposed framework. To estimate the Lagrangian multiplier, approaches with different computational overhead are presented to meet the requirement of different target applications. The proposed method achieves about 5%-10% bit rate reduction with same quality in terms of SSIM index. From the subjective evaluation, the proposed method preserves more detail and introduces less block artifact than the MSE-based H.264 encoder with the same bit-rate constraint.

並列關鍵字

H.264 ； mode decision ； rate-distortion optimization ； structural similarity ； Lagrange multiplier selection

參考文獻

[5] Video Coding for Low Bit Rate Communication, ITU-T Recommendation H.263, Feb. 1998. (H.263)

[8] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable video coding extension of the H.264/AVC standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 9, pp. 1103-1120, Sept. 2007.

[9] Y. He et al., “Introduction to the special session on multiview video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 11, pp. 1433-1435, Nov. 2007.

[10] Ian E. G. Richardson, “Video Codec Design: Developing Image and Video Compression Systems”, 2002.

[11] Ian E. G. Richardson, “H.264 and MPEG-4 Video Compression: Video Coding for Next Generation Multimedia”, Aug. 2003.

國際替代計量

H.264編碼器之整合性快速模式決策與基於結構相似性之R-D優化演算法

全文下載

主題瀏覽