感知性視訊編碼評估

為了有效地進一步提升壓縮效率與輸出畫質，影像編碼的研究必須尋求一個可行的新方向。編碼器中採用的誤差量尺將會深刻影響影像編碼的品質，而傳統的誤差量尺（如均方差）並不符合人類視覺感知的特性；有鑑於此，我們相信基於視覺特性的影像編碼（特指在影像編碼過程中採用基於視覺特性的畫質量尺）是將影像編碼品質提升至全新領域的合理方法。在本篇論文中，我們在位元率—誤差最佳化架構中採用結構相似性為誤差的衡量標準；相較於H.264/AVC的參考軟體JM，平均可以達到約12%的位元率減少。這個結果證實了在編碼過程採用基於視覺特性的畫質量尺可以幫助現有的影像編碼系統達到更好的壓縮效率。但在研究過程中，我們也發現由於畫質量尺的設計並未考量到和視訊編碼的結合，在編碼器中採用不同的畫質量尺可能會遇到許多困難；因此，在本篇論文中提出了”編碼器友善性”這個新概念，也就是如何才能輕易地將基於視覺特性之畫質量尺應用在現有的影像編碼演算法之中。在研究的最後，我們從影像編碼的觀點，提出一些未來在發明畫質量尺時可以參照的準則。

關鍵字

基於視覺特性之影像編碼；結構相似性；模式決策；位元率—誤差最佳化；人眼視覺系統；基於視覺特性之畫質量尺；編碼器友善性

並列摘要

There is a need to seek a feasible direction of video coding that can provide a significant quality improvement over existing video coding standards. In light of the well-known findings that the distortion metric for video quality has a profound impact on video coding performance and that traditional metrics such as mean square error are poorly correlated with human perception, we identify perceptual video coding, more specifically, adopting perceptual quality metrics in video coding, as a sensible approach that has the potential to help drive the performance of video coding to a significantly higher quality level. In this thesis, the structural similarity index is adopted as the distortion metric in the rate-distortion optimization framework and an average of 12% bitrate reduction is achieved over the JM reference software of H.264/AVC. This result proves that the existing video coding systems can benefit from adopting perceptual quality metrics. However, we have found that the SSIM index and the like cannot be easily adopted in video coding systems because most perceptual quality metrics are developed without considering the integration with video encoder. Here we introduce the concept of “codec-friendliness,” meaning how the perceptual quality metrics can be nicely incorporated into the coding process of standard video coding algorithms. We conclude the study by suggesting guidelines for the development of future quality metrics from the video coding perspective.

並列關鍵字

perceptual video coding ； structural similarity ； mode decision ； rate-distortion optimization ； human visual system ； perceptual quality metric ； codec-friendliness

參考文獻

[3] G. J. Sullivan and T. Wiegand, “Rate-Distortion Optimization for Video Compression,” IEEE Signal Processing Magazine, pp. 74-90, Nov. 1998.

[4] T. Wiegand et al., “Rate-Constrained Coder Control and Comparison of Video Coding Standards,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 688-703, Jul. 2003.

[5] T. Wiegand and B. Girod, “Lagrange Multiplier Selection in Hybrid Video Coder Control,” in Proc. IEEE int. Conf. on Image Processing, pp. 542-545, Oct. 2001.

[6] N. S. Jayant and P. Noll, Digital Coding of Waveforms: Principles and Applications to Speech and Video, Prentice Hall, USA, 1984.

[8] Zhou Wang and Alan C. Bovik, “Mean Squared Error: Love It or Leave It? A new look at Signal Fidelity Measures,” IEEE Signal Processing Magazine, vol.26, no.1, pp. 98-117, Jan. 2009.

被引用紀錄

張元治（2008）。用於伴唱機的伴奏音樂自動移調系統〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2008.01752

林睿敏（2006）。耳蝸物理模型為基礎的音高辨識方法〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2006.00957

國際替代計量

感知性視訊編碼評估

全文下載

主題瀏覽