透過您的圖書館登入
IP:3.149.251.155
  • 學位論文

適用於視訊編碼系統之基於最小可覺差視覺感知引擎之演算法及硬體架構設計

Algorithm and Hardware Architecture of Just-Noticed Difference Based Perception Engine for Video Encoders

指導教授 : 簡韶逸

摘要


在現今的生活中,多媒體一直在人類的生活中扮演重要的腳色。 由於計算複雜度和傳輸帶寬的限制,高品質的視訊數據處理的改進成為一個重要的議題。最新的視頻壓縮標準,HEVC,提供了幾十到幾百個的壓縮倍率。 對於視訊壓縮而言,最終接收者還是人的眼睛。然而,傳統的壓縮標準僅利用峰信號 - 噪聲比(PSNR)作為用於壓縮的視頻品質參考指數。 PSNR指標不考慮人類視覺系統(HVS)的特性。視頻串流的分配通常沒有對人眼的感知進行優化。如何在有限的頻寬內,根據不同的內容、有效地分配串流的壓縮效率成為了重要的議題。以比特,諸如利用更多的位元來壓縮重要的區域以及相對較少的位元分配給不重要的區域,如此一來便可有效降低儲存所需的位元數。換句話說,對於同樣的位元率下,能夠提供更好的視覺分數。其中最重要的概念就是考慮人類視覺系統的特性。許多研究工作一直致力於為模擬人類視覺系統的特性。相關的模型已依不同方式整合到視頻編碼架構。其中,由於近年來顯著的效益,利用最小可覺差(JND)模型改進編碼效率的研究最為研究者所討論。但是,常見的傳統視覺模型引擎的視頻編碼只有一些整合不同研究的算法,該算法忽略了不同HVS技術之間的相互影響。在這篇論文中,我們將討論各個模型之間的關係,並提出了新的最小可覺差模型改良的視頻編解碼器。為了滿足在視頻編碼系統中的即時要求,我們也提出了一種低硬體複雜度的人類視覺系統感知引擎硬體。相較於原本的壓縮標準,我們引擎的硬體額外成本可以忽略不計。 本文的主要目的是探討各相關技術之間的關係,並進一步提出了一個新的最小可覺差模型設計的視頻編解碼器。感知評價引擎必須分析當前視頻幀中的數據區域的內容來決定如何進行位元分配。我們提出了一種新的最小可覺差(JND)模型,其中考量了視覺注意模型、敏感度模型和量化失真並同時計算出各編碼單元(CU)中所含內容在人眼的重要性。利用HEVC當作原始的壓縮衡量指標,我們進一步開發出適合硬體實現的演算法和系統結構。考量系統的頻寬,我們採用的編碼運算單元(CU)作為我們的基本處理單元。為了確保我們的感知模型的相容性,我們採用[1]作為我們的基礎平台。 我們提出的演算法藉由改變個區塊的量化參數,使得各區塊得到較好的位元分配。該演算法實現了平均約14%的位元比率降低在27-37 QP範圍內,並且無感知上的視覺質量下降。而於硬體方面,該引擎使用TSMC40nm技術設計的,其尺寸為約0.1平方毫米,功率消耗是7.39mW,其硬體額外成本與[1]比約為1%兼容。

並列摘要


The existing multimedia has been affecting the life of human beings nowadays. Due to the limitation of computation complexity and transmission bandwidth, the data processing of the high quality video needs improvement. The newest video compression standard, HEVC, offers tens of to hundreds of compression ratio. The final receiver of the video information is human eyes. However, the traditional standard only uses Peak Signal-to-Noise-Ratio (PSNR) as the quality index for compressed video bit stream. PSNR index does not consider the properties in human visual system (HVS). The bit allocation of the video bit stream is usually not opti-mized for the perception of human eyes. How to allocate the bit rate for different content of the video effectively within a limited bandwidth is important. With proper allocation of bits, such as more bits for important area in one frame and fewer bits for indifferent area, the bit rate can be reduced. In other words, the compressed video shows better perceptual quality compared with the original standard’s com-pressed video in the same bit rate. The key point in bit allocation is considering the human eye perception in HVS. Many research efforts have been dedicated to modeling the human visual system’s characteristics. The resulting models have been integrated into video coding frameworks in different ways. Among them, coding enhancements with the just Noticed distortion (JND) model have drawn much at-tention in recent years due to its significant gains. But the conventional HVS per-ception engine on video coding only adopt different HVS technique by some fusion algorithm, which ignore the mutual effect between them. In this work, we will discuss the relation among them and proposed a new Just-Noticed Distortion Model for video codec. For satisfying the real time requirements in video encoding systems, we proposed a low hardware complexity human visual system perception evaluation algorithm, which can improve the functionality of bit allocation of video encoders, while the hardware overhead is negligible. The main target of this thesis is to discuss relation among the related technique, and further proposed a new JND model for video codec. One perception evaluation engine must analyze the content of current video frame data and determine the bit allocation for these data. We proposed an improved Just-Noticed Distortion (JND) models, which consider the visual attention model, sensitivity models and the dis-tortion of quantization simultaneously to get the weighting of importance of human eye perception for each coding unit (CU) in video frame. Cooperating with HEVC video encoding system, we further developed the algorithm and system architecture which is suitable for hardware implementation to analyze the video content and then proposed a scheme to determine the quantization parameter in the encoding system. To save the system bandwidth, we employed the CU-based processing as our basic unit of processing flow, and parallel processing for the each hardware of the visual model. To ensure the compatibility of our perception model, we adopt [1] as our basic platform. The proposed algorithm achieves better bit allocation for video coding systems by changing quantization parameters at CU level. With simulations of the coopera-tion of our proposed evaluation engine and the HEVC encoder in HM11.0 and subjective experiments, results show that our algorithm achieves about 14% bit-rate saving in the QP range of 27-37 without perceptual (visual) quality degradation. For the hardware implementation of the proposed evaluation engine, the engine is designed using TSMC40nm technology. The core size is about 0.1mm2, and the power con-sumption is 7.39mW, which is compatible with [1] and the hardware overhead of proposed engine is about 1%.

參考文獻


[3] J. G. Robson, “Spatial and temporal contrast sensitivity functions of the visual system,” J. Opt. Soc. Amer., vol. 56, pp. 1141–1142, 1966.
[4] S. Daly, “Engineering Observations from Spatiovelocity and Spatiotemporal Visual Models,” IS&T/SPIE Conference on Human Vision and Electronic Imaging, vol. 3299, Jan. 1998.
[5] C.-H. Chou and Y.-C. Li, “A Perceptually Tuned Subband Image Coder Based on the Measure of Just-Noticed-Distortion Profile,” IEEE Trans. Circuits Syst. Video Techn., vol. 5, no. 6, pp. 467-476, Dec. 1995.
[6] C.-H. Chou and C.-W. Chen, “A Perceptually Optimized 3-D Subband Codec for Video Communication over Wireless Channels,” IEEE Trans. Circuits Syst. Video Techn., vol. 6, no. 2, pp 143-156, Apr. 1996.
[8] J. Malo, J. Gutierrez, I. Epifanio, F. Ferri, and J. M. Artigas, “Perceptual feedback in multigrid motion estimation using an improved dct quantization,” IEEE Trans. Image Proc., vol. 10, no. 10, pp. 1411–1427, Oct. 2001.

延伸閱讀