透過您的圖書館登入
IP:3.129.70.157
  • 學位論文

應用於視訊傳輸上錯誤補償與感知視訊編碼系統之演算法與硬體架構研究

Algorithm and Hardware Architecture Design of Error Concealment and Perceptual Video Coding for Video Communication

指導教授 : 簡韶逸

摘要


視訊傳輸是一大挑戰,網路頻寬、網路異質性、封包延遲和遺失等問題,讓視訊傳輸系統的設計變的複雜。從視訊編碼系統的觀點來看,編碼效率、抗錯誤能力和具有可調整能力的視訊編碼系統是對於現今的運算與網路環境主要的挑戰。本論文注重在視訊編碼系統的編碼效率和抗錯誤支援能力。為了增加視訊編碼效率,所提出的感知導向的視訊編碼系統考量了人眼感知的特性。另一方面,抗錯誤能力的視訊解碼系統包含了錯誤偵測和錯誤隱蔽能力,可以減輕錯誤傳輸所引起的視訊封包遺失問題。為了能減低視訊封包傳輸的延遲問題,對於所發展的感知模型與錯誤隱蔽演算法都發展了相對應的硬體來達到HDTV及時處理的能力 在所提出的感知視訊編碼系統方面,人類感知特性被考量在所發展的演算法來增加編碼效率。在一個有效率的影像或視訊編碼系統中,除了移除空間性、時間性和統計上的冗餘外,也必須考量人類感知在影像與視訊上的冗餘。所提出的感知模型可以藉著改變在以macroblock為基礎單位的量化參數來幫助視訊編碼系統做更好的位元分配。我們採用結構相似模型(structural similarity model)、視覺注視模型(visual attention model)、可感受差異化失真模型(just-noticeable-difference model)和對比敏感度函數(contrast sensitivity function)並且作適當的結合去得到對每一個macroblock在人類感知上的重要程度參數。對於這些所使用的模型與演算法,我們更進一步的發展改進成適合硬體實作的方式。在外部記憶體頻寬方面,以macroblock為基礎單位的資料重複使用機制被採用來節省達50%的外部記憶體頻寬。此外,每個模型平行處理並共享內部記憶體的硬體架構可以減少硬體面積成本。主觀性實驗結果顯示提出的模型在Qp範圍從24到36可以達到7-41%的位元率節省而沒有視覺上視訊品質上的下降。對所提出的感知模型的硬體實作上,原型晶片利用TSMC 0.18μm技術製成,在100MHz速度時,面積為3.3×3.3 mm2,其功率消耗為83.9 mW,而處理能力則可以達到每秒30張HDTV720P的速度。 對於會造成封包遺失或錯誤的影響上,本論文提出包含錯誤偵測和錯誤隱蔽的抗錯誤視訊解碼系統。提出的錯誤偵測機制考慮了空間性和時間性視訊訊號的特性。此外,適應式閥值決定機制也被發展來讓演算法更適合用於各種不同特性的視訊影片。實驗結果顯示所提的方法可以得到0.5-2.4dB PSNR的改善,並且適用於不同的視訊編碼標準的解碼視訊。 我們也提出了有效率的錯誤隱蔽機制來減輕視訊封包遺失的問題。針對處理因為連續封包遺失而造成的連續畫面遺失的情況,我們提出利用遺失畫面前面跟後面正確解碼的畫面的移動向量場來預測目前要修補的畫面的移動向量場。實驗結果顯示提出的方法比原來只利用遺失畫面的前面畫面的移動向量場來預測目前遺失畫面移動向量的方法,有更好的視訊品質。對於不是連續畫面遺失的情況,本論文提出了利用時間性與空間性的錯誤隱蔽機制,以macroblock為基礎處理單元的機制下,對於空間性錯誤隱蔽演算法,我們考量了在視訊位元流中的內畫面(intra frame)編碼模式的資訊,被使用來作為要選擇雙線性插補或方向性插補的依據。使用這個方法只會有平均0.08dB PSNR的視訊品質下降,但是跟之前傳統的方法比較起來,用一般用途的處理器執行則有40倍處理速度的加速,而且也很適合硬體的實作。對於時間性錯誤隱蔽演算法上,在要修補的區塊的周圍區塊解碼出來的移動向量被拿來當成預測目前區塊移動向量的參考,對於硬體及時處理的支援上,所提出的資料與計算結果重複使用的移動向量估測機制跟傳統作法比較在只有0.18dB PSNR的視訊品質下降,但是可以減少96%的外部記憶體頻寬跟計算量。原型晶片利用UMC 90nm技術製成,在125MHz速度時,其功率消耗為15.77mW,處理能力可以達到每秒30張HDTV1080P畫面的速度。跟以前被提出的錯誤隱蔽硬體比較起來,所提出的錯誤隱蔽硬體,可以達到較高的處理能力和1.81dB PSNR的視訊品質提升。 本論文主要貢獻可分成兩個方向,第一部分為基於人類感知的視訊編碼系統,可以讓編碼效率提升。第二部分是解決因為封包遺失所造成的問題而提出的抗錯誤的視訊解碼系統,其中提出了錯誤偵測與錯誤隱蔽的演算法跟硬體實作。我們由衷希望我們的研究成果可以給人類帶來便利與進步。

並列摘要


Video transmission is a challenging work. Many issues, such as bandwidth, heterogeneity, delay, and loss, make the design of video transmission system complicated. From the viewpoint of source coding layer, coding efficiency, error robust and scalability of video coding system are the main challenges for nowadays computation and network environments. This dissertation focuses on the coding efficiency and error robust support issues for video coding system. To increase compression efficiency in the encoder side, a perception-aware video coding system considering human perception is developed. On the other hand, a robust video decoding system including error detection and error concealment is presented to alleviate the erroneous channel effects. Moreover, hardware architecture design of the proposed error concealment algorithms is also concerned in this dissertation because of the tight timing budget for real-time HDTV video processing. For the proposed perception-aware video coding system, human perceptual consideration is taken into the traditional video coding system to increase the coding efficiency. In image and video coding field, an effective compression algorithm should remove not only the spatial, temporal and statistical redundancy but also the perceptual redundancy information from the pictures. The proposed perception model helps to achieve better bit allocation for video coding systems by changing quantization parameters at macroblock level. We adopt and combine the structural similarity model, visual attention models, and just-noticeable-distortion model, and contrast sensitivity function to get the weighting of importance of human eye perception for each macroblock in video frame via a proper fusion algorithm. The proposed algorithms of the model are further developed and modified to be suitable for hardware implementation. Macroblock-based processing with data reuse scheme is used to save the system bandwidth. Moreover, the architecture of parallel processing for each visual model with sharing the on-chip memory and buffers is developed to reduce the chip area. Subjective experiment results show that the proposed model achieves about 7--41% bit-rate saving in the QP range of 24--36 without visual quality degradation. For the hardware implementation of the proposed evaluation engine, the chip is taped out using 0.18 um technology. The chip size is about 3.3x3.3 mm^2, and the power consumption is 83.9 mW. The processing capability is HDTV720p. For the erroneous channel effects, we propose a robust video decoding system which including error detection and error concealment schemes for compressed video transmission. The proposed error detection scheme jointly considers spatial and temporal video characteristics. In addition, adaptive threshold value decision scheme is also exploited to let the proposed algorithm suitable for different video sequences which have different aracteristics. The simulation results show that with the proposed technique, the image quality improvement of 0.5-2.4dB can be achieved. Furthermore, since the proposed method is applied on the decoded frames, it can be used with any coding standard. Moveover, this dissertation also presents efficient error concealment algorithms for video bitstream over error-prone channel suffering from damage. An error concealment algorithm for successive frame losses for H.264/AVC bitstream is developed. It estimates the motion field of a lost frame by forward or backward motion projection from a nearly frame which has correct motion field. Experimental results demonstrate that significant quality improvements can be obtained by the proposed algorithm, both objectively and subjectively. On the other hand, for non-successive frame losses case, a spatial-temporal error concealment is presented. For spatial error concealment, a mode selection algorithm considering the reuse of intra mode information embedded in bitstream is developed for the adaptation of bilinear and directional interpolation. It suffers only 0.08 dB video quality drop in average but the speedup measured on a general purpose processor is up to 40 times compared with the conventional methods. It is also more suitable for low cost hardware design. For temporal error concealment, the decoded motion vectors of the neighboring blocks of the corrupted macroblock are reused to provide hints to estimate the motion vector of the corrupted macroblock. Moreover, hardware architecture design and chip implementation of the proposed error concealment algorithm are also presented. For low cost hardware implementation, a data and computational results reuse scheme of motion vector estimation is proposed and 96% computation and memory bandwidth can be reduced compared with the conventional methods with 0.18 dB quality drop in average. With UMC 90 nm 1P9M process, the proposed error concealment engine can process HDTV1080P 30 frames-per-second video data and the power consumption is 15.77mW at 125MHz operation frequency. Compared with the previous hardware design of error concealment engine, the proposed design can achieve higher processing capability and up to 1.81 dB gain in PSNR. In brief, digital video techniques are contributed in two directions. Coding efficiency of video coding system can be improved based on the cooperation of the traditional video coding scheme and the proposed perception analysis model and hardware engine. Error robust ability of video decoding system is improved based on the proposed error concealment algorithm and hardware engine. We sincerely hope that our research results could make progress for the convenience of human life.

參考文獻


[1] R. L. Hsu, M. A.-M., and A. K. Jain, “Face detection in color images,”
May 2002.
tions,” in IEEE International Solid-State Circuits Conference Digest of
[3] T.-C. Chen, S.-Y. Chien, Y.-W. Huang, C.-H. Tsai, C.-Y. Chen, T.-W. Chen,
and L.-G. Chen, “Analysis and architecture design of an HDTV720p 30

延伸閱讀