透過您的圖書館登入
IP:18.219.22.169
  • 學位論文

應用於高效能視訊編碼與其視窗視訊延伸標準之先進像素估測技術

Advanced Prediction Techniques for HEVC and Its Screen Content Coding Extension

指導教授 : 彭文孝

摘要


在視訊壓縮技術中,畫面間與畫面內估測皆需處理一共同問題,即為替畫面中的區塊設計一有效的編碼方式,可在估測準確度(Prediction Accuracy)與傳輸耗費量(Signaling Overhead)間達到有效的平衡。然而,視訊的多樣性逐漸增加,找尋有效的編碼方式變得愈發有挑戰性。主流視訊內容除來自相機的拍攝外,更有來自數位裝置螢幕的顯示內容,兩者迥異的訊號特性使得傳統估測技術漸無法適應所有主流視訊的壓縮需求。因此,本論文嘗試提出新的估測技術以處理兩種常見主流視訊:自然影像視訊(Camera-capture Content)與電腦視窗視訊(Screen Content)。具體而言,本論文在高效能視訊壓縮標準(High Efficiency Video Coding,HEVC)與其視窗視訊編碼(Screen Content Coding)延伸標準中,針對自然影像視訊提出一項僅需傳輸單一動作向量(Motion Vector,MV)的低耗雙向估測技術(Low-overhead Bi-prediction),對電腦視窗視訊則提出一項以線段為基本單位的畫面內線形樣式估測技術(Intra Line Copy)。 首先,本論文的低耗雙向估測技術透過交疊區塊動態補償(Overlapped Block-based Motion Compensation,OBMC)以結合樣板估測(Template Matching Prediction,TMP)與區塊動態補償估測(Block-based Motion Compensation,BMC)兩者之估測結果。利用Template MV可於解碼端推導且無需傳送之特性,低耗雙向估測技術僅需傳送BMC的MV。於本論文的理論觀點中,Template MV可被解讀為在真實動作向量場中取樣,且取樣位置為樣板的重心。此概念同時也解釋了TMP之估測效能優於SKIP且遜於BMC。承此觀點,在給定Template MV的情況下,BMC 取樣之MV所對應的動態補償結果須能互補TMP估測結果欠佳之處。透過OBMC的權重分配,TMP與BMC的結合形成一種特殊的非矩形幾何切割(Geometry Motion Partitioning)。此外,於效能與複雜度的綜合考量下,本論文將技術延伸到多樣版設計、Multi-hypothesis估測、與利用目標區塊鄰近的MV取代複雜的樣板估測。實驗結果證實,低耗雙向預估技術可於HEVC HM-6.0平台上有效增進平均壓縮效能達1.9%。 除了畫面間估測,本論文另提出一項針對電腦視窗視訊而設計的畫面內線形樣式估測技術。此技術將目標區塊平均切分為水平或垂直的線段,每個線段各自於當前編碼畫面內已編碼之區域中搜尋最佳匹配。為降低搜尋時間,本論文之快速演算法僅需為每個線段測試具有特定特徵的位移向量(Line Vector):僅含水平或垂直偏量之位移向量、鄰近線段已編碼之位移向量、具相同雜湊值之線段的位移向量。此外,為有效傳送位移向量,每個位移向量皆參考一鄰近線段的位移向量進行估測編碼(Predictive Coding)。根據實驗結果,畫面內線形樣式估測技術在搜尋範圍的限制下,已能為HEVC SCM-4.0平台提供3-4%的壓縮效能提升。若允許使用與畫面內區塊複製技術(Intra Block Copy)相同的全幅搜尋(Full-frame Search)範圍,壓縮效能可進一步提升到4-7%。相較於畫面內串列複製技術(Intra String Copy),本論文的方法不僅能達到與其相似的編碼效能,且避免其仰賴循序執行的缺點。 最後,本論文提出之兩項技術皆被提交到JCT-VC會議討論,並經歷多次核心實驗(Core Experiments)的檢驗,每次皆能顯示極高的壓縮效能。最後,兩項技術的簡化版本分別被採納入HEVC與SCC標準之中。

並列摘要


A common issue in video coding technologies with inter and intra prediction schemes is how to find a block representation to trade off efficiently between prediction accuracy and signaling overhead. With ever-diversifying video content types, this issue has become more and more challenging nowadays. Mainstream video types, including not just camera-capture content but also screen content, exhibit very different signal characteristics from each other. Relying solely on conventional prediction schemes to deal with both content types is by no means sufficient in a video codec. This dissertation aims to provide new prediction techniques respectively for them. Specifically, we propose a low-overhead bi-prediction scheme for camera-captured content and an intra line copy (ILC) scheme for screen content to deliver higher coding efficiency on top of the High Efficiency Video Coding (HEVC) standard and its Screen Content Coding extension. Firstly, the low-overhead bi-prediction combines two motion-compensated predictors found respectively by template matching prediction (TMP) and block-based motion compensation (BMC) with the overlapped block motion compensation (OBMC). As the template motion is decoder-side derivable and need not be signaled, our scheme requires transmitting only single motion overhead for BMC. With examining TMP from a theoretical perspective, we find that template motion vector (MV) is close to the pixel true motion around the template centroid, through which we explain why TMP generally outperforms SKIP modes but is inferior to BMC. Then, we approach the problem of finding a MV for BMC that best complements the template MV from deterministic and statistical viewpoints. The result is a search criterion with OBMC window functions forming a particular type of geometry motion partitioning. The notion of this prediction mode is further extended to adaptive template design, multi-hypothesis prediction and motion merging for trading off performance and complexity. When compared with HEVC HM-6.0, our scheme achieves an average BD-rate saving of 1.9%. In addition to inter prediction, ILC focuses on intra prediction specifically for screen content. As its name, ILC forms the prediction of a block by decomposing it into horizontal or vertical lines of pixels and performing line-based prediction based on previously coded pixels in the current frame. To address the massive amounts of search operations, our fast search algorithm first searches along the horizontal and vertical directions, then checks line vector candidates from spatially and temporally causal neighborhoods, and finally evaluates reference lines that share the same hash value as for the prediction line. The resulting line vectors are further predicted adaptively to minimize their coding overhead. On top of HEVC SCM-4.0, ILC delivers additional BD-rate saving of 3-4% with restricted local search or 4-7% when it is allowed to access the same full-frame search area as for the intra block copy. ILC also achieves comparable coding performance to intra string copy and avoids all the complications from sequential string processing. The low-overhead bi-prediction and ILC were evaluated and cross-checked in several Core Experiments established by the JCT-VC committee, both showing very promising improvement in coding performance. A simplified form of each has been adopted in the HEVC standard.

參考文獻


[2] “Joint Call for Proposals for Coding of Screen Content,” ITU-T SG16 Q6 and ISO/IEC JTC1/SC29/WG11, MPEG-W14175, Jan. 2014.
[14] Y. W. Chen and W. H. Peng, “Parametric OBMC for Pixel-Adaptive Temporal Prediction on Irregular Motion Sampling Grids,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 22, no. 1, pp. 113–127, Jan. 2012.
[16] W. J. Chien,M. Karczewicz, and P. Chen, “TE1: Decoder-side motion vector derivation report from Qualcomm,” ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JCTVC-B097, 2011.
[17] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed. Wiley, 2006.
[23] S. Kamp, J. Balle, and M. Wien, “Multihypothesis prediction using decoder side motion vector derivation in inter frame video coding,” Visual Communications and Image Processing, 2009.

延伸閱讀