透過您的圖書館登入
IP:3.141.27.244
  • 學位論文

運用先進動作預測與殘餘編碼技術之視訊壓縮應用

Video Compression with Advanced Motion Estimation and Residue Encoding Methods

指導教授 : 丁建均

摘要


影像及視訊的服務在現代人的生活中事很重要的一環,不管是在網路影音串流(Youtube)或是影音儲存媒體 (藍光光碟Blu-ray Disc),都需要影像和視訊壓縮技術。近年來更由於影像與視訊應用的快速成長,人們對視訊服務的需求也日漸增加,例如:近年來很多電影或遊戲都提供觀眾更好的視覺感受,2020年奧運主辦國日本已將8K UHD(4320p),也就是7680×4320解析度列為奧運直播的標準。由上述例子可以預見,在未來人們會有比現在更好的視覺感受,但也需要在更短的時間內,處理更大量的影像及視訊資料,因此提供一個在資料儲存與運算時間都能更有效率的影像及視訊的服務有其必要性。 針對視訊資料的畫面間預測編碼在現行視訊壓縮標準是很重要的一部分,而動態估測 (Motion Estimation)是畫面間預測編碼中最重要的一個函式。因為目前廣為流通的視訊壓縮標準 (MPEG4, H.264/AVC 和 HEVC)並未定義動態估測的實現方式,但為求絕對精準性大部分皆採用竭盡式搜尋法(Full Search)來找到方塊最佳匹配,然而這造成運算量與運算時間非常龐大。在這篇論文裡我們首先提出一個有效率的搜尋演算法,它能提供接近理論上限的匹配精準度以及非常低的搜尋成本。它具備兩大特徵:推展的支持區域 (Region of support)和利用抽樣點陣塊 (Decimation lattice)來實現低成本搜尋。跟過去的方法相比,我們所提出的演算法有著更高的匹配精準度與更低的搜尋成本。 有些動態估測的方法先利用一位元轉換將影片畫格轉成非零即一的單位元畫格,之後在使用特徵式動態估測去減低運算複雜度。在本論文的第二部分,我們引進一個加權式的模板匹配方法,利用前一段所提出的特徵點搜尋法,在快速搜尋法的架構底下依然能找到精準的動態估測結果。與其他特徵式動態估測演算法比較的實驗結果顯示,平均而言,我們提出的方法有著最高的匹配精準度以及最低的搜尋成本。 熵編碼器的效率提升也是影像視訊壓縮裡很重要的一個課題。H.264/AVC編碼標準所使用的殘餘編碼為適應性變動長度編碼法以及指數哥倫布編碼,我們針對參考前後文選用編碼表的方法本身做微幅改良,以提升更好的編碼效果。另外,我們也提出改良式算術編碼,針對動作估測殘餘、動作向量殘餘做更有效率的編碼。 此外在視訊編碼裡,從H.263開始的標準 (包含後面的H.264/AVC,HEVC)都引入次像素 (sub-pixel)的概念來找出更細微的動態資訊,然而現行方法之次像素動態估測的最小單位以及相對應的內插濾波器必須事前先定義好,不免有其局限性。為了解決這個問題,我們提出了使用光流來做動態估測的方法,它的好處是摒棄內插濾波器並能根據使用者定義的最小單位做動態估測。

並列摘要


Nowadays image and video services play essential roles in human life. No matter in online video streams such as Youtube or video storage components such as blue-ray disc, image and video compression techniques become extremely important. With the rapid growth of video applications, the demands on novel video applications becomes larger. For instance, many movies and online games aim to provide better visual enjoyment for people. Japan, which is the host country of 2020 Olympic, has set up the 8K UHD (4320p) to be the only resolution of live broadcast services. Based on the phenomena above, we could anticipate that people would have better visual experiences in the future, but as compensation, processing larger images and video sequences in limited time. Thus, providing applications with high efficiency on data storage and processing is urgent. Temporal prediction coding for video data is a crucial part of video compression standards, while motion estimation is the essential function in the temporal prediction coding. The frequently-adopted video coding standards such as MPEG-4, H.264/ABC and H.264 haven’t defined the detailed implementation of motion estimation, so many methods acquired the best block matching results by using full search algorithm. In this thesis, we first propose an efficient search algorithm which provides the near-optimal motion estimation results but with extremely low search cost. There are two key features: Expanding the region of support and introducing the decimation lattice to fulfill the low-cost search. The comparisons with the previous methods show that the proposed algorithm outperforms other algorithms in either matching accuracy and search cost. Some efficient motion estimation methods first transform the image frames into binary planes by one-bit transform, then implement the feature based motion estimation to cut down on the computational complexity. In the second part of the thesis, we propose a weighted block matching criterion and combine it with the proposed fast search algorithm to pursuit matching results with higher accuracy. Based on the subjective comparisons, the proposed algorithm has the highest matching accuracy and lowest search cost on average. Besides, the requirements for entropy coder with higher coding efficiency are also noticeable subjects in image and video compression. H.264/AVC baseline adopts context-based adaptive variable length coding (CAVLC) and exponential-Golomb coding as their entropy coder to implement residual coding. In this thesis, we follow the architecture of CAVLC but some critical parts would be stepped up to achieve a higher coding performance. In addition, we propose an improved adaptive arithmetic coding in order to tackle the problems such as motion compensation residual or motion vector differences. Last but not least, video compression starts from H.263 (including later presented H.264/AVC and HEVC) all introduce the sub-pixel techniques to extract minor motion performances. Nevertheless, in order to implement sub-pixel based motion, users should define the motion estimation accuracy and corresponding Interpolation filter previously, which is lack of flexibility. To overcome such a limitation, we introduce the optical flow to execute the motion estimation procedure, which provides two benefits that we could not only discard the Interpolation filter but also tuning the sub-pixel accuracy between different scales elastically.

參考文獻


[44] Changryoul Choi and Jechang Jeong, “ Fast Motion Estimation Algorithm Using Dual Bit-plane Matching Criteria,” Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 398-401, May 2014.
[35] Mei-Juan Chen, Liang-Gee Chen, Tzi-Dar Chiueh, and Yung-Pin Lee, “A new blockmatching criterion for motion estimation and its implementation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 5, no.3, pp. 231 –236, Jun 1995.
[78] Li-Ang Chen, Hung-Yi Chen and Jian-Jiun Ding, “Shape Encoding and Shape Adaptive Prediction for Object Oriented Image Compression,” Computer Vision, Graphics, and Image Processing, Keelung, Taiwan, Aug. 2016.
[79] Hung-Yi Chen, Jian-Jiun Ding and Chiou-Shann Fuh, “A Region of Interest Based Surveillance Video Coding,” Computer Vision, Graphics, and Image Processing, Keelung, Taiwan, Aug. 2016.
[46] P. Bhagya Sri, E. Roohi, Osman Siddiqui, P. Muralidhar and C.B. Rama Rao, “Filtered two-bit transform for block based Motion Estimation,” Signal Processing, Communication and Networking (ICSCN), pp. 1-5, Mar. 2015

延伸閱讀