透過您的圖書館登入
IP:52.14.186.84
  • 學位論文

高效能H.265/HEVC視訊解碼器在DSP平台的實現與分析

A study of DSP implementation of high efficient H.265/HEVC decoder based on OpenHEVC

指導教授 : 王周珍
共同指導教授 : 高榮揚(Jung-Yang Kao)

摘要


為了提供消費者超高畫質(UHD)的視訊服務,JCT-VC於2013年完成訂定高效能視訊編碼(high efficiency video coding: HEVC)標準,ISO/IEC將HEVC訂定為MPEG-H視訊編碼規範,而ITU-T則將HEVC訂定為H.265規範。H.265/HEVC採用大面積適應性分割編碼區塊(64×64)技術,來完成更彈性和更準確的視訊預測訊號。雖然H.265/HEVC的編碼效能,比現今採用H.264/AVC視訊標準提高近2倍,但系統複雜度也大幅提高,以致於無法達到即時(real-time)解碼的應用。 為了能使H.265/HEVC標準能快速商品化,JCT-VC提供以C++撰寫之軟體測試平台HM10.0 [1],做為H.265/HEVC編解碼研究和效能比較;此外,為了使H.265/HEVC視訊解碼速度更快,由法國實驗室IETR主導OpenHEVC開放原始碼專案[2],主要是利用FFmpeg自由軟體之資料庫(libavcodec) [3],採用C語言撰寫H.265/HEVC解碼器並加以優化,來達到H.265/HEVC即時解碼的視訊運用。由於HM10.0和OpenHEVC都是以PC為基礎之軟體測試平台,若直接應用於DSP的行動視訊產品,解碼速度將大幅下降,導致無法即時解碼。為了克服H.265/HEVC在DSP解碼的問題,本論文提出以DSP為基礎之高效能H.265/HEVC解碼器,我們採用亞德諾公司(ADI)的單核心ADSP-BF548開發板來實現各平台之H.265/HEVC解碼器和進行分析,首先將HM10.0直接嵌入ADSP-BF548上,由於C++屬於高階語言,無法在DSP上直接進行程式優化,導致H.265/HEVC視訊解碼效能不佳,其次利用OpenHEVC直接嵌入ADSP-BF548,由於OpenHEVC是以C語言撰寫且經過程式優化,所以整體解碼效能比HM10.0提升近2.5倍,但仍無法達到即時解碼。 為了進一歩改善H.265/HEVC視訊解碼器在DSP實現的效能,本論文提出一基於OpenHEVC之彈性記憶體架構(flexible memory assignment architecture: FMAA)設計,首先對H.265/HEVC解碼器各模組進行複雜度分析,其中包括熵解碼(entropy decoding: ED) 模組、反量化和反餘弦轉換(inverse quantization/inverse transform: IQ/IT)、畫框內預測 (intra frame prediction: IFP)、運動補償(motion compensation: MC)和畫面濾波器 (loop picture filter: LPF)等主要模組,而其它解碼模組歸類為(other: OT) 模組。從複雜度分析結果,可以發現MC模組佔約30%、IQ/IT模組佔約26%、LPF模組佔約16%、IP模組佔約2%、ED模組佔約2%而其它解碼模組佔約24%。因為ADSP-BF548採用多階層的記憶體架構,為了能加速H.265/HEVC解碼器在ADSP-BF548模擬板實現的速度,論文利用FMAA技術來改善OpenHEVC各模組記憶體配置的效能。對於H.265/HEVC解碼器的函式(function)配置,所提FMAA將MC、IQ/IT模組等運算量較高的函式,從L3配置到L1和L2來加速運算速度,大幅降低H.265/HEVC解碼器時間。 從ClassC (832×480)視訊實驗與測試結果,我們可以發現將HM10.0直接嵌入ADSP-BF548,平均解碼速度約為1.6 fps (frame per second),而將OpenHEVC直接嵌入ADSP-BF548,則平均解碼速度約為4 fps,論文所提FMAA技術平均解碼約為9.6 fps,明顯比起直接嵌入的OpenHEVC高出2.5倍左右,也比直接嵌入的HM10.0則高出6倍。由實驗結果可以得知,論文所提FMAA技術能大幅提升H.265/HEVC在DSP上的解碼效能,若改採用四核心DSP平台,我們的高效能H.265/HEVC視訊解碼器,解碼速度可在30 fps以上,達到即時視訊解碼的應用。

關鍵字

none

並列摘要


With the rapid development of electronic technology, the ultrahigh definition (UHD) resolution of 4K2K (or 8K4K) will become the main video applications in future. Therefore, the ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Pictures Expert Group (MPEG) through their Joint Collaborative Team on Video Coding (JCT-VC) has been developed a newest high efficiency video coding (HEVC) for video compression standard to satisfy the UHD requirement, and the first version of HEVC was approved as ITU-T H.265 and ISO/IEC MPEG-H by JCT-VC in Jan. 2013. H.265/HEVC can achieve an average bit rate decrease of 50% in comparison with H.264/AVC High Profile while still maintaining the same subjective video quality. This is because HEVC adopts the quadtree-structured coding unit (CU) which sizes range from 64×64 to 8×8 pixels by spliting 4 level depths. H.265/HEVC can achieve the highest coding efficiency, but requires a very high computational complexity such that it is difficult to reach real-time applications. In order to fastly commercialize H.265/HEVC standard, the reference software of HEVC test model (HM) is provided by JCT-VC. The H.265/HEVC decoder is implemented using C++ language based on HM10.0. In addition, a more efficinet impletmentation of H.265/HEVC decoder in C language is the OpenHEVC decoder which is a project led by the Institute of Electronics and Telecommunications Rennes (IETR) laboratory. However, the decoding time will dramatically increase when HM10.0 and OpenHEVC are directly applicated in DSP-based mobile video device. This is because HM and OpenHEVC are PC-based software not DSP-based firmware. In order to reach real-time video applications of H.265/HEVC decoder, we embed a higfhly efficient DSP-based OpenHEVC on ADSP-BF548 processor. To realize an embedded fast H.265/HEVC decoder and player based on ADSP-BF548 processor, we propose flexible memory assignment architecture (FMAA) to efficiently control memory of ADSP-BF548. H.265/HEVC decoder mainly consists of some modules including entropy decoding (ED), inverse quantization and inverse integer cosine transform (IQ/IT), intra frame prediction (IFP), motion compensation (MC), sample adaptive offse and de-blocking filter (LPF). The most consuming processes of decoder are MC and IQ/IT modules. To overcome the problem of real-time decoding, we deeply study the memory assignment of ADSP-BF548 and fully use the hardware structure of DSP core. To reduce the decoding time, the proposed FMAA assigns the functions of MC and IPF from L3 to L1 and L2. Experimental results show that the proposed FMAA method can achieve an average speedup about 2.5 times and 6 times when compared with the directly embedded OpenHEVC and HM10.0 in ADSP-BF548 under ClassC (832×480) sequences, respectively. It is impossible for the proposed efficient H.265/HEVC decoder to finish real-time video applications due to ADSP-BF548 with only single CPU core. However, it actully reach real-time decoding video if we embed our method in the DSP platform with 4 CPU cores above.

並列關鍵字

none

參考文獻


branches/
[10] “ADSP-BF548 Data Sheets,” Analog Devices, Inc., http://www.analog.com/static/Import
[21] “Blackfin DSP Instruction Set Reference,” Analog Devices, Inc. http://smd.hu/DataAnal
[1] “Reference software HM10.0,” https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/
[2] “ Open source HEVC decoder (OpenHEVC),” https:://github.com/OpenHEVC.

被引用紀錄


葉博珽(2017)。多核心即時視訊編碼器與DSP實現〔碩士論文,義守大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0074-0608201701213300

延伸閱讀