透過您的圖書館登入
IP:3.144.98.87
  • 學位論文

應用於影像壓縮之高效能轉換核心

High Performance Transform Cores for the Applications of Video Compression

指導教授 : 張慶元

摘要


Discrete cosine transform (DCT) is a widely used transform engine for the applications of image and video compression. Recently, the development of visual media has been progressed to high-resolution specifications, such as high definition television (HDTV) and digital cinema. Therefore, a high-accuracy and high-throughput rate component is needed to meet the requirements of future specifications. In addition, in order to reduce the manufacturing costs of the integrated circuit (IC), a low hardware cost design is also required. Thus, a high performance video transform engine with high accuracy, small area, and high-throughput rate is desired for very-large-scale integration (VLSI) designs. In this study, a high-throughput DCT (HT-DCT) core is proposed , which draws on an odd-even decomposition adder-based distributed arithmetic (DA) scheme and an error-compensated adder-tree (ECAT). Instead of the coefficient length of 12-bit DA-precision which is commonly used in previous works, a 9-bit DA-precision coefficient length is chosen for HT-DCT so as to meet peak-signal-to-noise ratio (PSNR) requirements. Thus, the proposed HT-DCT core achieves a throughput rate of 1 G-pels/s with gate counts 22.2 K, meeting the PSNR requirements outlined in the previous works. On the other hand, another low cost DCT (LC-DCT) core using a spatial and time scheduling strategy, called the space-time scheduling (STS) strategy, that can achieve high image resolutions in real-time systems is also proposed. The proposed STS includes the ability to choose the DA-precision bit length, a hardware sharing architecture that reduces the hardware cost, and the proposed time scheduling strategy which arranges different dimensional computations. The proposed time scheduling strategy can calculate first-dimensional (1st-D) and second-dimensional (2nd-D) transformations simultaneously in single one-dimensional (1-D) DCT core to reach a hardware utilization of 100%. The measurement results show that the LC-DCT core has a latency of 84 clock cycles with a 52 dB PSNR and is operated at 167 MHz with 15.8 K gate counts. Finally, a multi-path DCT (MP-DCT) core, which employs four computation paths to achieve a high-throughput rate and is implemented by using single 1-D MP-DCT core and one transposed memory (TMEM) to reduce the area cost, is proposed. The proposed 1-D MP-DCT can calculate 1st-D and 2nd-D transformations simultaneously in four parallel streams, and the two-dimensional (2-D) MP-DCT utilizes single 1-D MP-DCT core with one TMEM. Therefore, a high-throughput rate and a low-area cost are achieved in the proposed 2-D MP-DCT core. The implementation results show the proposed 2-D MP-DCT core can achieve a high-throughput rate of 1 G-pels/s with only 20 K gate area. To conclude, as the current progress of visual media has advanced rapidly, this dissertation aims to cope with the ongoing advancement of high-resolution specifications and hopefully to meet the future needs as much as possible. Therefore, three circuits of HT-DCT, LC-DCT, and MP-DCT are proposed to achieve high performance in high-throughput rate and low cost VLSI designs.

並列摘要


摘 要 離散餘弦轉換(DCT)是一個被廣泛應用於影像及視訊壓縮的運算元件。因應高解析度視訊規格的訂定,高精確度和高吞吐率(throughput rate)將是未來的需求。另外,為了減少電路設計的成本,小面積的設計也是非常需要的。因此,我們需要一個具有高精確度、小面積以及高吞吐率的高效能視訊轉換電路。 在此項研究中,提出了一個利用加法器基底的分佈式算術(DA)和誤差補償加法樹(ECAT)的高吞吐率DCT (HT-DCT)電路。在設計中,選用9位元的DA精確係數長度來取代以往的13位元DA精確系數長度,便可符合峰值信號噪訊比(PSNR)的要求。因此,所提出的HT-DCT電路在22K的電路面積下可以達到1 G-pels/s的吞吐率。 另一方面,本研究中還提出一個低成本DCT (LC-DCT)電路,此電路採用空間及時間的規劃策略(STS)。STS主要包含DA精確系數長度的選擇、共用硬體的設計以及時間規劃的策略。藉由DA精確係數長度有效的選擇及共用硬體的設計,使得LC-DCT達到小面積及低成本的設計;另外所提出時間規劃策略可以使一個一維的DCT電路同時運算第一維度及第二維度的DCT運算。量測結果顯示,LC-DCT電路使用15.8 K的面積下以167 MHz的頻率運作可達到PSNR為52 dB的高精確運算。 最後,本研究中整合了HT-DCT及LC-DCT的特點,提出了多運算路徑DCT(MP-DCT)電路,此電路採用單一個一維的DCT電路及一轉置記憶體來達到小面積的設計,而其中的一維DCT是採用4條平行運算路徑的運算單元,其也與LC-DCT中的一維DCT電路相似,可以同時運算第一維度及第二維度的DCT轉換,藉此便可達到小面積及高吞吐率的設計。所提出的二維MP-DCT電路在17.7K的電路面積下可達到1 G-pels/s的高吞吐率。 由此可知,因應視訊媒體的迅速發展,此份學術論文以達到高解析度的規格為目標,並能符合未來的需要。所以,所提出的HT-DCT、LC-DCT和MP-DCT電路達到高吞吐率及低成本VLSI電路設計。

並列關鍵字

DCT DA-based space-time scheduling muti-path DCT

參考文獻


[2] W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard. New York: Van Nostrand Reinhold, 1992.
[3] G. K. Wallace, “The JPEG still picture compression standard,” Commun. of the ACM, vol. 34, pp. 31–44, Feb. 1992.
[4] C. Christopoulos, A. Skodras, and T. Ebrahimi, “The JPEG2000 still image coding system: an overview,” IEEE Trans. Consum. Electron., vol. 46, no. 4, pp. 1103–1127, Nov. 2000.
[5] D. Taubman, “High performance scalable image compression with EBCOT,” IEEE Trans. Image Process., vol. 9, no. 7, pp. 1158–1170, Jul. 2000.
[6] A. Bovik, The Essential Guide to Video Processing, 2nd ed. UK: Academic Press, 2009.

延伸閱讀