影像編解碼與可調式視訊應用之架構與演算法設計

感測器、顯示裝置、通訊系統、運算引擎隨著半導體製程的演進而快速發展，伴隨著影像與視訊標準的進步，使得影像與視訊應用無所不在的融入了我們的日常生活中。高壓縮率影像與視訊標準以及高畫面品質的要求隨著多媒體應用發展也越來越重要，高畫質、高壓縮率、低運算量以及低生產成本成為消費性電子產品應用的重要指標，而這些要求通常需要好的演算法與硬體架構設計來做畫面品質、運算資源及資料產出量之間的取捨。因此，有效率的演算法與硬體架構設計技術發展可以促進多媒體新技術與新應用的演進。在本篇論文中，我們主要分為兩部分的研究：從硬體架構層面來研究新影像標準實現以及從演算法層面進行可調式視訊應用研究。在論文的第一部分針對JPEG XR 編碼提出了系統分析以及單晶片編碼器的硬體架構。我們實現了可支援4：4：4 無失真/失真功能設計，且能夠即時的處理1920×1080p 的編碼器單晶片，在此設計中，模組間的時間排程與管線化作了有效的設計，我們用了內部記憶體來避免大量外部記憶體存取的動作，另外，為了最佳化與充分的利用矽晶圓的單位面積，我們提出資料重複技術來解決模組間存取資料範圍不同的問題，而在最高運算量且資料相依性最高的熵編碼模組，我們分析資料前後相關性並使用次管線排程的方式來增加執行速度與資料產出量，我們也提出多產出率處理架構來減少熵模組處理時間。藉由提出的系統排程、高度的平行化、演算法最佳化、以及模組管線化等技術，我們實作的面積9.61mm2 JPEG XR 編碼晶片能夠達到1 億8800 萬像素/秒的運算能力，此晶片採用0.18um 製程。本晶片為目前文獻上第一個JPEG XR 單晶片編碼器設計。在本論文第二部份研究中，我們針對可調式視訊應用不同的考量點提出相關的演算法設計。在異質性網路中做視訊傳輸時會遇到各式各樣不同規格的應用需求，這是由於使用者端有各種不同使用設備的限制或是不同使用者有其不同的使用習慣，在這種情況下，傳統的單一維度可調式方法已經無法滿足這樣的應用需求，視訊串流需要更多不同可調維度如空間、時間與清晰度同時組合的彈性來達到可調式視訊應用需求，但是彈性越大，要找到符合使用者主觀使用特性的視訊串流就越困難，在論文此部份我們提出一個多維度可調式視訊串流的選擇方法，在此方法中，我們提出一個客觀性推衍模型來當作多維度可調式視訊串流選擇模型，此模型在多維度視訊應用中可以有效符合使用者於各個維度上的主觀使用特性偏好。另外，我們提出軟決定的方法來克服使用者於多維度視訊應用中的使用者不確定因素，透過我們提出的方法，在多維度可調式視訊應用中，我們提出的方法可以有效將使用者主觀使用特性符合率由75%提高至94%，而且我們的演算法都是在壓縮串流上做運算，因此不會有太多額外的運算。在即時可調式視訊串流應用中，如何解決有傳輸上的錯誤也是非常重要的問題，在此部份我們提出一個不用改變現行標準架構的應用層視訊標頭保護方法，此方法可以運用於各個影像與視訊標準中，也由於我們是在應用層中做此設計，我們提出的方法也可以實現於各種網路傳輸環境，此外，我們設計上也考慮了傳輸通道的情況並提供減少傳輸編解碼保護位元數的方法，透過提出的演算法，在實際傳輸網路上的視訊串流畫質將可明顯優於傳統傳輸方法。總而言之，我們提出的技術能夠實現在許多生活應用與實際系統中，我們由衷地希望我們的研究成果能對人類日常生活的多媒體應用上帶來貢獻。

關鍵字

JPEG XR ；可調式視訊；應用層視訊標頭保護

並列摘要

Multimedia applications are more and more popular in our life as the rapid progress of image sensor, display devices, communication, VLSI manufacture, computing engines, and image/video coding standards. Many advanced multimedia applications require image and video compression technology with higher compression ratio and better visual quality. High quality, high compression rates of digital image/video, and low computational cost are important factors in many areas of consumer electronics. These requirements usually involve computationally intensive algorithms imposing trade-offs between quality, computational resources, and throughput. Hence, the researches of hardware-oriented algorithms and VLSI architectures push the progress of multimedia applications. This dissertation has two main purposes: to propose VLSI architectures for efficient implementation of the image coding systems and to provide algorithm designs of scalable video application systems for the emerging requirement in real multimedia applications. In the first part, we describes system analysis and architecture design of JPEG XR encoder. We proposed two chip implementations for JPEG XR image coding.Firstly, a 4:4:4 lossless/lossy symbol-based JPEG XR encoder is implemented on a 3.222 mm$^{2}$ with 90nm CMOS technology dissipating 95.7 mW at 0.9 V and 62.5 MHz. It is capable of processing 34.1 Mega samples within one second for lossless/lossy coding. The timing schedule and pipelining of color conversion, pre-filter, PCT and quantization modules are well designed. In order to prevent accessing the coefficients from off-chip memory, an on-chip SRAM is designed to buffer the coefficients. We use well arranged sub-pipeline timing schedule for the implementation of the entropy encoding module to increase the throughput about 3 times. This design is dedicated for the DSC and digital frame application. The another chip design is channel parallel JPEG XR encoder. An five-stage block pipelined architecture with proposed system scheduling supports real-time 4:4:4 full-HD(1920x1080p) lossless/lossy processing ability. We analysis the dependency of RLE and Flexbits modules and adopt multi-symobl architecture to reduce the processing cycles. It is implemented on 9.61 mm$^{2}$ with 0.18 um CMOS technology and 81 MHz. The 187 Mega samples/sec throughput are achieved by proposed system scheduling, high degree of parallelism, reducing memory access, and algorithmic optimization. In addition, the processing ability is six times larger than our first work. Our proposed architecture is the worldwide first reported JPEG XR single-chip encoder. The second part of this dissertation describes two algorithm designs for scalable video application which is emerging recently. When transmitting video over a heterogeneous network, it is required to satisfy the different constraints due to the preferences and equipment selections of different users. More than one video parameters include spatial frame size, temporal frame rate, and visual quality resolution are utilized to provide better scalability in scalable video application. It is difficult to find the relationship between the various video parameter settings and user preferences. In this part, we propose a multidimensional adaptation selection scheme to match the preferences of the video parameters for each user. This scheme characterizes the relationship between spatial, temporal and SNR scalabilities according to the subjectiveness of each user. An objectivity-derived emulation scheme is used in video adaptor to realize the selection of multidimensional adaptation. Therefore, our proposed video adaptor provides more appropriate adjustments of the video parameters for each user. After objectivity-derived scheme is derived, optimization fitting of proposed model for each user is the next important things. We proposed soft-decision optimization scheme to overcome the uncertainty of the user, which not discussed in presented literatures. Besides, our proposed video adaptor identifies the key frames in sequence to utilize the bandwidth in a more efficient way and achieve better subjective visual quality. The proposed method improves the average accuracy prediction rate from 75% to 94% in overall available adaptation bandwidth of test sequences. The experimental results show that the video adaptor provides high consistence of quality between the adjusted video stream and the expectation of users. Because we analyze the user preference according to the compress-domain data, this scheme can be used in video proxy or gateway without much computation overhead. To satisfy the urgent of providing real-time video service over error prone network in scalable video transmission systems, how to protect the video streaming to have better visual quality is also an important issue. We propose a way to protect the video header information in application layer without modifying standardized syntax. Because we will not modify the syntax of the existed standard and the redundant bits can be embedded in the bitstream, this scheme can be used in combination with any video codecs. Our method can be applied for the environment of video streaming system that we practically used today, since the effort that we made is confined in the application layer. Beside, we also consider channel condition of wireless transmission and propose a way to reduce redundant bits used in channel coding. By doing this, the bitstream can be simply transmitted over practical network such as mobile TV in scalable video streaming application and the reconstructed picture quality outperforms the original one. In brief, we believe that with the technologies proposed in this dissertation can be realized in many real practical systems. We sincerely hope that our research contributions can create a new era for digital multimedia life.

並列關鍵字

JPEG XR ； Scalable Video Coding(SVC) ； APEC

參考文獻

[43] W. Li, “Overview of fine granularity scalability in MPEG-4 video standard,”

[68] W. Li, F. Ling, and X. Chen, “Fine granularity scalability in MPEG-4 for

[72] W. Li, “Overview of fine granularity scalability in MPEG-4 video standard,”

Storage, pp. 330, Prentice Hall International, Inc., 1995.

[2] Weidong Kou, Digital image compression—algorithms and standards,

國際替代計量

影像編解碼與可調式視訊應用之架構與演算法設計

全文下載

主題瀏覽