可調性編碼中依據影像內容導向之空間可調性方法

隨著科技日新月異，能夠觀賞多媒體視訊內容的裝置也越來越多了。為了滿足不同裝置撥放影音串流的需求，MPEG與VCEG制定了一種以H.264/AVC為基礎的可調式視訊編碼，其目的是希望編碼出的Bitstream能分別或合併提供時間可調性、空間可調性與品質可調性，使用者可以根據網路通道的情形與裝置的能力，從中擷取出適合的Bitstream來進行網路串流。然而，H.264/SVC提供的空間可調性僅支援Cropping或Uniform-Scaling來產生多種低解析度的視訊內容，其可能會在產生多種低解析度的視訊內容時造成畫面資訊的遺失、物體變形或者無法保持重要物體的大小。因此，我們希望結合視訊畫面濃縮技術與H.264/SVC的空間可調性，以確保在產生低解析度的視訊內容時，能夠保留影像上人眼感興趣的部分，僅縮小或捨去人眼比較不感興趣的部分，進而保留原始畫面想要表達的意涵。本篇論文提出了可調性編碼中依據影像內容導向之空間可調性方法。我們首先使用基於全景接圖引導之視訊畫面濃縮技術來保留空間基本層的重要內容。此外我們提出了低負擔的附加訊息編碼器與數種非均值層間預測器來減輕編碼空間增強層的位元率負擔。從實驗結果證實了我們的方法不僅可以在低解析度的視訊內容保有較佳的主觀品質，而且平均只增加了4.17%-4.98%的位元率。

關鍵字

空間可調性；視訊畫面調整；視訊畫面濃縮；層間預測

並列摘要

The scalable extension of H.264/AVC (SVC) supports video cropping or uniform-scaling to create different lower resolution video content. However, it will cause information loss, important object deformation or unable to keep important object size in the different lower resolution. Therefore, we want to combine video retargeting with spatial scalability of the H.264/SVC to make sure generating different lower resolution video content can keep essential visual regions and condensing unimportant content. In this thesis, we proposed content-aware spatial scalability for scalable video coding. First of all, we use a mosaic-guide video retargeting method to preserve the important content in the spatial base layer. Moreover, we proposed a low overhead side information coder and several non-homogeneous interlayer prediction coding tools to mitigate the bit-rate overhead in the spatial enhancement layer. The experimental results demonstrate the proposed method not only preserves subjective quality of important content in the lower resolution sequence, but also only has an average 4.17%-4.98% bit-rate overhead.

並列關鍵字

H.264/SVC ； Spatial Scalability ； Video Adaptation ； Video Retargeting ； Inter-layer Prediction

參考文獻

[3] C. A. Segall and G. J. Sullivan, “Spatial scalability within the H.264/AVC scalable cideo coding extension,” IEEE Trans. Circuits. Syst. Video Technol., vol. 17, no. 9, pp. 1121–1135, Sept. 2007.

[4] T.-C. Yen, C.-M. Tsai, and C.-W. Lin, “Maintaining temporal coherence in video retargeting using mosaic-guided scaling,” IEEE Trans. Image Process., vol. 20, no. 8, pp. 2339–2351, Aug. 2011.

[6] Z. Lu, W. Lin, X. Yang, E. Ong, and S. Yao, “Modeling visual attention's modulatory aftereffects on visual sensitivity and quality evaluation,” IEEE Trans. Image Process., vol. 14, no. 11, pp. 19281942, Nov. 2005.

[7] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.

[9] P. F. Felzenszwalb and D. P. Huttenlocher, "Efficient Graph-Based Image Segmentation," Int. J. Comput. Vis., vol. 59, no. 2, Sept. 2004.

國際替代計量

可調性編碼中依據影像內容導向之空間可調性方法

主題瀏覽