近年來立體視訊技術的發展,已經從大型的娛樂產裝置如電影院,漸漸轉成家用或是手持娛樂裝置。立體視訊的資料量龐大,因此如何有效將資料減量是個急需解決的問題。彩色視訊搭配深度視訊可以進行虛擬視角的合成,取代傳送其他視角的彩色視訊,資料量也會隨著減少。為了配合現有的硬體設備,學者們開發了畫面相容立體視訊編碼技術,將左右眼影像降頻後包裝至一個畫面內進行傳輸,但是會造成觀看畫面品質的下降。另外,畫面相容畫面內的影像複雜度增加,也增加了編碼的資料量。 目前畫面相容立體視訊的發展,大多著重於彩色視訊的畫面品質維持與包裝格式的探討,鮮少有針對深度視訊開發的技術。深度視訊的功能在於幫助彩色視訊進行虛擬視角的合成,要求的是其準確性。若搭配彩色視訊使用的降頻方式,可能會降低其準確性,造成影像合成時的錯誤。因此,找出適當的深度視訊的降頻技術,將會是接下來需要面對的問題。 本論文提出的基於遮蔽區域之非對稱畫面相容深度視訊編碼技術,將左右眼的深度影像進行分析,分類為左眼獨有資訊、右眼獨有資訊以及兩眼都有的資訊,避免重複傳送相同的資訊而浪費頻寬。我們的做法讓一個視角的影像維持其影像完整性,另一個視角僅傳送欠缺的遮蔽區域影像資訊,以非對稱的方式組成一張畫面進行影像包裝。最後,於合成虛擬視角時,經過來源判斷找出最適當的視角資料來源,合成較準確的深度影像,進而幫助彩色影像的合成。 實驗結果呈現了客觀品質的量測,以及主觀的影像結果。在虛擬視角的合成品質上,在較低的頻寬需求下,皆能夠優於傳統的畫面相同立體視訊編碼技術。
Stereo video offers favorable reception in 3D entertainment. It supplies more immersive perception for audience than conventional video does. In order to expand stereo video application scenarios, depth video is used to render virtual view in conjunction with texture video. It not only provides much fun for watching video, but supports next generation TV, FTV (free-view TV). To reduce the stereo data and also reuse the existing infrastructure and equipment for 2D video, a frame-compatible stereo video format is usually used. For the frame-compatible stereo coding, only stereo texture video is considered. In this work, we propose frame-compatible stereo depth coding. Depth map plays a important role in view synthesis and the reliability of depth map is very important in rendering good quality virtual view. Hence, how to perform the downsampling while maintaining the preciseness of depth video is the challenge in this work. We propose an asymmetric frame-compatible depth video coding considering occulusion region. The content in both views is analyzed after performing view warping from the primary view to the secondary view. The occlusion region is defined as the missing region where no corresponding information can be founded in the primary view. The idea is to avoid the sending of the same information in both views. Thus, the downsized primary view and the (downsized) occlusion region in the secondary view is packed into a frame. Then the frame-compatible stereo depth can be encoded by video coding standards. The experiment results show that our proposed technique achieves better coding performance in terms of objective and subjective ways, compared to conventional frame-compatible technology. In the future, we will extend this technique to three views case. It will provide expanded view angle for audiences