單目影像序列的多視域合成之研究

隨著數位化和顯示技術的發展和進步，傳統的電視轉播正在經歷深遠的改變。在高解析度視訊轉播漸漸成為主流的趨勢下，而下一個新興的變革將是多視角視訊的轉播。在三維立體顯示技術日趨成熟下，適合三維顯示的數位內容相對的數量極少，而目前存在的視訊資料包括電影、新聞、記錄片等都是以單目攝影機拍攝而成，因此，將單目視訊轉換成三維立體視訊，將可提供豐富的數位內容以供三維顯示之需。本論文所要探討的課題是在場景中的景物固定不動的限制下，將單目視訊轉換成三維視訊。我們採取的方法是用基於景深影像繪圖(DIBR) 的技術來合成所需的立體影像畫面。然而使用基於景深影像繪圖法來實現視合成會產生去遮蔽區域，而如何有效的去除去遮蔽區域的失真則是一個極具挑戰的課題。在本研究中我們提出一個有效的去除視合成影像中的去遮蔽區域的演算法。首先我們先對影像序列中的相臨影像求取視差，之後利用視差來求出各畫面間的水平移動位移。而對影像序列中一特定的畫面，我們利用景深影像繪圖的技術在其左右兩側來合成所需的立體影像畫面。針對視合成所產生去遮蔽區域，我們採用兩階段的填補方法來處理畫面中的去遮蔽區域。我們稱第一階段為畫面間的填補：若視合成產生的畫面在原始畫面的左側，則我們先利用位於該原始畫面左側的相臨畫面來填補去遮蔽區域。對於使用第一階段填補法所殘留下的去遮蔽區域，我們在使用第二階段的畫面中的填補來作後續的處理。實驗結果顯示，我們所提出的去遮蔽區域的演算法的確可以有效的除去視合成所產生的去遮蔽區域的失真。

關鍵字

景深估測；畫面之間的填補；畫面內部的填補；遮蔽區域；立體顯示

並列摘要

Due to the development and progress in digital and display technologies, traditional television broadcasting is experiencing a profound change. Under the trend that high-definition video is becoming as the new mainstream of digital broadcasting, the next emerging technology will be multiview video broadcasting. The technology of 3D display is gradually mature; however, video content suitable for 3D displays are rather rare. Since most of the current existing video films including movies, news, documentaries, were captured by monocular cameras, technology capable of generating 3D video monocular should meet the demand of 3D displays. In the thesis, applying the stationary constraint to objects in the scene, we have developed technologies that convert a monocular video to a 3D stereoscopic video. In this research, we adopt the DIBR (Depth Image-Based Rendering) approach to generate multiview images. The major challenge of DIBR-based 3D view synthesis technique is the annoying disocclusion problem. We have proposed an effective algorithm to remove the disocclusion regions in the view synthesis images. First, we have to estimate the disparity map for every two consecutive images and then determine the horizontal displacement for each consecutive frame using the disparity map. The disparity maps provide the depth information for synthesizing the multiview images, while the horizontal displacements provide the relative position between images. We utilize a two stage inpainting to remove the disocclusion areas. The first stage is referred to as inter-frame inpainting which fills in the disocclusion regions using the data from the neighboring frames of the reference image. In subsequent, the second stage referred to as the intra-frame inpainting is applied to fill in the remained disocclusion areas. The experimental results show that our proposed disocclusion removal algorithm indeed is capable of taking good care of unpleasant disocclusion effect in the view-synthesis images.

並列關鍵字

Depth estimation ； Inter-frame inpainting ； Intra-frame inpainting ； Disocclusion ； 3D display

參考文獻

[1]S.H. Jang et al., “Real-time implementation of multiview image synthesis system, ” Optical Eng. 46 (4) (2007) 9 (Article no. 047005).

[3]J. Park, G. Um, C. Ahn, and C.-T. Ahn, “Virtual control of optical axis of the 3DTV camera for reducing visual fatigue in stereoscopic 3DTV,” ETRI Journal, pp. 597-604, 2004.

[4]Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans, Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, Nov. 2001.

[5]Y. Boykov and V. Kolmogorov, “An experimental comparison of min-cut / max-flow algorithms for energy minimization in vision,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1124-1137, Sept. 2004.

[6]V. Kolmogorov and R. Zabih, “Computing visual correspondence with occlusions via graph cuts,” in Proc. IEEE Conf. Computer Vision, pp. 508-515, 2001.

國際替代計量

單目影像序列的多視域合成之研究

未授權

主題瀏覽