單視角視訊深度估測演算法

3D立體內容提供人們更真實的視覺饗宴，但現今的影像與視訊多以單視角2D方式儲存，因此需開發一套準確的2D轉3D技術，其中深度估測為重建3D內容的首要工作。多數文獻在單視角視訊深度估測時通常以視訊前後時間畫面間的關係作為深度估測依據，但利用此種估測方式常碰到如雜訊、攝影機變速移動、非紋理區域、遮擋問題等，對視訊深度估測的過程皆造成重大的影響。本論文針對單視角視訊估測深度訊息以重建3D視訊內容。藉由初始深度資訊與其建立之參考深度，提取可靠的深度資訊，並結合時間與空間域進行深度修復。首先由視訊的連續畫面中，以基於適應性權重區塊匹配法提取畫面間之視差資訊，並考慮攝影機的移動校正視差資訊，解決攝影機變速運動造成整體深度不連續之情形。接著將視差轉換作為初始深度(Initial depth)，利用初始深度為基礎建立兩種參考深度資訊，分別為傳遞深度(Propagation depth)與光流深度(Optical flow depth)，結合上述三種深度依據，以投票融合的方式獲得可靠的深度資訊，最後利用超像素點分割以及時間-空間域平滑處理進行深度修復，降低非紋理區域或受到雜訊干擾的影響，獲得最終估測結果。實驗證明經由本文提出之深度估測方法，無需如文獻常使用額外的前處理過程與費時的迭代修正方法，並且能夠獲得視覺上舒適且時間域上連續的深度資訊。

關鍵字

單視角視訊； 2D轉3D ；深度估測；適應性權重；深度傳遞；光流法；影像分割； DIBR

並列摘要

Stereoscopic content can provide more realistic visual feast. Since nowadays existing images and videos are almost stored in 2D formats, it is necessary to develop a technique of accurate 2D to 3D video conversion. Depth estimation is the essential step for 2D to 3D reconstruction. In literature, depth estimation from a single-view video is often made use of the relationship between frames. But the estimation could encounter problems for videos with such as noise, camera movement in variable-speed, textureless region and occlusion. These kinds of contents will make depth estimation process more difficult. The goal of this thesis is to develop a robust depth estimation method from a single-view video 3D sequence to solve the above difficult situations. We utilize an estimated initial depth to establish a reference depth for further obtaining the reliable depth information, and then which is finally refined with a temporal-spatial filter. At first, we use the adaptive support-weight block matching to extract disparity information from consecutive frames. The disparity is compensated with the camera motion and then transformed to the initial depth. Based on the initial depth, two kinds of depths, the propagation depth and the optical flow depth, can be established. Finally, these three depths are fused together using voting, and then applied with the super-pixel segmentation and a temporal-spatial smoothing filter to improve the noisy depth estimated in textureless image region. The experiments show that the proposed method could achieve visually pleasing, and temporally consistent depth estimation results without additional pre-processing and time-consuming iterations required in other works in literature.

並列關鍵字

Single-view video ； 2D-to-3D ； depth estimation ； adaptive support-weight ； depth propagation ； optical flow ； image segmentation ； DIBR

參考文獻

[16] Xiao-Jun Huang, Liang-Hao Wang, Jun-Jun Huang, Dong- Xiao Li, and Ming Zhang, “A Depth Extraction Method Based on Motion and Geometry for 2D to 3D Conversion,” Third International Symposium on Intelligent Information Technology Application, vol. 3, pp.294-298, Nov 2009.

[1] C. Fehn, “A 3D-TV System Based on Video Plus Depth Information,” Proc. of 37th Asilomar Conference on signals, systems and Computers, Vol. 2, pp. 1529-1533, Nov. 2003.

[3] K. Mueller, P. Merkle, T. Wiegand, “3-D Video Representation Using Depth Maps,” Proceedings of the IEEE, 2011.

[4] C. Fehn, “Depth-image-based Rendering (DIBR), Compression, and Transmission for A New Approach on 3D-TV,” Proc. of the SPIE, vol. 5291, pp. 93-104, 2004.

[5] Kuk-Jin Yoon, In-So Kweon, “Locally Adaptive Support-Weight Approach for Visual Correspondence Search,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005.

被引用紀錄

羅一中（2013）。數位影像與視訊深度估測演算法〔博士論文，國立臺北科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0006-1308201316370500

國際替代計量

單視角視訊深度估測演算法

未授權

主題瀏覽