在本篇論文中,我們提出了一個修正3D影片的深度圖的方法。在計算深度圖的過程中,我們不只考慮空間的一致性,還使深度在不同時間保持一致。我們的方法可以被應用在動態的場景或是相機旋轉的情況下。在修正3D影片深度圖的相關研究中,大部分的研究都會用不同的方式修正動態物體跟靜態景物的深度。因為靜態物體的深度會在不同時間上保持一致,動態物體則不然。許多研究都會用二元圖將靜態的景物跟動態的物體分開作處理。而我們則是用機率的方式來描述一個像素屬於動態的可能性。相較於只有0與1的二元圖,我們的作法是比較有彈性的。在我們的方法中,我們先對每個時間點計算出對應的深度圖。由於計算過程中沒有考慮時間的一致性,這使得我們可能對一個靜態物體上在不同時間獲得不同的深度。為了結合時間的資訊,我們利用光流法來獲得相鄰影格的對應關係。若該像素屬於靜態景物,那我們將會使深度在時間以及空間上保持一致。但若像素屬於動態物體,那我們會重視深度在空間上的一致性,而不是時間。最後,我們利用迭代式最小平方法來求得最佳的深度影片。經由能量的最小化,我們可以讓修正後的深度圖更加準確並在時間軸上保持一致的深度。
In this thesis, we propose a novel method to refine the depth map from stereo video. The proposed method enforces the spatial and temporal consistency of the depth maps computed from stereo video. Most previous works on estimating depth map from videos applied different ways to recover the depth by first deciding if a pixel belongs to dynamic region or static region. Because the static region has the same depth over time, but the dynamic region does not. Most methods employ a binary map to separate two regions in an image. Our method employs a probabilistic framework to describe the probability for a dynamic pixel or a static pixel. Compared to the binary map, the probability map is more flexible. In the proposed approach, we estimate the initial disparity map for each frame. The initial disparity map is temporally inconsistent because it is estimated with a stereo matching method from a single pair of stereo images. We apply the optical flow to estimate the corresponding pixels between neighboring frames. If the pixel is static, we smooth its depth both spatially and temporally. If the pixel is dynamic, we just smooth its depth spatially. In the final step, we optimize the depth map by applying iterative reweighted least squares (IRLS). Our experiments show the refined depth maps by using the proposed algorithm are more accurate and temporally consistent through experiments on a number of different stereo videos.