透過您的圖書館登入
IP:13.58.184.90
  • 學位論文

幾何感知表示學習用於非監督式單眼深度估計

Geometry-Aware Representation Learning For Unsupervised Monocular Depth Estimation

指導教授 : 陳良基
共同指導教授 : 王鈺強(Yu-Chiang Wang)

摘要


對於場景理解來說,單眼深度預測(monocular depth estimation)是一個重要的判斷依據,雖然目前大量的監督式和非監督式機器學習方法被提出,並在單眼深度預測上取得長足的進展,但通常大部分的方法在物體邊界以及細節上無法獲得很好的結果,而這些部份的深度資訊在生活應用上卻是相對重要部份。 在這篇論文當中,我們提出一個全新的「幾何結構表示學習方法」 (geometry-aware representation learning),透過加入語意分割的資訊將物體幾何結構納入單眼深度預測中,搭配上一系列的特殊條件判別器,用於統整物體結構和視覺外觀,最終有效幫助非監督式單眼深度預測,改善之前大部分方法於物體邊界和細節上不準確的問題。 透過在公開資料集上定量和定性分析,證明我們的表示學習方法在非監督 式單眼深度預測上比肩於目前其餘最先進方法的結果,並於特定物體上取得明顯的進步。

並列摘要


Monocular depth estimation plays an important role in scene understanding. While a number of supervised and unsupervised learning approaches for monocular depth estimation have been proposed, their promising quantitative performance might not necessarily reflect satisfactory quality of the depth outputs due to inaccurate object boundaries. In this paper, we propose a novel approach of geometry-aware representation learning, which takes object geometry into account with the aid of semantic scene understanding (ie semantic segmentation). With a series of unique combinatorial conditional discriminators deployed over visual appearance and geometric representations, improved unsupervised depth estimation can be achieved. Experiments on the KITTI dataset successively verify that our model performs favorably against recent approaches.

參考文獻


[1] H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, Deepordinal regression network for monocular depth estimation," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,pp.2002-2011, 2018.
[2] D. Eigen, C. Puhrsch, and R. Fergus, Depth map prediction from a single image using a multi-scale deep network," in Advances in neural information processing systems, 2014, pp.2366-2374.
[3] F. Liu, C. Shen, G. Lin, and I. Reid, Learning depth from single monocular images using deep convolutional neural fields," IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 10, pp.2024-2039, 2015.
[4] C. Godard, O. Mac Aodha, and G. J. Brostow, Unsupervised monocular depth estimation with left-right consistency," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017,pp. 270-279.
[5] T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, Unsupervised learning of depth and ego-motion from video," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp.1851-1858

延伸閱讀