利用領域調適技術轉移抽象場景結構資訊以提升語意式影象切割與深度估計

在本文中，我們針對語義分割任務解決了無監督域自適應問題。在此問題中，我們嘗試從具有正確標註的合成數據集中學習知識，並轉移到沒有任何標註的真實世界圖像。我們假設一張圖片的結構是進行語義分割最具信息性和決定性的因素且不受不同資料集所影響，因此我們提出了 Domain Invariant Structure Extraction (DISE)框架，用於將圖像解析為域不變結構和域特定的紋理特徵。該框架能進一步實現跨域的圖像轉換和運用標籤轉移以進一步提高模型在語意分割任務的效能。大量實驗驗證了我們提出的 DISE 模型的有效性，並證明了其優於幾種最先進的方法和其在其他視覺任務的潛力。

關鍵字

深度學習；領域調適；語意分割；深度估計

並列摘要

In this thesis we tackle the problem of unsupervised domain adaptation for the task of semantic segmentation, where we attempt to transfer the knowledge learned upon synthetic datasets with ground-truth labels to real-world images without any annotation. With the hypothesis that the structural content of images is the most informative and decisive factor to semantic segmentation and can be readily shared across domains, we propose a Domain Invariant Structure Extraction (DISE) framework to disentangle images into domain-invariant structure and domain-specific texture representations, which can further realize image-translation across domains and enable label transfer to improve segmentation performance. Extensive experiments verify the effectiveness of our proposed DISE model and demonstrate its superiority over several state-of-the-art approaches and potential for other vision tasks.

並列關鍵字

deep learning ； domain adaptation ； semantic segmentation ； depth estimation

參考文獻

References

Google Scholar

[1] A. Gaidon, Q. Wang, Y. Cabon, and E. Vig, “Virtual worlds as proxy for multi-object tracking analysis,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

Google Scholar

[2] P. Haeusser, T. Frerix, A. Mordvintsev, and D. Cremers, “Associative domain adap- tation,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.

Google Scholar

[3] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” International Journal of Robotics Research (IJRR), 2013.

Google Scholar

[4] C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

Google Scholar

國際替代計量

利用領域調適技術轉移抽象場景結構資訊以提升語意式影象切割與深度估計

全文下載

主題瀏覽