以雙流對抗式生成網路之任意形狀三維立體場景修復

現今，由於擴增實境和虛擬實境技術的發展，使用者在三維立體場景中進行編輯的需求迅速增加。然而，現有的三維立體場景補全任務（以及資料集）並無法滿足使用者編輯的需求，因為其場景中的缺失區域是由傳感器偵測限制或物品遮擋產生的。因此，我們提出了任意形狀三維立體場景修復的新任務。與之前的三維立體場景補全任務的資料集中場景保留了缺失區域周圍大部分的主要結構和細節的形狀提示不同，我們所提出的三維立體場景修復資料集（FF-Matterport）包含了由我們提出的任意形狀三維立體遮罩生成演算法所產生大面積而多樣的缺失區域；此演算法模仿了人類在三維立體空間中繪製遮罩的軌跡。此外，先前的三維立體場景補全方法只需對缺失區域周圍的結構和顏色進行插值即可達到不錯的效果，但這些方法無法很好地完成三維立體場景修復這項具有挑戰性但實用的任務，因此我們針對此任務設計了雙流對抗式生成網路。首先，我們的雙流生成式網路結合了三維立體場景中結構與顏色的資訊，以生成具有明確語義邊界的場景並解決了先前方法中插值的問題。為了進一步加強場景中的細節，我們提出了輕量級的雙流鑑別式網路將生成場景的結構與顏色邊緣規範化，使其更加逼真與清晰。我們用提出的FF-Matterport資料集進行了實驗。定性和定量的結果都驗證了我們提出的方法優於現有三維立體場景補全方法且所有提出的架構皆有其效果。

關鍵字

深度學習；對抗式生成網路；任意形狀修復；三維立體場景；三維立體場景修復

並列摘要

Nowadays, the need for user editing in a 3D scene has rapidly increased due to the development of AR and VR technology. However, the existing 3D scene completion task (and datasets) cannot suit the need because the missing regions in scenes are generated by the sensor limitation or object occlusion. Thus, we present a novel task named free-form 3D scene inpainting. Unlike scenes in previous 3D completion datasets preserving most of the main structures and hints of detailed shapes around missing regions, the proposed inpainting dataset, FF-Matterport, contains large and diverse missing regions formed by our free-form 3D mask generation algorithm that can mimic human drawing trajectories in 3D space. Moreover, prior 3D completion methods cannot perform well on this challenging yet practical task, simply interpolating nearby geometry and color context. Thus, a tailored dual-stream GAN method is proposed. First, our dual-stream generator, fusing both geometry and color information, produces distinct semantic boundaries and solves the interpolation issue. To further enhance the details, our lightweight dual-stream discriminator regularizes the geometry and color edges of the predicted scenes to be realistic and sharp. We conducted experiments with the proposed FF-Matterport dataset. Qualitative and quantitative results validate the superiority of our approach over existing scene completion methods and the efficacy of all proposed components.

並列關鍵字

Deep Learning ； Generative Adversarial Network ； Free-form 3D Scene Inpainting ； 3D Scene ； 3D Scene Inpainting

參考文獻

[1] Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph., 2009.

Google Scholar

[2] Dominique Brunet, Edward R Vrscay, and Zhou Wang. On the mathematical properties of the structural similarity index. IEEE Transactions on Image Processing, 2011.

Google Scholar

[3] John Canny. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, 1986.

Google Scholar

[4] Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Niessner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3d: Learning from rgb-d data in indoor environments. arXiv preprint arXiv:1709.06158, 2017.

Google Scholar

[5] Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and Silvio Savarese. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In European conference on computer vision. Springer, 2016.

Google Scholar

主題瀏覽