類神經輻射場之可解釋隱變量於三維物件操控

控制三維物體的形變一直以來都是三維視覺領域中備受討論的領域。近年來，全新的三維表示法「類神經輻射場（NeRF）」被提出且蓬勃發展，在對場景的建模上取得很大的成功。使用此種表示法來進行物體合成或控制建模內容逐漸為人們所關注。此篇論文中，我們採用了一個能夠感知語義的生成型類神經輻射場，透過探索與詮釋為特定類別建模的生成型類神經輻射場所學習到的隱變量，得以對該類別之特定區域進行編輯。以預先訓練的生成型類神經輻射場為基礎，我們加入一個語義分割器，用來對每種物體作內部的區域分割，使得此類神經輻射場能夠同時渲染出所選視角的二維圖像與相對應的語義分割結果。我們提出的架構能對生成型類神經輻射所學到的隱變量做操縱，除了可以針對特定部位做編輯，編輯效果也能有不同程度的變化。我們將此架構以不同的生成型類神經輻射場和不同物體類別的資料集做實驗，結果成功地驗證此方法的有效性和實用性。

關鍵字

深度學習；電腦視覺；類神經輻射場；生成對抗網路；語義

並列摘要

Manipulating 3D objects has been among the active research topic for 3D vision. With the development and success of neural radiance field (NeRF) on scene modeling, synthesizing and manipulating 3D objects using such a representation becomes desirable. In this thesis, we introduce a semantic-aware generative NeRF, which is able to interpret the latent representation learned by category-specific generative NeRFs and to achieve editing of particular part attributes. With pre-trained generative NeRF, we propose to deploy a semantic segmentor for performing part segmentation on the object category. This allows the rendering of the 2D image and prediction of the corresponding segmentation mask. Our proposed scheme learns to manipulate the resulting latent representation and is optimized to edit the object part of interest with varying degrees. We conduct experiments on various object categories on benchmark datasets, and the results successfully verify the effectiveness and practicality of our proposed model.

並列關鍵字

deep learning ； computer vision ； 3D computer vision ； generative adversarial networks ； semantics

參考文獻

B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi,and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” in European conference on computer vision. Springer, 2020, pp.405–421.

Google Scholar

J. L. Schonberger and J.-M. Frahm, “Structure-from-motion revisited,” in Pro-ceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4104–4113.

Google Scholar

Y.-J. Yuan, Y.-K. Lai, T. Wu, L. Gao, and L. Liu, “A revisit of shape editing techniques: From the geometric to the neural viewpoint,” Journal of Computer Science and Technology, vol. 36, no. 3, pp. 520–554, 2021.

Google Scholar

J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove, “Deepsdf: Learning continuous signed distance functions for shape representation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 165–174.

Google Scholar

Z. Chen and H. Zhang, “Learning implicit fields for generative shape modeling,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5939–5948.

Google Scholar

主題瀏覽