透過您的圖書館登入
IP:3.12.161.30
  • 學位論文

基於雙層的分群策略實施高效率大規模圖像建模

Efficient Large-Scale Image-Based Modeling 
Using Divide and Conquer Strategy with Two-Layer Clustering

指導教授 : 陳冠文

摘要


雖然基於圖像的3D立體建模已得到廣泛開發,但其計算成本和內存需求通常是主要需要克服的問題,特別是當我們想要建立大規模甚至城市規模的立體模型但只能使用個人電腦時。因此在本文中,我們提出了一種高效率的大規模基於圖像的立體建模流程方法,該方法使用具有位置和圖像相似性的分群策略。允許普通人只需一台個人電腦即可輕鬆構建自己的大型立體模型。此外,與過去需要用戶手動選擇或輸入數千個圖像的方法不同,我們的方法只需要用戶任意拍攝場景的多個影片,對於實際應用來說更容易,也更實用。本文的主要思想是使用基於位置信息(意即GPS)和圖像相似性(意即特徵匹配和極線幾何)的分群策略。我們首先將影片劃分為多個根據位置信息而聚類的小組影片剪輯,然後再將這些小組影片根據圖像相似性劃分為更小的圖像聚類。最後,提出了將多個小規模立體模型組合成大規模立體場景模型的框架。由本文所提出的雙層聚類的方法將大大降低計算要求,實驗結果表明其可行性和準確性。根據我們的目前已知的資訊,這是第一次使用基於位置和圖像信息的雙層聚類來進行大規模立體模型構建。

並列摘要


Image-based modeling has been widely developed, but its computational cost and memory requirement are usually the main issues especially when we want to build a large-scale or even city-scale model but only personal computer can be used. In this paper, we propose an efficient large-scale image-based modeling approach which uses divide and conquer strategy with both location and image similarity. It allows normal people can easily build their own large-scale models with only a PC. In addition, unlike previous methods, which require users to select or input thousands of images manually, our approach only needs users to take multiple videos of the scene arbitrarily and it is easier and more practical for real applications. The main idea of this paper is using divide and conquer strategy based on both location information, i.e. GPS, and image similarity, i.e. feature matching and epipolar geometry. We firstly divide the videos into multiple small groups of video clips with location clustering and then divide these groups into further smaller clusters of images with image clustering. Finally, a framework of combining multiple small-scale models into a large-scale one is proposed. The two-layer clustering will decrease the computational requirements very much and the experimental results show its feasibility and accuracy. This is the first work using two-layer clustering based on both location and image information for large-scale model construction, to our best knowledge.

參考文獻


[1] N. Snavely, S. Seitz, and R. Szeliski, “Photo tourism: Exploring photo collections in 3d,” Proceedings of ACM SIGGRAPH, pp. 835–846, 2006
[2] N. Snavely, S. Seitz, and R. Szeliski, “Modeling the world from internet photo collections,” International Journal of Computer Vision 80, pp. 189–210, 2008
[3] N. Snavely, S. Seitz, and R. Szeliski, “Skeletal graphs for efficient structure from motion,” CVPR, 2008
[4] S. Agarwal, Y. Furukawa, N. Snavely, I. Simon, B. Curless, S.M. Seitz, and R. Szeliski, “Building rome in a day,“ Communications of the ACM, pp. 105–112, 2011
[5] C. Wu, “Towards linear-time incremental structure from motion,” Proceedings of the International Conference on 3D Vision, pp. 127–134, 2013

延伸閱讀