Unsupervised Scene Segmentation Using Sparse Coding Context

This thesis presents an approach to image understanding on the aspect of unsupervised scene segmentation. With the goal of image understanding in mind, we consider `unsupervised scene segmentation' a task of dividing a given image into semantically meaningful regions without using annotation or other human-labeled information. We seek to investigate how well an algorithm can achieve at partitioning an image with limited human-involved learning procedures. Specifically, we are interested in developing an unsupervised segmentation algorithm that only relies on the contextual prior learned from a set of images. Our algorithm incorporates a small set of images that are similar to the input image in their scene structures. We use the sparse coding technique to analyze the appearance of this set of images; the effectiveness of sparse coding allows us to derive a prior the context of the scene from the set of images. Gaussian mixture models can then be constructed for different parts of the input image based on the sparse-coding contextual prior, and can be combined into an MRF-based segmentation process. In the experimental results, we show that our unsupervised segmentation algorithm is able to partition an image into semantic regions, such as buildings, roads, trees, and skies, without using human-annotated information. The semantic regions generated by our unsupervised segmentation algorithm can be useful, as pre-processed inputs for subsequent classification-based labeling algorithms, in achieving automatic scene annotation and scene parsing.

關鍵字

scene segmentation

並列摘要

本篇論文提出了一個以非監督式的場景切割方面的影像理解方法.以影像理解為目標, 我們把"非監督式場景切割"定義為將一張給定的影像切割成有語意的物件而不需要標記或是人為提供的資訊.我們試著去探究一個學習演算法在有限人為的提供資訊下可以把一張影像切割到多好的結果,明確一點的說,我們有興趣的目標是發展一個只需要靠一些影像中學得的結構機率資訊的非監督式場景切割演算法.我們的演算法需要和欲切割的影像結構相似的少量影像, 並且使用稀疏編碼的技術來分析這些相似影像的視覺結構, 而稀疏編碼的特點在於可以讓藉由這些相似的影像來產生欲切割影像結構分佈區塊的估計.接著,對不同結構分佈區塊我們可以利用高斯混合模型來處理,後續結合馬可夫隨機場的方法來進行切割.在實驗結果中可以看到我們的非監督式影像切割演算法可以把一張影像成功的切出有語意的部分,如建築物,路面,樹叢和天空,並且不需要人為的事先標記資訊.經由我們的演算法產生的有語意的影像區塊可以是很有用的資訊,舉例來說可以當做後續以分類為基礎的標記演算法的輸入資訊,最後達成場景的自動標示以及場景剖析.

並列關鍵字

場景切割

參考文獻

[1] Yuri Boykov, Olga Veksler, and Ramin Zabih. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell., 23(11):1222–1239, 2001.

[2] Dorin Comaniciu and Peter Meer. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell., 24(5):603–619, 2002.

[3] Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2):167–181, 2004.

[4] James Hays and Alexei A. Efros. Scene completion using millions of photographs. Commun. ACM, 51(10):87–94, 2008.

[5] Derek Hoiem, Alexei A. Efros, and Martial Hebert. Geometric context from a single image. In ICCV, pages 654–661, 2005.

國際替代計量

Unsupervised Scene Segmentation Using Sparse Coding Context

主題瀏覽