Unsupervised Scene Segmentation Using Sparse Coding Context__國立清華大學博碩士論文全文影像系統

帳號：guest(18.118.254.28) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	劉晏誠
作者(外文):	Liu, Yen-Cheng
論文名稱(中文):	Unsupervised Scene Segmentation Using Sparse Coding Context
論文名稱(外文):	使用稀疏表示法的非監督式畫面切割
指導教授(中文):	陳煥宗
指導教授(外文):	Chen, Hwann-Tzong
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	資訊工程學系
學號:	9762525
出版年(民國):	99
畢業學年度:	98
語文別:	英文
論文頁數:	26
中文關鍵詞:	scene segmentation
外文關鍵詞:	場景切割
相關次數:	推薦:0 點閱:101 評分: 下載:0 收藏:0

This thesis presents an approach to image understanding on the aspect of unsupervised scene segmentation.
With the goal of image understanding in mind, we consider `unsupervised scene segmentation' a task of dividing a given image into semantically meaningful regions without using annotation or other human-labeled information.
We seek to investigate how well an algorithm can achieve at partitioning an image with limited human-involved learning procedures. Specifically, we are interested in developing an unsupervised segmentation algorithm that only relies on the contextual prior learned from a set of images. Our algorithm incorporates a small set of images that are similar to the input image in their scene structures. We use the sparse coding technique to analyze the appearance of this set of images; the effectiveness of sparse coding allows us to derive a prior the context of the scene from the set of images. Gaussian mixture models can then be constructed for different parts of the input image based on the sparse-coding contextual prior, and can be combined into an MRF-based segmentation process.
In the experimental results, we show that our unsupervised segmentation algorithm is able to partition an image into semantic regions, such as buildings, roads, trees, and skies, without using human-annotated information. The semantic regions generated by our unsupervised segmentation algorithm can be useful, as pre-processed inputs for subsequent classification-based labeling algorithms, in achieving automatic scene annotation and scene parsing.

本篇論文提出了一個以非監督式的場景切割方面的影像理解方法.以影像理解為目標, 我們把"非監督式場景切割"定義為將一張給定的影像切割成有語意的物件而不需要標記或是人為提供的資訊.我們試著去探究一個學習演算法在有限人為的提供資訊下可以把一張影像切割到多好的結果,明確一點的說,我們有興趣的目標是發展一個只需要靠一些影像中學得的結構機率資訊的非監督式場景切割演算法.我們的演算法需要和欲切割的影像結構相似的少量影像, 並且使用稀疏編碼的技術來分析這些相似影像的視覺結構, 而稀疏編碼的特點在於可以讓藉由這些相似的影像來產生欲切割影像結構分佈區塊的估計.接著,對不同結構分佈區塊我們可以利用高斯混合模型來處理,後續結合馬可夫隨機場的方法來進行切割.在實驗結果中可以看到我們的非監督式影像切割演算法可以把一張影像成功的切出有語意的部分,如建築物,路面,樹叢和天空,並且不需要人為的事先標記資訊.經由我們的演算法產生的有語意的影像區塊可以是很有用的資訊,舉例來說可以當做後續以分類為基礎的標記演算法的輸入資訊,最後達成場景的自動標示以及場景剖析.

1 Introduction
2 Approach
2.1 Overview
2.2 Scene Structure Analysis
2.3 Estimating the Contextual Prior Using Sparse Coding
2.4 Markov-Random-Field Based Image Segmentation
2.5 Refining the Image Stack
3 Experiments
4 Conclusion

[1] Yuri Boykov, Olga Veksler, and Ramin Zabih. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell., 23(11):1222–1239, 2001.
[2] Dorin Comaniciu and Peter Meer. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell., 24(5):603–619, 2002.
[3] Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2):167–181, 2004.
[4] James Hays and Alexei A. Efros. Scene completion using millions of photographs. Commun. ACM, 51(10):87–94, 2008.
[5] Derek Hoiem, Alexei A. Efros, and Martial Hebert. Geometric context from a single image. In ICCV, pages 654–661, 2005.
[6] Derek Hoiem, Alexei A. Efros, and Martial Hebert. Closing the loop in scene interpretation. In CVPR, 2008.
[7] Lawrence Hubert and Phipps Arabie. Comparing partitions. Journal of Classification, 2(1):193–218, 1985.
[8] Daniel P. Huttenlocher, Gregory A. Klanderman, and William Rucklidge. Comparing images using the hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell., 15(9):850–863, 1993.
[9] Ce Liu, Jenny Yuen, and Antonio B. Torralba. Nonparametric scene parsing: Label transfer via dense scene alignment. In CVPR, pages 1972–1979, 2009. 24
[10] Julien Mairal, Francis Bach, Jean Ponce, and Guillermo Sapiro. Online dictionary learning for sparse coding. In ICML, page 87, 2009.
[11] Greg Mori, Xiaofeng Ren, Alexei A. Efros, and Jitendra Malik. Recovering human body configurations: Combining segmentation and recognition. In CVPR (2), pages 326–333,2004.
[12] Aude Oliva and Antonio B. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3):145–175, 2001.
[13] Till Quack, Bastian Leibe, and Luc J. Van Gool. World-scale mining of objects and events from community photo collections. In CIVR, pages 47–56, 2008.
[14] Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. “grabcut”– interactive foreground extraction using iterated graph cuts. ACM Trans. Graph., 23(3):309–314, 2004.
[15] Carsten Rother, Thomas P. Minka, Andrew Blake, and Vladimir Kolmogorov. Cosegmentation of image pairs by histogram matching - incorporating a global constraint into mrfs. In CVPR (1), pages 993–1000, 2006.
[16] Bryan C. Russell, Alexei A. Efros, Josef Sivic, William T. Freeman, and Andrew Zisserman. Segmenting scenes by matching image composites. In NIPS, 2009.
[17] Bryan C. Russell, William T. Freeman, Alexei A. Efros, Josef Sivic, and Andrew Zisserman. Using multiple segmentations to discover objects and their extent in image collections. In CVPR (2), pages 1605–1614, 2006.
[18] Bryan C. Russell and Antonio B. Torralba. Building a database of 3d scenes from user annotations. In CVPR, pages 2711–2718, 2009.
[19] Bryan C. Russell, Antonio B. Torralba, Kevin P. Murphy, and William T. Freeman. Labelme: A database and web-based tool for image annotation. International Journal of Computer Vision, 77(1-3):157–173, 2008. 25
[20] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888–905, 2000.
[21] Ian Simon and Steven M. Seitz. Scene segmentation using the wisdom of crowds. In ECCV (2), pages 541–553, 2008.
[22] Noah Snavely, Steven M. Seitz, and Richard Szeliski. Modeling the world from internet photo collections. International Journal of Computer Vision, 80(2):189–210, 2008.
[23] Zhuowen Tu, Xiangrong Chen, Alan L. Yuille, and Song Chun Zhu. Image parsing: Unifying segmentation, detection and recognition. In ICCV, pages 18–25, 2003.

(此全文未開放授權)
電子全文
摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文