利用深度卷積特徵及密集對齊之非參數場景分析

這篇論文主要討論非參數場景分析方法中影響正確率的兩個重要因素: (1)影像檢索的品質；(2) 標籤轉移的準確度。因為非參數方法是從檢索影像中轉移標籤至測試影像，所以檢索影像和測試影像必須是“語意相似”的。當擁有一個好的檢索影像集合後，標籤轉移必須有像素等級的準確度。這篇論文中我們改進了上述的兩點觀察，以提升非參數影像標註的正確率。我們使用深度卷積特徵當作視覺描述子以及藉由語意描述子將檢索影像重新排序，因此得到品質更好的檢索影像集合。除此之外，我們在馬可夫隨機場模型中加入密集空間對齊，以提高標籤轉移在像素等級上的準確率。接下來我們將初次產生的標註結果當作擴展查詢，以得到更好品質的檢索影像集合，並根據此一更新的檢索影像集合執行第二輪標籤轉移。最後將兩輪的標籤轉移結果結合，再次透過馬可夫隨機場模型得到更好的標註成果。在實驗中，我們透過SIFT Flow與LMSun資料庫進行驗證，實驗結果顯示我們的方法均優於過去的非參數場景標註方法。

關鍵字

場景分析；物體視窗；深度卷積式網路； SIFT flow

並列摘要

This thesis addresses two key issues which concern the performance of nonparametric scene parsing: (1) the semantic quality of image retrieval; and (2) the accuracy in label transfer. First, because nonparametric methods annotate a query image through transferring labels from retrieved images, the task of image retrieval should find a set of “semantically similar” images to the query. Second, with the retrieval set, a good strategy should be developed to transfer semantic labels in pixel-level accuracy. In this thesis, we focus on improving scene parsing accuracy in these two issues. We propose using the state-of-the-art deep convolutional features as visual descriptors to improve the semantic quality of retrieved images. In addition, we include dense alignment into the Markov Random Field (MRF) inference framework to transfer labels at pixel-level accuracy. Next, we utilize the derived semantic labels as queries to expand the retrieval set and then conduct the second-round label transfer. Finally, we combine label transferring cues of two rounds into the MRF model to improve the labeling results. Our experiments on the SIFT Flow dataset and LMSun dataset show the improvement of the proposed approach over other nonparametric methods.

並列關鍵字

scene parsing ； object window ； deep convolutional network ； SIFT flow

參考文獻

[2] J. Tighe and S. Lazebnik, “Finding Things: Image Parsing with Regions and Per-Exemplar Detectors,” In Proc. CVPR, 2013.

[3] C. Liu, J. Yuen, and A. Torralba, “Nonparametric Scene Parsing via Label Transfer,” IEEE Trans. PAMI, vol. 33, no. 12, Dec. 2011, pp. 2368-2382.

[4] J. Tighe and S. Lazebnik, “SuperParsing: Scalable Nonparametric Image Parsing with Superpixels,” International Journal of Computer Vision, vol. 101, no. 2, Jan. 2013, pp. 329-349.

[5] J. Yang, B. Price, S. Cohen and M. Yang, “Context Driven Scene Parsing with Attention to Rare Classes,” In Proc. CVPR, 2014.

[6] F. Tung and J. J. Little, “CollageParsing: Nonparametric Scene Parsing by Adaptive Overlapping Windows.” In Proc. ECCV, 2014.

國際替代計量

利用深度卷積特徵及密集對齊之非參數場景分析

主題瀏覽