  • 學位論文


Collecting Shape Annotations from ImageNet by Crowdsourcing System and Study of Shape Recognition

指導教授 : 劉震昌


在物件辨識的許多方法中,形狀辨識是一個很重要的特徵,然而形狀辨識卻面臨了一個困境,目前所見的影像資料庫皆不包含高品質的物件標註。因此帶來兩個問題我們必須面對:其一,使用邊緣偵測產生的自動化物件標註,無法產生足夠精準的物件形狀。其二,使用人力標註物件形狀,需要耗費大量的時間與金錢。 本論文採用 ImageNet 大規模影像資料庫提出多個眾外包行動裝置應用程式,利遊戲化的設計概念來提升群眾外包中的使用者的使用動機。不只提出應用程式,我們亦提出了形狀蒐集機制與對應的 API,提供不同的行動裝置應用軟體開發者進行後續應用程式設計,該機制包含了形狀資料搜尋、形狀資料蒐集、語言翻譯及形狀資料驗證。 本論文亦對形狀辨識進行初步研究,使用 TPS-RPM 對描繪形狀進行辨識並輔以大小正規化、位置正規化、角度正規化及翻轉正規化來提升辨識率。最後,在 5 種物件類別並每類別各含 6 筆資料的實驗達到辨識率 90% 。


There are many approaches for object recognition in an image, and shape is one of the most important features. However, public available image datasets seldom contain high-quality shape annotations of the objects in the images. Lacking of shape annotations comes from two facts: automatic object segmentation or edge detection cannot produce precise shape contours, and manually annotating shape contours from a large amount of images is labor intensive. In this thesis, we propose crowdsourcing applications on mobile device by use a large-scale image database called ImageNet. These application can improve the motivation of users by using concept of gamification. We not only make applications, but also design mechanisms and corresponding APIs for collecting shape annotation which provide the mobile application developers to investigate furthur applications. These mechanisms include: searching shape data, collecting shape data, language translation and verifying shape data. Shape recognition is also studied in this thesis. We use TPS-RPM for shape recognition, and improve recognition rate by using size normalization, position normalization, tilt angle normalization, and flip normalization. In the experiment which has 5 object categories and each category has 6 data, the recognition rate is up to 90%.


[1] Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. In IEEE Transactions on Pattern Analysis and Machine Intelligence on, 24(4), pp 509-522.
[2] Belongie, S., Malik, J., & Puzicha, J. (2000). Shape Context: A new descriptor for shape matching and object recognition. In NIPS.
[3] Chui, H., & Rangarajan, A. (2003). A new point matching algorithm for non-rigid registration. Computer Vision and Image Understanding, 89(2), 114-141.
[4] DengDong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, LiJ.,. (2009). Imagenet: A large-scale hierarchical image database. Intl. Conference on Computer Vision and Pattern Recognition.
[5] Edwards, B. (1999). The new drawing on the right side of the brain (2nd rev. ed.).
