透過您的圖書館登入
IP:3.142.94.213
  • 學位論文

Rotation, Translation, and Scale Invariant Bag of Feature based on Feature Density

Rotation, Translation, and Scale Invariant Bag of Feature based on Feature Density

指導教授 : 江振國
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

並列摘要


In human vision, people can easily recognize object in image with any size, location at any position, at any angle, and with complicated background. But in computer vision, it is hard to achieve image recognition with such invariance. Spatial Pyramid Matching (SPM) has excellent performance on computer vision applications. However, SPM still meets the difficulty when the position of object changes in images. In recent year, researchers try to find a robust representation. For example, translation invariant, rotation invariant, and scale invariant features. There are works trying to solve this issue. However, they just deal with one of three invariants respectively. It lacks a robust representation that can handle three invariant simultaneously. In our work, we aim to develop a robust feature that achieves translation, rotation, and scale invariant simultaneously. To handle this problem, we propose a novel method named Block Based Integral Image to search the densest region of features and constraint the region size similar to a predefined region size, and further find the approximated center of object in image. Then, we apply SPR by replacing the image center with the approximated object center to handle translation and rotation invariance problem. After that, we use histogram equalization to adjust captured representation for scale invariant. After the adjustment, a robust representation can be obtained to handle translation, rotation, and scale invariance simultaneously. Finally, we verify our system on different datasets on image classification task. Experimental results show that our system indeed can deal with translation, rotation, and scale invariant simultaneously and achieve higher accuracy than the previous methods.

參考文獻


[1] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, vol. 2, pp. 2169-2178, 2006.
[3] D. G. Lowe, "Distinctive image features from scale-invariant keypoints," Int. J. Comput. Vision, vol. 60, pp. 91-110, Nov. 2004.
[8] A. Vedaldi and B. Fulkerson, "VLFeat - an open and portable library of computer vision algorithms," in ACM International Conference on Multimedia, 2010.
[10] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, "LIBLINEAR: A library for large linear classification," Journal of Machine Learning Research, vol. 9,
pp. 1871{1874, 2008.