Rotation, Translation, and Scale Invariant Bag of Feature based on Feature Density

In human vision, people can easily recognize object in image with any size, location at any position, at any angle, and with complicated background. But in computer vision, it is hard to achieve image recognition with such invariance. Spatial Pyramid Matching (SPM) has excellent performance on computer vision applications. However, SPM still meets the difficulty when the position of object changes in images. In recent year, researchers try to find a robust representation. For example, translation invariant, rotation invariant, and scale invariant features. There are works trying to solve this issue. However, they just deal with one of three invariants respectively. It lacks a robust representation that can handle three invariant simultaneously. In our work, we aim to develop a robust feature that achieves translation, rotation, and scale invariant simultaneously. To handle this problem, we propose a novel method named Block Based Integral Image to search the densest region of features and constraint the region size similar to a predefined region size, and further find the approximated center of object in image. Then, we apply SPR by replacing the image center with the approximated object center to handle translation and rotation invariance problem. After that, we use histogram equalization to adjust captured representation for scale invariant. After the adjustment, a robust representation can be obtained to handle translation, rotation, and scale invariance simultaneously. Finally, we verify our system on different datasets on image classification task. Experimental results show that our system indeed can deal with translation, rotation, and scale invariant simultaneously and achieve higher accuracy than the previous methods.

並列關鍵字

Feature Density ； Rotation, Translation, Scale Invariant ； Image Representation

參考文獻

[1] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, vol. 2, pp. 2169-2178, 2006.

[3] D. G. Lowe, "Distinctive image features from scale-invariant keypoints," Int. J. Comput. Vision, vol. 60, pp. 91-110, Nov. 2004.

[8] A. Vedaldi and B. Fulkerson, "VLFeat - an open and portable library of computer vision algorithms," in ACM International Conference on Multimedia, 2010.

[10] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, "LIBLINEAR: A library for large linear classification," Journal of Machine Learning Research, vol. 9,

pp. 1871{1874, 2008.

國際替代計量

Rotation, Translation, and Scale Invariant Bag of Feature based on Feature Density

未授權

主題瀏覽