透過您的圖書館登入
IP:18.117.92.13
  • 學位論文

藉由深度度量學習在高度類內差異下之圖像檢索

Image Retrieval under Large Intra-class Variance by Deep Metric Learning

指導教授 : 易志偉

摘要


深度度量學習(deep metric learning)是透過嵌入特徵向量(embedding feature vector)的學習來建構資料間的距離函數,使得同類別資料間的距離拉近,不同類別資料間的距離拉遠,可用於(image retrieval)的應用中,但這個方法在具顯著類內差異(intra-class variance)的資料集上,因硬性拉近具顯著差異的圖像距離,可能會導致精準度下降。在許多的應用中,能否妥善處理類內差異,往往是效能能否勝出的關鍵。在這個研究中,我們以圖像查詢為應用標的,結合多群中心的網路設計與評估類內差異的損失函數(loss function)來改善在高度類內差異下的度量學習,增進圖像檢索的準確率,克服物件多個角度面向的問題,並應用在智慧結帳檯以及車輛再識別的任務上,在智慧結帳檯的商品辨識正確率可達 97.7% 。最後,此多群中心的度量學習架構可套用在現有的圖像分類架構上,加強對於類內差異的處理並提升準確率。

並列摘要


Deep metric learning is to construct the distance function between data through the learning of embedding features vector so that the distances between data of the same category are decreased, and the distances between data of different categories are increased. It can be used in the application such as image retrieval. But for datasets with significant intra-class variance, the accuracy may be reduced due to the rigid pulling of the distance between significantly different images. In many applications, the ability to properly handle intra-class variance is often the key to success in performance. In this research, we use image query as the subject of the application, combined with the multi-center network and the loss function that evaluates intra-class variance to improve the metric learning under the large intra-class variance, improve the accuracy of image retrieval, and solving the problem of multiple perspectives of objects, and applied to intelligent checkout system and vehicle re-identification task. The accuracy of product identification at the intelligent checkout system is 97.7%. Finally, the multi-center metric learning architecture can be applied to existing image classification architectures to enhance the processing of intra-class variance and improve the accuracy of identification.

參考文獻


[1] K. Sohn,“Improved deep metric learning with multi-class n-pair loss objective,”in Ad-vances in neural information processing systems, 2016, pp. 1857–1865.
[2] A. Zhai and H.-Y. Wu, “Classification is a strong baseline for deep metric learning,”arXiv preprint arXiv:1811.12649, 2018.
[3] L. van der Maaten and G. Hinton,“Visualizing data using t-SNE,”Journal of Machine Learning Research, vol. 9, pp. 2579–2605, 2008.[Online].Available:http://www.jmlr.org/papers/v9/vandermaaten08a.html
[4] F. Long, H. Zhang, and D. D. Feng,“Fundamentals of content-based image retrieval,”in Multimedia information retrieval and management. Springer, 2003, pp. 1–26.
[5] D. G. Lowe,“Distinctive image features from scale-invariant keypoints,”International journal of computer vision, vol. 60, no. 2, pp. 91–110, 2004.

延伸閱讀