透過您的圖書館登入
IP:18.189.22.136
  • 學位論文

利用單域類神經網路與Reptile元學習之深度視覺追蹤

Deep Visual Tracking using Single Domain Neural Network with Reptile Meta-Learning

指導教授 : 蔡奇謚

摘要


視覺追蹤最主要的目的是在於一段連續的影像中,以邊界框的形式定位特定目標物體。雖然視覺追蹤在電腦視覺領域佔有重要角色已經很長一段時間,但是依然是個極具挑戰的問題,原因在於視覺追蹤要求定位特定物體,而非較為廣泛的物體類別,這對於以深度學習為基礎且需要線上學習之視覺追蹤演算法形成獨特的挑戰,雖然深度學習以其強大的辨識能力為名,但是若訓練資料非常少的情況下,深度學習演算法非常容易過擬合導致整體表現變得非常差。本論文以現有的深度學習追蹤演算法為基礎,加入了元學習演算法,使其在線上追蹤的初始化時,只需很少的更新次數與少量訓練資料即可表現優異,實驗結果顯示本演算法在OTB2015資料庫上獲得66.4%平均成功率。

並列摘要


The goal of visual tracking is to locate a specific object in the form of bounding box throughout a video or a sequence of images. While visual tracking has been one of the main topics in the field of computer vision for decades, it is still a very challenging topic. Visual tracking requires algorithms to recognize and locate objects down to instances level, and this requirement produces some unique challenges especially for some tracking algorithms based on deep learning techniques that require online leaning during the tracking process. Although deep leaning models could provide really strong and robust feature representation, it is easy to be over-fitted if given a really small set of training data thus making the overall performance throughout tracking poor. To deal with this issue, the proposed algorithm adopts first-order meta learning technique so that during initialization, the visual tracker only requires few training examples and few steps of optimization to perform well. Experiment results shows that it can achieve up to 66.4% of mean success rate on OTB2015 dataset.

參考文獻


[1] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel: “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541–551, Winter 1989.
[2] Krizhevsky, A., Sutskever, I., and Hinton, G. E.: “ImageNet classification with deep convolutional neural networks,” In NIPS, pp. 1106–1114, 2012.
[3] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L.: “Imagenet: A large-scale hierarchical image database,” In CVPR, pp. 248-255, 2009.
[4] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich.: “Going deeper with convolutions.” In CVPR, pp. 1-9 2015.
[5] He, K., Zhang, X., Ren, S., Sun, J. “Deep residual learning for image recognition.” In CVPR, pp. 770-778, 2016.

延伸閱讀