透過您的圖書館登入
IP:3.128.168.87
  • 學位論文

透過共注意力與共激發來實現單樣本物件偵測

One-Shot Object Detection with Co-Attention and Co-Excitation

指導教授 : 陳煥宗

摘要


本論文提出一套透過共注意力與共激發來實現單樣本物件偵測的方法。在 現實生活中,人類能夠基於少量樣本所提供的視覺資訊,達到很高的物件偵 測和辨識率,但對於深度學習模型來說,只依賴少量樣本要達到可靠的物件 偵測能力,卻是非常困難的挑戰。在本論文中我們探討關於單樣本的強化學 習,利用共注意力與共激發的方式提升模型的學習能力。方法上,我們以 Faster R-CNN 做為模型的基本架構,對於目標影像上的每個特徵區塊利用 樣本的特徵比對相似度,並強化潛在物體的特徵區塊。最後,使用樣本的特 徵來選擇最有用的特徵,提高有用的特徵,捨棄無用的特徵,進而增加相似 度的判斷可靠度。我們在單樣本物件偵測的成果可以達到現今最佳方法的水 準,並且已經將實驗所需的程式碼開源,供後續的研究使用。

並列摘要


This thesis aims to tackle the challenging problem of one-shot object de-tection. Given a query image patch whose class label is not included in thetraining data, the goal of the task is to detect all instances of the same class ina target image. To this end, we develop a novelco-attention and co-excitation(CoAE) framework that makes contributions in three key technical aspects.First, we propose to use the non-local operation to explore the co-attention em-bodied in each query-target pair and yield region proposals accounting for theone-shot situation. Second, we formulate a squeeze-and-co-excitation schemethat can adaptively emphasize correlated feature channels to help uncover rel-evant proposals and eventually the target objects. Third, we design a margin-based ranking loss for implicitly learning a metric to predict the similarity ofa region proposal to the underlying query, no matter its class label is seen orunseen in training. The resulting model is therefore a two-stage detector thatyields a strong baseline on both VOC and MS-COCO under one-shot settingof detecting objects from both seen and never-seen classes

參考文獻


[1] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation
by jointly learning to align and translate. In 3rd International Conference on Learning
Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track
Proceedings, 2015.
[2] Zhaowei Cai and Nuno Vasconcelos. Cascade R-CNN: delving into high quality

延伸閱讀