物件偵測已然成為現今廣泛應用的技術,擷取出圖像中,有興趣的部分,並且加以識別。為了達成該目標,需要準備大量的資料以及標記,其中標記包含了影像中的物件類別以及位置區域。 然而,物件偵測存在著極大的標記成本,大量的圖像資料需要標記其中的物件位置與類別,花費大量的時間以及人力成本。因此物件偵測衍伸出一個學習目標,即是不採用物體的位置資訊來做訓練,只使用圖像中的物件類別來做訓練,稱之為弱監督物件偵測。弱監督物件偵測補足了標記的成本問題卻也犧牲了物件偵測的整體準確度。 本篇論文提出一個觀點,在現今存在為數甚多的資料庫,提供了大量物件類別與對應的位置資訊,例如:PASCAL VOC, COCO, ILSVRC等等……, 使用這些現存的資料,作為來源資料,訓練出類別無關的物件偵測技術,再轉換給沒有位置資訊的目標資料做訓練,藉此提高若監督物件偵測的準確率。
Object detection has become a widely used technology today. It can localize the regions of interest in an image and identify their corresponding categories. To achieve this goal, a large amount of data and labels need to be prepared, including the object category and its location area in the image. Therefore, object detection has a great cost of labeling. Due to the amount of cost, object detection extends to a weakly-supervised learning goal, that is, the position information of the object is not used for training, and only the object category in the image is available. Weakly supervised object detection complements the cost of labeling but also sacrifices the overall accuracy of object detection. Our thesis presents a method to improve weakly-supervised detection performance in new data supervision. There are a large number of databases available today that are provided on the internet, contain a large number of object categories and corresponding location information, such as PASCAL VOC, COCO, ILSVRC, etc.... Using these existing data as the source data, we designed a category-independent framework to lean an objectness foreground predictor for all region proposals, and then transfer the results to the weakly-supervised training. Thereby, we improve weakly-supervised detection accuracy.