用於監視攝影畫面的物體偵測演算法

近年來物體偵測相關的演算法主要以準確度為主要的研究方向，著重在能夠更準確的偵測物體的位置，並且增加物體偵測的種類，但是準確度提高的同時往往也需要更強的運算效能，對於硬體設備的要求也相對提高，難以大量應用。另外，需要使用到物體偵測的影像中，有一大部分為固定鏡頭的監視影像，影格與影格之間的僅有部分區域產生變化，不需要針對完整的畫面重新進行偵測。本研究之目的為提升針對固定鏡頭之物體偵測效率，並且分析不同準確度要求下，對於偵測效率的影響。在偵測效率的提升分為兩個部分，一是減少非必要之感興趣區域 (Region of Interest, RoI) 數量，二是分析不同特徵描述子對於分類器運算時間以及準確度的影響。對於減少 RoI 數量的部分，本研究首先採用高斯混合模型 (Gaussian Mixture Model, GMM) 進行前後景分離來移除影像中不變的區域，對於分離出來的黑白遮罩圖進行線性模糊移除噪音點，再利用角點偵測演算法 (Features from Accelerated Segment Test, FAST) 偵測不同區塊前景的邊緣，最後由鄰近的特徵點合併成 RoI。相較於傳統的 sliding window 產生出數十萬個 RoI，以及 Selective Search 產生出約 2000 個 RoI，本研究結果能將 RoI 減少為數十個的情況下依然能保有需要偵測的物體視窗，並且產生出來的視窗並沒有大小以及長寬比的限制。在不同特徵描述子對於分類器運算時間以及準確度影響的部分，本研究採用 AdaBoost 作為分類器，測試了影像梯度、LUV 色彩空間以及不同切角數量之 HOG 特徵組合，並且用三個影像資料集作為測試樣本，在精確度由 0.986 下降為 0.9505 的情況下，運算時間下降為原先的 17%，並且提出不同精確度對應之運算時間的圖表。在整體的偵測流程上，我們首先提供了不同階段在運算時間所佔的比例。在純粹使用 CPU 運算的情況下，若使用 OpenCL 進行加速可以達到平均 160 FPS，在沒有 OpenCL 的情況下也可以達到 60 FPS 的運算效率。本研究提出之結果可做為往後物體偵測研究中，需要滿足不同精確度以及不同運算時間之參考。

關鍵字

電腦視覺；機器學習；影像處理；物體偵測

並列摘要

In recent years, object detection related algorithms mainly focus on im- proving detection accuracy as the main objective. They also focused on more accurately detecting the position of an object, and increasing the number of object classes. However, when improving the detection accuracy, the algo- rithms are becoming more complicated and requires more computing power. Some of them can only be implemented on high-end pieces of equipment. Therefore, they may not always be possible for deployment in real world applications. Besides, a great amount of the videos for object detection are stationed with a fixed angle, which means most pixels stay still between frames. In this case, there is no need for re-detection of the entire image. The purpose of this study is to improve the object detection efficiency for the fixed-angle camera and analyze the tradeoff of detection efficiency and accuracy. There are two main parts for improving the detection efficiency. One is reducing the unnecessary number of Region of Interest (ROI) for the image classification. T he other part is reducing the classification time through analyzing how the image feature descriptors affect the classification time and accuracy. For the part of reducing unnecessary RoI, this research first applies the Gaussian Mixture Model (GMM) to subtract the background and then linear resize the foreground mask image to reduce the noise points. Then we use Features from Accelerated Segment Test (FAST) to detect the edges of each foreground part as feature points. Finally, we use these points to generate our ROIs. Comparing to the traditional sliding window method which could gen- erate about hundreds of thousands of ROI, and Selective Search which will generate about 2000 ROI, the method of this study will only generate ROI within a hundred, reducing the irrelevant ROIs. In addition, The window size and the aspect ratio are not constraints in our approach. In this study AdaBoost was used as the classifier and the combinations of image gradient, LUV color space, and HOG as the feature descriptors. We used three image dataset as the testing data. The result showed that the classi- fication time can be reduced to 17% while the precision only decreased from 98.6% to 95.05%. We also propose the table of the different classification time with its corresponding precision. For the whole detection process, the time ratio of each processing part is proposed. The algorithm we proposed can run average 160 FPS using only CPU with OpenCL acceleration, and 60 FPS without OpenCL acceleration. The result of this study can serve as a reference for object detection research which needs to satisfy certain precision or computing time constraint.

並列關鍵字

Computer Vision ； Machine Learning ； Image Processing ； Object Detection

參考文獻

[1] Song Bai, Xiang Bai, and Qi Tian. Scalable person re-identification on supervised smoothed manifold. In CVPR, volume 6, page 7, 2017.

Google Scholar

[2] Sławomir Bak and Peter Carr. One-shot metric learning for person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition, 2017.

Google Scholar

[3] Rodrigo Benenson, Markus Mathias, Radu Timofte, and Luc Van Gool. Pedestrian detection at 100 frames per second. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2903–2910. IEEE, 2012.

Google Scholar

[4] G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.

Google Scholar

[5] Leo Breiman. Random forests. Mach. Learn., 45(1):5–32, October 2001. ISSN 0885-6125. . URL https://doi.org/10.1023/A:1010933404324.

Google Scholar

國際替代計量

用於監視攝影畫面的物體偵測演算法

全文下載

主題瀏覽