應用三維卷積深度特徵於監控影片之異常偵測

近年來將深度學習(Deep Learning)的技術應用於電腦視覺領域受到許多注目，然而相較於物件分類的單影像分類問題，由於異常事件的未知性與現實的多變化，應用非監督式學習的深度學習技術於監控影片異常偵測至今仍是困難挑戰。本論文提出了結合三維卷積神經網路 (3D Convolutional Neural Network, C3D)和單類別卷積神經網路(One-Class Convolutional Neural Network, OC-CNN)，以實現視訊監控系統之異常偵測。我們使用公開的事件資料集訓練C3D使其學習出的區域特徵具有緊湊性，以利後續的分類器學習，同時為了避免特徵過於集中而失去辨別性，我們使用人類動作行為的公開資料集UCF輔助學習C3D 網路，以改善C3D的特徵分類能力。最後我們將C3D正常事件特徵輸入各個獨立的區域分類器，並輔以高斯雜訊建成的偽異常資料，獨立訓練分類器並進行各區域的異常偵測。透過本文所提出之神經網路模型，可從正常事件訓練集學習具空間性與時間性的特徵，使這些隱藏特徵成功學習區域特性並應用在監控影片之異常區域偵測上，最後透過OC-CNN分類器偵測出未曾見過的異常事件。本論文所提出的方法，在公開廣泛使用的資料集上，與過去常使用的深度學習相關技術相比也有著優秀的表現。

關鍵字

異常偵測；深度學習；三維卷積神經網路；單類別分類器；非監督式學習

並列摘要

In recent years, the application of Deep Learning technology in computer vision field has attracted a lot of attention. However, in comparison with the single image objects classification, the deep learning technology of unsupervised learning is still a challenge for surveillance videos due to the realistic changes and unforeseen anomalies. In this thesis we proposed a combination of the 3D Convolution neural network (C3D) and the One-Class Convolutional Neural Network (OC-CNN) to perform anomaly detections of surveillance videos. We use two different datasets to train the system that it is capable to be compacted on “intra-class” but separated on “inter-class.” The former is a public (normal) event training dataset and the latter is the public dataset of human behavior dataset called UCF. The C3D network is adopted as the baseline architecture for training to extract features so that it learns the regional features with compactness as well as with high descriptive capability. In classifying events into normal vs. abnormal, a classifier is trained on each region independently. Normal features extracted from the before mentioned C3D network and pseudo abnormal data of Gaussian noise are used as negative and positive training samples to train such classifier. Finally, we input the C3D features of normal event into each independent regional classifier, and supplemented with the pseudo anomaly data built by Gaussian noise to independently train the classifier and perform anomaly detection in each area. Through the network we proposed, the spatial-temporal hidden features can be learned only from a normal event training set. Furthermore, these hidden features successfully learned the regional characteristics, and then be applied to the regional surveillance video anomaly detection by OC-CNN classifiers. The experiments show the method proposed in this thesis had great performance on two widely used public datasets in comparison with the deep learning related techniques that had been commonly used in anomaly detection of videos.

並列關鍵字

anomaly detection ； deep learning ； 3D convolutional neural network ； one-class classifier ； unsupervised learning

參考文獻

[1] V. Chandola, A. Banerjee and V. Kumar, “Anomaly Detection: A Survey,” ACM Comput, Surveys, 41(3), pp. 1 - 58, 2009.

Google Scholar

[2] O.P. Popoola and K. Wang, “Video-Based Abnormal Human Behavior Recognition—A Review,” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 42(6), pp. 865 – 878, 2012.

Google Scholar

[3] T. Li, H. Chang, M. Wang, B. Ni, R. Hong and S. Yan, “Crowded Scene Analysis: A Survey,” IEEE Transactions on Circuits and Systems for Video Technology, 25(3), pp. 367 – 386, March 2015.

Google Scholar

[4] C. Lu, J. Shi and J. Jia, “Abnormal Event Detection at 150 FPS in MATLAB,” Computer Vision (ICCV), 2013 IEEE International Conference, pp. 2720 – 2727, December 2013.

Google Scholar

[5] Y. Cong, J. Yuan and Y. Tang, “Video Anomaly Search in Crowded Scenes via Spatio-Temporal Motion Context,” Information Forensics and Security, IEEE Transactions on, 8(10), pp. 1590 – 1599. 2013.

Google Scholar

國際替代計量

應用三維卷積深度特徵於監控影片之異常偵測

全文下載

主題瀏覽