  • 期刊


Application of Deep Learning Technology to Construct Efficient Behavior Identification Models


在影像中識別人類行為是一項具有挑戰性與重要性的任務,可廣泛的應用於各種情境;如自動監控系統中的異常事件檢測、體育運動分析與影片分類等。本研究以3D ResNet-18模型為基礎進行優化改造,提出一個更簡單且較少超參數的模組化架構。在KTH和UCF-101資料集上的結果表明本文所提出的演算法準確率(Top-1)分別為的96.3%和60.01%,與3D ResNet-18模型相比,本研究能對人類行為進行更準確的辨識。


Recognizing human behavior in images is a challenging and important task. As the image recognition of human behavior recognition has gradually begun to be used in daily life, such as automatic monitoring system in the detection of abnormal events, sports analysis and film classification. This research will optimize and improve the 3D ResNet-18 based model to propose a simple and less hyperparameter adjustment modular architecture. The experimental results on the KTH and UCF-101 data sets show that the improved algorithm accuracy (Top-1) is 96.3% and 60.01%, and the improved model can take more effective features and improve the recognition effect of human behaviors compared with the original 3D ResNet-18 Model.


Fukushima, K. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks, 1(2), 119-130. doi:10.1016/0893-6080(88)90014-7, 1988.
Sultani, W., Chen, C., & Shah, M. Real-world anomaly detection in surveillance videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 6479–6488), 2018.
Simonyan, K., & Zisserman, A. Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems (pp. 568–576),2014a.
Tran, D., Bourdev, L., Fergus, R., Torresani, L., & Paluri, M. Learning spatiotemporal features with 3d convolutional networks (pp. 4489-4497). Presented at the Proceedings of the IEEE international conference on computer vision, 2015.
