透過您的圖書館登入
IP:18.218.232.140
  • 學位論文

透過監視器影像之人體幾何資訊萃取與行動分析

Human body geometry extraction and motion analysis based on surveillance camera images

指導教授 : 韓仁毓

摘要


基於統計顯示,監視器數量隨著年代開始成倍數成長,全球已超過1億支監視器被安裝與使用,然而影像紀錄內容仍需要仰賴人力介入辨識,才能夠達到監測之目的。本研究藉由單影像監視器進行室內影像的收集,並運用目前精準且快速的深度學習模型,如Yolov4、Openpose協助快速抓取行人影像。基於攝影測量學理建立嚴謹物像關係,以及套用最小二乘法推算行人於物空間坐標位置和精度指標作為後續評估分析。最後所萃取行人行走時多樣性指標,如行走的頻率、速度以及行人的幾何資訊來建立行人特有的特徵向量,並分析跨監視器影像之特徵向量相似性,藉以實現跨監視器之行人追蹤。 研究成果論證應用深度學習能自動準確偵測欲追蹤影像目標資訊,每幀影像耗費1~1.5秒辨識行人於影像中位置。萃取行走指標中行人幾何資訊,結合誤差傳播模型所得精度指標做加權分析來提高成果的可靠度。本研究結果獲取誤差落在±1公分幾何資訊,並利用單相中影像變化,萃取該行人於場景中行走頻率,成功辨識跑步的行人落在1.91Hz頻率,與慢走的行人使用0.92Hz進行慢走。藉由多樣化的行人特徵,強化跨影像追蹤之可靠度。未來進行室內場域管理,能夠基於本研究實現行人資訊的萃取與追蹤。延伸可應用於警方查緝犯人逃跑軌跡,或是即時監測場域內行人意外等即時探測,提升監視器影像於空間管理應用與價值。

並列摘要


According to statistics, there are around 1 billion surveillance have been installed and used. However, the content of video still needs to be identified by human intervention in order to achieve the purpose of monitoring. This research will collect indoor surveillance’s image. The current accurate and fast deep learning models, such as Yolov4, Openpose, etc., are used to quickly capture human in the image. A rigorous object-image relationship constructs based on the collinear of photogrammetry and least squares method. Then calculate the geometric information of the pedestrian. And it will extract more information from walking human, such as the frequency of walking and the speed of walking. Finally, the unique feature vector of the pedestrian is established, and applied to achieve cross-image pedestrian trajectory tracking. From the research results, it applies deep learning models to automatically and accurately capture image information. And calculate the spatial information of pedestrians by rigorous relationship. Moreover, the pedestrian geometric information could be combined with the accuracy indicators. These indicators obtained through the error propagation model for weighted analysis, and provided to improve the reliability of the results. The results of this study obtained the geometric information with an error of ±1 cm. Then extract the walking frequency of the pedestrian in the scene. It successfully identified running human using a frequency of 1.91Hz and walking slowly using a frequency of 0.92Hz. The reliability of cross-image tracking is going to enhance with diverse pedestrian features.

參考文獻


Bochkovskiy, A., Wang, C. Y., Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934.
Bureau of police research and development. (2019). Data of police organizations. India. VSK Kaumudl.
Cao, Z., Simon, T., Wei, S. E., Sheikh, Y. (2017). Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7291-7299).
Criminisi, A., Reid, I., Zisserman, A. (2000). Single view metrology. International Journal of Computer Vision, 40(2), 123-148.
He, K., Gkioxari, G., Dollár, P., Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

延伸閱讀