透過您的圖書館登入
IP:3.17.176.70
  • 學位論文

利用多台俯視之深度相機進行即時人型偵測與追蹤之大型室內監視系統

Real-time People Detection and Tracking for Large Indoor Surveillance Using Multiple Top-view Depth Cameras

指導教授 : 傅立成

摘要


在這篇論文中,我們提出了創新的理論,硬體設計,以及演算法來描述一個大型的室內監控系統,在天花板上架設多台俯視的深度相機,進而大幅解決傳統監視系統上遮蔽、劇烈燈光變化、相似外觀以及嚴重形變的問題。 我們首先利用影像拼接,同時整合多台相機之間的資訊以及相互地理位置,可適用於大型室內環境。另外,我們也提出了新的背景相減架構,來修正相機中的物體形變,以及得到室內場景中之前景物體。 接下來,我們提出了一個新的人型偵測及追蹤架構。利用圖形的切割法、頭形半球模型以及人型地圖即時偵測人形;此外,我們使用新的影像特徵,基於三維形狀相似度,利用推論之重要性取樣的粒子濾波器追蹤人形。這個架構總共有三項優點 1) 偵測器可以透過一系列的有效過濾,即時挑選出正確的人型,2) 追蹤器和偵測器可以互相補足彼此的缺點,達到系統的強健性以及即時性,3) 追蹤器可以抵抗劇烈的形變以及高度變化。 最後,實驗結果中展示了系統的即時效率以及強健性,我們就人型偵測和影像追蹤兩方面與最先進的演算法比較,就偵測上的準確率及精確度、追蹤上的誤差及追蹤物覆蓋率而言,驗證了整個系統的有效性。

並列摘要


In this thesis, we propose an indoor surveillance system which installs multiple vertical top-view depth cameras to track human shape. This new system leads to a novel framework to solve the traditional challenge of surveillance such as severe pcclusion, similar appearance, illumination change and deformation. In the first part of the thesis, we analyze the geometric relation between cameras using image stitching and show that our system can better be applied in large indoor surveillance scene. We also propose new background subtraction approach to calibrate camera distortion and extract the foreground objects in the cluttering environment. Second, we propose a new detection and tracking scheme. Several processes including graph-based segmentation, head hemisphere model, and geodesic distance map are cascaded to detect human shape. Moreover, a new shape feature based on 3D diffusion distance is utilized to track human by SIR particle filter. There are three advantages of our framework: 1) The detector can recognize the human shape by a series of strong detector, 2). The detector and tracker can compensate disadvantage of each other to achieve robustness, 3). The tracker can tolerate severe distortion and appearance change. At last, experimental results demonstrate the real-time performance and robustness of our surveillance system. State-of-the-art detection and tracking methods are compared with our detection and tracking algorithm. The precision rate, location error and occlusion rate with respect to ground truth show our algorithms outperform other methods. In summary, this thesis presents several novel and important solutions to track, detect human, and efficiently utilize the high dimension depth data.

參考文獻


[1] X. Zhang, J. Yan, S. Feng, L. Zhen, Y. Dong, and S. Z. Li, "Water Filling: Unsupervised People Counting via Vertical Kinect Sensor," in Ninth IEEE International Conference on Advanced Video and Signal-Based Surveillance, 2012, pp. 215-220.
[2] X. Jia, H. Lu, and M.-H. Yang, "Visual tracking via adaptive structural local sparse appearance model," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012, pp. 1822-1829.
[3] S. Hare, A. Saffari, and P. H. S. Torr, "Struck: Structured output tracking with kernels," in IEEE International Conference on Computer Vision (ICCV), 2011, pp. 263-270.
[4] B. Babenko, M.-H. Yang, and S. Belongie, "Robust Object Tracking with Online Multiple Instance Learning," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, pp. 1619-1632, 2011.
[7] A. F. Bobick and J. W. Davis, "The recognition of human movement using temporal templates," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, pp. 257-267, 2001.

延伸閱讀