透過您的圖書館登入
IP:52.15.85.66
  • 學位論文

在複雜背景下具自遮蔽處理之雙手追蹤系統

Hands Tracking with Self-occlusion Handling in Cluttered Environment

指導教授 : 傅立成

摘要


本論文描述了一個運用單攝影機並可應用於人機互動的雙手追蹤系統。為了辨別使用者的頭以及手,本方法同時追蹤了使用者的頭。當目標距離彼此大於一段距離時,它們會被視為獨立追蹤。然而當它們有可能被互相干擾時,它們的狀態向量會一起被考慮依據相依的量測。追蹤器會運用遮罩將其它追蹤器最近的結果所在的區域忽略,以避免不同追蹤器之間的干擾。當下具鑑別力的顏色權重影像以及參考模型的反向投影的合成、運動模板影像和梯度方向特徵被用來驗證粒子濾波器所產生的假設。在另一方面,當目標物距離很近,甚至是重疊的時候,我們運用基於膚色推論之重要性取樣的粒子濾波器,產生融合目標物的假設,並加入深度順序的估測。我們依據視覺上的資訊包括:被遮蔽的臉部模板、手的形狀之梯度方向、運動的連續性以及前臂的線性方程式,來驗證這些融合的目標物可能的假設。實驗結果中展示了系統的即時效率以及強健性,我們也提供了系統跟依據Kinect深度影像的OpenNI 追蹤器追蹤結果在準確度上的比較,以及與一個目前最新的人體姿態估測方法在正確率的比較。

並列摘要


This thesis presents a two-hands tracking method with a monocular camera for human machine interaction (HMI). To clarify the face of the user and his/her hands, the face is also tracked in our method. The targets are tracked independently when they are far from each other; however, they are merged with dependent likelihood measurements in higher dimension while they are likely to interrupt each other. While one target is being tracked in the independent situation, other targets are masked to decrease the skin color disturbances on the tracked one. Multiple cues, including the combination of the locally discriminative color weighted image and the back-projection image of the reference color model, the motion history image and the gradient orientation feature, are employed to verify the hypotheses originated from the particle filter. On the other hand, when the targets are closing or even overlapping, the multiple importance sampling (MIS) particle filter generates the tracking hypotheses of the merged targets by the skin blob reasoning and the depth order estimation. These joint hypotheses are then evaluated by the visual cues of occluded face template, hand shape gradient orientation, motion continuity and forearm equation. The experimental results present the real-time efficiency and the robustness in comparison with the OpenNI tracker which has been released recently for the Kinect depth sensor and with the state-of-the-art human pose estimation method.

參考文獻


[1] P. Buehler, M. Everingham, D. P. Huttenlocher, and A. Zisserman, "Long Term Arm and Hand Tracking for Continuous Sign Language TV Broadcasts," in Proceedings of the British Machine Vision Conference, 2008.
[2] M. Eichner, M. Marin-Jimenez, A. Zisserman, and V. Ferrari, "2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images," International Journal of Computer Vision, vol. 99, pp. 190-214, 2012.
[7] M. Kolsch and M. Turk, "Fast 2D Hand Tracking with Flocks of Features and Multi-Cue Integration," in IEEE Conference on Computer Vision and Pattern Recognition Workshop, 2004, pp. 158-158.
[8] N. Jojic, M. Turk, and T. S. Huang, "Tracking self-occluding articulated objects in dense disparity maps," in The Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, pp. 123-130 vol.1.
[10] S. Jianbo and C. Tomasi, "Good features to track," in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1994, pp. 593-600.

延伸閱讀