電腦視覺技術於手勢追蹤與表情辨識之研究

如何讓電腦藉由觀察使用者的行為與情緒來瞭解其意圖和想法是基礎而重要的課題。在本論文中，我們針對手勢追蹤、人臉五官追蹤和表情辨認等題目進行研究與探討。在手勢追蹤方面，由於每一根手指的關節都有數個運動自由度，這些為數眾多的關節自由度使得該問題成為複雜的高維度追蹤問題。為了能有效地解決此問題，我們提出了以貝氏機率傳遞模型為基礎之「外觀導引式粒子濾波法」。將手勢的外觀資訊引入動態系統中，透過外觀資訊的導引與動態訊息的傳遞我們可以正確地估測出運動的狀態。一般而言，物體是由若干個局部元件所組成的，而這些元件通常也存在某種幾何結構上的關係，為此我們也發展了利用物體局部特徵的空間一致性來改進影像的追蹤問題，並將其應用在人臉與五官的追蹤上。藉由於利用局部特徵的的關連與合作特性，我們可以有效地改善光線與局部遮蔽對追蹤所造成的影響。除了臉部追蹤，我們也成功地將此方法應用於其他的視覺追蹤問題上。在臉部表情辨認的研究上，不同於常見的整體或是局部的臉部表示法，我們採用了複合式的表示法來描述臉部的特徵，使得臉部的整體變化與局部細微的差異可以同時被觀察到。藉由應用監督式流形學習技術，我們提出了融合演算法來有效地整合這些不同元件上的流形，以突顯出個個元件在不同表情上的影響力。經由廣泛的實驗，我們證明了此方法可以有效地辨別各類表情。

關鍵字

手勢追蹤；粒子濾波法；局部特徵追蹤；局部元件合作；表情辨識；監督式流形學習

並列摘要

Three important topics for human intention understanding are discussed in this dissertation, including articulated hand tracking, face/facial component tracking, and facial expression recognition. To capture the complex hand motion in image sequences, we propose a model-based approach, called appearance-guided particle filtering, for high degree-of-freedom tracking. In addition to the initial state, our method assumes that there are some known attractors in the state space to guide the tracking. We then integrate both attractor and motion-transition information in a probability propagation framework. Experimental results show that our method performs better than those merely using sequential motion transition information or appearance information. An object usually consists of several components. To deal with the tracking problems that have (strong) spatial coherence in objects' components, we develop a part-based tracking method. Unlike existing methods that only use the spatial-coherence relationship for particle weight estimation, our method further applies the spatial relationship for state prediction. Thus, the tracking performance can be considerably improved. In the facial-expression recognition part, we propose a hybrid-representation approach in a manifold-learning framework, which takes advantage of both holistic and local representations for analysis. We show the effectiveness of our method by applying it to the Cohn-Kanade database, and a high recognition rate can be achieved.

並列關鍵字

Articulated hand tracking ； particle filtering ； tracking by parts ； component collaboration ； facial expression recognition ； supervised manifold learning

參考文獻

[1] S. Agarwal, A. Awan, and D. Roth. Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11):1475–1490, 2004.

[2] M. Arulampalam, S. Maskell, N. Gordon, and T. Clapp. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing, 50(2):174–188, 2002.

[3] V. Athitsos and S. Sclaroff. An appearance-based framework for 3D hand shape classification and camera viewpoint estimation. In IEEE International Conference on Automatic Face and Gesture Recognition, pages 40–45, 2002.

[5] S. Avidan. Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8):1064–1072, 2004.

[9] S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4):509–522, 2002.

國際替代計量

電腦視覺技術於手勢追蹤與表情辨識之研究

全文下載

主題瀏覽