基於隱藏條件式隨機場之手勢辨識

人們使用多種自然符號來溝通，手勢是其中一種。因此，機器需要辨識手勢，簡化人跟機器的溝通。針對手勢辨識，我們討論了兩種模型：詞袋模型(bag-of-words model)和分部模型(part-based model)。在本論文中，我們會介紹認知上的分部模型，並且評鑑一個具體的、計算上的分部模型：由 Quattoni 等人提出的隱藏條件式隨機場(Hidden Conditional Random Fields)。我們的實驗指出，隱藏條件式隨機場成功應用在正立的手勢資料庫上。在隱藏條件式隨機場中，任兩個點並未預設為獨立，因此可以有重疊。此外各點之間的全域關係可以統整到隱藏條件式隨機場中，以表示資料之間較大尺度的相依性。我們的實驗指出，Quattoni 等人所使用的全域關係，無法應用在有平面旋轉的情況下，因為這種全域關係會隨著平面旋轉而改變。但是，雙手具有高自由度，因此手勢時常出現平面旋轉。所以，我們提出，以點跟影像中心的距離，來表示各點之間的全域關係。而這樣的表示方法，不會隨著平面旋轉而改變，因此可以成功應用在有平面旋轉的情況下。

關鍵字

物體辨識；手勢辨識；分部模型；圖型模型；隱藏條件式隨機場；平面旋轉

並列摘要

Hand posture is one of the natural signs used by people for communication. Thus, there is the need for machines to recognize hand posture. For the recognition of hand posture, two major kinds of model for the object recognition are discussed: bag-of-words model and part-based model. In this thesis, we will review the part-based model in cognition and evaluate a specific computational part-based model proposed by Quattoni et al.: Hidden Conditional Random Fields (HCRFs). Our experiments show that HCRFs are successfully applied on the upright hand posture dataset. In HCRFs, any two nodes are not assumed to be independent and thus may be overlapped. Moreover, global relation of nodes may be incorporated into HCRFs so as to represent large scale dependency among data. Our experiments show that the global feature used by Quattoni et al. is not invariant to in-plane rotation. However, hands are with high degrees of freedom and thus hand postures are frequently in the rotated cases. Therefore, we propose to encode the global relation of nodes by the distance to the image center so as to be invariant to in-plane rotation.

並列關鍵字

object recognition ； hand posture recognition ； part-based model ； graphical model ； hidden conditional random fields ； in-plane rotation

參考文獻

Anton-Canalis, L. & Sanchez-Nielsen, E. (2006). Hand posture dataset creation for gesture recognition. In International Conference on Computer Vision Theory and Applications (VISAPP).

Bar-Hillel, A., Hertz, T., & Weinshall, D. (2005). Object class recognition by boosting a part-based model. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).

Bernstein, E. J. & Amit, Y. (2005). Part-based statistical models for object classification and detection. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).

Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115-147.

Fergus, R., Perona, P., & Zisserman, A. (2003). Object class recognition by unsupervised scale-invariant learning. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).

國際替代計量

基於隱藏條件式隨機場之手勢辨識

全文下載

主題瀏覽