以手勢為輸入的人機介面在這幾年來蓬勃的發展,常見的應用有機器人遙控、電視遙控、投影片簡報控制等,以手勢為輸入的操作不但非常地直覺,且可帶給使用者耳目一新的感受。但是,大多數的手勢辨識系統,都只能限定在某些單純的環境下使用,目的是為了要排除光線和複雜環境的干擾,這也是手勢辨識無法在日常生活中普及的最大原因。本論文的目的,就是要想辦法克服因為光線、環境、甚至是攝影機本身的因素,造成偵測失效的窘境,並提出一種簡單的即時手勢辨識系統,使用者不需經過學習的步驟,即可直覺的使用,達到控制的目的。 本論文提出了一種基於適應性膚色偵測的即時手勢辨識系統,所謂的適應性膚色偵測,是從攝影機擷取到的畫面中,先使用人臉偵測取得臉部區域,再利用統計與分析的方法,得到使用者的個人膚色模型,使用此模型來偵測畫面中的膚色。之後,利用此膚色模型結合人臉位置進行動態偵測,並在動態歷史影像中利用一種簡單的方向偵測方法,偵測使用者的手部區域運動方向,進而得到一個可靠的手勢辨識系統。本系統定義的手勢都是屬於一般的自然手勢,如手掌揮上、手掌揮下、手掌揮左、手掌揮右、手掌握拳、手掌揮動等6種手勢。經實驗統計,5個人進行上述手勢操作,每人共250次不同手勢,操作範圍在2公尺內,準確率可達94.1%,且處理一張畫面的時間只需約3.81ms,證明了本論文所提方法確實可行。
In recent years, hand gesture recognition based man-machine interface is being developed vigorously. The most commonly used applications include robot control, TV remote control, slide show control, etc. Man-machine interface by hand gesture is both intuitive and friendly for users. Due to the effect of lighting and complex background, most visual hand gesture recognition systems work only under restricted environment. This is why visual hand gesture recognition systems still are not popular in our daily life. The purpose of this thesis is to develop a simple and real-time hand gesture recognition system which could overcome the effects caused by camera, lighting, and even environmental variations. Users can interact directly with systems by intuitive hand gestures without training. An adaptive skin color detection method based on face detection and color distribution analysis is proposed to obtain the personalized skin color model. Then, we could apply the created color model to detect the other skin color regions like hands in frames. In addition, a simple hand moving direction detection method based on motion history image (MHI) is proposed. Four groups of directional patterns are defined for measuring the directions. There are six hand gestures defined in our system, natural hand moving up, moving down, moving left, moving right, fist hand, and waving hand. They could be bound to some hot keys or events for interactions. Five persons are requested to do 250 hand gestures within two meters in front of the camera. Experimental results show the accuracy is up to 94.1% in average and the processing time is only 3.81 ms per frame. These demonstrated the reliability and robustness of proposed system.