最近這幾年隨著Kinect裝置的發明,使得動作辨識這塊領域有了新的方向。主要是因為Kinect可以即時的抓取3D骨架的資訊,我們參考了一些相關領域的論文後,決定設計一個可以讓人用手勢來操作電腦應用程式的系統。在我們的實作中,我們把一個動作看成一連串的姿勢所組合而成,在這裡我們稱之為Key Pose。參考了相關的論文後,我們引用了Support Vector Machine(SVM)的方式把這些Key Pose加以分類,接著再利用樹狀的資料結構分析Key Pose的序列是屬於哪種動作。我們用了此體感系統成功了操作一些原本只能用滑鼠鍵盤來操作的電腦應用程式,而實驗結果也顯示我們的系統有不錯的反應時間和辨識率。
In this thesis, we implement a system which provides a natural interface so that users can interact via upper-body motion with applications which are originally controlled by keyboard and mouse. The system reads 3-dimensional skeleton data which are computed by Microsoft Kinect SDK from RGB-depth images, recognizes the gesture performed by the user, and then simulates keyboard and mouse to generate corresponding signals to control applications. In the implementation, a human gesture is considered as a sequence of poses. As in previous research, machine learning techniques such as SVM and gesture trees are used to classify key poses and recognize gestures. We demonstrate our system with several applications such as browsing photos in Picasa and playing Grand Theft Auto game with Kinect. Both applications are originally controlled by keyboards and mouse.