以電腦視覺為基礎之即時手勢辨識與其應用

本研究的主要目標是進行手勢辨識(Gesture Recognition)，本研究將以單一攝影機作為影像輸入裝置，建立出一個人機互動介面的系統(HCI, Human-Computer Interaction)，並期望將成果應用在手語辨識上，建立出聾啞人士與他人溝通的良好互動管道。在數位影像處理中，研究中所感興趣的區塊可視為前景(本研究為手勢區域)，而其他部分可視為背景，在多數情況下，由於一張影像在包含前景的情況下會有著複雜的背景，導致手勢辨識的過程中常發生背景被誤判成前景的情形，因此本研究將利用背景相減法(Background Subtraction)來取得移動物，接著將其轉換成YCbCr色彩空間並搭配特定門檻值來取得膚色區塊，針對雜訊部分將利用到形態學(Morphological Process)和標記連通成分(Connected Component)來去除，而手勢偏移角度造成辨識率下降也是本研究將解決的問題點，接著將進行手肘切割，就能還原影像中只包含手勢部分的影像，最後將利用類神經網路(Artificial Neural Network)來進行手勢的訓練和辨識，本研究將分成兩部分來檢視其結果，分別是靜態影像手勢辨識和動態影像手勢辨識，其結果不錯，靜態影像手勢辨識成功率可以達到94.6%，且每張影像的處理時間只需要用到39ms，可以達到即時辨識的效果；動態影像手勢辨識的整體辨識率約為89%，經過本研究的辨識系統處理時間約為55ms秒。而在實際應用方面，本研究將系統實作在iPhone上，以展現其成果，達成辨識手語的任務。

關鍵字

手勢辨識；人機互動介面；倒傳遞類神經網路

並列摘要

We present a system of gesture recognition based on Human-Computer Interaction (HCI). Our goal is to create an effective way on communication between deaf people and the others by using the system. In the field of Image Processing, the region of interest (ROI) is considered the foreground region and the others are background region. In most situations, the greatest problem is that the foreground is confused with a complicated background, and the foreground regions thus easily misleads. In order to resolve the problem, we apply Background Subtraction method to more precisely grab a user's gesture motion images. The images will be transformed to YCbCr color space and binaries to locate the skin region using specified threshold values. The processes might noise the images, so we use Morphological and Connected Component method to remove the noises. Further we also elimination the problem with recognizing process, like a slanted hand gesture. So, we can get a foreground region successfully. Finally, we use Artificial Neural Network for recognizing the sign language immediately and deploy our method to handheld device. Experimental results show that the accuracy is up to 94.6% in average and the processing time is only 39 ms per frame in our static image hand gesture system. And our dynamic image hand gesture system shows that the accuracy is 89% in average and the processing time is only 55msper frame.

並列關鍵字

Gesture recognition ； Human-computer interaction ； Backpropagation neural network

參考文獻

[1] J. H. Park, S. H. Baeg, J. Koh, K. W. Park, and M. H. Baeg, "A new object recognition system for service robots in the smart environment," in Control, Automation and Systems, 2007. ICCAS '07. International Conference on, 2007, pp. 1083-1087.

[2] R. Swaminathan, M. Nischt, and C. Kuhnel, "Localization based object recognition for smart home environments," in Multimedia and Expo, 2008 IEEE International Conference on, 2008, pp. 921-924.

[4] T. Lilienblum, P. Albrecht, R. Calow, and B. Michaelis, "Dent detection in car bodies," in Pattern Recognition, 2000. Proceedings. 15th International Conference on, 2000, pp. 775-778 vol.4.

[5] K. Moriwaki, Y. Katayama, K. Tanaka, and R. Hikami, "Recognition of moving objects by image processing and its applications," in ICCAS-SICE, 2009, 2009, pp. 667-670.

[6] Y. Shiqi, T. Tieniu, H. Kaiqi, J. Kui, and W. Xinyu, "A Study on Gait-Based Gender Classification," Image Processing, IEEE Transactions on, vol. 18, pp. 1905-1910, 2009.

國際替代計量

以電腦視覺為基礎之即時手勢辨識與其應用

全文下載

主題瀏覽