基於Histogram of Oriented Gradients之課堂舉手辨識研究

在現今教學環境中，以科技融入教學的e化教室越來越普遍，教師希望可以透過各類的e化工具來了解學生的學習情況。而在種種學生的動作中，舉手是最容易在課堂上與老師互動的，因此本研究提供了學生舉手辨識系統，希望借由該系統讓老師可以即時快速的掌握學生的情況。本研究使用少量攝影機在多人且複雜背景的情況下進行辨識，希望可以降低環境因素的影響。為了達到此一目的，本研究主要分為二個部份：複雜背景人物辨識分割與舉手判斷。在複雜背景人物辨識與分割部分，使用了k-means clustering與motion的結合，將膚色取出來加上motion資訊能夠完整的抓出人物。在舉手判斷的部份，使用了Histogram of Oriented Gradient(HOG)製作出edge的feature進行舉左手、正常狀態、舉右手的辨識。實驗中也交叉比對了不同人物在場景的辨識、光線影響圖片的辨識。其正確率在同一日當model的情況下平均高達91%，而在不同日的情況下也有7-8成的水準。

關鍵字

姿勢辨識；人物分割； k-means ； HOG

並列摘要

In recently classroom environment, there are more and more teachers using electronic devices to help them easy to understand what students think and how students act. In all gestures, raising hand is the most popular way that students interacting with teachers. In this research, we provided a raising hand recognition system to help teacher to handle all students’ behavior. We use camera in complex background and monitor multiple people. To propose a system that can satisfy all environments and will not re-train after changing environments, we separate the system into two parts: people segmentation and gesture recognition. In people segmentation part, we use k-means clustering to extract skin color and then use motion to remove skin-liked background. In gesture recognition part, we use histogram of oriented gradient to get the gesture feature and then use SVM to classify. Finally in experimental part, we test 3 scenes to verify our method. When we use the same case to train and test, the correct rate is average 91%. Even we use different day for training/testing, the correct rate can also reach 80%.

並列關鍵字

Gesture recognition ； People segmentation ； dynamic k-means ； HOG

參考文獻

[Dav05] David A. Sadlier and Noel E. O’Connor, “Event Detection in Field Sports Video Using Audio-Visual Features and a Support Vector Machine,” IEEE Transactions on Circuits and Systems for Video Technology, pp. 1225-1233, 2005.

[Gua09] Guangyu Zhu, Ming Yang, Kai Yu, Wei Xu, and Yihong Gong, “Detecting video events based on action recognition in complex scenes using spatio-temporal descriptor,” Proceedings of the seventeen ACM international conference on Multimedia, pp. 165-174, 2009

[Han09] J. Han, G. Awad, and A. Sutherland, “Automatic Skin Segmentation and Tracking in Sign Language Recognition,” Computer Vision, IET, pp. 24-35, 2009.

[Jin10] Jina Lee and Stacy C. Marsella, “Predicting Speaker Head Nods and the Effects of Affective Information,” IEEE Transactions on Multimedia, pp. 552-562, 2010.

[Jua05] Juan P. Wachs, Helman Stern, and Yael Edan, “Cluster Labeling and Parameter Estimation for the Automated Setup of a Hand-Gesture Recognition System,” IEEE Transactions on Systems, pp. 932-944, 2005.

國際替代計量

基於Histogram of Oriented Gradients之課堂舉手辨識研究

主題瀏覽