穿戴式互動相機：第一人稱視角社交互動視訊摘要之演算法與系統晶片設計

From industry to entertainment, wearable devices are becoming more and more popular nowadays. With the increasing length recorded by wearable cameras, how to extract the interesting part the user need on the first-person view videos (egocentric videos) has become more important, and the work includes many levels of computer vision tasks. This work proposes a wearable social camera, which is an egocentric camera that can summarize all social interaction activities between people and camera wearer from the whole video. The core technology of the wearable social camera is to do the “egocentric video summarization for social interaction”. Different from other works of second-person action/interaction recognition in egocentric videos, which focus on distinguishing different actions, this work tries to find the common features among all the interactions, the common features named Interaction Features (IF) is proposed to be composed of three parts: physical information of head, body languages and mouth expression. Furthermore, HMM (Hidden Markov Model) is employed to model the interaction sequences, and a summarized video is generated with HM-SVM (Hidden Markov Support Vector Machine). Experimental results tested by a large amount of life-log dataset show that the proposed system performs well for summarizing life-log videos. Furthermore, we design and implement the work by ASIC (Application-Specific Integrated Circuits) architecture, and fulfil the system with face landmark regression pipeline on DE2-115 FPGA (Field Programmable Gate Array).

並列關鍵字

Video Summary ； Interaction Detection

參考文獻

[1] A. Fathi, J. K. Hodgins, J. M. Rehg, “Social Interactions: a first-person perspective,” in Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2012, pp. 1226-1233.

Google Scholar

[2] A. Fathi, A. Farhadi, and J. M. Rehg, “Understanding egocentric activities,” in Proc. of International Conference on Computer Vision (ICCV), 2011, pp. 407-414.

Google Scholar

[3] A. Fathi, Y. Li, and J. M. Rehg, “Learning to recognize objects in egocentric activities,” in Proc. of European Conference on Computer Vision (ECCV), 2012, pp. 314-327.

Google Scholar

[4] Department of Defense Human Factors Engineering Technical Advisory Group (DOD HFE TAG), “Static adult human physical characteristics of the adult head,” Poston, Alan, pp. 72-75, April 2000.

Google Scholar

https://upload.wikimedia.org/wikipedia/commons/6/61/HeadAnthropometry.JPG

Google Scholar

國際替代計量

穿戴式互動相機：第一人稱視角社交互動視訊摘要之演算法與系統晶片設計

未授權

主題瀏覽