簡易檢索 / 詳目顯示

研究生: 李俊億
論文名稱: 自動化演講錄製系統之虛擬攝影師
Automated Lecture Recording System - Virtual Cameraman
指導教授: 陳世旺
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 中文
論文頁數: 81
中文關鍵詞: 自動演講拍攝AdaBoost演算法平均位移追蹤演算法高斯混合模型姿勢辨識
英文關鍵詞: automated lecture shooting system, AdaBoost algorithm, mean shift tracking algorithm, Gaussian mixture model, posture recognition
論文種類: 學術論文
相關次數: 點閱:42下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   本研究的目的是發展一套自動化的演講拍攝系統,利用Kinect分析講者的行為,然後使用PTZ攝影機摸擬真實攝影師拍攝的效果,並且藉由網路傳輸達到攝影師與導播之間的溝通。
      為了得知講者在PTZ攝影機畫面中的位置,本研究使用AdaBoost演算法來偵測講者的臉部區域。獲得該區域後,再利用平均位移(Mean Shift)追蹤演算法對該區域進行追蹤,讓PTZ攝影機可以持續得知講者的臉部位置。此外,使用Kinect的深度影像搭配高斯混合模型來辨識講者的手部姿勢,以及其彩色影像偵測講者是否使用雷射筆或指揮棒。
      為了呈現最理想的畫面給聽眾,我們制定了一套適當的攝影機運鏡規則。在系統綜合所有講者相關的資訊後,PTZ攝影機會根據該運鏡規則自動拍攝畫面,並且傳送訊息告知導播。導播得到訊息後,即可搭配其制定的選鏡規則,決定是否選擇講者PTZ攝影機的畫面,呈現給聽眾。

    We present a virtual cameraman system, which is a component system of an automated lecture recording system. The proposed virtual cameraman system consists of a Kinect device and a PTZ camera, which correspond to the cameraman and his/her camera, respectively. The Kinect device is further composed of a color camera and a depth sensor.
    To begin, the PTZ camera locates the speaker using the Adaboost algorithm and then the speaker is tracked by the mean shift algorithm. During tracking, the postures of the speaker are recognized based on the data provided by the depth sensor as well as a set of prebuilt posture models.
    The movement and posture of the speaker then determine the action of the PTZ camera according to a collection of predefined action rules. The output of the virtual cameraman system is the image sequence acquired by the PTZ camera.

    第一章 簡介 1 1.1、研究動機: 1 1.2、文獻探討: 5 1.3、攝影設備: 7 1.3.1、Kinect: 8 1.3.2、PTZ攝影機: 12 1.4、文章架構: 14 第二章 系統架構 15 2.1、系統架構: 15 2.1.1、攝影機架設: 15 2.1.2、攝影機控制: 19 2.2、系統流程: 23 第三章 講者偵測與追蹤 26 3.1、講者偵測: 26 3.1.1、Haar特徵: 26 3.1.2、積分影像: 27 3.1.3、Adaboost演算法: 28 3.1.4、偵測結果: 29 3.2、人臉追蹤: 30 3.2.1、平均位移演算法: 31 3.2.2、平均位移演算法用於人臉追蹤: 34 第四章 手部姿勢辨識 42 4.1、手部姿勢資料庫建立: 43 4.1.1、手部關節座標擷取: 44 4.1.2、座標正規化: 44 4.1.3、姿勢模型建立: 46 4.1.4、參數介紹: 47 4.1.5、參數更新: 47 4.2、手部姿勢辨識: 48 4.3、手部姿勢類型: 50 4.4、攝影機拍攝規則: 54 第五章 螢幕定位與視覺道具偵測 56 5.1、布幕定位: 57 5.2、雷射光點偵測: 59 5.3、指揮棒偵測: 61 第六章 實驗結果 64 6.1、攝影機拍攝規則: 68 6.2、實驗片段: 69 第七章 結論 77 7.1、結論: 77 7.2、未來工作: 78 參考文獻 79

    [Law 01]Lawrence A. Rowe, Diane Harley, Peter Pletcher, and Shannon Lawrence, “BIBS: A Lecture Webcasting System”, Berkeley Multimedia Research Center, 2001.
    [Yon 01]Yong Rui, Liwei He, Anoop Gupta, and Qiong Liu, “Building an Intelligent Camera Management System”, Proceedings of the ACM Multimedia, volume 9, pages 2-11, 2001.
    [Mic 98]Michael Bianchi, “AutoAuditorium: A Fully Automatic, Multi-camera System to Televise Auditorium Presentations”, Proceedings of the Joint DARPA/NIST Smart Spaces Technology Workshop, 1998.
    [Mic 04]Michael Bianchi, “Automatic Video Production of Lectures Using an Intelligent and Aware Environment”, Proceedings of the 3rd International Conference on Mobile and Ubiquitous Multimedia, Pages 117-123, 2004.
    [Gre 99]Gregory D. Abowd, “Classroom 2000: An Experiment with the Instrumentation of a Living Educational Environment” IBM Systems Journal, volume 38, issue 4, pages 508-530, 1999.
    [Cha 08]Cha Zhang, Yong Rui, Jim Crawford, and Li-Wei He, “An Automated End-to-end Lecture Capture and Broadcasting System”, Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), volume 4, issue 1, pages 2-11, 2008.
    [Sei 07]Seiji Okuni, Shinji Tsuruoka, Glenn P. Rayat, Hiroharu Kawanaka, and Tsuyoshi Shinogi, “Video Scene Segmentation Using the State Recognition of Blackboard for Blended Learning”, International Conference on Convergence Information Technology, pages 2437-2442, 2007.
    [Dav 96]David B. Christianson, Sean E. Anderson, Li-Wei He, David H. Salesin, Daniel S. Weld, and Michael F. Cohen, “Declarative Camera Control for Automatic Cinematography”, Proceedings of the 13th National Conference on Artificial Intelligence, volume 1, pages 148-155, 1996.
    [Rit 06]Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z. Wang, “Studying Aesthetics in Photographic Images Using a Computational Approach”, Proceedings of the 9th European Conference on Computer Vision, volume 3, pages 288-301, 2006.
    [Gil 94]Gil Cruz, and Ralph Hill, “Capturing and Playing Multimedia Events with STREAMS”, Proceedings of the 2nd ACM International Conference on Multimedia, pages 193-200, 1994.
    [Rui 04]Rui Yong, Gupta Anoop, and Grudin Jonathan, and He Liwei, “Automating Lecture Capture and Broadcast: Technology and Videography”, Multimedia Systems, volume 10, number 1, pages 3-15, 2004.
    [Ron 03]Ron Baecker, “A Principled Design for Scalable Internet Visual Communications with Rich Media, Interactivity, and Structured Archives”, Proceedings of the Centre for Advanced Studies on Collaborative research, pages 16-29, 2003.
    [Chi 07]Chia-Feng Juang, and Chia-Ming Chang, “Human Body Posture Classification by a Neural Fuzzy Network and Home Care System Application”, Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, volume 37, number 6, 2007.
    [Mot 02]Motoyuki Ozeki, Yuichi Nakamura, and Yuichi Ohta, “Human Behavior Recognition for an Intelligent Video Production System”, Proceedings of the 3rd IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing, pages 1153-1160, 2002.
    [Ko 11]Ko-Hsin Cheng, Chaur-Heh Hsieh, and Chang-Chieh Wang, “Human Action Recognition Using 3D Body Joints”, The 24th IPPR Conference on Computer Vision, Graphics, and Image Processing, session D2-2, 2011.
    [Shi 11]Shih-Yao Lin, Zong-Hua You, and Yi-Ping Hung, “A Real-time Action Recognition Approach with 3D Tracked Body Joints and Its Application”, The 24th IPPR Conference on Computer Vision, Graphics, and Image Processing, session B5-2, 2011.
    [Pau 04]Paul Viola, and Michael J. Jones, “Robust Real-time Face Detection”, International Journal of Computer Vision, volume 57, issue 2, pages 137-154, 2004.
    [Yi 95]Yi-Zong Cheng, “Mean Shift, Mode Seeking, and Clustering”, IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 17, number 8, pages 790-799, 1995.
    [Chi 11]Chien-Ting Lu and, Sei-Wang Chen, “Automatic Lecture Recording System”, The 24th IPPR Conference on Computer Vision, Graphics, and Image Processing, session D1-3, 2011.
    [Pau 69]Paul Ekman, and Wallace V. Friesen, “The repertoire of nonverbal Behavior: Categories, origins, usage and coding”, Semiotica, volume 1, issue 1, pages 49-98, 1969.
    [You 11]You-Xiang Hwang. “A Study on the Visual Guidance System of Robots”, Thesis for Master of Science Department of Mechanical Engineering, Tatung University, 2011.
    網路資料來源:
    http://msdn.microsoft.com/zh-tw/hh367958.aspx
    http://zh.wikipedia.org/wiki/kinect
    http://www.techbang.com/posts/2936-get-to-know-how-it-works-kinect
    http://tw.axis.com/index.php?mod=products&dop=feature&prod_no=53

    無法下載圖示 本全文未授權公開
    QR CODE