本研究的目的是發展一套自動化的演講拍攝系統,利用Kinect分析講者的行為,然後使用PTZ攝影機摸擬真實攝影師拍攝的效果,並且藉由網路傳輸達到攝影師與導播之間的溝通。 為了得知講者在PTZ攝影機畫面中的位置,本研究使用AdaBoost演算法來偵測講者的臉部區域。獲得該區域後,再利用平均位移(Mean Shift)追蹤演算法對該區域進行追蹤,讓PTZ攝影機可以持續得知講者的臉部位置。此外,使用Kinect的深度影像搭配高斯混合模型來辨識講者的手部姿勢,以及其彩色影像偵測講者是否使用雷射筆或指揮棒。 為了呈現最理想的畫面給聽眾,我們制定了一套適當的攝影機運鏡規則。在系統綜合所有講者相關的資訊後,PTZ攝影機會根據該運鏡規則自動拍攝畫面,並且傳送訊息告知導播。導播得到訊息後,即可搭配其制定的選鏡規則,決定是否選擇講者PTZ攝影機的畫面,呈現給聽眾。
We present a virtual cameraman system, which is a component system of an automated lecture recording system. The proposed virtual cameraman system consists of a Kinect device and a PTZ camera, which correspond to the cameraman and his/her camera, respectively. The Kinect device is further composed of a color camera and a depth sensor. To begin, the PTZ camera locates the speaker using the Adaboost algorithm and then the speaker is tracked by the mean shift algorithm. During tracking, the postures of the speaker are recognized based on the data provided by the depth sensor as well as a set of prebuilt posture models. The movement and posture of the speaker then determine the action of the PTZ camera according to a collection of predefined action rules. The output of the virtual cameraman system is the image sequence acquired by the PTZ camera.