透過您的圖書館登入
IP:18.118.200.197
  • 學位論文

基於穿戴式攝影裝置的重要視訊片段擷取與注視目標區域分析方法

Video Summarization and Regions-of-Interest Extraction for Videos Taken with Wearable Glasses

指導教授 : 王才沛

摘要


本論文的目的是針對穿戴式眼鏡Google Glass所拍攝的無剪接、無後製之長影片進行濃縮及精簡,以及對其中重要片段中出現之重要的人或物進行擷取,讓使用者在錄製影像後,可以方便檢索最重要或最相關的部分。 本論文使用包含時間與空間特性的特徵值作為分析重要片段以及中要注視目標區域之用,透過超像素(superpixel)演算法節省人工標記的時間,並使用超像素分割之區域與卷積神經網路輔助本研究分析拍攝者所注意之目標區域及物體。兩者將影像分割成數個部分後,輸入隨機森林演算法後輸出各自的重要注視區域,最後使用ROC曲線作為評估本實驗之用,並比較這兩者對於協助擷取重要注視目標的有效性。

並列摘要


The purpose of this thesis is to concentrate and streamline the unedited and unsplit video shot by the wearable glasses and to capture the important persons and objects appearing in the important fragments. So that the user can review the video after recording the video quickly, and easily recall the most important or most relevant part. The features used in this thesis, including time and space features, are used to analyze important segments and focus on the region-of-interest. By using the superpixel algorithm to save the time of manual labeling. Afterwards, we will compare how the segmentation of superpixel and bounding box of the convolution neural network effect the detection results. After inputting the random forest algorithm, the two output their respective important attention area and finally use the ROC curve to evaluate the experiment.

參考文獻


[1] You, Junyong, et al. "A multiple visual models based perceptive analysis framework for multilevel video summarization." IEEE Transactions on Circuits and Systems for Video Technology 17.3 (2007): 273-285.
[2] Nakamura, Yuichi, Jun’ya Ohde, and Yuichi Ohta. "Structuring personal activity records based on attention-analyzing videos from head mounted camera." Pattern Recognition, 2000. Proceedings. 15th International Conference on. Vol. 4. IEEE, 2000.
[3] Cheatle, Phil. "Media content and type selection from always-on wearable video." Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on. Vol. 4. IEEE, 2004.
[4] Su, Yu-Chuan, and Kristen Grauman. "Detecting engagement in egocentric video." European Conference on Computer Vision. Springer International Publishing, 2016.
[5] Itti, Laurent, Christof Koch, and Ernst Niebur. "A model of saliency-based visual attention for rapid scene analysis." IEEE Transactions on pattern analysis and machine intelligence 20.11 (1998): 1254-1259.

延伸閱讀