透過您的圖書館登入
IP:3.21.231.245
  • 學位論文

利用空間與時間的影片特徵來選擇最佳的訓練樣本

Using Spatial and Temporal Video Feature for Optimal Training Set Selection

指導教授 : 黃仲陵
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


Most existing learning-based semantic analysis approaches want to obtain the best semantic model, that require a large training set to achieve well generalization capacity. The training process determines the good or bad of the semantic model, which depends on the selection method of the training set rather than the size of the training data. In this thesis, we propose several schemes to make the sample selection, which are based on Gaussian and Correlation measure model. Through computing correlate values between each sample, we can choose the best samples for training a semantic model and use SVM to classify the category of each sample. This thesis intends to classify the shots of the semantic into the six categories: speech, landscape, cityscape, crowd, map and unknown. The experiment results demonstrate that we only need a half of all training samples to achieve the same performance of using a greater size of the training set.

關鍵字

訓練集合 樣本選擇 高斯 相關性 支向機

並列摘要


隨著多媒體裝置的普及,對於隨手拍攝的視訊影片,將越來越大量,因此就有學者提出一些解決的方法,對於視訊影片作有效的管理、分類與註解。而高層次的語意概念與低階的視覺特徵,兩者間存在著語意連結上的落差。在影片註解這領域裡,大部分所提出的語意模型分析器,都希望能夠得到一個具有廣泛性的效能,但通常需要一個很大量的資料庫,來達到這樣的語意模型。針對此類語意模型的好壞,應該取決於訓練的資料,而不是資料庫的大小。因此,基於這樣的想法,我們在本篇論文裡,提出了幾種方式來做訓練資料的選取,有高斯模型與相關性模型,並利用這兩種理論模型的計算,藉此來評估兩訓練資料間的相似程度,選擇在空間上資料散落的越開越好,從中得出較佳或是較具有語意代表性的訓練資料。搭配支向機(SVM)的應用來訓練語意模型,並做類別的分類。在實驗中,我們將影片的語意分成六類:演講、鄉村風景、城市風景、人群、地圖與其它。經由實驗中證實,我們僅需使用原本資料量的一半,所訓練出的語意模型與使用全部資料所訓練出的語意模型,兩者的效能將較為接近,利用較小的資料量來獲得不錯的系統效能。

並列關鍵字

training set sample selection gaussian correlation SVM

參考文獻


[1] Yong Ge, Richang Hong, Zhiwei Gu, Rong Zhang, Xiuqing Wu, “A Probability Model for Image Annotation, ” IEEE ICME, 2007.
[2] J. Fan, H. Luo, X. Lin, “Semantic Video Classification by Integrating Flexible Mixture Model with Adaptive EM Algorithm,” ACM, pages 9-16, Nov, 2003.
[3] D. Zhong, S.-F. Chang, “Structure Analysis of Sports Video Using Domain Models,” IEEE ICME, 2001.
[4] J. He, M. Li, H.-J. Zhang, H. Tong, C. Zhang, “Manifold-Ranking Based Image Retrieval,” ACM, Oct, 2004.
[5] J. Tang, X.-S. Hua, G..-J. Qi, Z. Gu, X. Wu, “Beyond Accuracy: Typicality Ranking for Video Annotation,” IEEE, 2007.

延伸閱讀