透過您的圖書館登入
IP:3.17.203.68
  • 學位論文

基於時空關聯性之運動影片事件與戰術語義分析

Spatial-temporal Correlation based Event and Tactics Semantics Analyses for Sports Videos

指導教授 : 吳家麟

摘要


隨著數位攝錄器材的普及,多媒體影片的資料量也快速成長,若未經適當整理或標記,搜尋特定影片片段將是耗時又費神的大工程,這項缺點使得許多影片在被創造後即塵封於貯存裝置中而失去原有的價值。近年來自動影片語意分析技術逐漸被開發並運用於各類型影片之中,由於運動影片的精彩度與可重複利用性高,本文將提出一套完整的運動影片內容分析流程與架構。透過領域知識的協助,結合媒體特徵在時間與空間上的關聯性資訊,幫助我們跨越影片中低階特徵值與高階語意間的語意鴻溝,萃取出運動影片中重要的事件與戰術語意,進而提供更精緻化的多媒體應用系統。使用者所關注的高階語意不盡相同,我們將運動影片的觀眾族群分為三大類: 關注精彩事件的一般觀眾群、欲透過比賽影片學習特定狀況下之應對策略的運動初學者,以及欲從比賽影片中獲知對手慣用戰術之專業教練與選手。本文所提出之運動影片分析架構包含了三大模組:媒體特徵擷取模組、基於高階媒體特徵之明確事件與戰術偵測模組,以及以軌跡相似度為基礎之隱含事件與戰術檢索模組。其中媒體特徵擷取模組開發可靠的物件偵測器以取得中、高階媒體特徵值,引入高階媒體特徵而建構的語意偵測模型將可找到奠基於低階媒體特徵所不易發掘的高階語意。然而影片中所包含的隱含事件或戰術,可能因不被定義於事件偵測模型中而被忽略,在運動影片中,這些隱含事件或戰術通常與球員、球相對於球場之移動變化相關。因此我們提出藉由使用者輸入物件軌跡的方式,搜尋擁有相似物件軌跡之影片片段。本文以廣播式網球影片、撞球影片、籃球影片分析為例,分別針對不同觀眾族群開發所需之應用系統,證實所提出之運動影片分析架構的可行性與有效性。

並列摘要


With the popularization of digital camera and recorder, the amount of multimedia video content increases rapidly. The viewer will take a lot of time to exhaustively search for a specific video clip without proper management and annotation of all his/her video content. Consequently, videos will be kept in the storage device permanently and finally lose their value. Recently, techniques of automatic video semantic analysis have been proposed for different types of videos. Among all kinds of videos, sports videos are rich in highlights which can be extracted for many applications. In this dissertation, we proposed a comprehensive framework of sports video analysis purposing to bridge the gap between low-level features and high-level semantics in a video. Combining domain knowledge and spatial-temporal correlation of media features, we extract important semantics of events and tactics from sports videos. There is a great diversity in the audiences of sports video, and we roughly categorize them into three groups, i.e., the general audiences who desire exciting highlights, the beginner who crave strategies or skills of how to play, and the professionals who dig into tactics of the opponents. The proposed framework extracts various kinds of semantics to meet their requirements correspondingly. Three main modules, including media feature extraction, explicit event/tactics detection based on high-level media features, and implicit semantic concept retrieval based on trajectory similarity, are studied in this dissertation. For the media feature extraction module, we develop several kinds of robust object detectors to generate mid/high-level media features for concept detection/retrieval. Combining representative high-level media features and the given domain knowledge, the concept detection/retrieval modules are able to acquire spatial-temporal related semantics that hardly be discovered by low-level media features. Moreover, we explore the possibility of extracting implicit events/tactics which are usually ignored by conventional event detection models. In sports videos, these implicit concepts are related to the movements of players and ball. We design an interactive interface for the user to input query trajectories, and the technique of trajectory similarity comparison is utilized to recommend the video clips having the best matched object trajectories. We realized the proposed framework on three types of sports videos, including tennis video, billiards video, and basketball video, to demonstrate the feasibility of the proposed sports video analysis framework.

參考文獻


[Babaguchi'00] N. Babaguchi, "Towards abstracting sports video by highlights," in Proc. IEEE International Conference on Multimedia and Expo, pp. 1519 - 1522, 2000.
[Babaguchi'02] N. Babaguchi, Y. Kawai and T. Kitahashi, "Event based indexing of broadcasted sports video by intermodal collaboration," IEEE Transactions on Multimedia, vol. 4, no. 1, pp. 68-75, 2002.
[Bach'05] N. H. Bach, K. Shinoda and S. Furui, "Robust highlight extraction using multi-stream hidden markov models for baseball video," in Proc. IEEE International Conference on Image Processing, pp. 173-176, 2005.
[Bai'06] H. Bai, W. Hu, T. Wang, X. Tong, C. Liu and Y. Zhang, "A novel sports video logo detector based on motion analysis," Neural information processing, Lecture notes in computer science, vol. 4233, Springer, Berlin, Heidelberg, 2006.
[Bishop'06] C. M. Bishop, "Pattern recognition and machine learning," Springer, 2006.

延伸閱讀