Automatic Closed Caption Detection and Filtering in MPEG Videos for Video Structuring

Video structuring is the process of extracting temporal structural information of video sequences and is a crucial step in video content analysis especially for sports videos. It involves detecting temporal boundaries, identifying meaningful segments of a video and then building a compact representation of video content. Therefore, in this paper, we propose a novel mechanism to automatically parse sports videos in compressed domain and then to construct a concise table of video content employing the superimposed closed captions and the semantic classes of video shots. First of all, shot boundaries are efficiently examined using the approach of GOP-based video segmentation. Color-based shot identification is then exploited to automatically identify meaningful shots. The efficient approach of closed caption localization is proposed to first detect caption frames in meaningful shots. Then caption frames instead of every frame are selected as targets for detecting closed captions based on long-term consistency without size constraint. Besides, in order to support discriminate captions of interest automatically, a novel tool-font size detector is proposed to recognize the font size of closed captions using compressed data in MPEG videos. Experimental results show the effectiveness and the feasibility of the proposed mechanism.

並列關鍵字

caption frame detection ； closed caption detection ； font size differentiation ； video structuring ； video segmentation

被引用紀錄

Chen, H. T. (2009). 運動影片內容分析、理解與註釋之研究 [doctoral dissertation, National Chiao Tung University]. Airiti Library. https://doi.org/10.6842/NCTU.2009.00838

國際替代計量

Automatic Closed Caption Detection and Filtering in MPEG Videos for Video Structuring

全文下載

主題瀏覽