本論文針對視訊內嵌字幕移除提出一個字幕偵測與修補方法。雖然視訊中內嵌的字幕可藉由人工方式以視訊或影像編輯工具抹除掉,但需耗費的龐大精力,因此自動字幕偵測與移除的技術是必須的。目前已有許多自動字幕偵測方法,但並無針對視訊字幕切換(Transition)時的模糊字幕的判斷與偵測提出一套解決辦法,且偵測時易受視訊中複雜背景的影響而造成誤判。 此篇論文中,我們考慮字幕在時間軸的相關性(Temporal Correlation)提出一個視訊字幕切換之判斷與偵測機制,並利用字幕固有的高對比特性來提升字幕偵測的準確度;在字幕移除區域之影像內容修復部分,我們結合影像修補(Image Inpainting)方法與動作向量(Motion Vector)資訊提出一個視訊修補(Video Inpainting)演算法,維持修復結果在空間域與時間域上的一致性與連續性。我們透過實際的廣播節目進行實驗,其結果顯示本論文所提出之字幕偵測與修復方法確實較傳統文獻方法優異。
This paper proposed a video text detection and completion method to remove embedded captions in broadcasting programs. One may remove captions manually frame by frame using image editing tools, but it takes a considerable amount of time and efforts. Many automatic text detection methods have been proposed to solve this problem, but none of existing methods considered real scenarios where captions suffer from caption transition and complicated background. This work develops a real time caption detection algorithm by making use of the temporal relation observed in caption transition, and improves the caption detection rate in complicated background using high contrast property found in spatial domain. To complete the detected caption region, we extend an exemplar-based image inpainting algorithm by incorporating motion vectors to the completion priority for video inpainting, so as to maintain spatial consistency and temporal continuity in playback. Experiments are performed on real television broadcast video clips, and shows that the proposed text detection and completion method is superior to other methods.