透過您的圖書館登入
IP:3.134.103.74
  • 學位論文

影片中的文字擷取

Text Extraction on Video

指導教授 : 顏淑惠

摘要


為了能夠有效管理眾多的影片檔案,本文發展一方法來擷取在影片中具有代表意義的文字。首先,對整體影片進行文字偵測,也就是說,每隔x個畫格從影片的開始至結束都檢查一次。偵測流程當中,不但要偵測該回合有無文字存在,並且要比對畫格間的文字重疊性及文字相似度。對於每個文字區段會紀錄其起始畫格、結束畫格、參考畫格、及代表畫格,並且標示該文字區域位在畫格之所在地。為了讓影片文字偵測結果更為準確,更進一步地進行文字區段間的合併,使得影片最終的文字偵測而得的文字區段段數達到最小值,以期與實際文字區段段數相符。 文字偵測之後,首先利用型態影像學中的測量學擴張將背景資訊移除。接著應用簡單的長條圖等化法增強影像的對比。然後執行文字擷取以備將來文字辨識之用。

並列摘要


With the rapid growth of digital technology, videos now play an important role in our life. Due to huge amount of video data, it needs efficient means to access and retrieve them. Text in videos is a powerful source to help us to understand the content of the videos. To achieve this task, we propose a method to extract text in videos. The text detection is achieved by overall video text detection and video clips mergence for same texts. Firstly, at each round,text regionsare roughly labeled by applying Canny edge detecting algorithm to 7 consecutive frames and taking the result of intersection of edge pixels. To determine whether there are the same texts on two frames, the comparison of region overlap and black-white transition count (BWTC) are used. For each text t, the video clip with start/end frame, reference frames, and corresponding frame will be recorded. The mergence of video clips occurs if two consecutive clips have the same text. Text mask Mt is constructed via reference frames of the text t. Text regions are thus refined using text masks. Before text extraction, the similarity of refined text regions is again compared for possible mergence of video clips. To accomplish the text extraction,three steps-background removal, contrast enhancement, and binarizaiton-are applied to the correspondence frame of the text. Background is removed by morphological reconstruction. In order to get better binary results, it will be enhanced by multi-stage histogram equalization. Finally, binarization is performed by moving average algorithm. Experimental results show that the effectiveness of the proposed method.

參考文獻


[1] K. Juang, K.I. Kim, and A.K. Jain, “Text information extraction in images and video: A survey,” Pattern Recognit., Vol. 37, No. 5, pp. 977–997, 2004.
[2] JiSoo Kim, SangCheol Park, and SooHyung Kim, “Text location from natural scene images using images intensities,” IEEE Transactions on Image Processing ,
Vol. 13, Issue: 1, pp. 87-99, January 2004.
[3] Shutao Li and James T. Kwok, “Text extraction using edge detection and morphological dilation,” 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, pp. 330-333, October, 2004.
[4] Yen-Lin Chen and Bing-Fei Wu, “Text extraction from complex document images using the multi-plane segmentation technique,” IEEE Conference on Systems, Man, and Cybernetics, October, 2006.

延伸閱讀


國際替代計量