運用類神經網路的自動化視訊文字粹取

影像中文字的偵測、辨識和粹取對於影像修正、註釋或分析來說是非常重要的一環，而這方面有許多實際的應用領域，包括多媒體系統、數位資料和地理資訊系統等。在本篇論文中，我們提出一個針對視訊文字偵測和粹取的方法，此方法在設計上考量了文字的特性，包括：文字的高明亮度、同質性、幾何上的限制和特定的筆畫方向等。此系統有四個主要的模組，首先使用一個影像邊緣偵測器來偵測影像中有強烈邊緣的部分，其次採用文字增強器把找出可能屬於文字邊緣的部份做強化，然後使用文字區塊偵測器來標示出影像中的文字區域，最後我們應用倒傳遞學習演算法的類神經網路來粹取文字。我們使用多組加諸在影像中的不同語言的文字來測試此一文字偵測和粹取的系統，其實驗結果顯示在不同的對比、字型、字的顏色及背景複雜度下，我們的方式確實能獲得相當令人滿意的結果。

關鍵字

文字偵測；文字識別；文字粹取

並列摘要

Videotext detection, recognition, and extraction are considered as one of the key component for video retrieval, commentaries, and analysis systems. There are many practical applications such as multimedia systems, digital libraries, and video indexing concerning it. In this paper, we proposed a videotext extraction method which takes the text characteristics into consideration, including the brightness, physical constraints, restriction of geometry, and specific strokes directions. The system consists of four major modules. First, we employ a color image edge operator to detect the strong edge parts in an image. Next, we adopt a text enhancement operator to enhance the edges of potential texts. Then, a text range detection operator is used to identify blocks of text. Finally, we utilize a neutral network with a back propagation learning algorithm to extract text. We have applied the proposed system to detect and extract text from a set of images with embedded with text in different languages. Experimental results show that it is robust for contrast, font-size, font-color, and background complexity.

並列關鍵字

text detection ； text extraction ； text recognition.

參考文獻

[1] Yu, Zhong; Hongjiang, Zhang; Jain, A.K, “Automatic caption localization in compressed video,” IEEE Trans. On PAMI, Vol. 22, Issue 4, April, 2000, pp. 385-392.

[2] M.A. Smith and T. Kanade, “Video Skimming and Characterization through Language and Image understanding Techniques,” Technical Report CMU-CS-95-186, School of Computer Science, Carnegie Mellon University,.July 1995.

[3] R. Lienhart and F. Stuber, “Automatic Text Recognition in Digital Videos,” Proc. Praktische Informatic IV, pp.68-131, 1996.

[4] Jie Xi, Xian-Sheng Hua, Xiang-Rong Chen, Liu Wenyin, Hong-Jiang Zhang. “A Video Text Detection and Recognition System,.” IEEE International Conference on Multimedia and Expo (ICME 2001), Waseda University, Tokyo, Japan, August 22-25,

[5] A. K. Jain and S. Bhatt acharjee, “Text Segmentation Using Gabor Filter for Automatic Document Processing,” Machine Vision and Application, Vol. 5, No.3, pp. 169-184, 1992.

被引用紀錄

郎崇年（2012）。以 Matlab 為平台進行可見光域即時目標物辨識〔碩士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2012.00664

蕭文海（2008）。針對數位保存而設計之具彈性且智慧的影像轉置系統〔碩士論文，國立臺北科技大學〕。華藝線上圖書館。https://doi.org/10.6841/NTUT.2008.00304

廖凡宇（2010）。以類神經網路在股價預測之研究〔碩士論文，國立虎尾科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0028-2907201017203900

國際替代計量

運用類神經網路的自動化視訊文字粹取

未授權

主題瀏覽