一個應用於攝影機擷取文字影像之光學文字辨識前處理系統

隨著科技的進步，現在個人行動裝備上常會搭配鏡頭，使手機擁有照相的功能，也讓使用者可以隨處拍攝感興趣的物品。如果行動裝備可以安裝光學文字辨識系統，那就可以擁有即時文字翻譯功能。但是利用行動裝備拍攝的文字影像常會導入外界的干擾，而不利於文字辨識。由於目前商用的OCR軟體對於準確切割的單字影像辨識率已達99%以上，但是在前處理階段如何克服外在因素影響，並且準確地將文字切割出來，將會是本篇論文首要的目標。在文字辨識之前，還需要先執行文字偵測，文字行建構與語言辨識。本篇論文是採用由下而上的方法來建構文字行，接著利用k-mean 與 least mean square error 來分析文字行並且找出typographical structure。為了處理多語言文件，我們提出一個結合特徵篩選器與語言辨認器的語言辨識器。這個語言辨識器是以部份文字和部首為辨識單位，特徵篩選器將會從文字影像自動的挑選出Shapelet feature，然後就將此特徵交給語言辨認器辨識該文字影像所屬的語言。在文字切割部份，我們提出一個結合連字過濾的文字切割方法，並且從文字影像中找出periphery features，此特徵將交給support vector machines得出信任值。在實驗當中，整體文字切割正確率達到94.90%，此一數據也證明所提出系統的可行性。

關鍵字

文字切割；部分字元之語言辨識；多語系文件分析；曲線文字行校正；文字行建構；文字偵測；以攝影機為基礎之光學文字辨識；連字過濾器； typographical structure ； shapelet feature ； periphery feature

並列摘要

Due to the rapid development of mobile devices equipped with cameras, the realization of what you get is what you see seems not to be a dream. The mobile devices together with the proposed technique can thus serve as a translation tool to translate from one language to another language by recognizing the texts presented in the captured scenes. Images captured by cameras will embed much more external or unwanted environmental effects which need not to be considered in traditional optical character recognition (OCR). In this dissertation, we plan to segment a text image captured by mobile devices into individual single characters to facilitate later OCR kernel processing. Before proceeding character segmentation, text detection, text-line construction, and language identification need to be performed in advance. In our work, we construct text-lines from text blocks using a bottom-up method. After text-line construction, typographical structure is analyzed by utilizing the proposed k-mean and least mean square error method. The extracted typographical structure features will be incorporated to facilitate later text-line completion, local binarization, and character segmentation tasks. To cope with multilingual documents, a combined language identifier, called feature selector and language identifier, which can successfully identify the language of partial character is proposed. Shapelet features extracted by feature selectors are utilized to identify the language of text blocks. A novel character segmentation method which integrates touched character filters is employed on text images captured by cameras. In addition, periphery features are also extracted from the segmented images of touched characters, and fed as inputs to support vector machines (SVM) to calculate the confident values. In our experiment, the accuracy rate of the proposed character segmentation system is 94.90%, which demonstrates the feasibility and effectiveness of the proposed method.

並列關鍵字

periphery feature ； shapelet feature ； typographical structure ； camera-based OCR ； text detection ； text line constructure ； curved text line correction ； multilingual document analysis ； partial character of language identification ； character segmentation ； touched character filter

參考文獻

towards a system for visually impaired persons,” in Proc. the 17th Int. Conf. Pattern

[1] H. T. Lue, M. G. Wen, H. Y. Cheng, K. C. Fan, C. W. Lin, and C. C. Yu, “A novel

character segmentation method for text images captured by cameras,” J. ETRI, vol. 32,

[2] X. Chen, J. Yang, J. Zhang, and A. Waibel, “Automatic detection and recognition of

[3] N. Ezaki, M. Bulacu, and L. Schomaker, “Text detection from natural scene images:

被引用紀錄

施秋安（2015）。應用影像處理與光學字元辨識於自動化生產系統資訊擷取之研究〔碩士論文，中原大學〕。華藝線上圖書館。https://doi.org/10.6840/CYCU.2015.00125

傅泓翊（2012）。影片字幕檢索系統以臺大文學講座系列影片為例〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2012.00918

國際替代計量

一個應用於攝影機擷取文字影像之光學文字辨識前處理系統

未授權

主題瀏覽