透過您的圖書館登入
IP:3.144.38.24
  • 學位論文

以屬性字串比對法做線上草寫連體字辨識

On-line Cursive Script Recognition Using Attributed String atching

指導教授 : 蔡義泰
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


文字辨識的研究,已有很長的一段時間,但是其中有關草寫連體字方面的 研究卻仍然偏少,而且仍然存在著一些問題有待解決,例如:連體字的切 割,延遲書寫字的點、橫或撇的回溯定位,比對結果字組的後處理修正等 問題。在論文中,我們提出一個草寫連體字的辨識系統,其中包括五個部 份,前置處理、特徵拮取、分段處理、比對和後處理。前置處理包括常用 的簡化處理、平滑化處理、傾斜扶正和比例化。特徵拮取中,簡單地僅以 所定義的形狀特徵以及相關軌跡的長度來表示未知的分段軌跡。分段處理 利用先切出所有可能的分段再加以合併的方式處理連體字的切割問題,並 且以各分段彼此間的空間關係來處理延遲書寫字的定位問題。比對是利用 屬性字串比對配合動態規劃法,以字串修正的代價做為分段辨識結果的依 據而得到分段候選字。後處理則將各分段的候選字加以組合,利用字典將 不屬於字典中的組合過濾掉,以最後過濾後的合法組合中比對代價的總合 做為系統辨識結果的依據。最後,為了證實所提系統的正確性與可靠性, 進行了一連串的實驗來加以驗證。

並列摘要


Character recognition has been studied for a long time, but the cursive script recognition still exists some problems. For example, the segmentation of the cursive script, the delayed writing letters and the postprocessing of the recognition results still need more investigation. In the paper, we propose a recognition system of the cursive script. It includes five parts : preprocessing, segmentation, feature extraction, matching and postprocessing. Preprocessing has several processes, including filtering, smoothing,deskewing and size normalization. In the feature extraction, we represent the unknown segmented patterns simply with the shape primitives and relative length. During the segmentation, we solve problem of the segmentation by finding all possible segments and combining them to bigger segments. And we use the space relationship of the segments to solve the problem the delayed writing letters. In the matching, attributed string matching based on dynamic programming algorithm is applied. The segmented candidates are based on the minimum distance between the input pattern and the reference pattern. Postprocessing combines all possible segmented candidates with dictionary checking and get the final recognition results based on getting the minimum sum of the matching distance of the segments. Extensive experiments for the cursive script are conducted to examine how the proposed recognition system performs.

延伸閱讀