透過您的圖書館登入
IP:3.19.31.73
  • 學位論文

萃取全域與區域特徵以兩階段分類器進行工業應用字元辨識

Optical Character Recognition using Global and Local Features with Two-Stage Classification for Industry Application

指導教授 : 陳金聖

摘要


光學字元辨識技術(Optical Character Recognition, OCR)廣泛應用於工業領域,字元辨識流程主要可分為影像前處理、字元切割、特徵提取以及分類器四大區塊。本研究主要以特徵提取及分類器為研究重點,探討提取影像全域和區域特徵值,並以階層式分類架構結合改良式k-means分群法與倒傳遞類神經網路進行兩階段之分類辨識。 本研究方法透過階段性分類策略,第一階段以擷取幾何特徵,對影像全域資訊進行初步分群,並以改良式k-means分群法依幾何特徵值評估出最佳分群數;第二階段以較高維度之區域特徵對檢測影像進行類神經網路辨識,而檢測影像只需針對第一階段辨識結果對與有可能之分類器進行網路搜尋,藉由篩選掉不相似之族群來減少第二階段之比對對象,此兩階段分類器相較於各自使用k-means分群法及類神經網路有較高的字元正確率。 本研究以克服同一字型之字體粗細變化為目標,故在第一階段幾何特徵採用字元寬高比、骨架面積和幾何距離;於第二階段將影像尺寸正規化後,再將其切割為數個區塊,以區域面積分布作為特徵輸入進行辨識;訓練樣本以英文字母A~Z以及數字0~9共36個字元,並且針對兩種字型,分別為Times New Roman、Arial字型以及經高斯模糊化之Times New Roman字元影像進行測試,其辨識準確率分別為97.22%、100%和90.28%。

並列摘要


Optical character recognition (OCR) system is widely applied in industrial applications, and it has become one of the most essential applications of technology in the field of pattern recognition and artificial intelligence. It contributes significantly to the advancement of automation process and improves the interface between human and machine. This thesis presents a hierarchical classification of Optical Character Recognition (OCR) using global and local features with modified k-means and neural network. A hierarchical classification strategy, which contains training and recognition phases, was designed to precisely recognize the characters. In the training phase, the global features of sample characters and modified k-means clustering are employed to quickly categorize the characters into several clusters. Then each cluster and its corresponding local features of sample characters are fed into the Back-Propagation Neural Network (BPN) to learn the optimal weights. In the recognition phase, both global and local features are extracted from the test characters. Then the character is recognized by well-trained clusters’ center and neural network for coarse and fine recognition, respectively. Experimental results of the three different datasets tested showed classification rates of 97.22%, 100% and 90.28%, respectively.

參考文獻


2. Anil K.Jain and Torfinn Taxt, “Feature extraction methods for character recognition-a survey”, Pattern Recognition, Vol. 29, No. 4, pp. 641-662, 1996.
3. J. Pradeep, E. Srinivasan and S. Himavathi, “Diagonal feature extraction based handwritten character system using neural network”, International Journal of Computer Applications, Vol. 8, No. 9, pp. 17-22, 2010.
4. J. Pradeep, E. Srinivasan and S. Himavathi, “Diagonal feature extraction based handwritten alphabets recognition system using neural network”, International Journal of Computer Science & Information Technology, Vol. 3, No. 1, pp. 27-38, 2011.
5. J. Pradeep, E. Srinivasan and S. Himavathi, “An investigation on the performance of hybrid features for feed forward neural network based English handwritten character recognition system”, Wseas Transaction on Signal Processing, Vol. 10, pp. 21-29, 2014.
6. Nafiz Arica and Fatos T. Yarman-Vural, “An Overview of Character Recognition Focused on Off-Line Handwriting”, IEEE Transactions on Systems, Man, and Cybernetics, Vol. 31, No. 2, pp. 216-233, 2001.

延伸閱讀