透過您的圖書館登入
IP:13.58.214.82
  • 學位論文

基於神經網路中光學像差在OCR系統的影響

Influence of Optical Aberrations on Optical Character Recognition System based on Neural Network

指導教授 : 蘇國棟
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近年來,由於光學字元辨識系統的技術逐漸發展成熟,對於人類圖像文字的辨識有更全面的影響。光學字元辨識(OCR)是指對文字資料的圖像檔案進行分析辨識處理,取得文字及版面資訊的過程。在光學中,光學像差是實際成像與理論成像結果的偏差,其中包括球面像差、彗星像差、散光、場曲和畸變。在本篇論文中,我們提出一種光學模擬方式進行光束追蹤,其主要設計方法為利用Zemax光學模擬軟體設計五種光學像差結構,而針對OCR系統內的英文字母進行辨識,在文字圖形結構產生像差影響之後,藉由神經網路的訓練下使用Pytesseract影像辨識程式將圖片文字框字後重新訓練圖片,進而提升文字圖形的辨識率。接著,我們展現所有英文字母辨識率之模擬結果在OCR系統中。利用廣角透鏡的參考設計來驗證OCR系統對於字母辨識的影響。我們也分析與討論藉由改變參數設計對於OCR系統的文字辨識情形。基於神經網路的OCR系統可以明顯降低廣角透鏡設計的複雜性。透鏡的數量從10個減少到6個,而孔徑從0.62毫米增加到0.71毫米。因此從辨識結果中發現,藉由文字框字後而多次訓練下的圖片,能夠提高所有英文字母的辨識準確率,最後我們驗證了所提出的光學模擬方法,也探討基於神經網路中五種光學像差結構對於OCR系統的影響。

並列摘要


In recent years, as the technology of the optical character recognition system has gradually developed and matured, it has had a more comprehensive impact on the recognition of human image characters. Optical Character Recognition (OCR) is the process of analyzing and identifying image files of text data to obtain text and layout information. Optical aberration is the deviation of actual imaging from theoretical imaging results, including spherical aberration, coma, astigmatism, field curvature, and distortion. In this paper, we propose an optical simulation method for ray tracing. The primary design method uses Zemax simulation software to design five optical aberration structures. In recognition, after the structure of text and graphics produces aberration effects, the Pytesseract® image recognition program based on a neural network is used to frame the text and re-train the picture under the neural network training, thereby improving the recognition rate of text and graphics. Next, we show the recognition results of all English letter recognition rates in the OCR system. A reference design of a wide-angle lens is used to verify the impact of the OCR system on letter recognition. We also analyze and discuss the text recognition situation for the OCR system by changing the parameter design. The neural network-based OCR system can significantly reduce the complexity of a wide-angle lens design. The number of lenses can be reduced from ten to six. The aperture can be increased from 0.62 mm to 0.71 mm. Therefore, the recognition results show that the recognition accuracy of all English letters can be enhanced by using the pictures after the text box and repeated training. Finally, we verified the proposed optical simulation method—the influence of optical aberration structure on the OCR system.

參考文獻


J.Lladós, D.Lopresti, and S.Uchida, Document Analysis and Recognition – ICDAR 2021. 2021.
S. K.Das, S. P.Das, N.Dey, and A.Hassanien, Machine Learning Algorithms for Industrial Applications, vol. 907, no. May. 2021.
J.Mantas, “An overview of character recognition methodologies,” Pattern Recognit., vol. 19, no. 6, pp. 425–430, 1986, DOI: 10.1016/0031-3203(86)90040-3.
A.Chaudhuri, K.Mandaviya, P.Badelia, and S. K.Ghosh, Optical Character Recognition System for Different Languages with Soft Computing, Studies in Fuzziness and Soft Computing. 2017.
S.Bansal, M.Gupta, and A. K.Tyagi, “A Necessary Review on Optical Character Recognition (OCR) System for Vehicular Applications,” Proc. 2nd Int. Conf. Inven. Res. Comput. Appl. ICIRCA 2020, pp. 918–922, 2020, DOI: 10.1109/ICIRCA48905.2020.9183330.

延伸閱讀