透過您的圖書館登入
IP:216.73.216.11
  • 學位論文

使用生成對抗網路完成七段顯示之數字字元辨識

Seven-segment Digits Recognition Based on Generative Adversarial Network

指導教授 : 秦群立
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


擁有無線傳輸功能的電子檢測儀已是如今的趨勢,但礙於價格方面比沒有傳輸功能來的昂貴許多,所以並無法被大眾普及,於是我們提出利用手機拍攝影像的方法辨識數值,用來取代具無線傳輸功能的電子檢測儀的高成本。通常大多數此類檢測器都使用LCD數位顯示器來顯示七段數字,但是該技術受到拍攝影像的限制,因為不同的因素(例如傾斜,模糊和反射)會影響識別結果。本文提出了一種新的基於生成對抗網絡的七段式數字之影像辨識(SD-GAN)系統,此系統使用多個損失函數,來克服在拍攝角度不同導致影像不易辨識,以提高七段式數字辨識的準確率。我們對該方法進行了SSIM和PSNR的評估,在SSIM中獲得0.96,在PSNR中獲得23.8;而混淆矩陣在訓練階段準確率為99.51%,測試準確率為99.55%;而辨識準確率在訓練階段的為97.92%,而測試階段為94.87%,完成一張準確繪製出七段式數字的影像。它不僅有克服了文字方向性以及背景反光的問題,而且在文字辨識也取得了突破。透過手機拍攝可大幅提高增加收集數據的便利性,配合深度學習演算法功能可提高辨識的準確率,因而改善拍攝影像的限制;不僅能取代高成本的無線傳輸功能的電子檢測儀,在收集數據有極大的發展性。

並列摘要


Electronic detectors with wireless transmission are now the trend, but they are not universal because they are much more expensive than those without transmission. Therefore, we propose to use the mobile phone to capture images to identify the value, which is used to replace the high cost of the electronic detector. Generally, most electronic detectors use LCD digital display to display seven-segment numbers, but this technology is limited by different factors (such as tilt, blur, and reflection) and will easily affect to recognition result. In this paper, we proposed a seven-segment digits recognition based on generative adversarial network (SD-GAN) with multiple loss functions. In order to solve text rotation problems, improving accuracy of seven-segment numerals identification. We evaluated the method by SSIM and PSNR. The result SSIM shows 0.96 and PSNR shows 23.8. In the training phase, the accuracy rate of the confusion matrix is 99.51%, and the test accuracy rate is 99.55%; the recognition accuracy rate is 97.92% in the training phase and 94.87% in the test phase. Complete a sheet to accurately draw seven segments Digital images. It not only overcomes the problems of text direction and background reflection, but also has made breakthroughs in text recognition. Though to the mobile phones, we can greatly increase the convenience of collecting data. We use the deep learning algorithm to improve the taking picture limitation and increase the accuracy rate of recognition. Therefore, it not only replace the high-cost wireless transmission function of the electronic detector, but also have great development in collecting data.

參考文獻


[1] Cutter, Michael, Roberto Manduchi “Towards Mobile OCR: How To Take a Good Picture of a Document Without Sight,” Proceedings of the ACM Symposium on Document Engineering, pp 75-84, December 2015.
[2] J. Ramon Navarro-Cerdan, Joaquim Arlandis, Rafael Llobet, Juan-Carlos Perez-Cortes “Batch-adaptive rejection threshold estimation with application to OCR post-processing,” Expert Systems with Applications, Vol. 42, pp 8111-8122, November 2015.
[3] Abdul Robby G., Antonia Tandra, Imelda Susanto, Jeklin Harefa, Andry Chowanda “Implementation of Optical Character Recognition using Tesseract with the Javanese Script Target in Android Application,” Procedia Computer Science, Vol. 157, pp 499-505, October 2019.
[4] Gabriel B. Holanda, João Wellington M. Souza, Daniel A. Lima, Leandro B. Marinho, Anaxágoras M. Girão, João Batista Bezerra Frota, Pedro P. Rebouças Filho “Development of OCR system on android platforms to aid reading with a refreshable braille display in real time,” Measurement, Vol. 120, pp 150-168, May 2018.
[5] C. Kaundilya, D. Chawla and Y. Chopra “Automated Text Extraction from Images using OCR System,” 2019 6th International Conference on Computing for Sustainable Global Development, pp. 145-150, March 2019.

延伸閱讀