透過您的圖書館登入
IP:3.134.103.74
  • 學位論文

以深度學習拆解與辨識中文書法字之筆畫

Deep Learning Algorithm on the Segmentation and Recognition of Chinese Calligraphy

指導教授 : 王偉彥 許陳鑑
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本論文針對中文書法字領域中較少人關注的部分─筆畫,以往對於文字方面的研究大多是文字辨識,例如:光學字元識別(Optical Character Recognition,OCR),主要在於”辨識”出文字。本論文透過筆畫來理解文字並進行拆解、辨識以及重現,遂提出了基於深度學習之筆畫拆解與辨識及即時書寫系統,驗證平台是透過網路攝影機讀取文字影像再用並列式手臂(Delta Robot)做即時的書寫。基於深度學習之筆畫辨識系統採用近幾年急速發展的深度學習來進行物件辨識,深度學習已經在影像識別方面證明其強對大的能力,藉由大量數據集學習對應物件而產生理想的網路模型,以此模型辨識想尋找的物件。所以本論文採用深度學習並改良部分神經網路架構,以得到較好的筆畫辨識結果。本系統參考並沿用YOLO(You only look once)在即時(Real-time)偵測與定位上的優良檢測速度以及準確度,在得出辨識與定位結果後,利用辨識與定位出的物件資訊做進一步的物件分割,再採用影像前處理濾除干擾以及提取骨架,得到每個筆畫物件的座標點,最後交由並列式手臂進行書寫以及文字的重構。此外,由於訓練神經網路需要大量的運算,因此有關神經網路的執行以及訓練都使用GPU進行平行運算來加速。本論文將文字筆畫作為物件並使用深度學習進行辨識與定位,此方式能同時得到筆畫種類以及座標,並且基於YOLO網路架構針對筆畫辨識進行架構改良,進一步提升辨識、定位準確率,同時保持原有的辨識速度。

並列摘要


This paper mainly focuses on the field that often people don't pay attention to in Chinese calligraphy - The Strokes. Most researches of characters are mostly text recognition. For example, Optical Character Recognition(OCR) is mainly used to identify the word's meanings. This paper proposed the deep learning on the segmentation and recognition of Chinese calligraphy system. It recognizes words with stroke structure, hence conducting segmentation, identification, and rebuilding them. Demonstration and application of the system read hand-writing text images using Webcam, then the word of chinese characters will be presented with Delta Robot. With the rapid deep learning development, this system uses it to detect objects. Deep learning has proved its powerful ability and potentials on image recognition. It learns objects we want from datasets and generates the ideal network model. After that, we use the model to identify objects we want. Therefore, this paper use deep learning and trains a lot of words stroke data and improves partial neural network structures to get the better stroke recognition results. This system reference and use YOLOv2 to get its good detect speed an accuracy on real-time detection and loaction. After getting results from detection and location, we segment objects with those information. Furthermore, we adopted image pre-processing to filter noise and got the skeleton of the objects. Finally, we get the point of every stroke objects and rewrite the word with writing by Delta Robot. In addition, we use GPU for parallel computing to accelerate all about neural network processing and training because the heavy computation with training and object detecting. This paper regards text stroke as object and use deep learning to recognize and locate. We will get the stroke's type and coordinate. Then, we base on the YOLOv2 for architectural improvement of stroke recognition. Lastly, we maintain original detection speed with improve detection and location accuracy.

參考文獻


[1] A. M. Turing, Computing machinery and intelligence. Mind 49, 433-460, 1950
[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep
convolutional neural networks,” Advances in Neural Information Processing
Systems., pp. 1106–1114, 2012.
[3] W. S. McCulloch and W. Pitts, “A Logical Calculus of the Ideas Immanent in Nervous Activity,” Bulletin of Mathematical Biophysics., vol. 5, pp. 115-133, 1943.

延伸閱讀