透過您的圖書館登入
IP:3.21.248.119
  • 學位論文

適用於卷積神經網路之擴充識別機制之設計

Design of an Extended Recognition Mechanism for Convolutional Neural Networks

指導教授 : 朱守禮
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近年來,深度學習已成為重要的人工智慧領域。在眾多深度類神經網路架構中,卷積神經網路(Convolutional Neural Network, CNN)展現了在電腦視覺領域的潛力。為了提高卷積神經網路的準確率與辨識類別數,CNN架構的網路層數日益增加,其所需要的訓練時間與訓練資料亦隨之大幅提升。遷移學習技術即用於減少CNN模型所需的訓練時間。然而,當辨識類別數擴增時,遷移學習仍需要修改CNN網路架構並重新訓練模型。有鑑於此,本研究提出了擴充學習(Extended Learning)方法,目的是使CNN在不需要重新建立網路架構下,基於已學習的網路層權重,擴充新類別的辨識能力。相較於遷移學習方法,擴充學習不需重新修改CNN架構,而相較於通用學習方法,擴充學習能以更少的總訓練資料量即可達到相同的識別能力。本研究亦提出了一個訓練機制,使其能調節擴充學習的訓練步驟,以提升訓練的成效。實驗顯示,在準確率為0.75的目標下,擴充學習相較於一般訓練CNN的方法減少約28%的總訓練資料量。

並列摘要


In recent years, deep learning has become an important field of artificial intelligence. In many deep neural network architectures, Convolutional Neural Network (CNN) illustrate the potential in field of computer vision. In order to improve the accuracy and classification categories of the convolutional neural network, the number of layers of the CNN architecture is increased, and the requirements of training time and dataset are also extremely increased. Transfer learning technology is adopted to reduce the time requirement for training the CNN model. However, transfer learning still needs to reconstruct the CNN architecture and retrain the model while the number of classification categories is enlarged. Therefore, this study proposes the Extended Learning method, which aims to expand the classification categories of the CNN architecture based on the learned weights of the neural network layers, without reconstructing the CNN network architecture. Compared with transfer learning technique, the proposed extended learning does not require to modify the CNN architecture. Compared with the general learning method, extended learning can achieve the same recognition ability with less amount of total training data. This study also proposes a training mechanism that can adjust the training steps of extended learning to improve the training efficiency. Experimental result reveals that extended learning with a target of 0.75 accuracy, approximately 28% of the total training data is reduced, compared to the general training of CNN.

參考文獻


[1]  A. Krizhevsky, I. Sutskever, G. E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.
[2]  C.-C. Chang and C.-J. Lin. “LIBSVM: A library for support vector machines.” ACM transactions on intelligent systems and technology (TIST) 2.3 (2011): 27.
[3]  K. He, X. Zhang, S. Ren, and J. Sun. “Deep residual learning for image recognition.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[4]  O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and F.-F. Li. “Imagenet large scale visual recognition challenge.” International journal of computer vision115.3 (2015): 211-252.
[5]  K. Simonyan and A. Zisserman. “Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556 (2014).

延伸閱讀