透過您的圖書館登入
IP:3.15.221.67
  • 學位論文

訓練深度學習網路之探討

Investigate the Training of Deep Learning Network

指導教授 : 陳大正
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


深度學習演算法最常被使用在電腦視覺的深度網路為卷積神經網路(Convolutional Neural Network, CNN),卷積神經網路是由一個或多個卷積層和頂端的完全連接層所組成,而此類深度學習使用的深度網路大多從過去的方法所延伸。因為過去受限於硬體設備的運算能力和資料數位化的困難,所以難以訓練結構龐大的類神經網路模型。而如何有效的訓練龐大的深度網路是過去文獻較少探討的,故本研究將針對如何訓練深度神經網路進行探討。在本研究中,我們將整批資料分批依次訓練,與使用整批資料一次訓練。藉由上述不同訓練方式所獲得的結果差異來進行探討,以歸納出較佳的訓練方法並達到提升深度學習網路的學習效果。藉由本研究結果得知,整批資料的訓練深度網路的方式優於資料分批的訓練方式;當以每類資料筆數較多來訓練深度網路時優於每類訓練資料筆數較少的訓練方式。藉由本研究結果對於現行深度學習技術實務研究應用上有更進一步的瞭解。

並列摘要


The deep learning algorithm most often used in computer vision is the Convolutional Neural Network (CNN) which is one of the depth learning neural networks. The structure of CNN includes convolutional layer, pooling layer, and fully-connectedlayer. Such deep learning algorithms are mostly extended from past neural network methods.It is difficult to train a large scale of neural network models because of the difficulties in the computing power and data digitization of hardware devices in the past. How to effectively train huge depth of the networks is studied very few in the past literature, so this study will focus on how to effectively train the depth of the neural network. In this study, we will compare the difference between the trainings with batches of training datasets and the training with the whole set of datasets at once. Through the different training methods obtained by the difference between the results to explore, to summed up the better training methods and to achieve the depth of learning network to enhance the learning effect.With the results of this study, it is learned that: (1) the performance ofCNN trained with the whole batch of datasets is superior to the one with training mode using batches of datasets separately; (2) More datasets of each category used for training CNN will be with better classification accuracy. It is wished that the results of this study provided better understanding of training CNN for practical applications.

參考文獻


[2] Guo, Y., Liu, Y., Oerlemans, A. (2016), “Deep Learning for visual understanding: Review,” Neurocomputing, 187, pp. 27-48.
[6] Deng, L. (2014), “A tutorial survey of architectures, algorithms, and applications for 
deep learning,” APSIPA Transactions on Signal and Information Processing, 3, e2.
[7]LeCun, Y. (2012), “Learning invariant feature hierarchies,” Computer Vision-ECCV 2012 Workshops and Demonstrations, Lecture Notes in Computer Science, 7583, pp 496-505
[8] Hinton, G.E., and Sejnowski, T.J. (1986), “Learning and Relearning in Boltzmann Machines,” Parallel distributed processing: explorations in the microstructure of cognition, MIT Press, Cambridge, MA, 1. pp. 282-317.
[9] Liou, C.Y., Cheng, W.C., Liou, J.W., and Liou, D.R. (2014), “Autoencoder for words,” Neurocomputing, 139, pp. 84-96.

延伸閱讀