卷積式類神經網路處理器之設計與實現

卷積式類神經網路是一種在傳統類神經網路基礎上發展出的機器學習模型，近年來，由於其高精準度、較少參數的特性而被廣泛應用於各種智慧系統和物聯網應用場景之中。然而，即使是非常簡單的卷積式類神經網路架構，其中的運算量也十分龐大，並且其內部運算會導致硬體資源利用率隨架構深度下降。同時，為了滿足不同應用的需求，其架構需要針對具體應用來調整。因此，在本篇論文中，我們設計並實現了一個可以彈性支援不同架構的卷積式類神經網路處理器，並藉由論文中所提出的重複利用計算單元的方法，有效的提升了硬體的利用率和處理速度。我們設計的系統通過Xilinx Virtex-7系列之現場可程式化閘陣列整合，並達到了4.799 e+9 synapses/s 和 3.96 nJ/synapse之運算效能。

關鍵字

卷積式類神經網路；硬體加速；機器學習

並列摘要

Convolutional neural network is a machine learning model with higher accuracy and less parameters than the traditional neural network, and it is widely use in the smart systems and IoT scenarios. However, the large amount of complex computation limits the processing speed, and some of the internal operations will even cause the decrease of utilization of processing unit. Moreover, different CNN models are required for various applications. Therefore, we propose and design a flexible CNN processor with high hardware utilization that can support different CNN models efficiently in this dissertation. The system is integrated on the Xilinx Virtex-7 FPGA, and achieves 4.799 e+9 synapses/s throughput and 3.96 nJ/synapse energy efficiency.

並列關鍵字

Convolutional Neural Network ； Hardware Acceleration ； Machine Learning

參考文獻

[1] G. E. Hinton, I. Sutskever and A. Krizhevsky, "Imagenet classification with deep convolutional neural networks", Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.

[3] G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath and B. Kingsbury, "Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups", IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82-97, 2012.

[4] P. Y. Simard, D. Steinkraus and J. C. Platt, "Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis", Proceedings of the Seventh International Conference on Document Analysis and Recognition, vol. 2, pp. 958–962, 2003.

[6] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition", Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.

[8] I. Hong, S. Park, J. Park and H. Yoo, "A 1.9nJ/pixel Embedded Deep Neural Network Processor for High Speed Visual Attention in a Mobile Vision Recognition SoC", Solid-State Circuits Conference (A-SSCC), 2015 IEEE Asian, pp. 1-4, 2015.

國際替代計量

卷積式類神經網路處理器之設計與實現

全文下載

主題瀏覽