針對臉部偵測之卷積神經網路於可程式邏輯裝置之架構設計

卷積神經網路(Convolutional Neural Networks)在影像辨識和物件偵測達到絕佳的準確率,然而而受限於龐大的計算、存儲需求和大量的傳輸,使得它們難以被實現於行動及嵌入式裝置。在這篇論文當中 , 基於一個針對臉部偵測的卷積神經網路串聯 ( CNN Cascade),在功率的限制下,一些優化被採用以提高計算速率。在這些改進之下,可以同時節省運算量及存儲和頻寬需求。首先,卷積神經網路串聯的第一級轉換成了一個完全卷積神經網路,藉此減少了第一級 83%的運算量。再者,一個有效率之量化方法被採用以量化模型參數, 將原先用 32 位元浮點數之參數改用用 2 位元定點數來表示,藉此省下 93.75%的參數存儲空間需求。最後,一個卷積神經網路加速器(CNN Accelerator)被實現於可程式邏輯裝置(Field Programmable Gate Array)。透過系統性的分析方法,可以找到消耗最少頻寬及可程式邏輯裝置資源下,最高速率的架構。除此之外,透過先前量化的改變可以提高最終架構的計算能力。

關鍵字

臉部偵測；卷積神經網路；可程式邏輯裝置；架構設計

並列摘要

Convolutional neural networks (CNNs) have emerged to provide powerful dis- criminative capability, especially in the world of image recognition and object detection. However, their massive computation requirements, storage and memory accesses make them hard to be deployed on mobile or embedded systems. In this thesis, a few optimizations based on a CNN cascade architecture for face detection are proposed to increase throughput while minimizing computation, storage and bandwidth requirement under power constraints. First, the first net of the CNN cascade is converted to a fully convolutional network to reduce 83% of the computation in the first stage. Second, an efficient method is applied to quantize the model parameters. This is done by adopting a retraining method, reducing the word length of the parameters from 32-bit floating points to 2-bit fixed points, resulting in 93.75% less parameter memory size. Finally, a CNN accelerator is implemented on a Xilinx XC7020 FPGA board. We quantitatively analyze the computing throughput and required bandwidth us- ing the roofline model, an analytical design scheme, to find the solution with best performance and lowest FPGA resource requirement. Furthermore, we show that more computational ability benefits from the quantizing optimization.

並列關鍵字

Architecture Design ； Neural Networks ； Face Detection ； FPGA Platforms

參考文獻

[1] S. S. Farfade, M. J. Saberian, and L.-J. Li, “Multi-view face detection us- ing deep convolutional neural networks,” in Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR ’15), 2015, pp. 643–650.

Google Scholar

[2] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “A convolutional neural net- work cascade for face detection,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015, pp. 5325–5334.

Google Scholar

[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proc. in Neural Information Pro- cessing Systems (NIPS 2012), 2012, p. 4.

Google Scholar

[4] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “Overfeat: Integrated recognition, localization and detection using convolu- tional networks,” in Proc. International Conference on Learning Represen- tations (ICLR2014). CBLS, April 2014.

Google Scholar

[5] “Net surgery,” https://github.com/BVLC/caffe/blob/master/examples/net

Google Scholar

國際替代計量

針對臉部偵測之卷積神經網路於可程式邏輯裝置之架構設計

全文下載

主題瀏覽