動態網路精簡之高效執行研究

卷積神經網路在許多電腦視覺任務中能達到最好的成果，但其計算資源過多和模型過大，難在有限資源的行動和穿戴式裝置上運行，因此壓縮模型和加速模型運行都成了現今最重要的議題之一。一般傳統的壓縮和加速方式都是對不重要的部分進行永久性靜態剪枝，對不同的輸入只使用同一套架構，然而不同的類別是由不同較低層級特徵經由階層式組成，靜態剪枝不能依不同輸入執行不同最佳子結構。本論文提出可依輸入進行動態剪枝的演算法。為了達到動態跳過較不重要通道的此目的，我們透過一個通道重要性預測小網路，根據不同的輸入預測出不同的通道重要性，此通道重要性由批正規化層中的縮放因子絕對值的大小表示，此預測小網路可以提供每一層批正規化層一個閾值，在各層批正規化層中，小於此閾值的縮放因子的通道的運行將被跳過。在通道重要性預測網路和目標網路同時訓練過程中，我們利用預測結果變異數找出不同難易樣本的合適剪枝率，再透過變異數的平均值來提高剪枝率，最後利用剩餘回數參數強迫最終剪枝率和目標一致。模擬結果顯示此方法在CIFAR-10數據集上執行ResNet系列加速了2到5.49倍，而僅損失不到1%的準確率，在類別數較多的CIFAR-100數據集上執行也有1.67倍的加速，伴隨著1.81%的準確率下降。在M-CifarNet上，相較於動態通道剪枝的方法(3.93倍加速，0.87%準確率下降)，有著更好的效能，4.29倍加速，僅造成準確率下降0.33%。在維持準確率90.5%的情況下，傳統靜態剪枝的方式(Network Slimming)能夠達到1.429倍的加速，而我們的方法可以達到2倍的加速，同時，預測小網路只造成在模型大小和計算量上不到1%的額外開銷。

關鍵字

神經網路；動態執行；輸入相關；網路加速

並列摘要

Convolutional neural networks (CNN) reach the state-of-the-art in computer vision. However, its huge computation and large model size cause that it is hard to execute on mobile and wearable devices with limited resources. Therefore, the model compression method and acceleration of model execution become one of the most salient researches. Conventional compression and acceleration methods pay efforts on removing the unimportant part in different scales, like weight pruning, filter pruning, and channel pruning. Nevertheless, the irreversible pruning methods result in permanent damage to the structure in the model. Consequently, the dynamic pruning method that differs from the difficulty of the classification task becomes relatively advantageous. In the past neural network research, different classes are hierarchically constructed by specific lower features. In a particular category, there is a myriad of low-level features useless, so we follow this idea to skip unnecessary features to accelerate the inference and to reduce the model size. Besides, theoretically, with the dynamic accelerating method that executes different substructures depending on different input images can reach a better performance than static pruning methods. In this paper, to measure the salience of features, we consider the absolute value of the scaling factor, gamma, in the batch normalization layer as the importance of the channels. Furthermore, we predict a threshold list with a tiny CNN model depending on different inputs and this list proves a threshold for each batch normalization layer. In each batch normalization layer, the channels with smaller-than-this-threshold gamma are skipped. During the training, we compute the expected pruning rate by the variance of predicted outputs, which increases with epochs. Besides, we increase the expected pruning rate by dividing by the average of the variance. Furthermore, we introduce a parameter, epoch_ratio, to force the expected pruning rate close to the target pruning rate. Besides, the threshold prediction network and the target model are trained with stochastic gradient descent, which is possible to find the best substructure that meets the target pruning rate during the inference. The simulation results show that this approach accelerates ResNet [1] by 2 to 5.49 times for CIFAR-10 [2], and only loses 0.94% accuracy. The execution of ResNet38 on the CIFAR-100 [2] with a larger number of categories also had a 1.67× acceleration, with a 1.81% accuracy drop. On M-CifarNet, compared to the FBS [3] method (3.93× acceleration, 0.87% accuracy reduction), it has better performance, 4.29 times acceleration and 0.33% accuracy drops. In the case of maintaining 90.50% accuracy, the conventional static pruning, Network Slimming, has a 1.429× acceleration and our approach has 2× acceleration. In addition, the threshold predictor brings in overhead costs in FLOPs and model size that does not exceed 1% at most.

並列關鍵字

dynamic execution ； input dependent ； model acceleration ； network slimming

參考文獻

[1] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in IEEE Conference on Computer Vision and Pattern Recognition, 2016.

Google Scholar

[2] A. Krizhevsky, V. Nair, and G. Hinton, "The CIFAR-10 and CIFAR-100 datasets," 2014.

Google Scholar

[3] X. Gao, Y. Zhao, L. Dudziak, R. Mullins, and C. Z. Xu, "Dynamic channel pruning: Feature boosting and suppression," in International Conference on Learning Representations, 2019.

Google Scholar

[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in Neural Information Processing Systems, 2012.

Google Scholar

[5] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A large-scale hierarchical image database," in IEEE Conference on Computer Vision and Pattern Recognition, 2009.

Google Scholar

國際替代計量

動態網路精簡之高效執行研究

全文下載

主題瀏覽