整體式深度與集成學習演算法

現今卷積神經網路已被廣泛運用於電腦視覺的相關領域並解決其問題。隨著許多高運算效能的加速器陸續問世，我們得以將已訓練好的模型部署於手機、攝像機，已或是其他運算資源有限的嵌入式裝置之中，進而在自架車、機器人等系統中扮演重要角色。以上種種應用除了講求高準確率，系統的能否對於外界環境的變化給予使用者即時性的反饋也相當重要。然而，隨著部署在前端裝置的神經網路種類及數量都越來越多，運算複雜度也相對提高，而個別去推論單一功能的模型也是一件相當沒有效率的事。因此，如果我們能找出個別模型之中是否有可以協同運算的部分並加以整合，便能大幅降低前端裝置的運算功耗。在本篇論文中，我們提出了一套完整的演算法框架，幫助我們將多個單功能的卷積神經網路整合成一個多工的卷積神經網路。其中採用了與多工學習，遷移式學習，以及深度特徵表徵之量化方法解決問題。透過此演算法的結果為一局部最佳解，也就是從局部可能的組合之中找出一個與整合前的神經網路相比擁有最大百分比的運算下降、記憶體使用下降，並盡可能最小百分比的準確率下降。透過此演算法框架的幫助，硬體開發者能夠花較少的時間再處理神經網路的整合問題，而能更加專注於處理實際的實作問題。

關鍵字

深度學習；集成學習；多工學習；遷移式學習；深度特徵表徵

並列摘要

The worldwide flourishing of the Convolutional Neural Networks (CNNs) has enabled researchers to tackle multiple difficult tasks on the topic of computer vision. Moreover, as more accelerators with efficient algorithm designs have come into exist, we are able to compute our CNNs on devices with limited resources such as smart phones, cameras, and other embedded systems. These applications also play vital roles in many robotics and self-driving car systems. Not only do they require high-accuracy performance, but also are in the need of real-time operation to return instant feedback while interacting with users. However, as more CNNs with diverse functionalities are deployed into devices, they are also becoming more highly computational complex, since it is inefficient to operate those individual CNNs seperately. If we can reduce millions of arithmetic operations executed on a single CNN by exploiting the synergy across multiple CNNs, a lot of computation power can be consequently saved. In this thesis, we propose a novel framework, called Integrated Deep and Ensemble Learning Algorithm (IDEA), to integrate multiple CNNs into a unified structure by exploiting the synergy across them. Our framework combine the spirit of multi-task learning, transfer learning, and also the quantitative approach used in deep visual representations to resolve the problem. To be noted, the final solution obtained by our proposed framework is considered as a sub-optimal one, with more efficient computational operation, a significant amount of memory reduction, but less drop in performance with respect to the original. With an aid of IDEA, hardware developers could be more concentrated on implementation issues.

並列關鍵字

deep learning ； ensemble learning ； multi-task learning ； transfer learning ； deep feature representation

參考文獻

M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in Proceedings of European Conference on Computer Vision (ECCV). Springer, 2014, pp. 818–833.

Google Scholar

J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, “Striving for simplicity: The all convolutional net,” arXiv, 2014.

Google Scholar

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based lo- calization,” in Proceedings of IEEE International Conference on Computer

Google Scholar

Vision (ICCV), 2017, pp. 618–626.

Google Scholar

D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba, “Network dissection: Quantifying interpretability of deep visual representations,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6541–6549.

Google Scholar

國際替代計量

整體式深度與集成學習演算法

全文下載

主題瀏覽