學習高速且高效率的深度學習的迴歸應用

在深度學習茁壯的近十年來，硬體和軟體都為了卷積神經網路 (CNN) 快速地發展，而在大尺度應用上，只追求效能的神經網路會產生無法負荷且過大的計算成本，導致快速且有效率的研究題目是非常迫切的。我們先就特徵描述子 (feature descriptor) 到神經網路的演進討論起，並在本論文中特別討論迴歸問題 (regression) 在深度學習中的應用。迴歸問題的本質可以分為多種，像是連續性 (continuity)、分群或量化 (grouping or quanitzation)、分佈性 (distribution) 等等。過去的方法也曾對這些問題著手，但是卻沒辦法將快速高效和這些面向做有效的結合。我們討論了在不同的電腦視覺應用下，如何讓過去運算或儲存的負荷降低到百倍甚至千倍以下，而且仍然維持足夠好的效能和訓練穩定度。其中我們以臉部年齡估計 (facial age estimation)、頭部角度估計 (head pose estimation) 作為主題，來表現我們方法的強健程度。

關鍵字

卷積神經網路；迴歸；高效率；緊實；快速；年齡；頭部；角度

並列摘要

In the past few decades, deep learning is growing as fast as it could, and both of the hardware and the software are rapidly developing for the convolutional neural network (CNN). However, in large scale scenario, purely performance driven network will consume too much computational power. Therefore, it is crucial to study fast and efficient neural network. At first, we introduce the process from using feature descriptor to CNN, and we discuss the regression problems in this thesis specifically. The nature of the regression has several aspects such as continuity, grouping or quantization, distribution and so on. Previous research also targets on such problems, but they failed on combining them into an efficient framework. We discuss different computer vision applications with 100× or even 1000× smaller computational cost or memory overhead while maintaining excellent performance and training stability. More specifically, we use facial age estimation and head pose estimation as concrete examples to show the robustness of the pro- posed methods.

並列關鍵字

Convolutional neural networks ； regression ； efficient ； compact ； fast ； age ； head ； pose

參考文獻

[1] EirikurAgustsson,RaduTimofte,andLucVanGool.Anchoredregressionnetworks applied to age estimation and super resolution. In Proceedings of the IEEE Confer- ence on International Conference on Computer Vision (ICCV), 2017.

Google Scholar

[2] Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

Google Scholar

[3] Relja Arandjelovic and Andrew Zisserman. All about VLAD. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR), 2013.

Google Scholar

[4] Relja Arandjelović and Andrew Zisserman. Dislocation: Scalable descriptor dis- tinctiveness for location recognition. In Proceedings of the Asian Conference on Computer Vision (ACCV), 2014.

Google Scholar

[5] Vassileios Balntas, Shuda Li, and VA Prisacariu. Relocnet: Continous metric learn- ing relocalisation using neural nets. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.

Google Scholar

國際替代計量

學習高速且高效率的深度學習的迴歸應用

全文下載

主題瀏覽