透過您的圖書館登入
IP:3.144.189.177
  • 學位論文

基於預訓練正規化之元學習

Improving Meta-Learning by Regularized Pre-training

指導教授 : 林軒田
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本論文提出一種新的改善元學習在少樣本學習(few-shot learning)的方法。近年 來,元學習方法在少樣本學習有不俗的表現。相比於傳統的遷移式學習因為參數過多容易 過擬和,元學習透過在訓練時模擬測試時的設定,而有較好的泛化效果。在這一兩 年內,一些研究提出透過精細調整遷移式學習的方法,可以達到一樣的泛化效果。更甚者, 有研究提出利用一般遷移式學習中的預訓練參數,可以當成元學習的初始化參數,並達成 更好的學習效果。然而,我們發現這些研究都只是單純把預訓練的權重拿來使用,並且 還是將重心放在元學習的架構調整上,而忽略預訓練改進之潛力。我們透過預訓練正規化在元 學習上達到更快的收斂,在較淺的網路上有更好的收斂結果。並且,我們認為相比元學習 ,預訓練的方法更需要大家去改進。

關鍵字

元學習 預訓練 正規化

並列摘要


This thesis presents a new dimension to improve the meta-learning in the few-shot learning field. In recent years, meta-learning has become one of the best ways to solve few-shot learning tasks. Traditional transfer learning style algorithms would easily overfit in the meta-train dataset and lead to poor performance on the meta-test dataset. However, in these two years, researchers have found that pre-training with some sophisticated fine-tuning may lead to competitive performance with the meta-learning approach. Moreover, utilizing the pre-training classifier weight on the original dataset could improve the meta-learning approach. Furthermore, pre-training is more time-efficient than episodic learning for meta-learning due to the sampling problem. We find that recent years of study mainly focus on the meta-learning part. However, the pre-training part has been less studied. We provide a naive regularization for pre-training with respect to Prototypical Network in the meta-learning stage. It leads to faster convergence speed and competitive performance which is comparable to classical Prototypical Network. The result implies that the classical meta-learning algorithm is good enough and it's possible to transfer some computation burden to the pre-training part.

並列關鍵字

Meta-Learning Pre-training Regularization

參考文獻


[1] D. Berthelot, N. Carlini, E. D. Cubuk, A. Kurakin, K. Sohn, H. Zhang, and C. Raffel.Remixmatch: Semi­supervised learning with distribution matching and augmenta­tion anchoring. In International Conference on Learning epresentations, 2020.
[2] D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. A. Raf­fel. Mixmatch: A holistic approach to semi­supervised learning. In H. Wallach,H. Larochelle, A. Beygelzimer, F. d'Alché­Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 5049–5059. Curran Associates, Inc., 2019.
[3] L. Bertinetto, J. F. Henriques, P. Torr, and A. Vedaldi. Meta­learning with differ­entiable closed­form solvers. In International Conference on Learning representa­tions, 2019.
[4] W.­Y. Chen, Y.­C. Liu, Z. Kira, Y.­C. F. Wang, and J.­B. Huang. A closer look at few­shot classification. In International Conference on Learning representations, 2019.
[5] Y. Chen, X. Wang, Z. Liu, H. Xu, and T. Darrell. A New Meta­Baseline for Few­Shot Learning. arXiv e­prints, page arXiv:2003.04390, Mar. 2020.

延伸閱讀