基於預訓練正規化之元學習

本論文提出一種新的改善元學習在少樣本學習(few-shot learning)的方法。近年來，元學習方法在少樣本學習有不俗的表現。相比於傳統的遷移式學習因為參數過多容易過擬和，元學習透過在訓練時模擬測試時的設定，而有較好的泛化效果。在這一兩年內，一些研究提出透過精細調整遷移式學習的方法，可以達到一樣的泛化效果。更甚者，有研究提出利用一般遷移式學習中的預訓練參數，可以當成元學習的初始化參數，並達成更好的學習效果。然而，我們發現這些研究都只是單純把預訓練的權重拿來使用，並且還是將重心放在元學習的架構調整上，而忽略預訓練改進之潛力。我們透過預訓練正規化在元學習上達到更快的收斂，在較淺的網路上有更好的收斂結果。並且，我們認為相比元學習，預訓練的方法更需要大家去改進。

關鍵字

元學習；預訓練；正規化

並列摘要

This thesis presents a new dimension to improve the meta-learning in the few-shot learning field. In recent years, meta-learning has become one of the best ways to solve few-shot learning tasks. Traditional transfer learning style algorithms would easily overfit in the meta-train dataset and lead to poor performance on the meta-test dataset. However, in these two years, researchers have found that pre-training with some sophisticated fine-tuning may lead to competitive performance with the meta-learning approach. Moreover, utilizing the pre-training classifier weight on the original dataset could improve the meta-learning approach. Furthermore, pre-training is more time-efficient than episodic learning for meta-learning due to the sampling problem. We find that recent years of study mainly focus on the meta-learning part. However, the pre-training part has been less studied. We provide a naive regularization for pre-training with respect to Prototypical Network in the meta-learning stage. It leads to faster convergence speed and competitive performance which is comparable to classical Prototypical Network. The result implies that the classical meta-learning algorithm is good enough and it's possible to transfer some computation burden to the pre-training part.

並列關鍵字

Meta-Learning ； Pre-training ； Regularization

參考文獻

[1] D. Berthelot, N. Carlini, E. D. Cubuk, A. Kurakin, K. Sohn, H. Zhang, and C. Raffel.Remixmatch: Semisupervised learning with distribution matching and augmentation anchoring. In International Conference on Learning epresentations, 2020.

Google Scholar

[2] D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. A. Raffel. Mixmatch: A holistic approach to semisupervised learning. In H. Wallach,H. Larochelle, A. Beygelzimer, F. d'AlchéBuc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 5049–5059. Curran Associates, Inc., 2019.

Google Scholar

[3] L. Bertinetto, J. F. Henriques, P. Torr, and A. Vedaldi. Metalearning with differentiable closedform solvers. In International Conference on Learning representations, 2019.

Google Scholar

[4] W.Y. Chen, Y.C. Liu, Z. Kira, Y.C. F. Wang, and J.B. Huang. A closer look at fewshot classification. In International Conference on Learning representations, 2019.

Google Scholar

[5] Y. Chen, X. Wang, Z. Liu, H. Xu, and T. Darrell. A New MetaBaseline for FewShot Learning. arXiv eprints, page arXiv:2003.04390, Mar. 2020.

Google Scholar

國際替代計量

基於預訓練正規化之元學習

未授權

主題瀏覽