多語言預訓練語言模型(mPLMs)已在零樣本跨語言轉移任務中展示了顯著的能力。具體來說,它們可以僅在來源語言的任務上進行微調,然後應用於目標語言的任務。然而,對於預訓練過程中未見的低資源語言,僅依賴零樣本跨語言轉移通常會產生較差的結果。一種常見的策略是在目標語言上繼續使用遮罩預測來繼續訓練模型。但是,由於需要調整所有參數以進行語言適應,這樣的方法效率不彰。 在本篇論文中,我們提出了一種更有效的解決方案:用於語言適應的軟提示微調。我們的實驗發現,通過特別設計的軟提示來調整多語言模型,使模型能夠實現對以前未見過的語言的下游任務進行有效的零樣本跨語言轉移。值得注意的是,我們發現,相對於傳統的微調方法,提示調整在兩個的文本分類的資料集上,都展現了更佳的零樣本跨語言轉移表現,同時僅利用了調整參數的百分之0.28。這些結果強調了相對於傳統微調方法,軟提示調整可以為預訓練模型提供更加有效且高效的新增語言適應。
Multilingual pre-trained language models (mPLMs) have demonstrated notable effectiveness in zero-shot cross-lingual transfer tasks. Specifically, they can be fine-tuned solely on tasks in the source language and subsequently applied to tasks in the target language. However, for low-resource languages unseen during pre-training, relying solely on zero-shot language transfer often yields sub-optimal results. One common strategy is to continue training mPLMs using mask language modeling objectives on the target language. Nonetheless, this approach can be inefficient due to the need to adjust all parameters for language adaptation. In this paper, we propose a more efficient solution: soft-prompt tuning for language adaptation. Our experiments demonstrate that with carefully designed prompts, soft-prompt tuning enables mPLMs to achieve effective zero-shot cross-lingual transfer to downstream tasks in previously unseen languages. Notably, we found that prompt tuning outperforms continuously trained baselines on two text classification benchmarks, encompassing 18 low-resource languages, while utilizing a mere 0.28% of the tuned parameters. These results underscore the superior adaptability of mPLMs to previously unseen languages afforded by soft-prompt tuning compared to traditional fine-tuning methods.