透過您的圖書館登入
IP:3.128.173.223
  • 學位論文

正則化強制指導法於文本摘要的應用

R­TeaFor: Regularized Teacher­-Forcing for Abstractive Summarization

指導教授 : 鄭卜壬
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


強制指導法是訓練自然語言生成模型中相當廣泛被使用的方法,該法既可以增加訓練效率,也可以使訓練過程更加穩定。然而,強制指導法會導致相當著名的暴露偏差問題,亦即訓練時模型使用的訓練資料分布與推理時使用的訓練資料分布並不相同。過去的研究通常使用修改訓練資料分布的方式,將訓練資料調整成較相似生成資料分布的形式。上述做法並未考慮原始資料與調整後資料之間的成對關係,因此我們提出正則化強制指導法,在訓練中運用上述的成對關係,提升模型訓練時的正則化效果。實驗數據顯示,正則化強制指導法可以在常見的文本摘要資料集中達到比先前作法更好的效果,且正則化強制指導法可以被應用至不同的預訓練模型中。

並列摘要


Teacher-forcing is widely used in training sequence generation models to improve sampling efficiency and stabilize training. However, teacher-forcing is known for suffering from the exposure bias problem. Previous works have attempted to address exposure bias by modifying the training data to simulate the data distribution in the inference stage. Nevertheless, they do not consider the pairwise relationship between original training data and modified ones. We propose Regularized Teacher-Forcing (R-TeaFor) to utilize this relationship for better regularization. Empirically, we show that R-TeaFor outperforms previous summarization state-of-the-art models, and the results can be generalized to different pre-trained models.

參考文獻


A. Aghajanyan, A. Shrivastava, A. Gupta, N. Goyal, L. Zettlemoyer, and S. Gupta. Better fine­tuning by reducing representational collapse. In International Conference on Learning Representations, 2021.
S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer. Scheduled sampling for sequence prediction with recurrent neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems ­ Volume 1, NIPS’15, page 1171–1179, Cambridge, MA, USA, 2015. MIT Press.
Z.­Y.Dou, P.Liu, H.Hayashi, Z.Jiang, and G.Neubig. GSum:A general framework for guided neural abstractive summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4830–4842, Online, June 2021. Association for Computational Linguistics.
A. Fan, D. Grangier, and M. Auli. Controllable abstractive summarization. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pages 45–54, Melbourne, Australia, July 2018. Association for Computational Lin­ guistics.
Y.Galand Z.Ghahramani. A theoretically grounded application of dropout in recur­rent neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 1027–1035, Red Hook, NY, USA, 2016. Curran Associates Inc.

延伸閱讀