透過您的圖書館登入
IP:3.138.141.202
  • 學位論文

利用預訓練語言遮罩模型探索語意相近用詞於換句話說生成之研究

Utilizing BERT to Explore Semantically Similar Words for Paraphrase Generation

指導教授 : 許永真

摘要


產生換句話說的句子是重要的自然語言處理任務,生成的句子必須傳達相同的意思,但是使用不同用詞。由於近期預訓練語言模型有強大的自然語言理解能力,這篇論文中我們提出有效利用BERT 模型的方法來探索語意相近的用詞,使用在換句話說句子的生成中。雖然利用BERT 做文字代換是常見的資料增強方法,但在換句話說的任務上帶來的進步幅度是有限的,因此我們不直接從分佈中隨機抽樣詞來做代換。由於一些相關研究的啟發,我們提出的方法將BERT 生成的分佈作為潛在的向量,整合在在Transformer Decoder 的注意力機制中。實驗的結果顯示我們提出的方法的BLEU 分數比現階段表現最好的模型更高,比起基準模型也得到更好的BLEU 與Rouge 分數,透過分析我們更觀察到提出的模型使用了訓練資料中沒有的詞彙代換來生成換句話說的句子。因此透過利用BERT 所學到的知識,我們能生成以量化的指標衡量下,表現更好的換句話說。

並列摘要


Paraphrasing is an important natural language processing task to generate texts conveying the same meaning using different words or syntactic structures. Due to the strong language understanding of pretrained language models, this thesis proposes an effective method to utilize pretrained BERT to explore the semantically similar words for paraphrase generation. Although lexical substitution with BERT is a typical data augmentation method, the improvement for the paraphrasing generation task is limited. Inspired by the related work using latent variables when generating paraphrases, we propose a method to utilize the output distribution of BERT mask prediction as latent vectors and include them in the Transformer decoder. We will also introduce our customized attention module, which takes both the output of the encoder and the latent BERT vector as input in the decoder blocks. Our experiment results show that our proposed method outperforms all state-of-the-art models in terms of BLEU scores and improves the baseline Transformer seq2seq model in terms of BLEU and Rouge. When analyzing the predicted sentences, we observe that our model uses new patterns to substitute the words unseen in the training pairs. Utilizing BERT, our model can explore semantically similar words and use them to generate paraphrases with higher quality evaluated by qualitative metrics.

參考文獻


[1] R. Bhagat and E. Hovy. Squibs: What is a paraphrase? Computational Linguistics, 2013.
[2] I. Bolshakov and A. Gelbukh. Synonymous paraphrasing using wordnet and internet. In Natural Language Processing and Information Systems. Springer Berlin Heidelberg, 2004.
[3] Z. Cao, C. Luo, W. Li, and S. Li. Joint copying and restricted generation for paraphrase. In Proceedings of the ThirtyFirst AAAI Conference on Artificial Intelligence. AAAI Press, 2017.
[4] M. Chen, Q. Tang, S. Wiseman, and K. Gimpel. Controllable paraphrase generation with a syntactic exemplar. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2019.
[5] W. Chen, J. Tian, L. Xiao, H. He, and Y. Jin. A semantically consistent and syntactically variational encoder-decoder framework for paraphrase generation. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, 2020.

延伸閱讀