利用預訓練語言遮罩模型探索語意相近用詞於換句話說生成之研究

產生換句話說的句子是重要的自然語言處理任務，生成的句子必須傳達相同的意思，但是使用不同用詞。由於近期預訓練語言模型有強大的自然語言理解能力，這篇論文中我們提出有效利用BERT 模型的方法來探索語意相近的用詞，使用在換句話說句子的生成中。雖然利用BERT 做文字代換是常見的資料增強方法，但在換句話說的任務上帶來的進步幅度是有限的，因此我們不直接從分佈中隨機抽樣詞來做代換。由於一些相關研究的啟發，我們提出的方法將BERT 生成的分佈作為潛在的向量，整合在在Transformer Decoder 的注意力機制中。實驗的結果顯示我們提出的方法的BLEU 分數比現階段表現最好的模型更高，比起基準模型也得到更好的BLEU 與Rouge 分數，透過分析我們更觀察到提出的模型使用了訓練資料中沒有的詞彙代換來生成換句話說的句子。因此透過利用BERT 所學到的知識，我們能生成以量化的指標衡量下，表現更好的換句話說。

關鍵字

換句話說生成；預訓練語言模型；自然語言處理；類神經網路

並列摘要

Paraphrasing is an important natural language processing task to generate texts conveying the same meaning using different words or syntactic structures. Due to the strong language understanding of pretrained language models, this thesis proposes an effective method to utilize pretrained BERT to explore the semantically similar words for paraphrase generation. Although lexical substitution with BERT is a typical data augmentation method, the improvement for the paraphrasing generation task is limited. Inspired by the related work using latent variables when generating paraphrases, we propose a method to utilize the output distribution of BERT mask prediction as latent vectors and include them in the Transformer decoder. We will also introduce our customized attention module, which takes both the output of the encoder and the latent BERT vector as input in the decoder blocks. Our experiment results show that our proposed method outperforms all state-of-the-art models in terms of BLEU scores and improves the baseline Transformer seq2seq model in terms of BLEU and Rouge. When analyzing the predicted sentences, we observe that our model uses new patterns to substitute the words unseen in the training pairs. Utilizing BERT, our model can explore semantically similar words and use them to generate paraphrases with higher quality evaluated by qualitative metrics.

並列關鍵字

Paraphrase Generation ； BERT ； Natural Language Processing ； Neural Network

參考文獻

[1] R. Bhagat and E. Hovy. Squibs: What is a paraphrase? Computational Linguistics, 2013.

Google Scholar

[2] I. Bolshakov and A. Gelbukh. Synonymous paraphrasing using wordnet and internet. In Natural Language Processing and Information Systems. Springer Berlin Heidelberg, 2004.

Google Scholar

[3] Z. Cao, C. Luo, W. Li, and S. Li. Joint copying and restricted generation for paraphrase. In Proceedings of the ThirtyFirst AAAI Conference on Artificial Intelligence. AAAI Press, 2017.

Google Scholar

[4] M. Chen, Q. Tang, S. Wiseman, and K. Gimpel. Controllable paraphrase generation with a syntactic exemplar. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2019.

Google Scholar

[5] W. Chen, J. Tian, L. Xiao, H. He, and Y. Jin. A semantically consistent and syntactically variational encoder-decoder framework for paraphrase generation. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, 2020.

Google Scholar

國際替代計量

利用預訓練語言遮罩模型探索語意相近用詞於換句話說生成之研究

全文下載

主題瀏覽