透過您的圖書館登入
IP:18.226.251.22
  • 學位論文

強化生成式摘要之資訊一致性與重點覆蓋率

Boosting Factual Consistency and High Coverage in Unsupervised Abstractive Summarization

指導教授 : 陳宜欣
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


生成式摘要(abstractive summarization)隨著快速成長的預訓練模型逐漸成為摘要任務的主流,而生成式摘要與原文資訊不一致的問題也變得更加明顯:摘要必須忠於原文,不應編造故事。本論文在非監督式摘要(unsupervised abstractive sumamrization)的研究基礎上,透過增加事實一致性評分機制,強化摘要與原文的資訊一致性;另外我們提出一個新的擷取關鍵字的方法,利用依存句法剖析器(Dependency Parsing)找到被修飾最多的關鍵字,這些關鍵字將用於輔助非監督式摘要所需還蓋到的訊息。透過 FEQA與ROUGE,實驗結果顯示我們在資訊一致性與重點還覆蓋率上皆有顯著的提升。

並列摘要


Abstractive summarization has gradually gained importance because of the rapid growth of pre-trained language models. However, there are occasions when the models generate a summary that contains information that is inconsistent with the original document. Presenting information differently from the original document is a critical problem under summarization that we label factual inconsistency. This research proposes an unsupervised abstractive summarization method for improving factual consistency and coverage that uses reinforcement learning. It includes a novel method designed to maintain factual consistency between the generated summary and the original document. As well as a novel method of ranking keywords; here, keywords are used to support the model and keep track of the level of coverage of the information. The result validates the performance and outperforms the existing methods.

參考文獻


[1] Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang,Songhao Piao, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon. Unilmv2: Pseudo-masked language models for unified language model pre-training, 2020.
[2] Federico Barrios, Federico Lpez, Luis Argerich, and Rosa Wachenchauzer. Variations of the similarity function of textrank for automated summarization, 2016.
[3] Yen-Chun Chen and Mohit Bansal. Fast abstractive summarization with reinforce-selected sentence rewriting, 2018.
[4] James Clarke and Mirella Lapata. Discourse constraints for document compression. Computational Linguistics, 36(3):411–441, 2010.
[5] Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, and Nazli Goharian. A discourse-aware attention model for abstractive summarization of long documents, 2018.

延伸閱讀