生成式摘要(abstractive summarization)隨著快速成長的預訓練模型逐漸成為摘要任務的主流,而生成式摘要與原文資訊不一致的問題也變得更加明顯:摘要必須忠於原文,不應編造故事。本論文在非監督式摘要(unsupervised abstractive sumamrization)的研究基礎上,透過增加事實一致性評分機制,強化摘要與原文的資訊一致性;另外我們提出一個新的擷取關鍵字的方法,利用依存句法剖析器(Dependency Parsing)找到被修飾最多的關鍵字,這些關鍵字將用於輔助非監督式摘要所需還蓋到的訊息。透過 FEQA與ROUGE,實驗結果顯示我們在資訊一致性與重點還覆蓋率上皆有顯著的提升。
Abstractive summarization has gradually gained importance because of the rapid growth of pre-trained language models. However, there are occasions when the models generate a summary that contains information that is inconsistent with the original document. Presenting information differently from the original document is a critical problem under summarization that we label factual inconsistency. This research proposes an unsupervised abstractive summarization method for improving factual consistency and coverage that uses reinforcement learning. It includes a novel method designed to maintain factual consistency between the generated summary and the original document. As well as a novel method of ranking keywords; here, keywords are used to support the model and keep track of the level of coverage of the information. The result validates the performance and outperforms the existing methods.