透過您的圖書館登入
IP:18.190.217.134
  • 學位論文

運用字詞與語句關係自動萃取文件摘要之研究

Automatic Text Summarization Using Relationship between Words and Sentences

指導教授 : 林熙禎
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本研究使用 NGD 建立一使用字詞關係網路的文句特徵摘要法以及一使用文句內聚關係網路的圖形化摘要方法,藉由 NGD 計算只需要文件本身包含字詞以及 Google 搜尋結果數的特點,除去對於相關領域資料集以及字詞關連字典的依賴。接著將兩組方法的結果以非監督式偏好投票式方法組合,達成一具有各方法共識的最終摘要結果。經 ROUGE 評估摘要品質,本方法所提出的利用字詞關係網路計分的文句特徵法可以達成比使用字詞統計資訊的 TF-IDF 計分好的效果。而文句內聚關係網路方法以及整體的排名分數組合法的表現也只略遜於 DUC 2002 當年一利用機器學習摘要組合的方法,證明本研究確實建立一有效的不需依賴相關文集、語義關係字典的非監督式單文件萃取式摘要方法。

並列摘要


This study proposed a feature-based and a graph-based summarization method by building graphs that represents the text, and interconnects between words and sentence with NGD. The methods can get rid of the reliance on the text corpus and lexical database, because we only use the words in document and the Google search results of word pairs to calculate NGD. We also proposed an aggregate method to combine the results from previous two summarization methods to generate better summary results. The experiment results showed that the ROUGE value of proposed feature-based summarization method was better than the feature-based summarization method using the TF-IDF. And the ROUGE values of proposed graph-based and aggregate summarization methods were only slightly lower by one of the DUC2002 peers. It proved that we proposed an effective unsupervised single-document summarization method without using the text corpus and lexical database.

參考文獻


a text summarization evaluation”, Natural Language Engineering, vol. 8, no. 1, pp.
43-68, 2002.
〔2〕 I. Mani and M. T. Maybury, Advances in Automatic Text Summarization, MIT Press,
〔3〕 H. P. Luhn, “The automatic creation of literature abstracts”, IBM Journal of Research
and Development, vol. 2, no. 2, pp. 159-165, April 1958.

延伸閱讀