透過您的圖書館登入
IP:18.222.239.77
  • 期刊

運用詞彙權重技術於自動文件摘要之研究

Automatic Text Summarization based on Wights of Words

摘要


目前各個搜尋引擎所產生的網頁摘要,大多無法提供使用者充足的摘要內容判斷資訊,更可能造成使用者的誤導。本研究希望搜尋引擎將查詢結果回傳給使用者時,不只是給予一些片斷不全的訊息,取而代之的是一個比較有幫助的摘要,使用者可以藉由此自動摘要,了解全文的概要,然後決定是否需要讀取網頁之全文。本研究運用權重技術針對網頁的內容進行文字探勘,藉由中研院所開發的中文斷詞系統(CKIP)進行斷詞,利用TF-ISF與相似度權重技術分別進行摘要實作,並透過其聯集與交集分別產生「概略摘要」與「精準摘要」,藉以提升自動摘要的品質。由實驗結果可證實本研究所提出之系統方法可以有效的提升文件自動摘要的正確性。

並列摘要


Purpose-The objective of text document summarization is to extract essential sentences that cover most of the concepts of a document so that users are able to comprehend the ideas of the documents which try to address by simply reading through the corresponding summary. This study aims to develop an automatic text summarization technique to product the summary of the web pages by extracting the sentences which cover most of the concepts of the web pages. Design/methodology/approach-The research framework was developed from CKIP (Chinese Knowledge Information Processing) system and automatic text summarization techniques. Two studies were designed to elicit and evaluate the accuracy and applicability of the five automatic text summarization techniques with 10 samples from 184 web articles. Findings-Our results show that TF-ISF (Term Frequency-Inverse Sentence Frequency) is better than the others in the evaluation of "F-measure". Further, "Rough Summary" and "Accurate Summary" respectively is the best performance in the evaluation of "RECALL" and "PRECISION". Research limitations/implications-This paper focuses on Chinese web articles. Hence, future research is recommended to develop an automatic text summarization system based on Ontology-based architecture. Practical implications-This paper provides several automatic text summarization techniques to product the summary of the web pages by extracting the sentences which cover most of the concepts of the web pages. The experimental results indicate that the proposed approach outperform a significant improvement on the accuracy of automatic text summarization. Originality/value-This paper is the first that applies the union and intersection of "Rough Summary" and "Accurate Summary" to improve the quality of automatic text summarization.

參考文獻


李俊宏、張興亞(2007)。一個以 Ontology 為基礎的 Web-Mining 技術應用於供應鏈競爭分析之研究。電子商務學報。9(3),435-160。
李麗華、李富民、詹尚驥、周裕健(2009)。以學術部落格為主之個人化推薦系統。資訊科技國際期刊。3(1),56-75。
柯淑津(2003)。從詞網出發的中文複合名詞的語意表達。中文計算語言學期刊。8(2),93-107。
陳姿妤、魏世杰()。
黃純敏、吳郁瑩()。

被引用紀錄


莊秉哲(2017)。中文新聞自動摘要產生系統〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2017.00003
Lin, C. H. (2017). 以文字探勘探討社群媒體文件分類之研究─以線上遊戲為例 [master's thesis, National Taipei University of Business]. Airiti Library. https://www.airitilibrary.com/Article/Detail?DocID=U0064-0201201815280103

延伸閱讀