透過您的圖書館登入
IP:3.133.87.156
  • 期刊

An Adaptive Approach to Concept Extraction from Searched Documents

具調適性之概念擷取法於擷取搜尋所得之文件

摘要


對使用者來說,要他們由瀏覽搜尋引擎所找到的資訊來獲得某特定主題的完整概念是不實際、也不可能;而且由於其動態的本質,要他們持續留意這些搜尋所得資訊的最新變化是更加困難。因此,本研究著重於發展一方法能自動由搜尋所得的文件中擷取概念,並偵測這些概念隨時間進展的變化情形。本研究提議以段落(paragraph)及其對應的關鍵字來表示概念,並漸進式地調整概念的結構來反應這些改變。為瞭解所提方法之有效性,本研究進行兩個實驗,實驗結果顯示,本方法的召回率及精確度皆達高的水準。最後,本研究並以東南亞海嘯事件來測試所提方法的實際應用;由結果顯示,本方法可幫助瀏覽者快速地獲取海嘯報導的不同概念及最新進展。

並列摘要


It is usually impractical and impossible for users to browse through information collected by search engines to gain an overall picture about what it stands for: let alone to follow its changes over time due to its dynamic nature. This research focuses on developing an approach to automatically extracting concepts from the searched documents and detecting the concept changes over time. It represents concepts in the form of paragraph summary with associated key terms, and adaptively modifies the concept structure to accommodate changes. Two experiments are conducted accordingly. The results show both high recall and high precision. The tsunami event is chosen as the illustrated real application. It is shown that we can easily grasp different concepts of the tsunami reports and realize their changes by using our approach.

參考文獻


Allan, J.,C. Wade,A. Bolivar(2003).Retrieval and novelty detection at the sentence level.Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval.(Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval).:
Boley, D.,M. Gini,R. Gross,E. Han,K. Hastings,G. Karypis,V. Kumar,B. Mobasher,J. Moore(1999).Partitioning-based Clustering for Web Document Categorization.Decision Support Systems.27(3),329-341.
Chang, T. M.,C. M. Lai(2002).Cluster-based Keyword Extraction Approach.6th Pacific Asia Conference on Information Systems.(6th Pacific Asia Conference on Information Systems).:
Chen, H.,K. J. Lynch(1992).Automatic Construction of Networks of Concepts Characterizing Document Database.IEEE Transaction on Systems, Man and Cybernetics.22(5),885-902.
Lingua: En: Tagger-part-of-speech tagger for English natural language processing

延伸閱讀