透過您的圖書館登入
IP:18.221.140.111
  • 學位論文

以Web資訊擷取和知識本體融合方法整合領域內容和知識

Domain Content and Knowledge Integration Based on Web Information Extraction and Ontology Fusion

指導教授 : 林宣華
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本論文應用Web Mining方法,自動搜集大量散佈在各個網站或目錄的網頁資訊,並從中淬煉出與領域高度相關的資料。以智慧型系統實作論文所提出的方法,我們的系統可以分析和擷取領域相關網站的知識架構 (例如:sitemap),經由這些簡單的domain ontologies,系統可以分析「目錄與目錄」、「目錄與物件」、「物件與物件」間的關係,進而融合成為較完整的Domain Ontology (DO)。DO除提供領域知識建構外,有可以根據DO提供使用者更友善的瀏覽動線,能夠更迅速且正確的就找到他們想要的結果。 基於這些動機,本論文設計和完成一個以Web資訊擷取和知識本體融合方法整合領域內容和知識的系統,有效將不同領域網站之domain ontologies自動融合 (稱為Ontology Fusion),以提供自動建立領域知識的系統雛形。本系統以I3S (Intelligent Internet Information System) [8]平台為基礎,分成三階段達到此目標。 首先,利用I3DDC (Domain Data Collector) 可快速及有效搜集領域相關的網頁。透過I3DME (Domain Metadata Extractor) 可將領域相關的重要metadata與domain concepts擷取出來。接著系統開發I3DKF (Domain Knowledge Fusioner) 中的I3CF (Catalog Fusion),以Concept-Based CF與Object-Based CF兩種模式進行目錄之間的合併、搬移。我們提出Ontology Fusion方法,分析目錄間的關聯性,可以進一步擷取目錄和物件間之關聯 (Relations),讓目錄架構擴充成為Ontology。

並列摘要


In this thesis, we applied web mining techniques to automatically collect a large number of scattered websites and pages directory information for extract highly domain-related information. The intelligent system is then proposed to analyze the structures of websites and directories and extract the structural knowledge as simple ontologies for the domain. Each simple ontology is corresponding to a portal directory that consists of relationships between “catalog and the catalog”, “catalog and object”, and “object and object”. Fusing these ontologies into the Domain Ontology (DO) is feasible since those are extracted from domain-related websites and directories. Based on three types of relationships, “concept-based”, “object-based” and “relation-based” fusion approaches are proposed and integrated into the ontology-fusion processes. Experiments show that the fused DO is not only used as the largest domain portal directory, the DO is also better than any portal sites for organizing the structure knowledge and browsing-and-finding objects.

參考文獻


[1] Chinese MARC, http://catweb.ncl.edu.tw/.
[2] Chakrabarti, S., van den Berg, M. and Dom, B., “Focused crawling: A new approach to topic-specific web resource discovery,” Proceedings of the 8th World Wide Web Conference, Toronto, 1999.
[3] Eric Glover, Gary Flake, Steve Lawrence, William P. Birmingham, Andries Kruger, C. Lee Giles, and David Pennock. Improving category specific web search by learning query modifications. In Symposium on Applications and the Internet, SAINT, San Diego, CA, January 8–12 2001.
[4] Dublin Core, “Dublin Core Metadata Initiative,” available at http://dublincore.org/.
[5] Findbook, http://findbook.tw/.

延伸閱讀