隨著網路上資訊爆炸的問題,人們需要一個有效的方法去擷取真正所需要資訊。語意網是在目前的全球網(WWW)之外,架設一層詮釋性資料層(metadata layer),用詮釋性資料描述全球網上的資源。語意網擴展目前的網站結構,在資訊方面給予意義上明確的定義,並且使得人和電腦可以共同合作處理資訊。 在這篇論文中,我們設計並且實作了一個用來擷取領域事件並且提供語義服務的系統。這個架構包含三個部分: 後端擷取元件,Ontology-based儲存庫以及服務前端。後端包含了數個用來擷取領域事件的元件。Ontology-based儲存庫是用來做為一個共通的介面,將所擷取出來的領域事件轉換成特定格式的資料,並且將這些轉換後的資料儲存到特定的儲存庫。服務前端則是提供了數個語義服務。在建構完整個系統之後,我們會評估我們的系統藉由擷取某個特定領域事件,並且探討那些原因會影響到擷取的結果。
With the problem of information explosion on the web, people need an efficient way to extract the information they really need. Semantic web is an emerging technology working by building a metadata layer upon the current web and using the metadata description language to describe the resources on the WWW. It is an extension of current Web where information is given well-defined meaning, better, enabling computers and people to process in cooperation. In this thesis, we design and implement an system that is able to extract the domain events from a large number of relevant documents and to provide the semantic service. The architecture consists of three parts: Back End Extraction Components, Ontology-based store and Service Front End. The Back End consists of several components used to extract the domain events. The ontology-based store is served as a common interface which takes extracted domain events as input and exports the specific format data as output and provide specific repository for specific data format to store. The Service Front End provides several semantic services. After building the whole system, we make the evaluation for our system by extract some specific domain events from the relevant documents and figure out which reasons can influence the result of extraction.