透過基於實例之網路探勘方法進行術語關係發掘

在全球資訊網上已有許多資源每天持續產生新的術語。以術語為基礎的資訊檢索方法一向是主流。組織術語成為呈現資訊的結構，如術語圖，可幫助許多資訊檢索的應用，像是問答系統、自動摘要系統等。然而，想組織全球資訊網上每天新增的術語，面臨兩個困難：一個是這些資源不提供術語的前後文資訊，不像傳統的關係抽取方法以文件為分析基礎；另一個是術語不斷新增，有各種可能的關係類別，組織過程中不容易全都先具體定義。本研究提出基於實例的網路探勘方法解決這兩個問題。先由使用者就有興趣的術語關係給出實例，再利用搜尋引擎取得各個術語的資訊特徵，藉由比對與實例之間的相似度，找出其他具有相同關係的相關術語對。最後從具相同關係的術語對群中，找出共同特徵，做為該術語對群的關係標籤。實驗測試本方法的正確性表現，並討論本方法在實例的選擇，實例的數量，與數種術語關係類別的表現差異。

關鍵字

術語關係；術語圖；術語組織；網路探勘

並列摘要

There are lots of terminological resources on the web and continually increasing day by day. Term-based approaches are major information retrieval methods. Organizing terms into a well-formed information structure such as term graph is helpful for advanced IR applications, such as question answering and summarization. However, there are two problems to construct the useful term graph from the increasing terminological resources. One is that no context information can be used from terminological resources as in document-based approach of relation extraction. Another is that no explicitly specific relation types are predefined. To solve the problems, we proposed an example-based Web mining approach to discover term relations from a term set. We identify relations by organizing Related Term Pairs (RTPs) according to similarity of their relations with user-given RTP example. We utilize a Web-mining approach to estimate similarity by the context words occurred in the search results of querying the RTP. We test our approach in a simulate term set. The experiment examine performance of several relation types, and the influence of example selection and example amount.

並列關鍵字

term relation ； term graph ； term organizing ； Web mining

參考文獻

1. Miller, S., et al., {Algorithms that learn to extract information-BBN: Description of the SIFT system as used for MUC-7}. Proceedings of MUC, 1998. 7.

2. Miller, S., et al., {A novel use of statistical parsing to extract information from text}. Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, 2000: p. 226-233.

4. Zelenko, D., C. Aone, and A. Richardella, Kernel methods for relation extraction. J. Mach. Learn. Res., 2003. 3: p. 1083-1106.

7. Yangarber, R., et al., {Automatic Acquisition of Domain Knowledge for Information Extraction}. Proceedings of COLING-2000, 2000: p. 940-946.

12. Hasegawa, T., S. Sekine, and R. Grishman, {Discovering Relations among Named Entities from Large Corpora}. Proc. of ACL-2004, 2004: p. 415-422.

國際替代計量

透過基於實例之網路探勘方法進行術語關係發掘

全文下載

主題瀏覽