上下文相關頁內搜尋

本論文描述一自動搜尋方法，將使用者給予的查詢詞彙，根據其上下文資訊連結至類維基百科知識庫中的文章，進而將知識庫中的文章提供給使用者作為參考，並減輕使用者選擇正確詞彙語意之負擔。此分法首先利用大型類維基知識庫中所含資訊來擴充較小型類維基知識庫之資訊，並利用擴充之後的知識庫來計算一超連結相似度模型，最後將此模型之量化資訊當作支持向量機支訓練資料，此模型即可用來根據前後文判別查詢詞彙在知識庫中應該對應之文章。實驗結果顯示，此系統能夠有效的為給定查詢詞彙與其上下文選擇正確的維基百科文章。此結果將可當作領域專有搜尋系統的核心，經過適當的修改，將可利用在跨語言搜尋系統上。

關鍵字

上下文相關搜尋；字義解岐；實體連結；維基百科；支持向量機

並列摘要

In this paper we introduce a method for searching appropriate articles from knowledge bases (e.g. Wikipedia) for a given query and its context. In our approach, this problem is transformed into a multi-class classification of candidate articles. The method involves automatically augmenting small knowledge bases using larger knowledge bases and learning to choose adequate articles based on hyperlink similarity between article and context. At run-time, keyphrases in given context are extracted and the sense ambiguity of query term is resolved by computing similarity of keyphrases between context and candidate articles. Evaluation shows that the method significantly outperforms the strong baseline of assigning most frequent articles to the query terms. Our method effectively determines adequate articles for given query-context pairs, suggesting the possibility of using our methods in context-aware search engines.

並列關鍵字

無資料

參考文獻

Hsieh, C.-T. (2000). Semi-Automatic Construction of Chinese WordNet - Using Class-based Translation Model.

Huang, C.-C., Tseng, C.-H., Kao, K. H., and Chang, J. S. (2008). A Thesaurus-based Semantic Classification of English Collocations. ROCLING 2008, (pp. 38-52). Taipei.

Agirre, E., and Rigau, G. (1996). Word Sense Disambiguation using Conceptual Density. 16th Conference on Computational Linguistics, (pp. 16-22). Copenhagen.

Black, E. W. (1988). An Experiment in Computational Discrimination of English Word Senses. IBM Journal of Research and Development , 185-194.

Chang CC and Lin CJ. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3):27.

國際替代計量

上下文相關頁內搜尋

全文下載

主題瀏覽