透過您的圖書館登入
IP:3.21.248.119
  • 學位論文

基於廣義知識本體及先驗演算法修正中文資訊檢索問題之實作系統

Web query expansion based on association rules mining with eHownet and Google chrome extension

指導教授 : 劉如生

摘要


搜尋引擎使用者如果對於查詢主題的背景知識不足或中文詞彙使用不夠精確的情況,例如兒童的語言表達能力尚未成熟而去使用不夠精確的中文關鍵字,加上多數使用者的使用習慣只參考前幾頁搜索結果,此情況下使用者可能遇到找不到欲查詢的網頁結果的問題 。 在本篇論文,我們提出一個基於Google chrome extension的查詢延伸系統。一方面使用中研院的廣義知識本體(Ehownet) 延伸使用者原來不精確的中文查詢,另一方面使用先驗演算法 (Aprori algorithm) 減少使用知識庫可能產生的噪音字 (noise word),最後系統提供使用者基於原來查詢的修正建議,並改善在Google搜尋引擎中文檢索的準確率。

並列摘要


In information retrieval, the same concept may be referred to using different words. This issue, known as synonymy, has an impact on the precision and recall of most information retrieval systems. According to the many surveys of search engine user behavior, around 50% of user has insufficient or no knowledge background for topic they query. Users may misuse Chinese synonyms or input imprecise keywords as search query to cause a decrease in precision. Therefore, the methods in which a system can help with query refinement become crucial to increase the quality of user search results. In this paper, we propose a query expansion system with online thesauri eHownet based on Google chrome extension to provide user with the suggesting additional query terms in Google search engine. However, eHownet may bring many noises for the expansion due to its collection independent characteristic. To overcome this issue, we use the learning algorithm Apriori to discover the association rules of the retrieved webpages and to filter out some noise words to improve the precisions of the queries.

參考文獻


[15] 林千翔、張嘉惠、陳貞伶. “結合長詞優先與序列標記之中文斷詞研究”. 2010
[5] Zhiguo Gong, Chan Wa Cheang, and Leong Hou U. “Web Query Expansion by
WordNet”. 16th International Conference, Database and Expert Systems Applications. pp 166-175. 2005
[6] J. J. Rocchio. “The Smart retrieval system - experiments in automatic document
[7] Brin, S.; Page, L. "The anatomy of a large-scale hypertextual Web search engine".

被引用紀錄


李琳(2017)。基於社群媒體訊息的事件偵測與追蹤之研究〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2017.00563

延伸閱讀