透過您的圖書館登入
IP:3.145.77.114
  • 學位論文

一種合併多個搜尋結果之兩階段排序方法

A 2-Stage Ranking Method to Merge Multiple Search Results

指導教授 : 鄭卜壬

摘要


集合型搜尋旨在探討如何將多個搜尋系統的結果當成既有的資訊,利用這些資訊,合併產生一個較好的結果。在這篇論文當中,我們藉由機器學習的技術提出了一個新的兩階段排序演算法用來解決如何合併多的搜尋結果的問題。兩階段的排序方法是基於分類的概念,希望對於所有的文件都能先進行分類的動作,利用這些結果進行排序而產生最後的答案。在第一個階段,我們對所有的文件分成四種相關性程度。一但每個文件有了這些資訊,在第二階段我們利用線性組合的方法可以簡單的對這些文件近一步排序,進而得到最後的結果。在實驗的方面,我們將方法實作在NTCIR4英英文件的標準測試集上面,在我們的實驗中,兩階段排序方法的結果皆能夠顯著的勝過數個基準的方法所產生之結果,也證明我們的演算法是有效的。

關鍵字

集合型搜尋

並列摘要


Metasearch is the problem that discusses how to combine the results of multiple independent search algorithms into one single result list and tries to improve the effectiveness of the retrieval. We propose a novel 2-stage ranking method to do this by applying the technology of machine learning. The 2-stage ranking method aims to use the concept of classification to solve the metasearch problem. In the first stage, we try to label each document in the search result with relevance or irrelevance by classification, where we discuss the differences between general classification and cost-sensitive classification in our algorithm. Once we have labeled all of the documents in the search result, in stage 2, we can use this information to produce the final ranking result by using linear combination. The 2-stage ranking method performs well on NTCIR4 English-English IR data. The experiment result shows that our method outperforms the existed metasearch algorithms and gives a significant improvement.

參考文獻


[1] Spärck Jones, Karen. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1):11-21, 1972.
[2] Salton, Gerard, Edward A. Fox & Harry Wu. Extended Boolean information retrieval. Communications of the ACM, 26(11):1022–1036, 1983.
[3] G. Salton, A. Wong, and C. S. Yang. A Vector Space Model for Automatic Indexing. Communications of the ACM, 18(11): 613–620, 1975.
[4] J M Ponte and W B Croft. A Language Modeling Approach to Information Retrieval. Research and Development in Information Retrieval, 275-281, 1998.
[5] F Song and W B Croft. A General Language Model for Information Retrieval. Research and Development in Information Retrieval, 279-280, 1999.

延伸閱讀