透過您的圖書館登入
IP:3.15.203.168
  • 學位論文

從MSN query log分析使用者的查詢需求

Are We Searching for the Same Thing? A Large-Scale Analysis of Search Engine Logs

指導教授 : 陳宜欣
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


輸入相同搜尋字串時,使用者意向(user intention)可能會因人而異,是許多網路搜尋技術所關切的問題,但是實際查詢時是否有如此的狀況,目前並沒有人提出強力的論證;所以本研究的主旨就是要分析使用者的搜尋行為,來了解使用者在搜尋相同字串時的意向歧異度,以作為相關網路搜尋技術的參考。   分析使用者的實際查詢、瀏覽行為是了解使用者意向的重要途徑;本研究取得了微軟研究中心提供的微軟搜尋引擎查詢日誌(MSN query logs)和使用者點擊日誌(click logs),一共兩千多萬筆的實際使用者查詢相關資料來做分析。   我們把使用者查詢後點擊的網頁視為虛擬關聯度回饋(pseudo relevance feedback),從中萃取出與使用者意向相關的資料:查詢目的(user goal)以及資訊需求(information need)。分析結果發現,查詢目的為尋找網站 (Navigational)的搜尋字串,大多有很一致的使用者意向;而查詢目的為蒐集知識 (Informational)及獲取資源 (Resource)的搜尋字串,使用者意向就有較明顯因人而異的狀況;另外我們也提出了可以快速分類查詢目的的自動化方法,由以上研究,可以幫助搜尋引擎決定該用何種技術來處理不同的搜尋字串。

並列摘要


Many researchers have been working on advanced search techniques such as personalization, query reformulation, and collaborative filtering to enhance the quality of the search results. One common motivation addressed in these research works is the ambiguous-query problem. However, few studies have evaluated the ambiguity of query strings. We analyze 15 million queries from MSN search engine logs in this study. Three major questions are addressed: 1) How many people are using the same query string to search? 2) Are they searching for the same thing while using the same query string? 3) Can we distinguish ambiguous and unambiguous query strings?

參考文獻


[1] S. M. Beitzel, E. C. Jensen, A. Chowdhury, O. Frieder, and D. Grossman. Temporal analysis of a very large topically categorized web query log. J. Am. Soc. Inf. Sci. Technol., 58(2):166–178, 2007.
[3] A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3–10, 2002.
[5] B. J. Jansen and A. Spink. How are we searching the web? A comparison of nine search engine query logs. In Information Processing and Management, volume 42, 2006.
[6] T. Joachims. Optimizing search engines using clickthrough data. In KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 133–142, New York, NY, USA, 2002. ACM Press.
[8] C. C. J. E. L. Kwan Yi, Jamshid Beheshti and A. Large. User search behavior of domain-specific information retrieval systems: An analysis of the query logs from psycinfo and abc-clios historical abstracts/america: History and life. THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 57(9):1208–1220, 2006.

延伸閱讀