帳號:guest(          離開系統
字體大小: 字級放大   字級縮小   預設字形  


作者(外文):Tu, Shih-Wei
論文名稱(中文):Identify Specialization and Generalization Relationships between Terms Using Query Refinement Behaviors
指導教授(外文):Chen, Yi-Shin
外文關鍵詞:search logterm relationship
  • 推薦推薦:0
  • 點閱點閱:76
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
Many aspects of the search log (e.g., phrase extraction and query recommendations) have gained much attention from researchers. In this work, we investigate the search log and focus on analyzing user query refinement behaviors. We find that search engine users often specialize and generalize queries in order to get better search results. According to this finding, we further propose the hypothesis that a refinement behavior can be considered a specialized or generalized action of a query term. To verify the hypothesis, we introduce a framework that aims to automatically mine the specialization, generalization, and replacement relationships between terms from the search log. Furthermore, the mining system is built based on a large-scale search log by using the annotations of numerous Web users (i.e., Web users annotate what they want by queries). Our results demonstrate that the semantic relationships between terms can be retrieved by analyzing the searching behaviors of search engine users. We also show the feasibility of systematically creating an ontology using the “wisdom of the crowd.”
近年來,搜尋引擎記錄被廣泛的應用在各個研究領域範疇中,像是關鍵字擷取、查詢詞推薦等等。本研究將針對搜尋引擎記錄進行分析,並把重點放在探討使用者的查詢詞修正行為上。在觀察、研究搜尋引擎記錄的過程中,我們發現許多搜尋引擎的使用者在下查詢詞時,時常使用specialize或是generalize的方式來修正查詢詞,藉著使用更廣義或狹義的查詢詞,希望能獲得更符合其需求的搜尋結果。根據這個觀察結果,我們提出了一個假設:我們認為使用者的查詢詞修正行為,皆能視為是詞彙之間specialize或generalize之語意上的變換。為了進一步對提出的假設進行驗證,我們提出了一個能從搜尋引擎記錄中自動擷取出詞彙之間specialization, generalization, 和replacement等等語意關係的系統。此系統不僅利用資料量龐大的搜尋引擎記錄當作資料來源,並且也彙集了眾多搜尋引擎使用者使用的查詢詞、點擊搜尋結果頁面等等對事物的描述,進而從中擷取出詞彙間的語意關係。我們的研究結果證實了詞彙之間的語意關係,確實能經由分析搜尋引擎使用者的搜尋行為而獲得。同時我們也呈現了一個利用□群眾智慧□,能系統化自動建立ontology的可行方法。
Chinese Abstract ii
Abstract iii
Acknowledgement iv
List of Tables vii
List of Figures viii
1 Introduction 1
3 Framework 7
3.1 Session Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Refinement Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Relationship Builder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.1 Specialization Relationship Builder . . . . . . . . . . . . . . . . . 13
3.3.2 Generalization Relationship Builder . . . . . . . . . . . . . . . . . 14
3.3.3 Replacement Relationship Builder . . . . . . . . . . . . . . . . . . 14
3.4 Relationship Extractor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 System Demostration 18
4.1 Prototype System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 System Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Demo Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
References 21

List of Tables
3.1 An example of a user session . . . . . . . . . . . . . . . . . . . . . . . . . 9

List of Figures
3.1 System framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Number of concepts in query pairs . . . . . . . . . . . . . . . . . . . . . . 11
4.1 A small term relationship graph given the query term “american idol” . . . 19
[1] R. Baeza-Yates and A. Tiberi. Extracting semantic relations from query logs. In KDD
’07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge
discovery and data mining, pages 76–85, New York, NY, USA, 2007. ACM.
[2] R. A. Baeza-Yates, C. A. Hurtado, and M. Mendoza. Query recommendation using
query logs in search engines. In EDBT Workshops, pages 588–596, 2004.
[3] H. Cao, D. Jiang, J. Pei, Q. He, Z. Liao, E. Chen, and H. Li. Context-aware query
suggestion by mining click-through and session data. In KDD ’08: Proceeding of
the 14th ACM SIGKDD international conference on Knowledge discovery and data
mining, pages 875–883, New York, NY, USA, 2008. ACM.
[4] J. Guo, G. Xu, H. Li, and X. Cheng. A unified and discriminative model for query
refinement. In S.-H. Myaeng, D.W. Oard, F. Sebastiani, T.-S. Chua, and M.-K. Leong,
editors, SIGIR, pages 379–386. ACM, 2008.
[5] C.-K. Huang, L.-F. Chien, and Y.-J. Oyang. Relevant term suggestion in interactive
web search based on contextual information in query session logs. Journal of the
American Society for Information Science and Technology, 54(7):638–649, 2003.
[6] J. Huang and E. N. Efthimiadis. Analyzing and evaluating query reformulation strategies
in web search logs. In CIKM ’09: Proceeding of the 18th ACM conference on
Information and knowledge management, pages 77–86, New York, NY, USA, 2009.
[7] B. J. Jansen, D. L. Booth, and A. Spink. Patterns of query reformulation during web
searching. J. Am. Soc. Inf. Sci. Technol., 60(7):1358–1371, 2009.
[8] B. J. Jansen, A. Spink, C. Blakely, and S. Koshman. Defining a session on web search
engines: Research articles. J. Am. Soc. Inf. Sci. Technol., 58(6):862–871, 2007.
[9] R. Jones and K. L. Klinkner. Beyond the session timeout: automatic hierarchical
segmentation of search topics in query logs. In CIKM ’08: Proceeding of the 17th
ACM conference on Information and knowledge management, pages 699–708, New
York, NY, USA, 2008. ACM.
[10] T. Lau and E. Horvitz. Patterns of search: analyzing and modeling web query refinement.
In UM ’99: Proceedings of the seventh international conference on User
modeling, pages 119–128, Secaucus, NJ, USA, 1999. Springer-Verlag New York, Inc.
[11] S. Y. Rieh and H. Xie. Analysis of multiple query reformulations on the web: the
interactive information retrieval context. Inf. Process. Manage., 42(3):751–768, 2006.
第一頁 上一頁 下一頁 最後一頁 top
* *