社交標籤系統中瀏覽式標籤推薦查詢之研究

使用者對標籤資源進行查詢時，大多給予簡短的查詢字，搜尋出包含查詢字為標籤的資料物件。當查詢字為涵義較廣的字時，常造成查詢結果回傳大量資料物件，導致使用者需要費時對龐大的物件一一瀏覽，才能找到真正需要的資料。因此，本論文對社交標籤系統，探討如何由使用者給定的查詢字提供進一步的查詢標籤推薦，使能快速篩選找到所需資料。我們從包含查詢字為標籤的物件，以這些物件包含的所有標籤為候選標籤，評估與查詢字間的相關程度及和已推薦標籤的相異程度來決定一個標籤的關聯代表分數，再選擇分數值最高的前k個標籤為推薦查詢標籤。我們採用面相查詢的概念呈現推薦標籤，當使用者選擇特定推薦標籤後，系統將根據所選擇標籤推薦下一層可進一步篩選結果的查詢標籤，幫助使用者逐步縮小查詢結果涵蓋範圍。此外，本論文提出一個雙層式索引結構來加速社交標籤系統的查詢處理，而此索引結構也可支援可容錯的集合包含查詢處理。實驗結果顯示本研究方法可有效減少使用者搜尋資料所需的瀏覽成本，而所提出的索引結構亦可有效增進容錯集合包含查詢的處理效率，且對於關鍵字個數較多的查詢字效果越佳。

關鍵字

社交標籤系統；查詢標籤推薦；索引結構；集合包含查詢

並列摘要

Most users are used to giving brief keywords to query a social-tagging system for getting the objects whose tag sets contain the given query keywords. When the query keyword is a general term, the system usually returns a lot of objects as the query result. Accordingly, the users have to spend much time to browse all the returned objects to get the data he needs. For solving this problem, this thesis proposes a query recommendation method for social tagging systems. According to the given query keyword, we study how to provide some more tags as additional query terms for helping the user to effectively filter the dataset to find the required data quickly. At first, we find out the query result which consists of all the objects whose tag sets contain the query keyword. All the tags of these objects are called candidate tags. Next, for each candidate tag, we consider the relatedness with the query and the diversity with the selected recommendation tags to decide its representation score. According to the representation scores, the top-k tags are chosen to be recommendation tags. Then we adopt the concept of facet search to present the recommended tags. After users choose a specific recommended tag, the system will add the chosen tag into the query and perform tag recommendation recursively. Furthermore, this thesis proposes a two-level index structure, which aggregate similar tag sets into clusters according to the similarity between tag sets. A two-level bounding mechanism is proposed to deal with query processing of tag set containment queries. Besides, the Jaccard Containment function is used to evaluate the degree of set containment for supporting set containment search with error tolerant allowed. The experimental results show that the proposed method of query recommendation can effectively reduce the cost of user-browsing. Moreover, the proposed two-level index structure and query processing strategies provide better performance on execution time for tag set containment queries, especially for queries consisting of many tags.

並列關鍵字

social-tagging system ； query tag recommendation ； index structure ； set containment search

參考文獻

[4] H. Ma, R. Chandrasekar, C. Quirk and A. Gupta , “Improving Sea ch Engines Using Human Computation Games, ” in Proceedings of the 18th ACM conference on Information and knowledge management(CIKM), 2009.

[25] X. Wu, L. Zhang and Yong Yu, “Exploring Social Annotations for the Semantic Web,” in Proceedings of the 15th international conference on World wide web(WWW), 2006.

[2] M. Gupta, R. Li, Z. Yin and J. Han , “Survey on Social Tagging Techniques,” in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining(KDD), 2010.

[5] J. Chuang, C. Cho, A. Chen , “Similarity Search in Transaction Databases with a Two Level Bounding Mechanism,” in Proceedings of the 11th International Conference of Database Systems for Advanced Applications (DASFAA), 2006.

[9] D. Lu and Q. Li, “Personalized Search on Flickr based on Searcher’s Preference Prediction,” in Proceedings of the 20th international conference companion on World wide web(WWW), 2011.

國際替代計量

社交標籤系統中瀏覽式標籤推薦查詢之研究

主題瀏覽