依使用者查詢及回饋探勘同類型喜好之網頁推薦系統

本論文提出建構一個網頁查詢推薦系統之相關技術。此系統從使用者查詢關鍵字及瀏覽網頁的記錄中過濾出有用的資訊，以使用者查詢關鍵字、瀏覽網頁、以及其回饋值構成使用者查詢交易記錄單位，並記錄在使用者側寫之中。本論文提出瀏覽關聯二分圖群集演算法，以二元位元向量來加快查詢關鍵字相似度的計算，找出查詢關鍵字和瀏覽網頁的聚落（查詢喜好聚落）。具有多種不同類型喜好的使用者可分屬於多個查詢喜好聚落中，使得以合作式過濾做推薦時，具有部份喜好相同的使用者瀏覽行為也能提供重要的推薦資訊。接下來，系統根據查詢喜好聚落將使用者查詢交易記錄做分割，針對每個分割探勘出查詢關鍵字與網頁文章之間的關聯規則，並以使用者瀏覽網頁回饋值計算出關聯規則的支持度與確信度。最後根據這些探勘所得的資訊，對於會員使用者，系統以合作式過濾方式，推薦使用者側寫所包含的查詢喜好聚落中相關性高的網頁文章給使用者。對於匿名使用者，系統則提供如搜尋引擎以關鍵字搜尋的方式，依據查詢關鍵字所屬聚落中推薦相關性高的網頁，使查詢結果更精簡並切合使用者需求。

關鍵字

網頁探勘；群集分析；關聯規則；合作式過濾；推薦系統

並列摘要

Most previous works on recommendation systems of web pages were designed based on collaborative filtering according to the clusters of user browsing behavior. In these approaches, a user only belongs to certain one cluster. If most users have multiple kinds of browsing interests, the number of users in the same cluster will be small and the information used for recommendation is limited. In addition, the information of users who have partially similar behavior is not considered. In this thesis, the strategies for constructing a query and recommendation system of web pages are proposed. First, the query keywords, browsed web pages, and user feedback values are extracted from web logs to be query transactions. A clustering algorithm is proposed to find the clusters of queries and related web pages, called the clusters of query interest , from the query transactions. A user who has multiple kinds of query interests can belong to more than one cluster. Then user query transactions are partitioned based on the clusters of query interest. In each partition, the association rules of queries and web pages are mined, where the support and confidence of rules are computed based on feedback values of users. According to the mined information, two main functions are provided in the system. A member user can ask a recommendation request. Based on clusters of query interest contained in the user profile, the highly associated web pages are recommended. On the other hand, an anonymous user can ask a query recommendation request to the system by giving query keywords. According to the cluster of query interest that the query keywords belong to, the highly associated web pages are returned as query results. Therefore, the query results will be more simplified and meet the requirements of most users.

並列關鍵字

Web Mining ； Clustering Analysis ； Association Rule ； Collaborative Filtering ； Recommendation System

參考文獻

[1] R. Agarwal, and R. Srikant, “Fast Algorithm for Mining Association Rule in Large Databases,” in Proceeding of The 20th International Conference on Very Large DataBases, 1994.

[7] I.S. Dhillon, “Co-Clustering Documents and Words Using Bipartite Spectral Graph Partitioning,” in Proceeding of the 7th International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD, 2001.

[10] H.J. Kim, and S.G. Lee, “A Semi-Supervised Document Clustering Technique for Information Organization,” in Proceeding of the 9th International Conference on Information and Knowledge Management, 2000.

[11] M. Kwak, and D.S. Cho, “Collaborative Filtering with Automatic Rating for Recommendation,” in Proceedings of International Symposium on Industrial Electronics, Volume: 1 , 2001.

[13] B. Mobasher, R. Cooley, and J. Srivastava, “Creating Adaptive Web Sites Through Usage-Based Clustering of URLs,” in Proceedings of the Workshop on Knowledge and Data Engineering Exchange, 1999.

國際替代計量

依使用者查詢及回饋探勘同類型喜好之網頁推薦系統

主題瀏覽