於社群網路中之高效能鏈結預測與群組查詢

As the development and popularization of social networking websites, many recommendation systems tend to leverage the information in social networks to provide helpful suggestions for users, and a great deal of research studies on social network analysis are thereby motivated. Recently, the sizes of social networks have been increasing rapidly, and this growth results in a significant increase in the computational cost of the sophisticated recommendations. The huge size and complexity of social networks create a considerable burden for recommendation systems while processing the information from social networks to provide suggestions. Therefore, in this dissertation, we study three important recommendation problems in social networks and aim to improve their efficiency. First, we focus on the relationship between two users and study the link prediction problem in large-scale networks. During the link prediction, numerous feature values need to be calculated and then combined to make recommendations, and the computational cost grows quickly as the network size becomes larger. Some previous studies involving network processing attempt to lower the computational cost by reducing the network size via sparsification. However, sparsification might remove important information and hurt the prediction accuracy. To address this issue, we propose a framework called Diverse Ensemble of Drastic Sparsification (DEDS), which constructs ensemble classifiers with good accuracy while keeping the prediction time short. DEDS includes various sparsification methods that are designed to preserve different measures of a network. Therefore, DEDS can generate sparsified networks with significant structural differences and increase the diversity of the ensemble classifier, which is key to improving prediction performance. Second, we extend the scope from the relationship between two users to the relationship among a group of users, and study the social group query problem with its applications in activity planning. Considering social links among all users to recommend a mutually acquainted group of attendees for an activity is an NP-hard problem. In addition to finding a group of attendees familiar with each other, selecting an activity period available to all is also essential for activity planning. Therefore, we need to further consider the available time of users, which makes the problem even harder due to the complexity of social connectivity and the diversity of user schedules. In this dissertation, we propose the Social-Temporal Group Query (STGQ) to find suitable time and attendees with minimum total social distance. We design two algorithms, SGSelect and STGSelect, which include various effective pruning strategies to substantially reduce running time. Experimental results indicate that SGSelect and STGSelect are significantly more efficient than baseline approaches. We also conduct a user study to compare the proposed approach with manual activity coordination. The results show that our approach obtains higher quality solutions with less coordination effort, thereby increasing users' willingness to organize activities. Finally, we study the consecutive group query problem to support a sequence of recommendations. When planning an activity, it is difficult for a user to specify all the conditions right at once to find the perfect group of attendees and time. Fortunately, with the aforementioned social-temporal group query, it is easy for the user to tune the parameters to find alternative recommendations. As users may iteratively adjust query parameters to fine tune the results, we further propose Consecutive Social Group Query (CSGQ) to support such needs. Envisaging that exploiting the intermediate solutions of previous queries may improve processing of the succeeding queries, we design a new tree structure, namely, Accumulative Search Tree, which caches the intermediate solutions of historical queries in a compact form for reuse. To facilitate efficient lookup, we further propose a new index structure, called Social Boundary, which effectively indexes the intermediate solutions required for processing each CSGQ with specified parameters. According to the experimental results, with the caching mechanisms, processing time of consecutive queries can be further reduced considerably.

並列關鍵字

Algorithm Design and Analysis ； Social Networks ； Link Prediction ； Network Sparsification ； Query Processing ； Index Structure

參考文獻

[1] L. Adamic and E. Adar. Friends and neighbors on the web. Social Networks, 25:211–230, 2001.

[2] A. An, M. Kargar, and M. ZiHayat. Finding affordable and collaborative teams from a network of experts. In SIAM Int’l Conf. Data Mining (SDM), 2013.

[4] B. Balasundaram, S. Butenko, and I. V. Hicks. Clique relaxations in social network analysis: The maximum k-plex problem. Operations Research, 59(1):133–142, 2011.

[5] A. L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286:509–512, 1999.

[7] S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In Proc. 7th Int’l Conf. World Wide Web (WWW), 1998.

國際替代計量

於社群網路中之高效能鏈結預測與群組查詢

主題瀏覽