透過您的圖書館登入
IP:3.144.113.197
  • 學位論文

異質性社群網路上關係資訊的取樣

Sampling Relational Profiles on Heterogeneous Social Network

指導教授 : 林守德

摘要


近年來,社群網路分析吸引了廣泛的注意。目前已有許多針對同質性社群網路的研究成果,例如Clustering Coefficient、Centrality以及Betweenness。然而針對異質性社群網路的研究並不多•另外,許多社群網路分析方法需要針對網路進行高複雜度的計算,因此並不適用於現今線上社群網站所產生的大規模社群網路。因此,對網路取樣便成為一個能同時縮小網路規模,又能保留重要網路特徵的方法,並且能讓我們在可接受的時間內對大規模社群網路進行深入的了解。 此論文中,我們提出一個能捕捉異質性社群網路中,各種不同型別之間的條件機率的網路性質,稱為Relational Profile。此性質可以表現出異質性社群網路中的型別之間的相依性。接著我們提出了新穎的取樣問題:保留Relational Profile的取樣。我們針對此問題提出了新穎的取樣方法:Difference Maximization Sampling與Difference Proportional Sampling。我們在三個真實世界中的異質性社群網路上驗證了我們的取樣方法的良好表現,並且討論了不同Relational Profile與取樣大小對各種取樣方法的表現的影響。

並列摘要


In recent years, social network analysis becomes a topic that raises wide interest in. There are many existing works on analyzing important network properties on homogeneous social network, such as clustering coefficient, centrality and betweenness. But such works on heterogeneous social network are limited. Additionally, many of these analysis need to perform high complexity computation on the network, which is not practical on the large scale social networks provided by modern online social network sites. To gain better understanding to such large-scale network in a reasonable time, graph sampling becomes an important method to reduce the scale of network and preserve important properties in the original network in the same time. In this paper, we proposed relational profile, which is a network property contains the conditional probabilities between various types in the heterogeneous social network to capture the dependence between types in the heterogeneous social network. Then, we proposed a novel sampling problem, relational profile preserving sampling. We also proposed sampling methods, Difference Maximization Sampling and Difference Proportion Sampling, which designed to solve the sampling problem. We evaluated our sampling methods on three real-world networks and show that our algorithm outperforms other common sampling algorithms when the sample size is small. Finally, We discussed the effect of relational profile and sample size on the performance of different sample methods.

參考文獻


[1] Leskovec et al. Sampling from large graphs. Proceedings of the 12th ACM SIGKDD … (2006)
[2] Hubler et al. Metropolis algorithms for representative subgraph sampling. IEEE International Conference on Data Mining (2008)
[3] Maiya et al. Sampling community structure. WWW (2010)
[4] Rafiei et al. Effectively visualizing large networks through sampling. 16th IEEE Visualization (2005)
[5] Jhao-Yin Li and Mi-Yen Yeh. On Sampling Type Distribution from Heterogeneous Social Networks. PAKDD (2011)

延伸閱讀