探勘時間性社交網路樣式

隨著社交網路應用不斷推陳出新，如何從複雜且龐大的社交網路中找出有意義的樣式已成為一個熱門的研究議題。我們可以將社交的互動關係用一個至多個社交網路來表示，而每個互動關係都有各自相對應的時間區間，藉由資料探勘的技術，可以幫助我們在時間性社交網路中發現物件間頻繁的互動行為。因此，本篇論文探討如何在時間性社交網路中尋找頻繁樣式，我們提出一個有效率的探勘演算法叫「TSP-Miner」，用來找尋時間性社交網路中的封閉性頻繁樣式。我們所提出的演算法主要包括兩個階段。首先，我們產生所有長度為一的頻繁樣式。然後，我們利用頻繁樣式樹以深先搜尋法的方式遞迴產生所有的頻繁樣式。在產生的過程中，除了檢查這些樣式是否為封閉外，我們也利用修剪策略刪除不必要的候選樣式。由於TSP-Miner只需掃描資料庫一次且不會產生不必要的樣式，實驗結果顯示，不管在合成或真實資料庫中，我們所提出的方法皆優於改良式的Apriori演算法。

關鍵字

資料探勘；時間性社交網路；封閉性樣式

並列摘要

With an increasing interest in social network applications, how to find meaningful patterns from social networks has attracted more and more attention. The interactions in a social network can naturally be modeled by a temporal network, where a node in the network represents an individual, and an edge between two nodes denotes the interaction between two individuals in a certain time interval. Mining frequent patterns in temporal social networks can help us discover frequent interaction behaviors. Therefore, in this thesis, we propose a novel algorithm, TSP-Miner (Temporal Social network Patterns Miner), to mine frequent closed temporal social network patterns. The proposed algorithm consists of two phases. First, we find all frequent patterns of length one in the database. Second, for each pattern found in the first phase, we recursively generate frequent patterns by a frequent pattern tree in a depth-first search manner. During the mining process, we eliminate impossible candidates and check whether the frequent patterns are closed or not. Since the TSP-Miner only needs to scan the database once and doesn’t generate unnecessary candidates, it is more efficient and scalable than the modified Apriori algorithm. The experiment results show that the TSP-Miner outperforms the modified Apriori in both synthetic and real datasets.

並列關鍵字

data mining ； temporal social network ； closed patterns

參考文獻

[14]R. Jin, C. Wang, D. Polshakov, S. Parthasarathy, G. Agarwal, Discovery frequent topological structures from graph datasets, Proceeding of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, USA, 2005, pp. 606-611.

[4]T.S. Chen, S.C. Hsu, Mining frequent tree-like patterns in large datasets, Data and Knowledge Engineering, Vol. 62, No. 1, 2007, pp. 65-83.

[5]J. Cheng, Y. Ke, W. Ng, δ-Tolerance closed frequent itemsets, Proceedings of the IEEE International Conference on Data Mining, Hong Kong, China, 2006, pp. 139-148.

[6]C.I. Ezeife, M. Monwar, SSM: A frequent sequential data stream patterns miner, Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, Honolulu, Hawaii, USA, 2007, pp. 120-126.

[7]G. Grahne, J. Zhu, Fast algorithms for frequent itemset mining using FP-trees, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 10, 2005, pp. 1347-1362.

國際替代計量

探勘時間性社交網路樣式

全文下載

主題瀏覽