Social Network Analysis Based on Network Motifs

Hong-lin, Xu; Han-bing, Yan; Cui-fang, Gao; Ping, Zhu

doi:https://doi.org/10.1155/2014/874708

Journal of Applied Mathematics

On this page

Abstract Introduction Conclusions Acknowledgments References Copyright Related Articles

Special Issue

Game Theory and Applications in Economics

View this Special Issue

Research Article | Open Access

Volume 2014 | Article ID 874708 | https://doi.org/10.1155/2014/874708

Social Network Analysis Based on Network Motifs

Xu Hong-lin,¹Yan Han-bing,¹Gao Cui-fang,¹and Zhu Ping¹

Academic Editor: Pu-yan Nie

Received14 Nov 2013

Accepted25 Dec 2013

Published16 Feb 2014

Abstract

Based on the community structure characteristics, theory, and methods of frequent subgraph mining, network motifs findings are firstly introduced into social network analysis; the tendentiousness evaluation function and the importance evaluation function are proposed for effectiveness assessment. Compared with the traditional way based on nodes centrality degree, the new approach can be used to analyze the properties of social network more fully and judge the roles of the nodes effectively. In application analysis, our approach is shown to be effective.

1. Introduction

A large number of systems in the real world exist as networks, such as social networks (coauthor network, criminal networks, etc.), biological networks (protein interaction networks, metabolic networks, etc.), and technology networks (electricity networks, the Internet, etc.) [1–12]. In order to reveal their structure and principle, Milo et al. first proposed the concept of “network motifs,” which can be defined as patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized network [13]. Later, research on network motifs has been developed extensively. Kim et al. defined biological network motifs as biologically significant subgraphs [14]. Farina et al. identified regulatory network motifs from gene expression data, and they proposed the corresponding algorithm [15]. In order to specify network motifs, Ohnishi et al. analyzed an interfirm network consisting of about one million firms and four million directed links [16].

The study of social networks has always been a hot research topic. In order to judge the importance of nodes, the staple methods of traditional social network analysis are basing on the calculation of the centrality of nodes in network, [17, 18]. In recent years, various new methods are introduced into social network analysis; network motif is an important kind of them [19, 20]. Analyzing motifs for the large social networks derived from email communication firstly, Juszczyszyn found that the distribution of motifs in all analyzed real social networks is similar and can be treated as the network fingerprint. This property is most distinctive for stronger human relationships [21, 22].

In this paper, we introduce network motifs to develop a set of network analysis methods, which is different from the traditional social network analysis, and also illustrate its application.

2. Research Methods

2.1. Directed Graph and Point Centrality

A network with nodes is denoted by , where is node set, is edge set. is adjacency matrix of , of which the elements are as follows: where .

The centrality analysis is the staple method of traditional social network analysis [17, 18]. In a network, if there are direct links between an actor and other actors, this actor resides in the centre of the network, having more “power” [17]. The importance of a node, point centrality, can be measured by the number of contacted nodes [18]. Based on the adjacency matrix, the formula of point centrality of node is as follows: where .

2.2. Network Motifs Finding

2.2.1. Frequent Graph

Frequent subgraph mining is an important method of network information mining. Frequent subgraph mining algorithms are divided into breadth-first search (BFS) algorithm and depth-first search (DFS) algorithm based on subgraph search path [23]. As a breadth-first search algorithm, Apriori graph mining (AGM) algorithm is an early adopter of Apriori idea. AGM algorithm takes an adjacency matrix to represent the graph. Then it generates code based on adjacency matrix and takes minimum coding as unique identification for the graph in order to solve NP problem of subgraph isomorphism [24].

The graph which is constituted by node set and edge set is a subgraph of . Let . Furthermore, the node set of subgraph is denoted as .

Based on the adjacency matrix of a subgraph, the maximum encoding is obtained as unique identification of the subgraph. AGM algorithm is used to mine frequent subgraphs based on the maximum encoding.

2.2.2. Random Network Model

In typical network motifs finding algorithms, random network model maintains the degree sequence of the real network very well [25]. Exchange algorithm is an algorithm for generating random network according to degree sequence, which is as follows [26].

Algorithm A. Input: degree sequence Output: random network Step 1: Construct a network according to degree sequence. Step 2: Randomly select a pair of edges (e.g.,,). Step 3: Carry out the Monte Carlo exchange (,). Step 4: Cancel the exchange if the exchange has led to multiple edges or loops. Step 5: Repeat until reaching the target number of times.In this way, a set of random networks with the same degree sequence as can be obtained.

2.2.3. Statistical Significance of Network Motifs

Network motifs are frequent subgraphs with special statistical significance, which have some special functions in the network. Network motifs satisfy the following conditions: occurrence of the subgraph in real network is not less than a minimum and is significantly higher than their occurrence in random network [13, 27].

The statistical significance of network motifs is denoted by : where denotes the occurrences of a subgraph in real network and and denote mean and standard deviation of the occurrences of the subgraph in random networks.

2.3. Frequency Matrix

The nodes of a network always have a lot of roles, such as teacher and student. Most social networks can be simulated by role network model (RNM). The role set is denoted by , where is the number of roles. The set of nodes whose role is ) is denoted by , so the set of is denoted by . denotes the set of subgraphs whose structure is the same as the network motifs in the network.

In order to determine the role tendentiousness of the unknown role nodes, we can count the frequency of the unknown role nodes occurring in different network motifs, respectively, through the composition of nodes in network motifs, which contain different known role nodes.

Based on network motifs, frequency matrix is obtained. The elements of denotes the total occurrences of node in the network motifs that contain the known role nodes, whose role is .

The algorithm for calculating frequency matrix is as follows.

Algorithm B. Input: , , Output: Step 1: For any , for any . Step 2: For any , for any , if , then for any ,. Step 3: Output . Step 4: End.

2.4. Evaluation Function

2.4.1. Tendentiousness Evaluation Function (TEF)

Based on frequency matrix , the tendentiousness of node with respect to role is evaluated by TEF.

Definition 1. The TEF value is defined by where , , and . Obviously, the greater , the greater the tendentiousness of node with respect to role .

2.4.2. Importance Evaluation Function (IEF)

Based on point centrality and TEF, the importance of node with respect to role in the network is evaluated by IEF.

Definition 2. The IEF value is defined by The normalized form is where , , and . The greater , obviously, the greater the importance of node with respect to role in the network.

3. Application Analysis

The Intergalactic Crime Modelers (ICM) is investigating a conspiracy to commit a criminal act. The case involves 83 members and 400 messages between these people, as shown in Figure 1. As priorly known in [28], Jean, Alex, Elsie, Paul, Ulf, Yao, Harvey are conspirators, Darlene, Tran, Jia, Ellin, Gard, Chris, Paige, Este are nonconspirators.

Now, we analyze the set of prior conspirator and the set of prior non-conspirator by using the theory and methods of network motifs.

Firstly, let be “conspirator” and let be “non-conspirator”. Then the links are divided into two categories, of which daily topic is denoted by topic 1, and conspiracy topic is denoted by topic 2.

Based on adjacency matrix of the network, point centrality of nodes is calculated by using formulas (2) as shown in Table 1. The point centrality reflects the influence of a node in the network, which means the larger point centrality a criminal node have, the bigger negative impact on the network will occur. Therefore, ICM should focus on high-ranking members in Table 1.

According to the method in Section 2.2, frequent subgraphs and network motifs of the network are obtained, as shown in Table 2 (example).

Depending on the network motifs, frequency matrix is obtained by applying Algorithm B. Based on the evaluation functions TEF and IEF, the priority list of criminal tend and the priority list for key monitoring are obtained, as shown respectively, in Tables 3 and 4.

By comparing and analyzing, the “network motifs” offer a much more comprehensive way to analyze social networks; the high-ranking members having both higher point centrality and more criminal tend in Tables 3 and 4 are more suspect than others, so ICM should monitor them. Our research confirmed that the method is suitable in social network and the results are reliable.

4. Conclusions

In this paper, we developed a set of network analysis methods based on the theory and methods of social network analysis, frequent subgraph mining, network motifs, and so forth, which is different from the traditional social network analysis. In application analysis, a series of priority lists are obtained based on the evaluation functions. The priority lists reflect network information effectively, which is of great reference value for ICM.

Based on the study on node connection relationship of social networks, a follow-up study will involve more attention to structural relationship with more practical value.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The research is supported in part by the National Natural Science Foundation of China under Grant 11271163 and the Fundamental Research Funds for the Central Universities under Grant JUSRP51317B (China).

References

S. H. Strogatz, “Exploring complex networks,” Nature, vol. 410, no. 6825, pp. 268–276, 2001.
View at: Publisher Site | Google Scholar
M. E. J. Newman, “Assortative mixing in networks,” Physical Review Letters, vol. 89, no. 20, Article ID 208701, 4 pages, 2002.
View at: Publisher Site | Google Scholar
D. J. Watts and S. H. Strogatz, “Collective dynamics of “small-world” networks,” Nature, vol. 393, no. 6684, pp. 440–442, 1998.
View at: Publisher Site | Google Scholar
T. N. Friemel, “Dynamics of social networks,” Procedia-Social and Behavioral Sciences, vol. 22, pp. 2–3, 2011.
View at: Publisher Site | Google Scholar
J. Xu and H. Chen, “Criminal network analysis and visualization,” Communications of the Association for Computing Machinery, vol. 48, no. 6, pp. 100–107, 2005.
View at: Publisher Site | Google Scholar
C. Brun, F. Chevenet, D. Martin, J. Wojcik, A. Guénoche, and B. Jacq, “Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network,” Genome Biology, vol. 5, no. 1, pp. 1474–1760, 2003.
View at: Publisher Site | Google Scholar
E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A.-L. Barabási, “Hierarchical organization of modularity in metabolic networks,” Science, vol. 297, no. 5586, pp. 1551–1555, 2002.
View at: Publisher Site | Google Scholar
K. Musiał and P. Kazienko, “Social networks on the Internet,” World Wide Web, vol. 16, no. 1, pp. 31–72, 2013.
View at: Publisher Site | Google Scholar
C. E. Hutchins and M. Benham-Hutchins, “Hiding in plain sight: criminal network analysis,” Computational and Mathematical Organization Theory, vol. 16, no. 1, pp. 89–111, 2010.
View at: Publisher Site | Google Scholar
M. Girvan and M. E. J. Newman, “Community structure in social and biological networks,” Proceedings of the National Academy of Sciences of the United States of America, vol. 99, no. 12, pp. 7821–7826, 2002.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
T. Velden, A. Haque, and C. Lagoze, “A new approach to analyzing patterns of collaboration in co-authorship networks: mesoscopic analysis and interpretation,” Scientometrics, vol. 85, no. 1, pp. 219–242, 2010.
View at: Publisher Site | Google Scholar
M. E. J. Newman, “The structure and function of complex networks,” SIAM Review, vol. 45, no. 2, pp. 167–256, 2003.
View at: Google Scholar | Zentralblatt MATH
R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon, “Network motifs: simple building blocks of complex networks,” Science, vol. 298, no. 5594, pp. 824–827, 2002.
View at: Publisher Site | Google Scholar
W. Kim, M. Li, J. Wang, and Y. Pan, “Biological network motif detection and evaluation,” BMC Systems Biology, vol. 5, Supplement 3, article S5, 2011.
View at: Publisher Site | Google Scholar
L. Farina, A. Germani, G. Mavelli, and P. Palumbo, “Identification of regulatory network motifs from gene expression data,” Journal of Mathematical Modelling and Algorithms, vol. 9, no. 3, pp. 233–245, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
T. Ohnishi, H. Takayasu, and M. Takayasu, “Network motifs in an inter-firm network,” Journal of Economic Interaction and Coordination, vol. 5, no. 2, pp. 171–180, 2010.
View at: Publisher Site | Google Scholar
Q. H. Zhu and L. Li, “Social network analysis method and application study in information science,” Information Study, vol. 31, no. 2, pp. 179–183, 2008.
View at: Google Scholar
P. Chen and H. Y. Yuan, “Social network analysis of crime organization structures,” Journal of Tsinghua University, vol. 51, no. 8, pp. 1097–1101, 2011.
View at: Google Scholar
D. Gómez, E. González-Arangüena, C. Manuel, G. Owen, M. del Pozo, and M. Saboyá, “The cohesiveness of subgroups in social networks: a view from game theory,” Annals of Operations Research, vol. 158, no. 1, pp. 33–46, 2008.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
X. G. Wang, “Discovering critical nodes in social networks based on cooperative games,” Computer Science, vol. 40, no. 4, pp. 155–159, 2013.
View at: Google Scholar
K. Juszczyszyn, P. Kazienko, and K. Musial, “Local topology of social network based on motif analysis,” in Knowledge-Based Intelligent Information and Engineering Systems, vol. 5178 of Lecture Notes in Computer Science, pp. 97–105, 2008.
View at: Publisher Site | Google Scholar
K. Juszczyszyn and K. Musial, “Motif analysis and the periodic structural changes in an organizational email-based social network,” International Journal of Virtual Communities and Social Networking, vol. 1, no. 2, 14 pages, 2009.
View at: Publisher Site | Google Scholar
X.-T. Li, J.-Z. Li, and H. Gao, “Efficient frequent subgraph mining algorithm,” Journal of Software, vol. 18, no. 10, pp. 2469–2480, 2007.
View at: Publisher Site | Google Scholar
H. M. Lu, B. Q. Feng, and Q. B. Song, “Survey of frequent subgraph mining research,” Microelectronics & Computer, vol. 26, no. 3, pp. 156–161, 2009.
View at: Google Scholar
G.-M. Qin, L. Gao, and J.-L. Hu, “A review on algorithms for network motif discovery in biological networks,” Acta Electronica Sinica, vol. 37, no. 10, pp. 2258–2265, 2009.
View at: Google Scholar
R. Milo, N. Kashtan, S. Itzkovitz, M. E. J. Neuman, and U. Alon, “On the uniform generation of random graphs with prescribed degree sequences,” Cond-mat/0312028, 2003.
View at: Google Scholar
N. Kashtan, S. Itzkovitz, R. Milo, and U. Alon, “Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs,” Bioinformatics, vol. 20, no. 11, pp. 1746–1758, 2004.
View at: Publisher Site | Google Scholar
ICM, “Modeling for crime busting,” The Interdisciplinary Contest in Modeling, 2012.
View at: Google Scholar

Copyright

Copyright © 2014 Xu Hong-lin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

3794

Downloads

1932

Citations