Upload
valia-siozopoulou
View
18
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Descriptive statistics for networks
Citation preview
Networks and Discrete Mathematics
Descriptive Statistics on Real Networks Chronis Moyssiadis Vassilis Karagiannis
08/11/2011 WS.04 Webscience: lecture 2
Aristotle University, School of Mathematics Master in Web Science
Lesson 2 Overview Scope
Introduce contemporary basic descriptive statistics on the topology of real networks.
Provide the skills to explore networks.
Means Mathematical definition and interpretation of each statistical
measure on the topology of networks Network examples (using igraph and NodeXL applications) and
simple exercises solved by hand.
Next Lesson Applications from Algebraic Graph Theory in network analysis. Random models of networks (ER, Small-Worlds, Scale-Free)
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 2
Main Part (4 hours)
(after completing the first lesson)
Networking Complex Systems
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 4
Some step (probably the final) of the connections among the Twitter users who recently tweeted the word Worldbank when queried on June 20, 2011, scaled by numbers of followers (with outliers thresholded). Connections created when users reply, mention or follow one another
Networking Complex Systems
Most researchers would probably agree that a complex system is a system composed of many interacting parts, such that the collective behavior of those parts together is more than the sum of their individual behaviors.
Complex system can thus be said to be a system of interacting parts that displays emergent behavior.
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 5
2010. Mark Newman, Complex Systems: A Survey
Networking Complex Systems To quantify the details of the system one must specify first its
topologywho interacts with whomand then its dynamicshow the individual atoms behave and how they interact.
Topology is usually specified in terms of lattices or networks, and
this is one of the best developed areas of complex systems theory.
The final or another initial step is the construction of a model
that agrees with the empirical observations and probably is capable to make predictions in a statistical manner of view, or in case of making significant errors, continue by gathering new observations and adjust or reconstruct the initial proposed model.
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 6
2010. Mark Newman, Complex Systems: A Survey
Topology helps understanding Function and Evolution
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 7
Tokyo railway network
Lattice as a chessboard
Huge Complicated Topologies
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 8
Most complex systems, however, have more complicated non-regular topologies that require a more general network framework for their representation
The size the scale, and the shape
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 9
The Greek Statisticians Scientific Collaboration Network, over 20 years (2010).
Weighted vs. Unweighted Networks
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 10
A common way to represent a complex system as a network is by some transformation of a similarity or dissimilarity measure between objects. Each new transformed value constitutes a weight of the link between a pair of objects (Horvath ch. 5). Accordingly a link or a tie is set to present if its weight is higher than a cut-off, and removed otherwise. The produced network is either binary (b, c, d) or weighted ().
The weighted net The binary net without weights
The 0 - 1 net Using a cut-off w(link) > 1
The 0 - 1 net Using a cut-off w(link) > 2
Global and Local Statistical Measures
Local measures (statistics, indices) are those that characterize individual nodes, links or their neighborhood.
Global measures (statistics, indices) are the distribution of any local measure over the set of the nodes or links, or some summary statistic that is produced in accordance with the distribution of any local measure over the set of nodes or links.
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 11
Network type and Connectivity
The first questions to be answered Does the network have multiple edges or loops? (usually we
delete them, but the researcher is responsible for the answer) Is it connected? If not:
Find the node and the edge connectivity numbers. find the components
Is it a directed network? find the strongly connected components and the weakly connected
components. Compute network density Giant component, component distribution. In some disconnected networks a component much greater in
comparison to all the others is observed (giant component). What is the proportion of nodes that it includes? What is the number of nodes in the second largest component? What is the distribution of the components regarding the number of the
nodes included therein?
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 13
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 14
The giant component includes 70% of the nodes while the second largest component has 15% of the nodes
Giant Component
Example (Scientific Collaboration Net)
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 15
Component distribution Nodes in component 1 2 3 4 5 6 8 9 10 11 12 418 % (Sum of nodes/ n) 0,1 0,3 0,4 0,5 0,7 0,8 1,1 1,2 1,3 1,5 1,6 55,4
# of components 67 26 25 6 3 7 1 1 1 1 2 1
Sparse: density= 0.003
A first look concerning vulnerability.
Articulation points bi-components
and nodes at the edge of the network
Articulation points in a connected net
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 17
Red nodes are the articulation points and blue ones are these of degree 1 (at the edge of the biological network)
Bi-components in a connected net
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 18
bi-components show dense parts of the network and possible weaknesses
A crucial step towards the topology. Degree,
link weight and weighted degree
distributions
Find the degree distribution regardless of connectivity
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 20
Descriptive statistics
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 21
(G) = 0.0
Median degree = 2.0
Average degree = 2.1
(G) = 4.0
SD = 1.12
1, 1, 4, 2, 3, 2, 4, 3, 2, 2, 4, 2, 3, 1, 1, 1, 0, 2, 2, 2,
Coefficient of Variation = SD/Average Degree = 53%
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 22
Degree sequence : 1, 1, 4, 2, 3, 2, 4, 3, 2, 2, 4, 2, 3, 1, 1, 1, 0, 2, 2, 2
Degree freq 0 0.05 1 0.30 2 0.70 3 0.85 4 1.00
Degree freq 0 1.00 1 0.95 2 0.70 3 0.30 4 0.15
Degree freq 0 0.05 1 0.25 2 0.40 3 0.15 4 0.15
Plots of the CCDF, CCDF with log(degree), CCDF with log(degree) and log (freq)
Degree distributions
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 23
Usually for the sake of noise reduction a logarithmic binning of the degrees is performed having more than M=10 bins and the bin size is given by the form: so the bin number assigned to node with degree ki is The end of the bin mi on the k-axis is given by
=
ln1M
r [ ]1,0,ln1
= MMk
rM iii
ek iiMr
m)1( +=
Actually dont see all the zero values because log(0) =
logarithmic binning on log-log plot
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 24
Same bins, but plotted on a log-log scale
100
101
102
103
104
100
101
102
103
104
105
106
integer value
frequ
ency
Noise in the tail: Here we have 0, 1 or 2 observations of values of x when x > 500
here we have tens of thousands of observations when x < 10
Slide from Lada Adamic
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 25
Newman 2003
Distribution of the (link) weights
Weighted networks are similarly described by a matrix Wij specifying the weight on the edge connecting the vertices i and j (Wij = 0 if the nodes i and j are not connected, or positive if they are connected)
A first characterization of weights is obtained by the distribution P( w ) that any given edge
has weight w. This distribution may a priori be homogeneous and characterized by a typical scale, or on the contrary carry a novel heterogeneity.
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 26
2007, Caldarelii Vespignani book.
Weighted Degree Distribution - Strength
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 27
Questions: How can we take into consideration the number of links in the computation of the strength? What is the distribution of the node strength? (the answer to this question in similar to the one about estimation of the degree distribution)
Association between degree and strength In the absence of correlation between
the weight of the links and the degree
of nodes, the weights wij are on
average independent of link {i,j},
hence considering the mean weight
the strength of each node could be
approximately computed as
This observation can be resulted after a linear regression of strength on
degree and reveals that weights give
the same information as degrees. 8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 28
2007, Caldarelii Vespignani book.
ii dws =
w
In the presence of correlations between topology and weights we obtain in general a relation of the form 1 or
,and1 with,>
==
bwCbdCs i
bi
Example: correlation between topology and weights
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 29
)tscoefficien both for,001.0(108.1and414.1161.1
:equationestimated
Degree degree correlation
(how they choose and how we
choose)
The average nearest neighbor degree (ANND)
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 31
Are the nodes preferentially connected to other nodes with similar degree or with dissimilar degree?
2007, Caldarelii Vespignani book.
Assortativity
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 32
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 33
4, 2, 3, 2, 4, 3, 2, 2, 4, 2, 3, 1, 2, 2, 0, 1, 1, 1, 1, 2
Node 3 has (3)={1, 2, 4, 5} neighbors having degrees {1,1,2,3}, hence knn(3) = (1+1+2+3)/4 = 1.75
There are 3 nodes having degree 4 with knn(3) = 1.75, knn(7) =2.5 and knn(10) = 2.25 hence
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 34
Great fluctuations around the line exist hence this is a poor example of a disassortative small network
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 35
A Disassortative collaboration networks (At HSI conferences professors collaborate with their students more often that with other professors). Hubs prefer non-hubs (found in biological, social media, technological nets)
Weighted ANND
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 36
2007, Caldarelii Vespignani book.
Node i with small average nearest neighbors degree but large weighted average nearest neighbors degree is mostly connected to low-degree nodes but the link with largest weight points towards a well-connected hub
Example on Weighted ANND
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 37
The Dissasortative collaboration networks (HSI conferences) Using weights we get a clearer picture of dissasortativity Hubs prefer non-hubs
But, in Dissasorative networks hubs can be interconnected
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 39
The rich-club coefficient
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 40
Colizza et. al, 2006
What is the tendency of nodes with high connectivity?
The rich-club coefficient
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 41
Colizza et. al, 2006 The rich club phenomenon. Hubs are interconnected in a disassortative network (a property of both computer and social networks)
Opsahl, 2010 proposed the quotient network randomized a from comes)(rwnull
The weighted rich-club coefficient
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 42
A weighted network with 5 Hubs (Opsahl: http://toreopsahl.com/tnet/two-mode-networks/weighted-rich-club-effect/
1
( )r
w rE rank
ll
Wrw
>
>
=
=
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 43
Weighted rich club effect:
Scientists with at most 10
collaborators tend to collaborate with
each other while this is not the case
with the hubs of the network (points
under the horizontal line y=1)
http://sites.google.com/site/vcolizza2/PhysRevLett_101_168702.pdf?attredirects=0
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 44
Binary rich
club effect:
The last
result is
clearly
observed in
that picture
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 45
Weighted
rich club
effect:
The last
result is
clearly
observed in
that picture
The distance
Distances in Real-World Networks
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 47
Distance in binary networks
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 48
Mean distance (giant Component): Sum all the elements and divide by 14(14-1) = 3.32
Diameter (giant component) = 8
The distance matrix The distribution of distances is another useful exploration tool
When the distance is meaningful? This net is disconnected
Distance in Weighted Networks
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 49
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 50
The Giant Component of the Scientific collaboration network. Diameter: 16 Average distance: 6.8 (binary case)
Node Importance Centrality Indices
Centrality Indices
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 52
A centrality coefficient is a measure that captures the importance of a node's or links position in the network.
Degree Centrality Strength (weighted)
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 53
The degree centrality of a node is its degree. Nodes with more connections tend to have more power.
Eigenvector Centrality
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 54
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 55
Values 0 for the nodes in the giant component
Depends both on the number and the quality of the connections
3 0,094966 1 0,033984 2 0,033984 4 0,082647 5 0,114758 7 0,135986 6 0,08973 8 0,092863 9 0,061756
10 0,061756 11 0,079709 12 0,047158 13 0,052068 14 0,018633 15 0 16 0 18 0 19 0 20 0 17 0
Closeness Centrality
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 56
It is based upon on the concept of distances between nodes
Closeness Centrality
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 57
In a disconnected network, each component has to be examined separately because in such case closeness is not well defined.
3 0,023256 1 0,018182 2 0,018182 4 0,026316 5 0,027027 7 0,03125 6 0,025 8 0,03125 9 0,027778
10 0,027778 11 0,025 12 0,02 13 0,020408 14 0,016393 15 1 16 1 18 0,5 19 0,5 20 0,5 17 0
Betweenness Centrality
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 58
It is based upon on the concept of network shortest paths between nodes
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 59
In a disconnected network, each component has to be examined separately because in such case closeness is not well defined.
3 23,5 1 0 2 0 4 12 5 15 7 43,5 6 0 8 42,5 9 16
10 16 11 30,5 12 0 13 12 14 0 15 0 16 0 18 0 19 0 20 0 17 0
Comparison between centrality indices
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 60
Kolaczyk 2009
closeness
betweenness eigenvector
Clustering, Cliquishness, Cohesiveness
and Hierarchical Structure
A broader aspect of connectedness and centrality
One important question on social networks is how tightly clustered they are. For example, the question in what extent my friends are friends with each
other captures one facet of this. Clustering has an interesting history as a term, growing out of the earlier
sociology literature, based on partitioning signed graphs into subsets where nodes within elements of the partition have only positive relationships between them, and only negative relationships exist across elements of the partition (Ch. 6 in Wasserman and Faust).
Additionally a variety of concepts measure how cohesive or closely knit a social network is. An early concept related to this is the idea of a clique. One measure of
cliqueshness is to count the number and size of the cliques of a network. Cliques are generally required to contain at least 3 nodes.
In recent network literature the notion of clustering had been related to that of transitivity: a friend of a friend is a friend (Ch. 4 in Wasserman and Faust)
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 62
A connected triple centered at node i is defined as a path of length 2 having node i as the intermediate node. The number of all possible connected triples at node i having degree di is: The transitivity of a node i as also the clustering coefficient of a node i is defined as:
Transitivity and the Clustering Coefficient
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 63
( ) = = ( 1)( ) 2 2i iid ddT i
1
2 3
A connected triple centered at node 1
1
3 2 One triangle centered at node 1 but 3 triangles centered at nodes 1,2 and 3
Two measures of the aspect that a friend of a friend is a friend
Some facts on transitivity and clustering coefficient of the entire network
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 64
The Watts and Strogatz clustering coefficient tends to weight the contributions of low-degree vertices more heavily than the transitivity coefficient, because such vertices have a small denominator. Bollobas verified that T = C if all nodes have the same degree or all clustering coefficients are equal
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 65
3 0 1 0 2 0 4 0 5 0,333333 7 0,166667 6 1 8 0 9 0
10 0 11 0,166667 12 1 13 0,333333 14 0 15 0 16 0 18 1 19 1 20 1 17 0
Average clustering coefficient = 0.3--------Transitivity = 0.257 Average clustering coefficient of the giant component = 0.214 ---------- Transitivity = 0.1875
Weighted Transitivity and Clustering Coefficient
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 66
Facts and Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 67
Transitivity = 0.316 Clustering coefficient = 0.46 Weighted Clustering coefficient = 0.292
Cw < C Triples are formed by scientists
that either are old but they did not collaborate frequently or they are new scientists with close collaboration and few articles.
If Cw > C, we are in presence of a network in which the interconnected triples are more likely formed by the edges with larger weights. On the contrary, Cw < C signals a network in which the topological clustering is generated by edges with low weight. (Caldarelli book p. 69)
Cliquishness
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 68
The clique number of the network and the maximal sets of cliques (biological net). Clear they constitute a cohesive group of proteins
The k-core decomposition
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 69
Larger values of coreness clearly correspond to groups of vertices with larger degree and more central position as group in the networks structure. Properties of the network that can be seen: hierarchical arrangement, degree correlations and centrality
2005, Ignacio Alvarez-Hamelin et al.
Hierarchical Structure by the C(k) function
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 70
To investigate if any hierarchical organization is present in real networks we measured the C(k) function for several networks for which large topological maps are available.
Actor Network: the high-k range of C(k) scales as k-1. The majority of actors with a few links (small k) appear only in one movie. Each such actor has a clustering coefficient equal to one, as all are part of the same cast, and are therefore connected to each other. The high k nodes include many actors that acted in several movies, and thus their neighbors are not necessarily linked to each other, resulting in a smaller C(k).
The scaling of C(k) for (a) actor network, (b) The semantic web, connecting two words if they are listed as synonyms in the Merriam Webster Dictionary, (c) The WWW , (d) Internet at the Autonomous System level, each node representing a domain. The dashed line in each figure has slope -1
Ravasz, 2004
Important subgraphs that may uncover Functionality and evolutionary principles
of the network
Motifs
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 72
disadvantage: dont know if motif is part of a larger cohesive community
Example
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 73
0 0 26 3
Example (scientific collaboration net)
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 74
z score = (Nreal Nrand)/SD
Frequency Mean-Freq Standard-Dev Z-Score p-Value [Original] [Random] [Random] 86.573% 99.999% 0.00011008 -1219.6 1
13.427% 0.00101% 0.00011008 1219.6 0
1
2 3 1
3 2
Although it was observed (weighted clustering coefficient vs unweighted) that triplets are not due to scientists with frequent collaboration using Milos study it is clear that the 13.427% triplets that contained in the network constitute a statistical significant characteristic of its evolution.
Community structure of the network
Triadic Closure
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 76
A weak link (tie) bridge
After some time
A weak link (tie) local bridge
The strength of weak ties (links) edge betweenness
Finding Communities Social and other networks have a natural community structure We want to discover this structure rather than impose a certain
size of community or fix the number of communities
Without looking, can we discover community structure in an
automated way?
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 77
Girvan & Newman: betweenness clustering
Finding community structure in very large networks (fast greedy alforithm)
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 78
Consider edges that fall within a community or between a community and the rest of the network
Define modularity:
),(22
1wv
vw
wvvw ccm
kkAm
Q
=
probability of an edge between two vertices is proportional to their degrees
adjacency matrix
For a random network, Q = 0 the number of edges within a community is no different from
what you would expect
if vertices are in the same community
Clauset, M. E. J. Newman, Cristopher Moore, 2004 Slide from Lada Adamic
0 Q 1
Communities edge betweeness
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 79
modularity = 0.45, very low
Extensions to weighted networks (with fast greedy algorithm)
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 80
Betweenness clustering? Will not work strong ties will have a disproportionate number of
short paths, and those are the ones we want to keep Modularity (Analysis of weighted networks, M. E. J. Newman)
reuters new articles keywords
),(22
1wv
vw
wvvw ccm
kkAm
Q
=
weighted edge
=j
iji AkSlide from Lada Adamic
Weighted Community structure of the giant component
8/11/2011 WS.04 lecture 2:Descriprive Stat on Real Nets - C. Moyssiadis V. Karagiannis 81
modularity = 0.86 16 communities
Networks and Discrete MathematicsLesson 2 OverviewMain Part (4 hours)(after completing the first lesson)Networking Complex SystemsNetworking Complex SystemsNetworking Complex SystemsTopology helps understanding Function and EvolutionHuge Complicated TopologiesThe size the scale, and the shapeWeighted vs. Unweighted NetworksGlobal and Local Statistical MeasuresNetwork typeand ConnectivityThe first questions to be answeredExampleExample (Scientific Collaboration Net)A first look concerning vulnerability.Articulation points bi-componentsand nodes at the edge of the networkArticulation points in a connected netBi-components in a connected netA crucial step towards the topology.Degree,link weightand weighted degreedistributionsFind the degree distribution regardless of connectivityDescriptive statistics ExampleDegree distributionslogarithmic binning on log-log plotSlide Number 25Distribution of the (link) weightsWeighted Degree Distribution - StrengthAssociation between degree and strengthExample: correlation between topology and weightsDegree degreecorrelation(how they choose and how we choose)The average nearest neighbor degree (ANND)AssortativityExampleExampleExampleWeighted ANND Example on Weighted ANNDBut, in Dissasorative networks hubs can be interconnectedExampleThe rich-club coefficientThe rich-club coefficientThe weighted rich-club coefficientExampleSlide Number 44Slide Number 45The distanceDistances in Real-World NetworksDistance in binary networksDistance in Weighted NetworksExampleNode ImportanceCentrality IndicesCentrality IndicesDegree Centrality Strength (weighted)Eigenvector CentralityExampleCloseness CentralityCloseness CentralityBetweenness CentralityExampleComparison between centrality indicesClustering, Cliquishness, Cohesivenessand Hierarchical StructureA broader aspect of connectedness and centrality Transitivity and the Clustering CoefficientSome facts on transitivity and clustering coefficient of the entire networkExampleWeighted Transitivity and Clustering CoefficientFacts and ExampleCliquishnessThe k-core decompositionHierarchical Structure by the C(k) functionImportant subgraphs that may uncover Functionality and evolutionary principles of the networkMotifsExampleExample (scientific collaboration net)Community structure of the networkTriadic ClosureFinding CommunitiesFinding community structure in very large networks (fast greedy alforithm)Communities edge betweenessExtensions to weighted networks (with fast greedy algorithm)Weighted Community structure of the giant component