Upload
evangeline-mabel-gibbs
View
231
Download
3
Embed Size (px)
DESCRIPTION
Biological, Social and Technological systems. Complex networks describe the underlying structure of interacting complex Biological, Social and Technological systems.
Citation preview
Introduction to complex networksIntroduction to complex networksPart I: StructurePart I: Structure
Ginestra BianconiPhysics Department,Northeastern University, Boston,USA
NetSci 2010 Boston, May 10 2010
QuickTime™ and a decompressor
are needed to see this picture. QuickTime™ and a decompressor
are needed to see this picture.
Complex networks
describe
the underlying structure of interacting complex
Biological, Social and Technological systems.
QuickTime™ and a decompressor
are needed to see this picture.QuickTime™ and a
decompressorare needed to see this picture.
Identify your network !QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
Why working on networks?Why working on networks?Because
NETWORKS
encode for the ORGANIZATION,
FUNCTION,ROBUSTENES
AND DYNAMICAL BEHAVIOR of the entire complex system
Types of networksSimple Each link is either existent or non existent, the links do not have directionality
(protein interaction map, Internet,…)
Directed The links have directionality,i.e., arrows(World-Wide-Web, social networks…)
Signed The links have a sign(transcription factor networks, epistatic networks…)
QuickTime™ and a decompressor
are needed to see this picture.
Types of networksWeighted The links are associated to a real number
indicating their weight(airport networks, phone-call networks…)With features of the nodes The nodes might
have weight or color(social networks, diseasome, ect..)
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.QuickTime™ and a
decompressorare needed to see this picture.
Bipartite networks: Bipartite networks: ex. Metabolic networkex. Metabolic network
Reaction pathway Bipartite Graph
A+B C+DA+D EB+C E
A B C D E
1 2 3
A E
B
C
D
Metabolites projection
Reactions (enzymes) projection
1
2
3
1
2 3
Adjacency matrix
€
0 1 0 0 01 0 1 0 10 1 0 1 10 0 1 0 10 1 1 1 0
⎛
⎝
⎜ ⎜ ⎜ ⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟ ⎟ ⎟ ⎟
1
2 3
45
Network:A set of labeled nodes and links between them
Adjacency matrix:The matrix of entriesa(i,j)=1 if there is a ink
between node i and j
a(i,j)=0 otherwise
Total number of nodes Nand links L
The total number of nodes Nand the total number of links L
are the most basic characteristics of a network
Degree distribution
0
1
2
3
1 2 3
P(k)
1
2 3
45
€
ki = aijj
∑Degree if node i:Number of links of node i Network: Degree distribution:
P(k): how many nodes have degree k
Random graphs
Binomial Poisson distribution distribution
G(N,p) ensembleG(N,p) ensemble
Graphs with N nodesEach pair of nodes linked
with probability p
G(N,L) ensembleG(N,L) ensemble
Graphs with exactly N nodes and
L links
€
P(k) =N −1k
⎛ ⎝ ⎜
⎞ ⎠ ⎟pk (1 − p)N −1−k
P(k) =1k!c ke−c
Universalities:Scale-free degree distribution
)exp()(~)( 00
τ
γ
kkkkkkP +
−+ −
€
P(k)∝ k −γ γ ∈ (2,3)
k
Actor networks WWW Internet
∞→2k
finitek
kFaloutsos et al. 1999Barabasi-Albert 1999
Topology of the yeast protein Topology of the yeast protein networknetwork
€
P(k) ~ (k + k0)−γ e−k+k0
kτ
γ ≈ 2.5
H. Jeong et al. Nature (2001)
Scale-free networksScale-free networks • Technological networks:
– Internet, World-Wide Web
• Biological networks :Biological networks : – Metabolic networks,
– protein-interaction networks,– transcription networks
• Transportation networks:Transportation networks:
– Airport networks
• Social networks:Social networks: – Collaboration networks
– citation networks
• Economical networks:Economical networks: – Networks of shareholders
in the financial market – World Trade Web
Why this universality?
• Growing networks:Growing networks:– Preferential attachment
Barabasi & Albert 1999,Dorogovtsev Mendes 2000,Bianconi & Barabasi 2001,
etc.
• Static networks:Static networks:– Hidden variables mechanism
Chung & Lu 2002, Caldarelli et al. 2002, Park & Newman 2003
Scale-free networksScale-free networks
€
γ > 3
γ−∝ kkP )(
with
21 <γ<with
with
32 <γ<€
k finite
k 2 finite
€
k finite
k 2 →∞
€
k →∞
k 2 →∞
Hyppocampus functional neural network
€
P(k) ∝ k−γ
€
γ∈ (1,2]k →∞
k 2 →∞
QuickTime™ and a decompressor
are needed to see this picture.
Bonifazi et al . Science 2009Dense scale-free networks
Social network of phone calls
QuickTime™ and a decompressorare needed to see this picture.
€
P(k) =a
(k + k0)γe−k / kc
γ = 8.4
J. P Onnela PNAS 2007
QuickTime™ and a decompressor
are needed to see this picture.
Finite scale network
Clustering coefficient
€
Ci =|#of closed triangles |
ki(ki −1) /2
€
C2 =13
C3 =23
1
2 3
45
Definition of local clusteringcoefficient
Network Clustering coefficient of nodes 2,3
Shortest distance
The shortest distance between two nodes is the minimal number of links than a path must hop to go from the source to the destination
1
3
4
The shortest distancebetween node 4 and node 1 is 3between node 3 and node 1 is 2
2
5
Diameter and average distance
The diameter D of a network is the maximal length of the shortest distance between any pairs of nodes in the network
The average distance is the average length of the shortest distance between any pair of nodes in the network
€
D ≥ l
€
l
Universality: Small worldUniversality: Small world
Ci =# of links between 1,2,…ki neighbors
ki(ki-1)/2
Networks are clustered (large average Ci, i.e.C)
but have a small characteristic path length
(small L).
Network C Crand L N
WWW 0.1078 0.00023 3.1 153127
Internet 0.18-0.3 0.001 3.7-3.76 3015-6209
Actor 0.79 0.00027 3.65 225226
Coauthorship 0.43 0.00018 5.9 52909
Metabolic 0.32 0.026 2.9 282
Foodweb 0.22 0.06 2.43 134
C. elegance 0.28 0.05 2.65 282
Ki
i
Watts and Strogatz (1999)
Small world in social systems
1967 Milligran experimentPeople from Nebraska and Kansas where ask to
contact a person in Boston though their network of acquaintances
The strength of weak ties(Granovetter 1973)
Weak ties help navigate the social networks
Small-world model (Watts & Strogatz 1998)
The model for coexistence of high clustering and small distances in complex networks
Diameter and average lengthDiameter and average lengthDiameterDiameter
Average distanceAverage distance
Poisson Poisson networksnetworks
Scale-free Scale-free networksnetworks
)log(ND≈
)log()log(><
≈ kND
)log()log(><
≈ kNl
€
l ≈log(N)
log(log(N))if γ = 3
log(log(N)) if γ < 3
⎧ ⎨ ⎪
⎩ ⎪
Small world propertySmall world property Ultra small world propertyUltra small world property
Chung & Lu 2004 Cohen &Havlin 2003
Giant componentGiant componentA connected component of a network is a subgraph such that for each pair of nodes in the subgraph there is at least one path linking themThe giant component is the connected component of the network which contains a number of nodes of the same order of magnitude of the total number of links
Molloy-Reed principleMolloy-Reed principleA network has a giant A network has a giant
component ifcomponent if 22
≥><><
kk
Second moment <kSecond moment <k22>> Molloy-Reed condition
Poisson Poisson networksnetworks
Scale-free Scale-free networksnetworks
><+>>=<< kkk 22 1>≥<k
33)log(
2/)3(2
<γ=γ
⎩⎨⎧
>≈< γ− ifif
NNk 3≤γ
Molloy-Reed 1995
RobustnessRobustnessComplex systems maintain their basic functions
even under errors and failures
Node failure
fc
0 1Fraction of removed nodes, f
1
S
Robustness of scale-free networksRobustness of scale-free networks
1
S
0 1ffc
Attacks
FailuresTopological
error tolerance
R. Albert et al. 2000Cohen et al. 2000Cohen et al. 2001
RobustnessRobustness
Failure=Attack Robust against FailureWeak against Attack
R. Albert et al. 2000
Betweenness CentralityThe betweenness centrality of a node v
is given by
where st is the total number of shortest paths between s and t
st(v) are the number of such paths passing through v
QuickTime™ and a decompressor
are needed to see this picture.
€
B(v) =s,t≠v∑ σ st (v)
σ st
The nodes A and B are also called bridges and have high betweeness
Link probability in Link probability in uncorrelated networksuncorrelated networks
In uncorrelated networks the probability that a node i is linked to a node j is given by
€
pij =kik j
< k > N
i
j
€
pij = 23k N
€
PR (k,k ') =kk'k N
P(k)P(k ')
Degree correlationsDegree correlations
€
P(k,k ')PR (k,k ')
The map of
reveals correlations in the protein interaction map S. Maslov and K. Sneppen Science 2002
Link probability between nodes of degree classes k and k’
Specific characteristic of networks: Degree correlations
log(
k nn(k
))
α≈kkknn )(
Assortative networks α>0
Uncorrelated networks α=0
Disassortative networks α<0
kkiNjj
inn
i
kk
kk=∈
∑=)(
)( 1
log (k)
Average degree of a neighbor of a node of degree k
Degree-degree correlations in Degree-degree correlations in the Internet at the AS level the Internet at the AS level
€
knn (k) ≈ kα α < 0the network is the network is disassortativedisassortative
Vazquez et al. PRL 2001
€
knn (k) =1ki
k jj∈N ( i)∑
ki =k
Average degree of a neighbor of a node of degree k
log(k)
log(
C(k
)) Uncorrelated
networks =0
Modular networks >0
Correlations and Correlations and clustering coefficientclustering coefficient
€
C(k) ≈ k−δ
€
C(k) =ai, ja jrar,i
i, j,r∑k(k −1)
i k( i) =k
Average clustering coefficient C(k) of nodes of degree k
Clustering coefficient of Clustering coefficient of metabolic networksmetabolic networks
€
C(k) ≈ k−δ
There are important correlations in the network!The network is not ‘random’Highly connected nodes link more “distant” nodes of the networkRavasz,et al. Science (2002).
Loops or cycles of size LA loop or a cycle of size L is a path of the network that start at one point and ends after hopping L links on the same point without crossing the intermediate nodes more than one time
2 3
45
1
This network has 2 loops of size 3and 1 loop of size 4
Small loops in the Internet at the AS level
The number of loops of size L= 3,4,5
grows with the network size N as
In Poisson networks instead they are a fixed number
Independent on N
€
ℵL ∝ N ξ (L )
G. Bianconi et al.PRE 2005
Cliques of size c
Clique of size cis a fully connected subgraph of the network of
c nodes and c(c-1)/2 links
2 3
45
1This network contains 4 cliques of size 3 (triangles)and 1 clique of size 4
Small subgraphs appear Small subgraphs appear abruptly when we increase abruptly when we increase
the average number of the average number of links in the random graphslinks in the random graphs
€
c =LN
≈ Nα
∞−α
€
c ≈ Nα
-1 -1/2 -1/3 -1/4 0 1/3 1/2
Subgraph thresholdsSubgraph thresholds
Finite average connectivity
Poisson distribution
Diverging average connectivity
∞
2/)3(
21 L
L NL
γ−≈ℵ
γ−≈k)k(P
γ 3 2 1
Small subgraphs in SF networksSmall subgraphs in SF networks
Cmax=3
Small loopsOf length L
Cliques of Size C
γγ /)3(
21 L
L NL
−≈ℵ
c
c ccN
⎟⎟⎠⎞
⎜⎜⎝⎛
−≈ℵ
−
)(
2/)3(
γ
γc
c ccN
⎟⎟⎠⎞
⎜⎜⎝⎛
−≈ℵ
)(
)/1(
γ
γ
∞⏐⏐ →⏐ ∞→Nmaxc ∞⏐⏐ →⏐ ∞→N
maxc
finiteLℵ
G. Bianconi and M. Marsili (2004), (2005)
€
k →∞
€
k finite
K-core of a networkK-core of a networkA A K-coreK-core of a network of a network
is the subgraph of a network obtained by is the subgraph of a network obtained by removing the nodes with connectivity kremoving the nodes with connectivity kii<K<K
iteratively until the network iteratively until the network has only nodes with has only nodes with
K-core of the InternetDIMES Internet dataVisualization:Alvarez-Hamelin et al. 2005
€
ki ≥K
K-coreson trees with given degree
distributions
Scale-free networks
γγ −−≈ 32max cutKpmK
1
max )(−
⎟⎟⎠⎞
⎜⎜⎝⎛
≈γ
cuτKmpKM
S. Dorogovtsev, et al. PRL (2006) Carmi et al. PNAS (2007)
Maximal K of the K-cores
Size of the minimal K-core
How to build a null modelHow to build a null modelform a given network:form a given network:swap of connectionsswap of connections
ChooseChoose two two random links random links linking four distinct linking four distinct nodesnodes
If possible (not If possible (not already existing already existing links) links) swap the swap the ends of the linksends of the links
Maslov & Sneppen 2002
MotifsMotifsThe motifs are subgraphssubgraphs which appear with higher frequency in real networks than in randomized networks
In biological networks the motifs are told to be motifs are told to be selected selected by evolutionby evolutionand are relevant to and are relevant to understand the function of understand the function of the network.the network.
Motifs in the transcriptome network of e.coli.S.S. Shen-Orr, et al., 2002
Milo et al. 2002
Motifs of size 3 in directed networks
QuickTime™ and a decompressor
are needed to see this picture.
The number of distinguishable possible motifs increases exponentially with the motif size limiting the extension of this method to large subgraphs
Specific characteristics of a network:
communities
A community of a network define a set of nodes with similar connectivity pattern.
Dolphins social network High-school dating networks
S. Fortunato Phys. Rep. 2010
Girvan and Newman algorithm
1. The betweenness of all existing edges in the network is calculated first.
2. The edge with the highest betweenness is removed.
3. The betweenness of all edges affected by the removal is recalculated.
4. Steps 2 and 3 are repeated until no edges remain.
Girvan and Newman PNAS 2002
The algorithm
ModularityAssign a community si=1,2,…K to each
node, The modularity Q is given by
Tight communities can be found by maximizing the modularity function
€
Q =1k N
aij −kik j
k N
⎡ ⎣ ⎢
⎤ ⎦ ⎥
ij∑ δ (si,s j )
Newman PNAS 2006
Cliques Cliques for community for community detectiondetection
Overlapping set of cliques can be
helpful to identify community
structures in different networks
For this method to work systematically networks must have many cliques (ex. Scale-free networks)
Palla et al. Nature (2005)
Weight distribution and weight-Weight distribution and weight-degree correlationsdegree correlations
The weights of a network might correlate with the degree of the nodes which are linked by them.
Strength Sk
Is a measure of the inhomogeneities of the distribution of the weights among the nodes of the network.
Disparity Y2k
Is a measure of the inhomogeneities of the weights ending at a node of degree k.
Strength of the node and Strength of the node and weight-degree correlationsweight-degree correlations
The strength of the nodes is the average weight of links connecting that node
€
Sk = wijj
∑ki =k
∝k
k1+η
⎧ ⎨ ⎪
⎩ ⎪
All weights distribute equally to the strength of the nodes
Weight-degree correlations:Highly weighted links are connected to highly connected nodes
Coauthorship networksCoauthorship networks
The coauthorship network is a scale-free network
Power-law strength distribution and
Linear dependence of strength versus degree
A. Barrat et al. PNAS 2004
Airport networkAirport network
The airport network is a scale-free network
With power-law strength distribution and
The strength versus k show weight-degree correlations.
A. Barrat et al. PNAS 2004
The DisparityThe Disparity
€
Y 2k =
wi, j
si
⎛ ⎝ ⎜
⎞ ⎠ ⎟
j∑
2
ki =k
≈1/k....1
⎧ ⎨ ⎪
⎩ ⎪
All weights contribute equally
There is one dominating link
In many networks we observeIn many networks we observe
indicating a hierarchy of weightsindicating a hierarchy of weights
€
Y 2(k) =1/kα with α ∈(0,1)
Metabolic NetworkMetabolic NetworkThe degree distribution of the metabolite projection of the metabolic network is scale-free.
The distribution of the fluxes in metabolic networks (e.coli in succinate substrate) is power-law.
The Y2(k) predicted by flux-balance analysis techniques is non trivial
€
Y 2(k) ≈ k −0.27Almaas et al. Nature 2004
Extraction of sector Extraction of sector information in financial information in financial
marketsmarkets
Minimal-Spanning-Trees Planar maximally filtered graphs
NYSE daily returns USA equity market 1995-98 Bonanno et al. (2003) Tumminello et al. (2007)
Why working on networks?Why working on networks?Because
NETWORKS
encode for the ORGANIZATION,
FUNCTION,ROBUSTENES
AND DYNAMICAL BEHAVIOR of the entire complex system