Class 7: Evolving Network Models Network Science: Evolving Network Models February 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Embed Size (px)

DESCRIPTION

The average path-length varies as Constant degreeP(k)=δ(k-k d ) Constant clustering coefficientC=C d Two-dimensional lattice: D-dimensional lattice: Average path-length: Degree distribution: P(k)=δ(k-6) Clustering coefficient: BENCHMARK 1: Regular Lattices Network Science: Evolving Network Models February 2012

Citation preview

Class 7: Evolving Network Models Network Science: Evolving Network Models February 2012 Prof. Albert-Lszl Barabsi Dr. Baruch Barzel, Dr. Mauro Martino Empirical findings for real networks P(k) ~ k - Small World: distances scale logarithmically with the network size Clustered: clustering coefficient does not depend on network size. Scale-free: The degrees follow a power-laws distribution. Network Science: Evolving Network Models February 2012 The average path-length varies as Constant degreeP(k)=(k-k d ) Constant clustering coefficientC=C d Two-dimensional lattice: D-dimensional lattice: Average path-length: Degree distribution: P(k)=(k-6) Clustering coefficient: BENCHMARK 1: Regular Lattices Network Science: Evolving Network Models February 2012 Erds-Rnyi Model- Publ. Math. Debrecen 6, 290 (1959) fixed node number N connecting pairs of nodes with probability p Clustering coefficient: Path length: Degree distribution: BENCHMARK 2: Random Network Model Network Science: Evolving Network Models February 2012 Watts-Strogatz algorithm Nature 2008 fixed node number N connecting pairs of nodes with probability p Clustering coefficient: Path length: Degree distribution: Exponential BENCHMARK 3: Small World Model Network Science: Evolving Network Models February 2012 P(k) ~ k - Regular network Erdos- Renyi Watts- Strogatz Pathlenght Clustering Degree Distr. P(k)=(k-k d ) Exponential EMPIRICAL DATA FOR REAL NETWORKS Network Science: Evolving Network Models February 2012 SCALE-FREE MODEL (BA model) Network Science: Evolving Network Models February 2012 Real networks continuously expand by the addition of new nodes Barabsi & Albert, Science 286, 509 (1999) BA MODEL: Growth BA model: Growth ER, WS models: the number of nodes, N, is fixed (static models) Network Science: Evolving Network Models February 2012 Actor network BA MODEL: Growth (Actors/Internet) Internet BA model: GrowthGrowth of the Internet routing table Number of movies in IMDB Herr II, Bruce W., Ke, Weimao, Hardy, Elisha, and Brner, Katy. (2007) Movies and Actors: Mapping the Internet Movie Database. In Conference Proceedings of 11th Annual Information Visualization International Conference (IV 2007), Zurich, Switzerland, July 4-6, pp Network Science: Evolving Network Models February 2012 WWW Barabsi & Albert, Science 286, 509 (1999) BA MODEL: Growth (www/Pubs) Scientific Publications BA model: GrowthNetwork Science: Evolving Network Models February 2012 (1) Networks continuously expand by the addition of new nodes Add a new node with m links Barabsi & Albert, Science 286, 509 (1999) BA MODEL: Growth BA model: Growth Network Science: Evolving Network Models February 2012 Barabsi & Albert, Science 286, 509 (1999) PREFERENTIAL ATTACHMENT: the probability that a node connects to a node with k links is proportional to k. New nodes prefer to link to highly connected nodes (www, citations, IMDB). BA MODEL: Preferential Attachment Where will the new node link to? ER, WS models: choose randomly. Network Science: Evolving Network Models February 2012 Barabsi & Albert, Science 286, 509 (1999) P(k) ~k -3 (1) Networks continuously expand by the addition of new nodes WWW : addition of new documents GROWTH: add a new node with m links PREFERENTIAL ATTACHMENT: the probability that a node connects to a node with k links is proportional to k. (2) New nodes prefer to link to highly connected nodes. WWW : linking to well known sites Origin of SF networks: Growth and preferential attachment Network Science: Evolving Network Models February 2012 A.-L.Barabsi, R. Albert and H. Jeong, Physica A 272, 173 (1999) All nodes follow the same growth law Use: During a unit time (time step): k=m A=m : dynamical exponent Network Science: Evolving Network Models February 2012 SF model: k(t)~t (first mover advantage) Fitness Model: Can Latecomers Make It? time Degree (k) Network Science: Evolving Network Models February 2012 = 3 A.-L.Barabsi, R. Albert and H. Jeong, Physica A 272, 173 (1999) Degree distribution A node i can come with equal probability any time between t i =m 0 and t, hence: Network Science: Evolving Network Models February 2012 = 3 A.-L.Barabsi, R. Albert and H. Jeong, Physica A 272, 173 (1999) Degree distribution (i) The degree exponent is independent of m. (ii) As the power-law describes systems of rather different ages and sizes, it is expected that a correct model should provide a time-independent degree distribution. Indeed, asymptotically the degree distribution of the BA model is independent of time (and of the system size N) the network reaches a stationary scale-free state. (iii) The coefficient of the power-law distribution is proportional to m 2. Network Science: Evolving Network Models February 2012 Stationarity: P(k) independent of N m=1,3,5,7 N=100,000;150,000;200,000 Insert: degree dynamics m-dependence NUMERICAL SIMULATION OF THE BA MODEL Network Science: Evolving Network Models February 2012 The mean field theory offers the correct scaling, BUT it provides the wrong coefficient of the degree distribution. So assymptotically it is correct (k ), but not correct in details (particularly for small k). To fix it, we need to calculate P(k) exactly, which we will do next using a rate equation based approach. Network Science: Evolving Network Models February 2012 A.-L.Barabsi, R. Albert and H. Jeong, Physica A 272, 173 (1999) Number of nodes with degree k at time t. Nr. of degree k-1 nodes that acquire a new link, becoming degree k Preferential attachment Since at each timestep we add one node, we have N=t (total number of nodes =number of timesteps) 2m: each node adds m links, but each link contributed to the degree of 2 nodes Number of links added to degree k nodes after the arrival of a new node: Total number of k-nodes New node adds m new links to other nodes Nr. of degree k nodes that acquire a new link, becoming degree k+1 # k-nodes at time t+1 # k-nodes at time t Gain of k- nodes via k-1 k Loss of k- nodes via k k+1 MFT - Degree Distribution: Rate Equation A.-L.Barabsi, R. Albert and H. Jeong, Physica A 272, 173 (1999) # m-nodes at time t+1 # m- nodes at time t Add one m-degeree node Loss of an m-node via m m+1 We do not have k=0,1,...,m-1 nodes in the network (each node arrives with degree m) We need a separate equation for degree m modes # k-nodes at time t+1 # k-nodes at time t Gain of k- nodes via k-1 k Loss of k- nodes via k k+1 MFT - Degree Distribution: Rate Equation Network Science: Evolving Network Models February 2012 A.-L.Barabsi, R. Albert and H. Jeong, Physica A 272, 173 (1999) k>m We assume that there is a stationary state in the N=t limit, when P(k,)=P(k) k>m MFT - Degree Distribution: Rate Equation Network Science: Evolving Network Models February 2012 A.-L.Barabsi, R. Albert and H. Jeong, Physica A 272, 173 (1999) ... m+3 k Krapivsky, Redner, Leyvraz, PRL 2000 Dorogovtsev, Mendes, Samukhin, PRL 2000 Bollobas et al, Random Struc. Alg for large k MFT - Degree Distribution: Rate Equation Network Science: Evolving Network Models February 2012 A.-L.Barabsi, R. Albert and H. Jeong, Physica A 272, 173 (1999) Its solution is: Start from eq. Dorogovtsev and Mendes, 2003 MFT - Degree Distribution: A Pretty Caveat Network Science: Evolving Network Models February 2012 Do we need both growth and preferential attachment? Network Science: Evolving Network Models February 2012 growth preferential attachment (k i ) : uniform MODEL A Network Science: Evolving Network Models February 2012 growth preferential attachment P(k) : power law (initially) Gaussian Fully Connected MODEL B Network Science: Evolving Network Models February 2012 Do we need both growth and preferential attachment? YEP. Network Science: Evolving Network Models February 2012 P(k) ~ k - Regular network Erdos- Renyi Watts- Strogatz P(k)=(k-k d ) Exponential Barabasi- Albert P(k) ~ k - EMPIRICAL DATA FOR REAL NETWORKS Pathlenght Clustering Degree Distr. Network Science: Evolving Network Models February 2012 Distances in scale-free networks Size of the biggest hub is of order O(N). Most nodes can be connected within two layers of it, thus the average path length will be independent of the system size. The average path length increases slower than logarithmically. In a random network all nodes have comparable degree, thus most paths will have comparable length. In a scale-free network the vast majority of the path go through the few high degree hubs, reducing the distances between nodes. Some key models produce =3, so the result is of particular importance for them. This was first derived by Bollobas and collaborators for the network diameter in the context of a dynamical model, but it holds for the average path length as well. The second moment of the distribution is finite, thus in many ways the network behaves as a random network. Hence the average path length follows the result that we derived for the random network model earlier. Cohen, Havlin Phys. Rev. Lett. 90, 58701(2003); Cohen, Havlin and ben-Avraham, in Handbook of Graphs and Networks, Eds. Bornholdt and Shuster (Willy-VCH, NY, 2002) Chap. 4; Confirmed also by: Dorogovtsev et al (2002), Chung and Lu (2002); (Bollobas, Riordan, 2002; Bollobas, 1985; Newman, 2001 Small World DISTANCES IN SCALE-FREE NETWORKS Bollobas, Riordan, 2002 PATH LENGTHS IN THE BA MODEL Network Science: Evolving Network Models February 2012 P(k) ~ k - P(k)=(k-k d ) Exponential P(k) ~ k - EMPIRICAL DATA FOR REAL NETWORKS Pathlenght Clustering Degree Distr. Regular network Erdos- Renyi Watts- Strogatz Barabasi- Albert Network Science: Evolving Network Models February 2012 The numerical results indicate a slightly slower decay. What is the functional form of C(N)? CLUSTERING COEFFICIENT OF THE BA MODEL Reminder: for a random graph we have: Konstantin Klemm, Victor M. Eguiluz, Growing scale-free networks with small-world behavior, Phys. Rev. E 65, (2002), cond-mat/ Network Science: Evolving Network Models February 2012 1 2 Denote the probability to have a link between node i and j with P(i,j) The probability that three nodes i,j,l form a triangle is P(i,j)P(i,l)P(j,l) The expected number of triangles in which a node l with degree k l participates is thus: We need to calculate P(i,j). CLUSTERING COEFFICIENT OF THE BA MODEL Network Science: Evolving Network Models February 2012 Calculate P(i,j). Node j arrives at time t j =j and the probability that it will link to node i with degree k i already in the network is determined by preferential attachment: Where we used that the arrival time of node j is t j =j and the arrival time of node is t i =i Let us approximate: Which is the degree of node l at current time, at time t=N There is a factor of two difference... Where does it come from? CLUSTERING COEFFICIENT OF THE BA MODEL Network Science: Evolving Network Models February 2012 Clustering Coefficient of the BA model CLUSTERING COEFFICIENT OF THE BA MODEL Konstantin Klemm, Victor M. Eguiluz, Phys. Rev. E 65, (2002) Network Science: Evolving Network Models February 2012 P(k) ~ k - EMPIRICAL DATA FOR REAL NETWORKS Pathlenght Clustering Degree Distr. P(k)=(k-k d ) Exponential P(k) ~ k - Regular network Erdos- Renyi Watts- Strogatz Barabasi- Albert Network Science: Evolving Network Models February 2012 The origins of preferential attachment. Network Science: Evolving Network Models February 2012 Preferential Attachment: a brief history of collective amnesia Gyorgy Polya ( ) 1923: Polya process in the mathematics literature George Udmy Yule ( ) in 1925: the number of species per genus of flowering plants; Yule process in statistics Robert Gibrat ( ), 1931: rule of proportional growth (the size of the growth and rate of a firm are independent). Gibrat process in economics George Kinsley Zipf ( ), 1949: the distribution of the wealth on the society. Herbert Alexander Simon ( ), 1955, the distribution of city sizes and other phenomena Derek de Solla Price ( ), 1976, used it to explain the citation statistics of scientific publications, "cumulative advantage Robert Merton ( ), 1968: Matthew effect, Gospel of Matthew: "For everyone who has will be given more, and he will have an abundance. Whoever does not have, even what he has will be taken from him." "Not the first, but the last" Network Science: Evolving Network Models February 2012 Plot the change in the degree k during a fixed time t for nodes with degree k, and you get (k) (Jeong, Neda, A.-L. B, Europhys Letter 2003; cond-mat/ ) No pref. attach: ~k Linear pref. attach: ~k 2 To reduce noise, plot the integral of (k) over k: CAN WE MEASURE PREFERENTIAL ATTACHMENT? Network Science: Evolving Network Models February 2012 neurosci collab actor collab. citation network Plots shows the integral of (k) over k: Internet CAN WE MEASURE PREFERENTIAL ATTACHMENT? No pref. attach: ~k Linear pref. attach: ~k 2 Network Science: Evolving Network Models February 2012 1.Copying mechanism directed network select a node and an edge of this node attach to the endpoint of this edge 2.Walking on a network directed network the new node connects to a node, then to every first, second, neighbor of this node 3.Attaching to edges select an edge attach to both endpoints of this edge 4.Node duplication duplicate a node with all its edges randomly prune edges of new node MECHANISMS RESPONSIBLE FOR PREFERENTIAL ATTACHMENT Network Science: Evolving Network Models February 2012 Copying Mechanism Network Science: Evolving Network Models February 2012 Proteins with more interactions are more likely to obtain new links: (k)~k (preferential attachment) Wagner 2001; Vazquez et al. 2003; Sole et al. 2001; Rzhetsky & Gomez 2001; Qian et al. 2001; Bhan et al ORIGIN OF THE SCALE-FREE TOPOLOGY IN THE CELL:Gene Duplication Network Science: Evolving Network Models February 2012 k vs. k : increase in the No. of links in a unit time No PA: k is independent of k PA: k ~k Eisenberg E, Levanon EY, Phys. Rev. Lett Jeong, Neda, A.-L.B, Europhys. Lett PREFERENTIAL ATTACHMENT IN PROTEIN INTERACTION NETWORKS Network Science: Evolving Network Models February 2012 Nr. of nodes: Nr. of links: Average degree: Degree dynamics Degree distribution: Average Path Length: Clustering Coefficient: The network grows, but the degree distribution is stationary. : dynamical exponent : degree exponent SUMMARY: PROPERTIES OF THE BA MODEL Network Science: Evolving Network Models February 2012 =1=2=3 diverges finite w in w out intern actor collab metab cita synonyms sex BA model Can we change the degree exponent? DEGREE EXPONENTS Network Science: Evolving Network Models February 2012 Evolving network models Network Science: Evolving Network Models February 2012 The BA model is only a minimal model. Makes the simplest assumptions: linear growth linear preferential attachment Does not capture variations in the shape of the degree distribution variations in the degree exponent the size-independent clustering coefficient Hypothesis: The BA model can be adapted to describe most features of real networks. We need to incorporate mechanisms that are known to take place in real networks: addition of links without new nodes, link rewiring, link removal; node removal, constraints or optimization EVOLVING NETWORK MODELS Network Science: Evolving Network Models February 2012 (the simplest way to change the degree exponent) = 3 A.-L.Barabsi, R. Albert and H. Jeong, Physica A 272, 173 (1999) Undirected BA network: Directed BA network: =1: dynamical exponent in =2: degree exponent; P(k out )=(k out -m) Undirected BA: =1/2; =3 BA ALGORITHM WITH DIRECTED EDGES Network Science: Evolving Network Models February 2012 Extended Model prob. p : internal links prob. q : link deletion prob. 1-p-q : add node EXTENDED MODEL: Other ways to change the exponent P(k) ~ (k+ (p,q,m)) - (p,q,m) [1, ) Network Science: Evolving Network Models February 2012 P(k) ~ (k+ (p,q,m)) - (p,q,m) [1, ) Extended Model p=0.937 m=1 = = 3.07 Actor network prob. p : internal links prob. q : link deletion prob. 1-p-q : add node Predicts a small-k cutoff a correct model should predict all aspects of the degree distribution, not only the degree exponent. Degree exponent is a continuous function of p,q, m EXTENDED MODEL: Small-k cutoff Network Science: Evolving Network Models February 2012 More models Non-linear preferential attachment: P(k) does not follow a power law for 1 1 : no-scaling ( >2 : gelation) P. Krapivsky, S. Redner, F. Leyvraz, Phys. Rev. Lett. 85, 4629 (2000) NONLINEAR PREFERENTIAL ATTACHMENT Network Science: Evolving Network Models February 2012 Initial attractiveness shifts the degree exponent: A - initial attractiveness Dorogovtsev, Mendes, Samukhin, Phys. Rev. Lett. 85, 4633 (2000) BA model: k=0 nodes cannot aquire links, as (k=0)=0 (the probability that a new node will attach to it is zero) Note: the parameter A can be measured from real data, being the rate at which k=0 nodes acquire links, i.e. (k=0)=A INITIAL ATTRACTIVENESS Network Science: Evolving Network Models February 2012 Finite lifetime to acquire new edges Gradual aging: S. N. Dorogovtsev and J. F. F. Mendes, Phys. Rev. E 62, 1842 (2000) L. A. N. Amaral et al., PNAS 97, (2000) GROWTH CONSTRAINTS AND AGING CAUSE CUTOFFS Network Science: Evolving Network Models February 2012 P(k) ~ k - Pathlenght Clustering Degree Distr. P(k)=(k-k d ) Exponential P(k) ~ k - THE LAST PROBLEM: HIGH, SYSTEM-SIZE INDEPENDENT C(N) Regular network Erdos- Renyi Watts- Strogatz Barabasi- Albert Network Science: Evolving Network Models February 2012 Each node of the network can be either active or inactive. There are m active nodes in the network in any moment. 1.Start with m active, completely connected nodes. 2.Each timestep add a new node (active) that connects to m active nodes. 3.Deactivate one active node with probability: K. Klemm and V. Eguiluz, Phys. Rev. E 65, (2002) C C* when N A MODEL WITH HIGH CLUSTERING COEFFICIENT Network Science: Evolving Network Models February 2012 Fitness Model Network Science: Evolving Network Models February 2012 SF model: k(t)~t (first mover advantage) Fitness model: fitness ( k( ,t)~t = C Fitness Model: Can Latecomers Make It? time Degree (k) Bianconi & Barabsi, Physical Review Letters 2001; Europhys. Lett G. Bianconi and A.-L. Barabsi, Physical Review Letters 2001; cond-mat/ NetworkBose gas Fitness Energy level New node with fitness New energy level Link pointing to node Particle at level Network quantum gas MAPPING TO A QUANTUM GAS Network Science: Evolving Network Models February 2012 f( )=e . The dynamic exponent f(e) depends on m, determined by the self-consistent equation: BOSE-EINSTEIN CONDENSATION Network Science: Evolving Network Models February 2012 time Degree (k) Bianconi & Barabsi, Physical Review Letters 2001; Europhys. Lett Bose-Einstein Condensation Network Science: Evolving Network Models February 2012 Bianconi & Barabsi, Physical Review Letters 2001; Europhys. Lett Network Science: Evolving Network Models February 14, 2011 Bose-Einstein Condensation Bose-Einstein condensation Fit-gets-rich FITNESS MODEL: Can Latecomers Make It? 1. There is no universal exponent characterizing all networks. 2.Growth and preferential attachment are responsible for the emergence of the scale-free property. 3.The origins of the preferential attachment is system-dependent. 4. Modeling real networks: identify the microscopic processes that take place in the system measure their frequency from real data develop dynamical models that capture these processes. 5. If the model is correct, it should correctly predict not only the degree exponent, but both small and large k-cutoffs. LESSONS LEARNED: evolving network models Network Science: Evolving Network Models February 2012 Philosophical change in network modeling: ER, WS models are static models the role of the network modeler it to cleverly place the links between a fixed number of nodes to that the network topology mimic the networks seen in real systems. BA and evolving network models are dynamical models: they aim to reproduce how the network was built and evolved. Thus their goal is to capture the network dynamics, not the structure. as a byproduct, you get the topology correctly LESSONS LEARNED: evolving network models Network Science: Evolving Network Models February 2012 5 slides Discuss: What are your nodes and links How will you collect the data Expected size of the network (Nr of nodes, links) What questions you plan to ask (they may change as we move along with the class). Why do we care about the network you plan to study. Network Science: Evolving Network Models February 2012 The end Network Science: Evolving Network Models February 2012