Cartography of complex networks: From organizations to the metabolism Cartography of complex networks: From organizations to the metabolism Roger Guimer 

  • View
    214

  • Download
    2

Embed Size (px)

Text of Cartography of complex networks: From organizations to the metabolism Cartography of complex...

  • Slide 1

Cartography of complex networks: From organizations to the metabolism Cartography of complex networks: From organizations to the metabolism Roger Guimer Department of Chemical and Biological Engineering Northwestern University Oxford, June 19, 2006 Slide 2 From a linear world Predator Consumer Resource Food chains Predator Consumer Resource Predator Consumer Resource Food tree Consumer Slide 3 to the real world The Biosphere2 project Slide 4 Trophic interactions in the North Atlantic fishery: a real food web Slide 5 The email network of a real organization Guimera, Danon, Daz-Guilera, Giralt, Arenas, PRE (2002) Slide 6 The worldwide air transportation network: a real socio-economic network Guimera, Mossa, Turtschi, Amaral, PNAS (2005) Slide 7 The protein interactome of yeast: a real biochemical network Jeong, Mason, Barabasi, Oltvai, Nature (2001) Slide 8 Summary What is (was) missing in the analysis of complex systems? Cartography of complex networks: Modules in complex networks Roles in complex networks Can we discover new therapeutic drugs by analyzing complex networks? Slide 9 Lets assume that......proteins/people interact at random with other proteins/people Slide 10 Lets assume that......individuals live in a square lattice!! Slide 11 Nodes in real networks are (often) close to each other Slide 12 Nodes in real networks (often) have structured neighborhoods Slide 13 Real networks are (often) highly inhomogeneous Slide 14 Real networks are (often) modular Slide 15 What can we learn by studying the interaction network topology? Slide 16 Extracting information from complex networks Protein interactions in fruit fly Giot et al., Science (2003) Slide 17 We need a cartography of complex networks Modules One divides the system into regions Roles One highlights important players Slide 18 Heuristic methods to identify modules in complex networks: Girvan-Newman algorithm Girvan & Newman, PNAS (2002) Identify the most central edge in the network Remove the most central edge in the network Iterate the process A B C D E F H I G Slide 19 The Girvan-Newman algorithm for module detection is remarkably effective Slide 20 The community tree of a real organization Slide 21 Shortcomings of the GN algorithm It is very slow: O(N 3 ) One needs to decide where to stop the process It does not work that well when the modular structure becomes fuzzy Slide 22 We define a quantitative measure of modularity Low modularity High modularity Newman & Girvan, PRE (2003) Intuitively high modularity = many links within & few links between Slide 23 We define a quantitative measure of modularity Newman & Girvan, PRE (2003); Guimera, Sales-Pardo, Amaral, PRE (2004) f s : fraction of links within module s F s : expected fraction of links within module s, for a random partition of the nodes Modularity of a partition: M = (f s F s ) Slide 24 We define a quantitative measure of modularity Modularity of a partition: Where: l s is the number of links within module s d s is the sum of the degrees of the nodes in module s L is the total number of links in the network Slide 25 But now that we have modularity, we can try optimization-based approaches Brute force: Find all possible partitions of the network, calculate their modularity, and keep the partition with the highest modularity. Uphill search: 1.Start from a random partition of the network. 2.Try to randomly move a node from one module to another. Does the modularity increase? Yes:Accept the movement. No:Reject the movement. 3.Repeat from 2 Slide 26 Uphill search does not give the best possible partition Slide 27 We use simulated annealing to obtain the partition with largest modularity Simulated annealing: 1.Start from a random partition of the network. 2.Define a computational temperature T. Set T to a high value. 3.Try to randomly move a node from one module to another. Does the modularity increase? Yes:Accept the movement. No:Is the decrease in modularity much larger than T? Yes: Reject the movement. No: Sometimes accept the movement. 4.Decrease T and repeat from 3. Guimera & Amaral, Nature (2005) Slide 28 Simulated Annealing We use simulated annealing to obtain the partition with largest modularity Slide 29 The new algorithm for module detection outperforms previous algorithms Slide 30 As we already knew, geo-political factors determine the modular structure of the air transportation network Guimera, Mossa Turtschi, Amaral, PNAS (2005) Slide 31 Now we need to identify the role of each node Slide 32 Previous approaches to role identification: Structural equivalence Definition Two nodes are structurally equivalent if, for all actors, k=1, 2, , g (k=i, j), and all relations r =1, 2, , R, actor i has a tie to k, if and only if j also has a tie to k, and i has a tie from k if and only if j also has a tie from k. (Wasserman & Faust) Translation Two nodes are structurally equivalent if they have the exact same connections. Slide 33 Previous approaches to role identification: Regular equivalence Definition If actors i and j are regularly equivalent, and actor i has a tie to/from some actor, k, then actor j must have the same kind of tie to/from some actor, m, and k and m must be regularly equivalent. (Wasserman & Faust) Translation Two nodes are regularly equivalent if they have identical connections to equivalent nodes. Slide 34 We define the within-module degree Within-module relative degree where: i : number of links of node i inside its own module Slide 35 We define the participation coefficient Participation coefficient where: f is : fraction of links of node i in module s Slide 36 The within-module degree and the participation coefficient define the role of each node Slide 37 We define seven different roles Hubs Non-hubs Ultra-peripheral Satellite connector Peripheral Provincial hub Global hub Slide 38 Our definition of roles enables us to identify important cities Slide 39 How does network cartography help us understand the metabolism? Metabolic network of E. coli Slide 40 The cartographic representation of the metabolic network of E. coli Guimera & Amaral, Nature (2005) Satellite Global Slide 41 Satellite connectors are more conserved across species than provincial hubs Comparison between 12 organisms: 4 archea 4 bacteria 4 eukaryotes Ultra-peripheralPeripheralSatellite connectorsProvincial hubsGlobal hubs Slide 42 Fluxes involving satellite connectors are essential Guimera, Sales-Pardo, Amaral, submitted (2006) Slide 43 Questions for us to think Can we design better organizations / transportation systems / by using these new tools? What can we learn from organizations / that could help us design better drugs? How are topology, dynamics, and function related? Slide 44 Acknowledgements Lus A. N. Amaral, Marta Sales-Pardo Fulbright Commission and Spanish Ministry of Education, Culture, and Sports. More information: http://amaral.northwestern.edu/ http://amaral.northwestern.edu/roger/ Slide 45 What happens if the modular structure of the network is hierarchically organized? Slide 46 To determine the hierarchical modular structure of the network, we sample the whole modularity landscape Sales-Pardo, Guimera, Moreira, Amaral, submitted (2006) Slide 47 We are able to identify the modules at each of the hierarchical levels Sales-Pardo, Guimera, Moreira, Amaral, submitted (2006) Nodes Slide 48 We are able to identify the modules at each of the hierarchical levels Sales-Pardo, Guimera, Moreira, Amaral, submitted (2006)