37
Evolutionary Clues Embedded In Network Structure ——EPJB,85,106(2012) Zhu Guimei NGS Graduate School for Integrative Science & Engineering, Centre for Computational Sciences & Engineering, National University of Singapore 1

Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

  • Upload
    anne

  • View
    29

  • Download
    0

Embed Size (px)

DESCRIPTION

Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012). Zhu Guimei NGS Graduate School for Integrative Science & Engineering, Centre for Computational Sciences & Engineering, National University of Singapore. Outline. Introductions Localizations on complex networks - PowerPoint PPT Presentation

Citation preview

Page 1: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Evolutionary Clues Embedded In Network Structure

——EPJB,85,106(2012)

Zhu Guimei

NGS Graduate School for Integrative Science & Engineering,Centre for Computational Sciences & Engineering,

National University of Singapore

1

Page 2: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

2

• Introductions

• Localizations on complex networks

• Evolutionary ages

• Conclustions

Outline

Page 3: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

3

Objective and Scopes

Detecting structural patterns at different scales :multi-scale structure

Finding an intresting network evolution mechanisms based on multi-scale structure networks.

Page 4: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

4

Complex Networks

Protein networks

Internet networks

scientific collaborations networks

A

B

C

Real-World Networks Communication networks: telephone, internet, www… Transportation networks: airports, highways, rail, electric power…Biological networks: genetic ,protein-protein interaction, metabolic…Social networks: friendship networks, collaboration networks…

Page 5: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

?

Function

DynamicsStructureMassEnergySignalInformation…

Structure, Functions, Dynamics

DegreeMotifModularity…

Dynamic Process at different structure scale

So Structure measures is the cornerstone for understanding the relations between structure, dynamic, function

5

Page 6: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Microscopic Macroscopic

How to measure: Multi-Scale Structure ?

what is a Meso (midterm) pattern?DegreeMotifclustering coefficient…

Modules

?Dynamics on Different Structure Scales:

6

Page 7: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

7

Define the Mesoscopic pattern

In Physics: Mesoscopic has been well defined

Materials that have a relatively intermediate length scale in condensed matter physics:

BUT in Complex networks: not yet well defined

size between

molecules microns

We detect different structures patterns through localization method.

Page 8: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

We map networks to large clusters (nodes as atoms; edges as bonds)

nmtAnnHN

nmmnmn

N

nn

1

ˆ

Huckel Model0 0, ( 1,2,3 )

1i

mn

i N

t

AH

Consider an undirected complex network with N identical nodes, topological structure can be described by an adjacency matrix (or Laplace matrix ).

For an electron moving in such a molecule, the tight-binding Hamiltonian is:

ijA ijL

Detect structure through localization: how?

8

Adjacent matrix (if nodes i and j are connected, is 1, otherwise 0. Diagonal member all are 0)

Laplace matrix:

ijiji

ij

N

jiij

ij AkjiA

jkAL

i

ijA ijA

Page 9: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Dynamics on networks: Diffusive process Transport processes on networks: from micro- to macro- scales

Structures of networks: Motif module

Micro-scale macro-scale

Different Eigenvalues represented different structure patterns

Emergence of different scale structures on complex Networks

Laplace Matrix Diffusive process

….. …..

2.Tight-binding Hamiltonian

LH Huckle Model

N

nm

mnmn

N

n

n nmtLnnH1

ˆ

9

1)...3,2,1(0i

mn

i

tNik

Page 10: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

How to describe Localizations on complex networks?

10

The localization properties of electrons in the clusters can be used as measures of the structural properties of the networks.

detect different structure patterns from the spectra of complex networks.

The eigenvalues of L can be ranked as, They correspond to the eigenfunctions from high to low energies.

N 210

Page 11: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

11

Eigenvectors associated with small eigenvalues, usually have large wavelengths, and so they are sensitive to perturbation on a large size of nodes in networks.

Eigenvectors associated with large eigenvalues, have small wavelengths, are most sensitive to localized perturbations that are applied to a small set of nodes in the network.

Hence, the eigenvalues from to can detect the structural patterns from macro- to micro-scales.

Different Eigenvalues represented different structureScale patterns, how?

2 N

In Eigen space: (for complex networks)

each eigenstate represents a specific wave function,

they are sensitive to the structural patterns matching in size with its wavelength.

Page 12: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Eigenvalues sensitive to structural patterns matching in size with its wave length

(a): The eigenstates on a perfect regular network are periodic waves with the wavelengths from to 2.(b): we construct a local deformation in the segment from the 40th to the 60th node by adding edges . the eigenstates with large values localize mainly in this region (local peak) 12

Page 13: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

13

Methods to detect Multi-scale structures:

Standardized The components of the eigenvectors:

NX i ,,,21i The components of every eigenvector of L:

.,,2,1,max

NiX

XX

i

isi

For each scale structure, the components of the nodes involved in it are distinguishably large compared with others. Hence, the -based results are robustness.

The nodes with large values of standardized components ( ) are regarded as the nodes involved in the corresponding scale structure.

Then a threshold can be used to identify the nodes involved in the scale structure, respectively.

: is value of the largest component.)max( iX

s

s

Page 14: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

14

The Santa Fe Cooperation network (part)

:41 ~47 (blue), :1 ~ 6 (magenta), :48 ~ 53 (violet)

...

We consider a part of the largest component of the Santa Fe Institute collaboration network, N=76

76largest eigenvalues can detect the three hubs 40, 7 and 67 (red color). 75 74

73 :involves a group of nodes numbered 17 ~ 25 (green nodes),

: nodes 26 ~29 and 34 also (cyan)

7069

72

68

With the decrease of eigenvalue , clusters in much larger scale can be identified (not shown).

Page 15: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

15

Three scale-free networks :With edge density w = 2, 4, 8, (a–c) average evolutionary ages,

(d–f) average degree (on a logarithmic scale),

(g–i) size of eigenvector versus the eigenvalue index i.

Eigenvectors associated with large eigenvalues generally have small sizes, but their ages are “older” in the network.

Eigenmodes and Average Evolutionary Age:BA Scale Free network

Page 16: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

16

Eigenvalue compared with Degree: to describe the Average Evolutionary Age

Eigenmodes and Average Evolutionary Age:BA Scale Free network

Page 17: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

17

Eigenmodes and Average Evolutionary Age:Scale-free networks generated by other mechanism

Scale-free networks generated byduplication/divergence-based mechanism from PPI network ofthe Baker’s Yeast,

(d) Average age versus degree. Because of large fluctuation, the degree cannot give age-related information,but the eigenvalues can.

Page 18: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

18

Yeast 11k network: original: 5400 yeast proteins : 80 000 interaction.

focused on 11 855 interactions with high and medium confidence among 2617 proteins.

But finally, we only consider the part of the largest component of 2235 proteins from the 2617 proteins.

Y11k: PPI network: Evolutionary Age

Protein-protein interaction networks: Isotemporal Classification of Proteins

First , classified all yeast proteins into four isotemporal categories:

prokaryotes, eukarya, fungi, yeast only (the yeast without annotation).

Based on the university tree of life,

we assign evolutionary age 4,3,2,1 from ancient to modern for each group of prokaryotes-4, eukarya-3, fungi-3, and yeast only-1, respectively.

(1). C. von Mering et al., Nature 417, 399 (2002).

Page 19: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

19

For the largest connected component of the PPI network of the baker’s yeast with 2235 nodes,

(d) Average age versus degree.We see that degree does not reveal age-related information.

Eigenmodes and Average Evolutionary Age:PPI Network

Page 20: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

20

Summary

The localization properties of the eigenvectors from high

to low energies can detect patterns from micro- to macro-

scales.

Interestingly, the patterns contains significant clues of

evolutionary ages.

Page 21: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

21

(1) G.M. Zhu, H.J. Yang, R.Yang, J. Ren, B. Li, and Y.-C. Lai, European Physical Journal B, 85, 106 (2012).

(2). G.M. Zhu, H.J. Yang, C. Yin, B. Li, Localizations on Complex Networks, Phys. Rev. E 77, 066113 (2008).

(3). H.J. Yang, C. Yin, G.M. Zhu, B. Li, Phys. Rev. E 77, 045101(R) (2008)

References

Page 22: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

22

Page 23: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

23

Page 24: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Complex Networks: Nontrivial Properties

A: random; small-world; scale-free(power law degree distribution);

B: motif, modularity, hierarchy,C: fractal properties, and so on.…..

Santo Fortunato, Physics Reports 486 (2010) 75174

ER random networks, N=100, link connect ion probability p=0.02

SW networks, link rewiring probability r=0.1

BA scale free network, N=100, average degree w=2

cauliflower are fractal in nature.. self similarity

Hierarchical networks

A

B

C

24

Page 25: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

25

Complex Networks: Basic Concepts

Structure DescriptionStructure Description Hierarchical Description: Module FunctionHierarchical Description: Module Function

Page 26: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Graph theory

DegreeClustering coefficientShortest pathSmall-worldScale-free

Bioinformatics MotifWhat is more?

Social NetsCommunityHierarchyclustering

microNode/edge-basedaverage

macroglobal(Newman)

Dynamics:MicroTo Macro

Dynamics process is the bridge between structure and functions

Structure Multi-Scale Measures

R. Albert and A. -L. Barabasi, Rev. Mod. Phys. 47(2002);M. Newman, SIAM Review 45, 167-256 (2003);C. Song, et. al., Nature 433,6392(2005); Nature Physics 2,275(2006).

26

Page 27: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

27

What is a Mesoscopic pattern?

In Physics: Mesoscopic has been well defined

Materials that have a relatively intermediate length scale in condensed matter physics:

In complex networks: not yet well defined

size betweena quantity of atoms such as molecules materials measuring microns

Could regard it as community in complex networks (but there are also other formations like trees or stars structures)

We define it as intermediate length scale structures based on structure induced localization.

Page 28: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

28

What is community on complex networks?

groups of vertices : characterized by having more internal than external connections between them.Share common properties and/or play similar roles within the graph.

Community(clusters, modules)

Fortunato, S., and C. Castellano, 2009,(Springer, Berlin, Germany), volume 1, eprint arXiv:0712.2716.

Santo Fortunato, Physics Reports 486 (2010) 75174

Community detect methods

Graph partitioning, hierarchical clusteringPartitioning clusteringSpectral clustering

It is a hot topic but even the definition of a community is a controversial issue. people are still improving the methods to detect the true communities in real world.

Page 29: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

29

Y11k: PPI network: multi-scale analysis

Page 30: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

30

How to Detect community?Several methodsTraditional Methods

Graph partitioning: dividing the vertices in g groups of predefined size

Hierarchical clustering: definition of a similarity measure between vertices

Partitional clustering: separate the points in k clusters such to maximize or minimize a given cost function based on distances between points.

Spectral Clustering: eigenvectors of matrix Adjacent or Laplace.

Modularity-based methods Modularity optimization

Modifications of modularity

Limits of Modularity

Spectral algorithms: Use the eigenvalue and eigenvectors

Divisive algorithms

The algorithm of Girvan and Newman: according to the values of measures of edge centrality, estimating the importance of edges according to some property or process running on the graph

Ahn, Y. Y., J. P. Bagrow, et al. (2010). "Link communities reveal multiscale complexity in networks." Nature 466(7307): 761-U711.

Page 31: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Small world, scale free, whole –cell networks

31

WSSW model: we construct first a regular circular lattice with each node connecting with its d right-handed nearest neighbors. For each edge we rewire it with probability to another randomly selected node. Self- and double edges are forbidden.

The BASF : preferential growth mechanism. Starting from several connected nodes as a seed, at each growth step a new node is added and w edges are established between this node and the existing network. The probability for an existing node to be connected with the new node is proportional to its degree. Self- and double edges are forbidden. For the resulting networks, the number of edges per node obeys a power law.

Whole-cell networks: consider cellular functions such as intermediate metabolism and bioenergetics, information pathways, electron transport, and transmembrane transport. The directed edges are replaced simply with nondirected edges. We consider only cellular networks with sizes larger than 500.

Page 32: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Statistical properties of the spectra

1 (localized)

2 (extended)

where s is the NNLS and the characteristic distribution width.

In order to obtain the value of , we use the accumulated function:

dxxUsCs

0

)(

some trivial calculations lead to:

lnln1

1lnln)(ln

ssC

sR

Fig . Value of Brody parameter versus network

parameters pr and w. (a) WSSW and (b) BASF networks.

sssU exp1

From this formula, we can determine the values of and .

The PDF of the Nearest Neighbor Level Spacing(NNLS) distribution obeys the Brody distribution:

1

32

Page 33: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Wavelets Transform

33

Wavelets are mathematical functions that cut up data into different frequency components, and then study each component with a resolution matched to its scale.

They have advantages over traditional Fourier methods in analyzing physical situations where the signal contains discontinuities and sharp spikes.

Wavelets were developed independently in the fields of mathematics, quantum physics, electrical engineering, and seismic geology. Interchanges between these fields during the last ten years have led to many new wavelet applications such as image compression, turbulence, human vision, radar, and earthquake prediction.

http://www.amara.com/IEEEwave/IEEEwavelet.html

Page 34: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

WT can detect the fractal properties based on the ascending-order-ranked series . As a standard procedure, we first find the WT maximal values :

Jkg kkkkaaT ,,,, 21

The fractal dimension (statistical subsets properties) can be obtained through the Legendre transform :

hD

dq

qdhqqhhD

,

Fig5. The branched multifractal behavior for the whole cell network of M. jannaschii is presented as a typical example.

Fractals properties on networks: Wavelet transform (WT)

We assume the probability values has been sorted in ascending order: N 21

where a is the given scale. The partition function should scale in the limit of small scales as

qk

kk

q

kg aaaTqaZJ

~,),(1

Local Hurst exponent h: denotes local subsets

Positive q, reflects the scaling of large fluctuationsNegative q, reflects the scaling of small fluctuations

q q

34

Page 35: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

35

Structure, Functions, Dynamics

• Structural measures: (cornerstone for understanding the relations)

DegreeClustering coefficientShortest path

• Dynamics: (can be regarded as the transport progresses of )

MassEnergySingalInformationsAnd so on

Dynamic diffusive Process at different structure scale• Functions ?

L. K. Gallos, C. Song, S. Havlin, Proc. Natl. Acad. Sci. U.S.A. 104, 7746 (2007).H. Yang, C. Yin, G. Zhu, and B. Li, Phys. Rev. E 77,045101(R) (2008)

Zhu, G.M., Yang H., Yin C., Li B., Physical Review E, 2008. 77(6)

Page 36: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

DNA sequence

Proteins

Functions

Protein-protein interaction networksProtein-protein interaction networks

Functions of Proteins realized byProtein-protein interactions

Y1 Y2Y3

1. Signal transduction: interactions between signaling molecules

2. Protein complex * One carries another, e.g, from cytoplasm to nucleus * One modify another * complex formation often serves to activate or inhibit one or more of the associated proteins

Protein-protein interactions

Protein-protein interaction networks

36

Page 37: Evolutionary Clues Embedded In Network Structure —— EPJB,85,106(2012)

Metabolic networks (life processes)

metabolism of an organism, the basic chemical system

that generates essential components (1) such as amino acids, sugars and lipids, (2) and the energy required to synthesize them (3) and to use them in creating proteins and cellular structures.

This system of connected chemical reactions is a metabolic network.

37