233-234

Preview:

DESCRIPTION

233-234. Sedgewick & Wayne (2004); Chazelle (2005). Adjacency lists. Hansel & Gretel. random walk. 1. Birds eat the bread crumbs. DFS/BFS. 2. They don’t. Diffusion equation. Random walk. Diffusion equation. Normal distribution. 3 views of the same thing. Hansel & Gretel. - PowerPoint PPT Presentation

Citation preview

233-234233-234233-234233-234

Sedgewick & Wayne (2004); Chazelle (2005)Sedgewick & Wayne (2004); Chazelle (2005) Sedgewick & Wayne (2004); Chazelle (2005)Sedgewick & Wayne (2004); Chazelle (2005)

Adjacency lists

1. Birds eat the bread crumbs

2. They don’t

random walk

DFS/BFS

Hansel & Gretel

Diffusion equation

Diffusion equation

Normal distribution

Random walk

With bread crumbs one canfind exit in time proportional

to V+E DFS/BFS

Hansel & Gretel

Breadth First Search

Breadth First Search

F

A

B C G

D E

H

Breadth First Search

F

A

B C G

D E

H

Queue: A

get

0distance from A

visit(A)

Breadth First Search

F

A

B C G

D E

H

Queue:

0

F1

F discovered

Breadth First Search

F

A

B C G

D E

H

Queue: F

0

1

B 1B discovered

Breadth First Search

F

A

B C G

D E

H

Queue: F B

0

1

1 C 1C discovered

Breadth First Search

F

A

B C G

D E

H

Queue: F B C

0

1

1 1 G

1

G discovered

Breadth First Search

F

A

B C G

D E

H

Queue: F B C G

get

0

1

1 1

1

A finished

Breadth First Search

F

A

B C G

D E

H

Queue: B C G

0

1

1 1

1A already

visited

Breadth First Search

F

A

B C G

D E

H

Queue: B C G

0

1

1 1

1

D2

D discovered

Breadth First Search

F

A

B C G

D E

H

Queue: B C G D

0

1

1 1

1

2 E

2

E discovered

Breadth First Search

F

A

B C G

D E

H

Queue: B C G D E

get

0

1

1 1

1

2

2

F finished

Breadth First Search

F

A

B C G

D E

H

Queue: C G D E

0

1

1 1

1

2

2

Breadth First Search

F

A

B C G

D E

H

Queue: C G D E

0

1

1 1

1

2

2

A alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue: C G D E

get

0

1

1 1

1

2

2

B finished

Breadth First Search

F

A

B C G

D E

H

Queue: G D E

0

1

1 1

1

2

2

A alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue: G D E

get

0

1

1 1

1

2

2

C finished

Breadth First Search

F

A

B C G

D E

H

Queue: D E

0

1

1 1

1

2

2

A alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue: D E

0

1

1 1

1

2

2

E alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue: D E

get

0

1

1 1

1

2

2

G finished

Breadth First Search

F

A

B C G

D E

H

Queue: E

0

1

1 1

1

2

2

E alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue: E

0

1

1 1

1

2

2

F alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue: E

get

0

1

1 1

1

2

2

D finished

Breadth First Search

F

A

B C G

D E

H

Queue:

0

1

1 1

1

2

2

D alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue:

0

1

1 1

1

2

2

F alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue:

0

1

1 1

1

2

2

G alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue:

0

1

1 1

1

2

2

H 3

H discovered

Breadth First Search

F

A

B C G

D E

Queue: H

get

0

1

1 1

1

2

2

H 3

E finished

Breadth First Search

F

A

B C G

D E

H

Queue:

0

1

1 1

1

2

2

3

E alreadyvisited

Breadth First Search

F

A

B C G

D E

H

Queue:

STOP

0

1

1 1

1

2

2

3

H finished

Breadth First Search

F

A

B C G

D E

H

0

1

1 1

1

2

2

3

distance from A

Breadth-First Search

b ca da

cdb

v

Rod Steiger

Martin Sheen

Donald Pleasence

#1

#2

#3

#876Kevin Bacon

Barabasi

Rank NameAveragedistance

# ofmovies

# oflinks

1 Rod Steiger 2.537527 112 25622 Donald Pleasence 2.542376 180 28743 Martin Sheen 2.551210 136 35014 Christopher Lee 2.552497 201 29935 Robert Mitchum 2.557181 136 29056 Charlton Heston 2.566284 104 25527 Eddie Albert 2.567036 112 33338 Robert Vaughn 2.570193 126 27619 Donald Sutherland 2.577880 107 2865

10 John Gielgud 2.578980 122 294211 Anthony Quinn 2.579750 146 297812 James Earl Jones 2.584440 112 3787…

876 Kevin Bacon 2.786981 46 1811…

Why Kevin Bacon?

Measure the average distance between Kevin Bacon and all other actors.

876 Kevin Bacon 2.786981 46 1811Barabasi

Langston et al., A combinatorial approach to the analysis of differential gene expression data….

Minimum Dominating Set

Minimum Dominating Set

Minimum Dominating Set

size of dominating set

Expected size of dominating set

Assume each node has at least d neighbors

Naïve algorithm still n/2 in worst case

Simple probabilistic algorithm:

1. For each vertex v, color v red with probability p

1. For each vertex v, color v red with probability p

2. Color blue any non-dominated vertex

X= number of red nodes Y= number of blue nodes

Size of dominating set = X+Y

Expected size of dominating set S =

Markov’s inequality

proof

j= k E|S|

Probability that is < 1/2

Run algorithm 10 times and keep smallest S

with probability > 0.999

protein-protein

interactions

PROTEOME

GENOME

Citrate Cycle

METABOLISM

Bio-chemical reactions

Barabasi

Tucker-Gera-Uetz

Local network motifs

SIM MIM FFLFBL

[Alon; Horak, Luscombe et al (2002), Genes & Dev, 16: 3017 ]

Barabasi

The New Science of Networks by Barabasi

Degree DistributionDegree Distribution

PP((kk) = probability a given node has ) = probability a given node has exactly exactly kk neighbors neighbors

Random NetworkRandom Network P(k) = PoissonP(k) = Poisson ~~ No hubsNo hubs

Scale free NetworkScale free Network P(k) ~P(k) ~ . .

A few hubsA few hubs

Metabolic network

Organisms from all three domains of life are scale-free networks!

H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000)

Archaea Bacteria Eukaryotes

Barabasi & Albert, Science 286, 509 (1999)

Actors

Movies

Web-pages

Hyper-links

Trans. stations

Power lines

Nodes:

Links:

Scale-free networksScale-free networks

Why scale-free topology in biological Why scale-free topology in biological networks ?networks ?

Preferential attachment

Mean Field Theory

γ = 3

t

k

k

kAk

t

k i

j j

ii

i

2)(

ii t

tmtk )(

, with initial condition mtk ii )(

)(1)(1)())((

02

2

2

2

2

2

tmk

tm

k

tmtP

k

tmtPktkP ititi

33

2

~12))((

)(

kktm

tm

k

ktkPkP

o

i

A.-L.Barabási, R. Albert and H. Jeong, Physica A 272, 173 (1999)

Clustering in protein interaction networks

Goldberg and Roth, PNAS, 2003

high clustering = high quality of interaction

|))(||,)(min(|

|)()(| |)(||)(|

|)(||)(|log

wNvN

wNvNi

vwwN

N

iwN

vNN

i

vNC

Scale-free model(1) GROWTH : At every timestep we add a new node with m edges (connected to the nodes already present in the system).

(2) PREFERENTIAL ATTACHMENT : The probability Π that a new node will be connected to node i depends on the connectivity ki of that node

A.-L.Barabási, R. Albert, Science 286, 509 (1999)

jj

ii k

kk

)(

P(k) ~k-3

Why scale-free topology in biological Why scale-free topology in biological networks ?networks ?

Yeast protein networkNodes: proteins

Links: physical interactions (binding)

P. Uetz, et al. Nature, 2000; Ito et al., PNAS, 2001; …

Recommended