Download ppt - UBI529 3. Distributed Graph Algorithms. 2 Distributed Algorithms Models Interprocess Communication method: accessing shared memory, point- to-point or

UBI529

3. Distributed Graph Algorithms

2

Distributed Algorithms Models

Interprocess Communication method: accessing shared memory, point-to-point or broadcast messages, or remote procedure calls.

• Timing model: synchronous or asynchronous models.

• Failure models : reliable or faulty behavior; Byzantine failures (failed processor can behave arbitrarily).

3

We assume

A distributed network—Modeled as a graph. Nodes are processors and edges are communication links.

• Nodes can communicate directly (only) with their neighbors through the edges.

• Nodes have unique processor identities.

• Synchronous model: Time is measured in rounds (time steps).

• One message (typically of size O(log n)) can be sent through an edge in a time step. A node can send messages simultaneously through all its edges at once in a round.

• No failure of nodes or edges. No malicious nodes.

2.1 Vertex and Tree Coloring

• Vertex Coloring

• Sequential Vertex Coloring Algorithms

• Distributed Synchronous Vertex Coloring Algorithm

• Distributed Tree Coloring Algorithms

5

Preliminaries

Vertex Coloring Problem: Given undirected Graph G = (V,E). Assign a color cu to each vertex u Є V such that if e = (v,w) Є E, then cu ≠ cw Aim is to use the minimum number of colors.

Definition 2.1.1 : Given an undirected Graph, chromatic number Χ(G) is the minimum number of colors to color it. A vertex k-coloring uses exactly k colors. If X(G) = k, G is k-colorable but not (k-1) colorable.Calculating X(G) is NP-hard. 3-coloring decision is NP-complete.

Applications :Assignment of radio frequencies : Colors represent frequencies, transmitters are the vertices. If two stations are neighbors when they interfere.University course scheduling : Vertices are courses, students edgesFast register allocation for computer programming : Vertices are variables, they are neigbors if they can be active at the same time.

6

Sequential Algorithm for Vertex Coloring

Algorithm 2.1.1 : Sequential Vertex ColoringInput : G with v1,v2, ..., vn

Output : Vertex Coloring f : VG -> {1,2,3,..}

1. For i =1 to n do 2. f(vi) := smallest color number that does not conflict by

any of the other colored neighbors of vi

3. Return Vertex Coloring f

7

Vertex Coloring Algorithms

Definition 2.1.2 : The number of neighbors of a vertex v is called the degree of v δ(v). The maximum degree vertex in a Graph G is called the the Graph degree Δ(G) = Δ.

Theorem 2.1.1 : The algorithm is correct and terminates in O(n) steps. The algorithm uses Δ +1 colors.Proof: Correctness and termination are straight-forward. Since each node has at most Δ neighbors, there is always at least one color free in the range {1, …, Δ+1}.

Remarks:• For many graphs coloring can be done with much less than Δ +1 colors.• This algorithm is not distributed; only one processor is active at a time. But: Use idea of Algorithm 1.4 to define “local” coloring subroutine 1.7

8

Heuristic Vertex Coloring Algorithm : Largest Degree First

Idea : (Two observations) A vertex of a large degree is more difficult to color than a smaller degree vertex. Also, a vertex with more colored neighbors will be more difficult to color later

Algorithm 2.1.1 : Largest Degree First Algorithm

Input : G with v1,v2, ..., vn

Output : Vertex Coloring f : VG -> {1,2,3,..}

1. While there are uncolored vertices of G2. Among the uncolored max. degree vertices

Choose vertex v with the max. Colored degree 3. Assign smallest possible k to v : f(v) := k4. Return Vertex Coloring fThe coloring in the diagram is v3,v1,v2,v4,v8,v6,v7,v5

Colored degree : # of different colors used to color neighbors of v

9

Coloring Trees : A Distributed Algorithm

Lemma 2.1.1: X(Tree) <= 2.

Proof: If the distance of a node to the root is odd (even), color it 1 (0). An odd node has only even neighbors and vice versa.If we assume that each node knows its parent (root has no parent) and children in a tree, this constructive proof gives a very simple algorithm.

Algorithm 2.1.3 [Slow tree coloring]:

1. Root sends color 0 to children. (Root is colored 0)2. When receiving a message x from parent, a node u picks color cu = 1-x, and sends cu to its children

10

Distributed Tree Coloring

Remarks:

• With the proof of Lemma 2.1.1, the algorithm 2.13 is correct.• The time complexity of the algorithm is the height of the tree.• When the root is chosen randomly, this can be up to the diameter of the tree.

2.2 Distributed Tree based Communication Algorithms

• Broadcast

• Convergecast

• BFS Tree Construction

12

Broadcast

Broadcasting means sending a message from a source node to all othernodes of the network.

Two basic broadcasting approaches are flooding andspanning tree-based broadcast.

Flooding:

A source node s wants to send a message to allnodes in the network. s simply forwards the message over all its edges.

Any vertex v != s, upon receiving the message forthe first time (over an edge e) forwards it on everyother edge.

Upon receiving the message again it does nothing.

13

Broadcast

Definition 2.2.1 [Broadcast]: A broadcast operation is initiated by a single processor, the source. The source wants to send a message to all other nodes in the system.

Definition 2.2.2 [Distance, Radius, Diameter]:

• The distance between two nodes u, v in an undirected graph is the number of hops of a minimum path between u and v.

• The radius of a node u in a graph is the maximum distance between u and any other node. The radius of a graph is the minimum radius of any node in the graph.

• The diameter of a graph is the maximum distance between two arbitrary nodes.

14

Broadcast

Theorem 2.2.1 [Lower Bound]: The message complexity of a broadcast is at least n-1. The radius of the graph is a lower bound for the time complexity.Proof: Every node must receive the message.

Remarks:• You can use a pre-computed spanning tree to do the broadcast with tight message complexity.• If the spanning tree is a breadth-first spanning tree (for a given source), then also the time complexity is tight.Definition 2.2.3 : A graph (system/network) is clean if the nodes do not know the topology of the graph.Theorem 2.2.2 [Clean Lower Bound]: For a clean network, the number of edges is a lower bound for the broadcast message complexity.Proof: If you do not try every edge, you might miss a whole part of the graph behind it.

15

Flooding

Algorithm 2.2.1 [Flooding]: The source sends the message to all neighbors. Each node receiving the message the first time forwards to all (other) neighbors.

Remarks:

• If node v receives the message first from node u, then node v calls node u “parent”. This parent relation defines a spanning tree T. If the flooding algorithm is executed in a synchronous system, then T is a breadth-first spanning tree (with respect to the root).

• More interestingly, also in asynchronous systems the flooding algorithm terminates after r time units, where r is the radius of the source. (But note that the constructed spanning tree needs not be breadth-first.)

16

Flooding Analysis

Theorem : The message complexity of flooding is (|E|) and the time complexity is (D), where D is the diameter of G.

Proof. The message complexity follows from the fact that each edge delivers the message at least once and at most twice (one in each direction). To show the time complexity, we use induction on t to show that after t time units, the message has already reached everyvertex at a distance of t or less from the source

17

Broadcast Over a Rooted Spanning Tree

Suppose processors already have information about a rooted spanning tree of the communication topology

tree: connected graph with no cycles spanning tree: contains all processors rooted: there is a unique root node

Implemented via parent and children local variables at each processor

indicate which incident channels lead to parent and children in the rooted spanning tree

18

Broadcast Over a Rooted Spanning Tree: A Simple Algorithm

1. root initially sends msg to its children2. when a node receives msg from its parent

sends msg to its children terminates (sets a local boolean to true)

Synchronous model:

time is depth of the spanning tree, which is at most n - 1 number of messages is n - 1, since one message is sent over

each spanning tree edge

Asynchronous model:

same time and messages

19

Tree Broadcast

Assume that a spanning tree has been constructed.

Theorem . For every n-vertex graph G with a spanning tree T rooted at r0, the message complexity of broadcast is n−1 and time complexity is depth(T).

A broadcast algorithm can be used to construct a spanning tree in G.

The message complexity of broadcast is asymptotically equivalent to the message complexity of spanning tree construction.

Using a breadth-first spanning tree, we get theoptimal message and time complexities for broadcast.

20

Convergecast

Again, suppose a rooted spanning tree has already been computed by the processors

parent and children variables at each processor

Do the opposite of broadcast:

leaves send messages to their parents

non-leaves wait to get message from each child, then send combined info to parent

21

Convergecast

g h

a

b c

d e f

g h

d e,g f,h

c,f,hb,d

solid arrows: parent-child relationships

dotted lines:non-tree edges

22

Finding a Spanning Tree Given a Root

a distinguished processor is known, to serve as the rootroot sends M to all its neighborswhen non-root first gets M

set the sender as its parent send "parent" msg to sender send M to all other neighbors

when get M otherwise

send "reject" msg to sender

use "parent" and "reject" msgs to set children variables and know when to terminate

23

Execution of Spanning Tree Alg.

g h

a

b c

d e f

Synchronous: always givesbreadth-first search (BFS) tree

g h

a

b c

d e f

Asynchronous: not necessarily BFS tree

Both models:O(m) messagesO(diam) time

2.3 Distributed Minimum Spanning Tree Algorithms

25

Minimum Spanning Tree

Minimum spanning tree. Given a connected graph G = (V, E) with real-valued edge weights ce, an MST is a subset of the edges T E

such that T is a spanning tree whose sum of edge weights is minimized.

Cayley's Theorem. There are nn-2 spanning trees of Kn.

5

23

10

21

14

24

16

6

4

189

7

11 8

5

6

4

9

7

11 8

G = (V, E) T, eT ce = 50

can't solve by brute force

26

Applications

MST is fundamental problem with diverse applications.

Network design– telephone, electrical, hydraulic, TV cable, computer, road

Approximation algorithms for NP-hard problems– traveling salesperson problem, Steiner tree

Indirect applications

– max bottleneck paths– LDPC codes for error correction– image registration with Renyi entropy– learning salient features for real-time face verification– reducing data storage in sequencing amino acids in a protein– model locality of particle interactions in turbulent fluid flows– autoconfig protocol for Ethernet bridging to avoid cycles in a

network Cluster analysis.

27

Greedy Algorithms

Kruskal's algorithm. Start with T = . Consider edges in ascending order of cost. Insert edge e in T unless doing so would create a cycle.

Reverse-Delete algorithm. Start with T = E. Consider edges in descending order of cost. Delete edge e from T unless doing so would disconnect T.

Prim's algorithm. Start with some root node s and greedily grow a tree T from s outward. At each step, add the cheapest edge e to T that has exactly one endpoint in T.

Remark. All three algorithms produce an MST.

28

Greedy Algorithms

Simplifying assumption. All edge costs ce are distinct.

Cut property. Let S be any subset of nodes, and let e be the min cost edge with exactly one endpoint in S. Then the MST contains e.

Cycle property. Let C be any cycle, and let f be the max cost edge belonging to C. Then the MST does not contain f.

f C

S

e is in the MST

e

f is not in the MST

29

Cycles and Cuts

Cycle. Set of edges the form a-b, b-c, c-d, …, y-z, z-a.

Cutset. A cut is a subset of nodes S. The corresponding cutset D is the subset of edges with exactly one endpoint in S.

Cycle C = 1-2, 2-3, 3-4, 4-5, 5-6, 6-1

13

8

2

6

7

4

5

Cut S = { 4, 5, 8 }Cutset D = 5-6, 5-7, 3-4, 3-5, 7-8

13

8

2

6

7

4

5

30

Cycle-Cut Intersection

Claim. A cycle and a cutset intersect in an even number of edges.

Pf. (by picture)

13

8

2

6

7

4

5

S

V - S

C

Cycle C = 1-2, 2-3, 3-4, 4-5, 5-6, 6-1Cutset D = 3-4, 3-5, 5-6, 5-7, 7-8 Intersection = 3-4, 5-6

31

Greedy Algorithms


Cut property. Let S be any subset of nodes, and let e be the min cost edge with exactly one endpoint in S. Then the MST T* contains e.

Pf. (exchange argument) Suppose e does not belong to T*, and let's see what happens. Adding e to T* creates a cycle C in T*. Edge e is both in the cycle C and in the cutset D corresponding

to S there exists another edge, say f, that is in both C and D.

T' = T* { e } - { f } is also a spanning tree. Since ce < cf, cost(T') < cost(T*). This is a contradiction. ▪

f

T*

e

S

32

Greedy Algorithms


Cycle property. Let C be any cycle in G, and let f be the max cost edge belonging to C. Then the MST T* does not contain f.

Pf. (exchange argument) Suppose f belongs to T*, and let's see what happens. Deleting f from T* creates a cut S in T*. Edge f is both in the cycle C and in the cutset D corresponding

to S there exists another edge, say e, that is in both C and D.

T' = T* { e } - { f } is also a spanning tree. Since ce < cf, cost(T') < cost(T*). This is a contradiction. ▪

f

T*

e

S

33

Prim's Algorithm: Proof of Correctness

Prim's algorithm. [Jarník 1930, Dijkstra 1957, Prim 1959] Initialize S = any node. Apply cut property to S. Add min cost edge in cutset corresponding to S to T, and add

one new explored node u to S.

S

34

Implementation: Prim's Algorithm

Prim(G, c) { foreach (v V) a[v] Initialize an empty priority queue Q foreach (v V) insert v onto Q Initialize set of explored nodes S

while (Q is not empty) { u delete min element from Q S S { u }

foreach (edge e = (u, v) incident to u) if ((v S) and (ce < a[v]))

decrease priority a[v] to ce

}

Implementation. Use a priority queue ala Dijkstra. Maintain set of explored nodes S. For each unexplored node v, maintain attachment cost a[v] =

cost of cheapest edge v to a node in S. O(n2) with an array; O(m log n) with a binary heap.

35

Kruskal's Algorithm: Proof of Correctness

Kruskal's algorithm. [Kruskal, 1956] Consider edges in ascending order of weight. Case 1: If adding e to T creates a cycle, discard e according

to cycle property. Case 2: Otherwise, insert e = (u, v) into T according to cut

property where S = set of nodes in u's connected component.

Case 1

v

u

Case 2

e

eS

36

Implementation: Kruskal's Algorithm

Kruskal(G, c) { Sort edges weights so that c1 c2 ... cm. T

foreach (u V) make a set containing singleton u

for i = 1 to m (u,v) = ei

if (u and v are in different sets) { T T {ei} merge the sets containing u and v } return T}

Implementation. Use the union-find data structure. Build set T of edges in the MST. Maintain set for each connected component. O(m log n) for sorting and O(m (m, n)) for union-find.

are u and v in different connected components?

merge two components

m n2 log m is O(log n) essentially a constant

37

38

Distributed Spanning tree construction

Chang-Robert’s algorithm

{The root is known}

Uses signals and acks, similar

to the termination detection

algorithm. Uses the same rule

for sending acknowledgment.

0

1 2

3 4

5

root

For a graph G=(V,E), a spanning tree is a maximally connected subgraph T=(V,E’), E’ E,such that if one more edge is added, then the subgraph is no more a tree. Used for broadcasting in a network.

Question: What if the root is not

designated?

39

Chang Roberts Spanning Tree Algprogram probe-echodefine N : integer (no. of neighbors)

C, D : integer;initially parent :=i; C=0; D=0;

{for the initiator}

send probes to each neighbor;D:=no. of neighbors;do D!=0 echo -> D:=D-1 od {D=0 signals end}

{ for a non-initator process i>0}

do parent parent=i C=0 -> C:=1; parent := sender; if i is not a leaf -> send probes to non –

parent neighbors; D:= no. of non-parent neighbors

fi; echo -> D:=D-1; probe sender != parent -> send echo to sender; C=1 D=0 -> send echo to parent; C:=0;od

40

Graph traversal

Many applications of exploring an unknown graph by a visitor

(a token or mobile agent or a robot). The goal of traversal

is to visit every node at least once, and return to the starting point.

- How efficiently can this be done?

- What is the guarantee that all nodes will be visited?

- What is the guarantee that the algorithm will terminate?

Consider web-crawlers, exploration of social networks,graph layouts for visualization or drawing etc.

41

Graph traversal and Spanning Tree Formation

Rule 1. Send the token towards each neighbor exactly once.

Rule 2. If rule 1 is not applicable, then send the token to the parent.

Tarry’s algorithm is one of the oldest (1895)

0 2

3

4 5

root

6

1

5

A possible route is: 0 1 2 5 3 1 4 6 2 6 4 1 3 5 2 1 0

Nodes and their parent pointers generate a spanning tree that may not be DFS

42

Distributed MST

Def MST Fragment : In a weighted graph G = (V,E,w), a tree T in G is called an MST fragment of G, i there exists an MST of G such that T is a subgraph of that MST.

Def MWOE : An edge e is an outgoing edge of a MST fragment T, iff exactly one of its endpoints belongs to T. The minimum weight outgoingedge is denoted MWOE(T).Lemma : Consider a MST fragment T of a graph G = (V, E, w). Lete = MWOE(T). Then T U e is a MST fragment as well.

Proof : Let TM be an MST containing T. If TM contains T we are done.Otherwise, let e’ be an edge that connects T to the rest of TM.Clearly, e’ is an outgoing edge of T and w(e’)>=w(e). Adding e to TM, creates a graph C with a cycle through e and e’. Discarding e’ from C yields a new T’ M with w(T’ M) >= w(TM).

43

Minimum Spanning Tree

Given a weighted graph G = (V, E), generate a spanning tree T = (V, E’)

such that the sum of the weights of all the edges is minimum.

Applications

On Euclidean plane, approximate solutions to the traveling salesman

problem,

Lease phone lines to connect the different offices with a minimum cost,

Visualizing multidimensional data (how entities are related to each other)

We are interested in distributed algorithms only

The traveling salesman problemasks for the shortest route to visit a collection of cities and return to

the starting point.

44

Example

45

Sequential algorithms for MST

Review (1) Prim’s algorithm and (2) Kruskal’s algorithm.

Theorem. If the weight of every edge is distinct, then the MST is unique.

1

2

3

4

5

6

7

8

9

e0 2

3

51

4

6

T1T2

46

Gallagher-Humblet-Spira (GHS) Algorithm

GHS is a distributed version of Prim’s

algorithm.

Bottom-up approach. MST is recursively

constructed by fragments joined by an edge

of least cost.

3

7

5

Fragment Fragment

47

Challenges

1

2

3

4

5

6

7

8

9

e0 2

3

51

4

6

T1T2

Challenge 1. How will the nodes in a given fragment identify the edge to be used to connect with a different fragment?

A root node in each fragment is the coordinator

48

Challenges

1

2

3

4

5

6

7

8

9

e0 2

3

51

4

6

T1T2

Challenge 2. How will a node in T1 determine if a given edge connects to a node of a different tree T2 or the same tree T1? Why will node 0 choose the edge e with weight 8, and not the edge with weight 4?

Nodes in a fragment acquire the same name before augmentation.

49

Two main steps

Each fragment has a level. Initially each node is a fragment at level 0.

(MERGE) Two fragments at the same level L combine to form a fragment

of level L+1

(ABSORB) A fragment at level L is absorbed by another fragment at level

L’ (L < L’)

50

Least weight outgoing edge

To test if an edge is outgoing, each node sends a test message through a candidate edge. The receiving node may send accept or reject.

Root broadcasts initiate in its own fragment, collects the report from other nodes about eligible edges using a convergecast, and determines the least weight outgoing edge.

1

2

3

4

5

6

7

8

9

e0 2

3

51

4

6

T1T2

test

reject

accept

51

Accept of reject?

Case 1. If name (i) = name (j) then send rejectCase 2. If name (i)≠name (j)level (i) level (j) then send acceptCase 3. If name (i) ≠ name (j) level (i) > level (j) then wait until level (j) = level (i).

Levels can only increase.

Question: Can fragments wait for ever and lead to a deadlock?

test

Let i send test to j

reject

test

52

Delayed response

join

initiate

test

Level 5 Level 3

A B

B is about to change its level to 5. So B does notsend an accept reponse to A in response to test

53

The major steps

Repeat

Test edges as outgoing or notDetermine lwoe - it becomes a tree edgeSend join (or respond to join)Update level & name & identify new coordinator

until done

54

Classification of edges

Basic (initially all branches are basic)Branch (all tree edges)Rejected (not a tree edge)

Branch and rejected are stable attributes

55

Wrapping it up

Merge

The edge through which the join

message is sent, changes its status to

branch, and becomes a tree edge.

Each root broadcasts an

(initiate, L+1, name) message

to the nodes in its own fragment.

T T’

(join, L, T)

(join, L’, T’)

(a)

level=L level = L’

L= L’

T T’

level=Llevel = L’

(b) L > L’

(join, L’, T;)

Example of merge

56

Wrapping it up

Absorb

T’ receives an initiate

message.

This indicates that the fragment

at level L has been

absorbed by the

other fragment at level L’.

They collectively search for the

lwoe.

The edge through which the

join message was sent,

changes

its status to branch.

T T’

(join, L, T)

(join, L’, T’)

(a)

level=L level = L’

L= L’

T T’

level=Llevel = L’

(b) L > L’

(join, L’, T;)

initiate

Example of absorb

57

Example

4

1

0 2

5

6 3

1

3

2

4 7

8

9

5

6

58

Example

4

1

0 2

5

6 3

1

3

2

4 7

8

9

5

6

merge merge

merge

59

Example

4

1

0 2

5

6 3

1

2

4 7

8

9

5

6

merge

absorb

3

60

Example

4

1

0 2

5

6 3

1

2

4 7

8

9

5

6

3

absorb

61

Message complexity

At least two messages (test + reject) must pass through eachrejected edge. The upper bound is 2|E| messages.

At each of the log N levels, a node can receive at most (1) oneinitiate message and (2) one accept message (3) one joinmessage (4) one test message not leading to a rejection, and(5) one changeroot message.

So, the total number of messages has an upper bound of2|E| + 5N logN

62

MST Algorithms: Theory

Deterministic comparison based algorithms. O(m log n) [Jarník, Prim, Dijkstra, Kruskal, Boruvka] O(m log log n). [Cheriton-Tarjan 1976, Yao 1975] O(m (m, n)). [Fredman-Tarjan 1987] O(m log (m, n)). [Gabow-Galil-Spencer-Tarjan 1986] O(m (m, n)). [Chazelle 2000]

Holy grail. O(m).

Notable. O(m) randomized. [Karger-Klein-Tarjan 1995] O(m) verification. [Dixon-Rauch-Tarjan 1992]

Euclidean. 2-d: O(n log n).compute MST of edges in Delaunay k-d: O(k n2). dense Prim

63

Distributed MST Algorithms

Gallager, Humblet, & Spira ’83: O(n log n) running time message: O(|E| + n log n) (optimal)

Chin & Ting ’85: O(n log log n) timeGafni ’85: O(n log*n) Awerbuch ’87: O(n), existentially optimalGaray, Kutten, & Peleg ’98: O(D + n0.61), Diameter DKutten & Peleg ’98:

Elkin ’04: , μ is called MST radius

– Cannot detect termination unless μ is given as input.

Peleg & Rabinovich (’99) showed a lower bound of for running time.

nnDO log*

nO ~

n~

64

Distributed Graph Algs : Other areas of interest

Distributed Cycle/Knot Detection

Distributed Center Finding

Distributed Connected Dominating Set Construction in MANETs, WSNs

Distributed Clustering based on Graph Partitioning

65

References

Introduction to Graph Theory, Douglas West, Prentice Hall, 2000 (basics)

Graph Theory and Its Applications, Gross and Yellen, CRC Press, 1998(basics)Distributed Algorithm Course notes, J.Welch, TAMU (flooding and tree algorithms)CS590A Fall 2007 G. Pandurangan 1, Purdue University

Distributed Computing Principles Course Notes, Roger Wattenhofer, ETH (Coloring algorithms)Introduction to Algorithm Design, Kleinman, Tardos, Prentice-Hall, 2005 (MST dependent)

22C:166 Distributed Systems and Algorithms Course, Sukumar Ghosh, University of Iowa (routing part heavily dependent)