Graph Algorithms

Graph Algorithms

Graph Theory is an area of mathematics that deals with following types of problems

Connection problems Scheduling problems Transportation problems Network analysis Games and Puzzles.

The Graph Theory has important applications in Critical path analysis, Social psychology, Matrix theory, Set theory, Topology, Group theory, Molecular chemistry, and Searching.

Those who would like to take a quick tour of essentials of graph theory please go directly to "Graph Theory" from here.

Digraph

A directed graph, or digraph G consists of a finite nonempty set of vertices V, and a finite set of edges E, where an edge is an ordered pair of vertices in V. Vertices are also commonly referred to as nodes. Edges are sometimes referred to as arcs.

As an example, we could define a graph G=(V, E) as follows:

V = {1, 2, 3, 4} E = { (1, 2), (2, 4), (4, 2) (4, 1)}

Here is a pictorial representation of this graph.

The definition of graph implies that a graph can be drawn just knowing its vertex-set and its edge-set. For example, our first example

has vertex set V and edge set E where: V = {1,2,3,4} and E = {(1,2),(2,4),(4,3),(3,1),(1,4),(2,1),(4,2),(3,4),(1,3),(4,1). Notice that each edge seems to be listed twice.

Another example, the following Petersen Graph G=(V,E) has vertex set V and edge set E where: V = {1,2,3,4}and E ={(1,2),(2,4),(4,3),(3,1),(1,4),(2,1),(4,2),(3,4),(1,3),(4,1)}.

We'll quickly covers following three important topics from algorithmic perspective.

1. Transpose 2. Square 3. Incidence Matrix

1. Transpose

If graph G = (V, E) is a directed graph, its transpose, GT = (V, ET) is the same as graph G with all arrows reversed. We define the transpose of a adjacency matrix A = (aij) to be the adjacency matrix AT = (Taij) given by Taij = aji. In other words, rows of matrix A become columns of matrix AT and columns of matrix A becomes rows of matrix AT. Since in an undirected graph, (u, v) and (v, u) represented the same edge, the adjacency matrix A of an undirected graph is its own transpose: A = AT.

Formally, the transpose of a directed graph G = (V, E) is the graph GT (V, ET), where ET = {(u, v) V×V : (u, v)E. Thus, GT is G with all its edges reversed.

We can compute GT from G in the adjacency matrix representations and adjacency list representations of graph G.

Algorithm for computing GT from G in representation of graph G is

ALGORITHM MATRIX TRANSPOSE (G, GT)

For i = 0 to i < V[G] For j = 0 to j V[G] GT (j, i) = G(i, j) j = j + 1; i = i + 1

To see why it works notice that if GT(i, j) is equal to G(j, i), the same thing is achieved. The time complexity is clearly O(V2).

Algorithm for Computing GT from G in Adjacency-List Representation

In this representation, a new adjacency list must be constructed for transpose of G. Every list in adjacency list is scanned. While scanning adjacency list of v (say), if we encounter u, we put v in adjacency-list of u.

ALGORITHM LIST TRANSPOSE [G]

for u = 1 to V[G] for each element vAdj[u] Insert u into the front of Adj[v]

To see why it works, notice if an edge exists from u to v, i.e., v is in the adjacency list of u, then u is present in the adjacency list of v in the transpose of G.

2. Square

The square of a directed graph G = (V, E) is the graph G2 = (V, E2) such that (a, b)E2 if and only if for some vertex cV, both (u, c)E and (c,b)E. That is, G2 contains an edge between vertex a and vertex b whenever G contains a path with exactly two edges between vertex a and vertex b.

Algorithms for Computing G2 from G in the Adjacency-List Representation of G

Create a new array Adj'(A), indexed by V[G]For each v in V[G] doFor each u in Adj[v] do \\ v has a path of length 2. \\ to each of the neighbors of umake a copy of Adj[u] and append it to Adj'[v]Return Adj'(A).

For each vertex, we must make a copy of at most |E| list elements. The total time is O(|V| * |E|).

Algorithm for Computing G2 from G in the Adjacency-Matrix representation of G.

For i = 1 to V[G] For j = 1 to V[G] For k = 1 to V[G] c[i, j] = c[i, j] + c[i, k] * c[k, j]

Because of three nested loops, the running time is O(V3).

3. Incidence Matrix

The incidence matrix of a directed graph G=(V, E) is a V×E matrix B = (bij) such that

-1 if edge j leaves vertex j.bij = 1 if edge j enters vertex j. 0 otherwise.

If B is the incidence matrix and BT is its transpose, the diagonal of the product matrix BBT represents the degree of all the nodes, i.e., if P is the product matrix BBT then P[i, j] represents the degree of node i:

Specifically we have

BBT(i,j) = ∑eE bie bTej = ∑eE bie bje

Now,

If i = j, then biebje = 1, whenever edge e enters or leaves vertex i and 0 otherwise.

If i ≠ j, then biebje = -1, when e = (i, j) or e = (j, i) and 0 otherwise.

Therefore

BBT(i,j) = deg(i) = in_deg + Out_deg if i = j

= -(# of edges connecting i an j ) if i ≠ j

Breadth First Search (BFS)

Breadth First Search algorithm used in

Prim's MST algorithm. Dijkstra's single source shortest path algorithm.

Like depth first search, BFS traverse a connected component of a given graph and defines a spanning tree.

Algorithm Breadth First Search

BFS starts at a given vertex, which is at level 0. In the first stage, we visit all vertices at level 1. In the second stage, we visit all vertices at second level. These new vertices, which are adjacent to level 1 vertices, and so on. The BFS traversal terminates when every vertex has been visited.

BREADTH FIRST SEARCH (G, S)

Input: A graph G and a vertex.Output: Edges labeled as discovery and cross edges in the connected component.

Create a Queue Q.ENQUEUE (Q, S) // Insert S into Q.While Q is not empty do for each vertex v in Q do for all edges e incident on v do if edge e is unexplored then let w be the other endpoint of e. if vertex w is unexpected then - mark e as a discovery edge

- insert w into Q else mark e as a cross edge

BFS label each vertex by the length of a shortest path (in terms of number of edges) from the start vertex.

Example (CLR)

Step1

Step 2

Step 3

Step 4

Step 5

Step 6

Step 7

Step 8

Step 9

Starting vertex (node) is SSolid edge = discovery edge.Dashed edge = error edge (since none of them connects a vertex to one of its ancestors).

As with the depth first search (DFS), the discovery edges form a spanning

tree, which in this case we call the BSF-tree.

BSF used to solve following problem

Testing whether graph is connected.

Computing a spanning forest of graph.

Computing, for every vertex in graph, a path with the minimum number of edges between start vertex and current vertex or reporting that no such path exists.

Computing a cycle in graph or reporting that no such cycle exists.

Analysis

Total running time of BFS is O(V + E).

Bipartite Graph

We define bipartite graph as follows: A bipartite graph is an undirected graph G = (V, E) in which V can be partitioned into two sets V1 and V2 such that (u, v)E implies either u V1 and v V2 or u V2 and v V1. That is, all edges go between the two sets V1 and V2.

In other to determine if a graph G = (V, E) is bipartite, we perform a BFS on

it with a little modification such that whenever the BFS is at a vertex u and

encounters a vertex v that is already 'gray' our modified BSF should check

to see if the depth of both u and v are even, or if they are both odd. If either

of these conditions holds which implies d[u] and d[v] have the same parity, then the graph is not bipartite. Note that this modification does not

change the running time of BFS and remains (V + E).

Formally, to check if the given graph is bipartite, the algorithm traverse the graph labeling the vertices 0, 1, or 2 corresponding to unvisited., partition 1 and partition 2 nodes. If an edge is detected between two vertices in the same partition, the algorithm returns.

ALGORITHM: BIPARTITE (G, S)

For each vertex UV[G] - {s} do

Color[u] = WHITE d[u] = ∞ partition[u] = 0

Color[s] = graypartition[s] = 1d[s] = 0Q = [s]While Queue 'Q' is not empty do u = head [Q] for each v in Adj[u] do if partition [u] = partition [v] then return 0 else if color[v] WHITE then color[v] = gray d[v] = d[u] +1 partition[v] = 3 - partition[u] ENQUEUE (Q, v)DEQUEUE (Q)Color[u] = BLACKReturn 1

Correctness

As Bipartite (G, S) traverse the graph it labels the vertices with a partition number consisted with the graph being bipartite. If at any vertex, algorithm detects an inconsistency, it shows with an invalid return value,. Partition

value of u will always be a valid number as it was enqueued at some point

and its partition was assigned at that point. AT line 19, partition of v will unchanged if it already set, otherwise it will be set to a value opposite to that

of vertex u.

Analysis

The lines added to BFS algorithm take constant time to execute and so the

running time is the same as that of BFS which is O(V + E).

Diameter of Tree

The diameter of a tree T = (V, E) is the largest of all shortest-path

distance in the tree and given by max[dist(u,v)]. As we have mentioned that BSF can be use to compute, for every vertex in graph, a path with the minimum number of edges between start vertex and current vertex. It is quite easy to compute the diameter of a tree. For each vertex in the tree, we use BFS algorithm to get a shortest-path. By using a global variable length,

we record the largest of all shortest-paths. This will clearly takes O(V(V + E)) time.

ALGORITHM: TREE_DIAMETER (T)

maxlength = 0For S = 0 to S < |V[T]| temp = BSF(T, S)

if maxlength < temp maxlength = temp Increment s by 1return maxlength

Depth First Search (DFS)

Depth first search (DFS) is useful for

Find a path from one vertex to another Whether or not graph is connected Computing a spanning tree of a connected graph.

DFS uses the backtracking technique.

Algorithm Depth First Search

Algorithm starts at a specific vertex S in G, which becomes current vertex. Then algorithm traverse graph by any edge (u, v) incident to the current vertex u. If the edge (u, v) leads to an already visited vertex v, then we backtrack to current vertex u. If, on other hand, edge (u, v) leads to an unvisited vertex v, then we go to v and v becomes our current vertex. We proceed in this manner until we reach to "deadend". At this point we start back tracking. The process terminates when backtracking leads back to the start vertex.

Edges leads to new vertex are called discovery or tree edges and edges lead to already visited are called back edges.

DEPTH FIRST SEARCH (G, v)

Input: A graph G and a vertex v.Output: Edges labeled as discovery and back edges in the connected component.

For all edges e incident on v do If edge e is unexplored then w ← opposite (v, e) // return the end point of e

distant to v If vertex w is unexplained then - mark e as a discovery edge - Recursively call DSF (G, w) else - mark e as a back edge

Example (CLR)

Solid Edge = discovery or tree edgeDashed Edge = back edge.

Each vertex has two time stamps: the first time stamp records when vertex is first discovered and second time stamp records when the search finishes examining adjacency list of vertex.

DFS algorithm used to solve following problems.

Testing whether graph is connected.

Computing a spanning forest of graph.

Computing a path between two vertices of graph or equivalently reporting that no such path exists.

Computing a cycle in graph or equivalently reporting that no such cycle exists.

Analysis

The running time of DSF is (V + E).

Consider vertex u and vertex v in V[G] after a DFS. Suppose vertex v in

some DFS-tree. Then we have d[u] < d[v] < f[v] < f[u] because of the following reasons

1. Vertex u was discovered before vertex v; and 2. Vertex v was fully explored before vertex u was fully explored.

Note that converse also holds: if d[u] < d[v] < f[v] < f[u] then vertex

v is in the same DFS-tree and a vertex v is a descendent of vertex u.

Suppose vertex u and vertex v are in different DFS-trees or suppose vertex u and vertex v are in the same DFS-tree but neither vertex is the descendent of the other. Then one vertex was discovered and fully explored before the

other was discovered i.e., f[u] < d[v] or f[v] < d[u].

Consider a directed graph G = (V, E). After a DFS of graph G we can put each edge into one of four classes:

A tree edge is an edge in a DFS-tree.

A back edge connects a vertex to an ancestor in a DFS-tree. Note that a self-loop is a back edge.

A forward edge is a nontree edge that connects a vertex to a descendent in a DFS-tree.

A cross edge is any other edge in graph G. It connects vertices in two different DFS-tree or two vertices in the same DFS-tree neither of which is the ancestor of the other.

Lemma 1 An Edge (u, v) is a back edge if and only if d[v] < d[u] < f[u] < f[v].

Proof

(=> direction) From the definition of a back edge, it connects vertex u to an ancestor vertex v in a DFS-tree. Hence, vertex u is a descendent of vertex v. Corollary 23.7 in the CLR states that vertex u is a proper descendent of vertex v if and only if d[v] < d[u] < f[u] < f[v]. Hence proved forward direction. □

(<= direction) Again by the Corollary 23.7 (CLR), vertex u is a proper descendent of vertex v. Hence if an edge (u, v) exists from u to v then it is an edge connecting a descendent vertex u to its ancestor vertex v. Hence it is a back edge. Hence proved backward direction.

Conclusion: Immediate from both directions.

Lemma 2 An edge (u, v) is a cross edge if and only if d[v] < f[v] < d[u] < f[v].

Proof

First take => direction.

Observation 1 For an edge (u, v), d[u] < f[u] and d[v] < f[v] since for any vertex has to be discovered before we can finish exploring it.

Observation 2 From the definition of a cross edge it is an edge which is not a tree edge, forward edge or a backward edge. This implies that none of

the relationships for forward edge [ d[u] < d[v] < f[v] < f[u] ] or

back edge [ d[v] < d[u] < f[u] < f[v] ] can hold for a cross edge.

From the above two observations we conclude that the only two possibilities are:

d[u] < f[u] < d[v] < f[v] and

d[v] < f[v] < d[u] < f[u]

When the cross edge (u, v) is discovered we must be at vertex u and vertex

v must be black. The reason is that if v was while then edge (u, v) would

be a tree edge and if v was gray edge (u, v) would be a back edge.

Therefore, d[v] < d[u] and hence possibility (2) holds true.

Now take <= direction.

We can prove this direction by eliminating the various possible edges that

the given relation can convey. If d[v] < d[v] < d[u] < f[u] then edge

(u, v) cannot be a tree or a forward edge. Also, it cannot be a back edge by

lemma 1. Edge (u, v) is not a forward or back edge. Hence it must be a cross edge (please go above and look again the definition of cross edge).

Conclusion: Immediately from both directions.

Just for the hell of it lets determine whether or not an undirected graph contain a cycle. It is not difficult to see that the algorithm for this problem would be very similar to DFS(G) except that when the adjacent edge is already a GRAY edge than a cycle is detected. While doing this the algorithm also takes care that it is not detecting a cycle when the GRAY edge is actually a tree edge from a ancestor to a descendent.

ALGORITHM DFS_DETECT_CYCLES [G]

For each vertex u in V[G] do Color [u] = while, Predecessor [u] = NIL;time = 0For each vertex u in V[G] do if color [u] = while DFS_visit(u);

The subalgorithm DFS_visit(u) is as follows:

DFS_visit(u) color(u) = GRAY d[u] = time = time + 1 For each v in adj[u] do if color[v] = gray and Predecessor[u] v do return "cycle exists" if color[v] = while do Predecessor[v] = u Recursively DFS_visit(v)

color[u] = Black; f[u] = time = time + 1

Correctness

To see why this algorithm works suppose the node to visited v is a gray node, then there are two possibilities:

1. The node v is a parent node of u and we are going back the tree edge

which we traversed while visiting u after visiting v. In that case it is not a cycle.

2. The second possibility is that v has already been encountered once during DFS_visit and what we are traversing now will be back edge and hence a cycle is detected.

Time Complexity

The maximum number of possible edges in the graph G if it does not have

cycle is |V| - 1. If G has a cycles, then the number of edges exceeds this number. Hence, the algorithm will detects a cycle at the most at the Vth edge

if not before it. Therefore, the algorithm will run in O(V) time.

Documents

Graph Algorithms