37
PSU CS 311 - Algorithms Analysis and Design Graphs

PSU CS 311 - Algorithms Analysis and Design Graphs

Embed Size (px)

Citation preview

Page 1: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Graphs

Page 2: PSU CS 311 - Algorithms Analysis and Design Graphs

2PSUCS 311 - Algorithms Analysis and Design

Graph definitions

There are two kinds of graphs: directed graphs (sometimes called digraphs) and undirected graphs

Nantes Paris

LyonGrenoble

Bordeaux

Marseille

Monaco

400

340590

420

350120

320

510

An undirected graph

start

fill panwith water

take eggfrom fridge

break egginto pan

boilwater

add saltto water

A directed graph

Page 3: PSU CS 311 - Algorithms Analysis and Design Graphs

3PSUCS 311 - Algorithms Analysis and Design

Graph terminology I

A graph is a collection of nodes (or vertices, singular is vertex) and edges (or arcs) Each node contains an element Each edge connects two nodes together (or possibly the

same node to itself) and may contain an edge attribute A directed graph is one in which the edges have a

direction An undirected graph is one in which the edges do not

have a direction Note: Whether a graph is directed or undirected is a logical

distinction—it describes how we think about the graph Depending on the implementation, we may or may not be

able to follow a directed edge in the “backwards” direction

Page 4: PSU CS 311 - Algorithms Analysis and Design Graphs

4PSUCS 311 - Algorithms Analysis and Design

Graph terminology II

The size of a graph is the number of nodes in it The empty graph has size zero (no nodes) If two nodes are connected by an edge, they are

neighbors (and the nodes are adjacent to each other) The degree of a node is the number of edges it has For directed graphs,

If a directed edge goes from node S to node D, we call S the source and D the destination of the edge

– The edge is an out-edge of S and an in-edge of D– S is a predecessor of D, and D is a successor of S

The in-degree of a node is the number of in-edges it has The out-degree of a node is the number of out-edges it has

Page 5: PSU CS 311 - Algorithms Analysis and Design Graphs

5PSUCS 311 - Algorithms Analysis and Design

Graph terminology III

A path is a list of edges such that each node (but the last) is the predecessor of the next node in the list

A cycle is a path whose first and last nodes are the same

Example: (Paris, Nantes, Bordeaux, Lyon) is a path

Example: (Paris, Nantes, Lyon, Paris) is a cycle

A cyclic graph contains at least one cycle

An acyclic graph does not contain any cycles

Nantes Paris

LyonGrenoble

Bordeaux

Marseille

Monaco

400

340590

420

350120

320

510

An undirected graph

Page 6: PSU CS 311 - Algorithms Analysis and Design Graphs

6PSUCS 311 - Algorithms Analysis and Design

Graph terminology IV

An undirected graph is connected if there is a path from every node to every other node

A directed graph is strongly connected if there is a path from every node to every other node

A directed graph is weakly connected if the underlying undirected graph is connected

Node X is reachable from node Y if there is a path from Y to X

A subset of the nodes of the graph is a connected component (or just a component) if there is a path from every node in the subset to every other node in the subset

Page 7: PSU CS 311 - Algorithms Analysis and Design Graphs

7PSUCS 311 - Algorithms Analysis and Design

Adjacency-matrix representation I

One simple way of representing a graph is the adjacency matrix

A 2-D array has a mark at [i][j] if there is an edge from node i to node j

The adjacency matrix is symmetric about the main diagonal

AB

G

E

F

D

C

ABCDEFG

A B C D E F G

• This representation is only suitable for small graphs! (Why?)

Page 8: PSU CS 311 - Algorithms Analysis and Design Graphs

8PSUCS 311 - Algorithms Analysis and Design

Adjacency-matrix representation II

An adjacency matrix can equally well be used for digraphs (directed graphs)

A 2-D array has a mark at [i][j] if there is an edge from node i to node j

Again, this is only suitable for small graphs!

AB

G

E

F

D

C

ABCDEFG

A B C D E F G

Page 9: PSU CS 311 - Algorithms Analysis and Design Graphs

9PSUCS 311 - Algorithms Analysis and Design

Edge-set representation I

An edge-set representation uses a set of nodes and a set of edges The sets might be represented by, say, linked lists The set links are stored in the nodes and edges themselves

The only other information in a node is its element (that is, its value)—it does not hold information about its edges

The only other information in an edge is its source and destination (and attribute, if any) If the graph is undirected, we keep links to both nodes, but

don’t distinguish between source and destination This representation makes it easy to find nodes from an

edge, but you must search to find an edge from a node This is seldom a good representation

Page 10: PSU CS 311 - Algorithms Analysis and Design Graphs

10PSUCS 311 - Algorithms Analysis and Design

Edge-set representation II

Here we have a set of nodes, and each node contains only its element (not shown)

AB

G

E

F

D

Cpq

r

s

t

uv

w

Each edge contains references to its source and its destination (and its attribute, if any)

edgeSet = { p: (A, E),

q: (B, C), r: (E, B),

s: (D, D), t: (D, A),

u: (E, G), v: (F, E),

w: (G, D) }

nodeSet = {A, B, C, D, E, F, G}

Page 11: PSU CS 311 - Algorithms Analysis and Design Graphs

11PSUCS 311 - Algorithms Analysis and Design

Adjacency-set representation I

An adjacency-set representation uses a set of nodes Each node contains a reference to the set of its edges For a directed graph, a node might only know about

(have references to) its out-edges Thus, there is not one single edge set, but rather

a separate edge set for each node Each edge would contain its attribute (if any) and its

destination (and possibly its source)

Page 12: PSU CS 311 - Algorithms Analysis and Design Graphs

12PSUCS 311 - Algorithms Analysis and Design

Adjacency-set representation II *

Here we have a set of nodes, and each node refers to a set of edges

AB

G

E

F

D

Cpq

r

s

t

uv

w

A { p }

B { q }

C { }

D { s, t }

E { r, u }

F { v }

G { w }

Each edge contains references to its source and its destination (and its attribute, if any)

p: (A, E)

q: (B, C)

r: (E, B)

s: (D, D)

t: (D, A)

u: (E, G)

v: (F, E)

w: (G, D)

Page 13: PSU CS 311 - Algorithms Analysis and Design Graphs

13PSUCS 311 - Algorithms Analysis and Design

Adjacency-set representation III

If the edges have no associated attribute, there is no need for a separate Edge class Instead, each node can refer to a set of its neighbors In this representation, the edges would be implicit in

the connections between nodes, not a separate data structure

For an undirected graph, the node would have references to all the nodes adjacent to it

For a directed graph, the node might have: references to all the nodes adjacent to it, or references to only those adjacent nodes connected by

an out-edge from this node

Page 14: PSU CS 311 - Algorithms Analysis and Design Graphs

14PSUCS 311 - Algorithms Analysis and Design

Adjacency-set representation IV

Here we have a set of nodes, and each node refers to a set of other (pointed to) nodes

The edges are implicit

AB

G

E

F

D

C A { E }

B { C }

C { }

D { D, A }

E { B, G }

F { E }

G { D }

Page 15: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

A graph is a pair G = (V, E), where V is a set E is a set of pairs (v, w) with v, w ∈ V

We call the elements of V the vertices of G;

the elements of E are the edges of G.

Graphs

Vertices a and c are adjacent;vertices f and g are not.Vertex b and edge (b, g) are incident; vertex e and edge (d, g) are not.

e

a

f

db

cg

Example:

V = {a, b, c, d, e, f, g}E = {(a, c), (b, d), (b, e), (b, g),

(d, e), (d, g), (e, f)}

The neighbourhood of vertex b is the set N(b) = {d, e, g}.The degree deg(b) of vertex b is 3.

Page 16: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Undirected vs. Directed Graphs

e

a

f

db

cg

In an undirected graphs, edges are unordered pairs (v, w).

That is, edges (v, w) and (w, v) are the same edge; there is no sense of direction.

e

a

f

db

cg

In a directed graph, edges are ordered pairs (v, w).

That is, edges (v, w) and (w, v) are two different edges.Edge (v, w) is directed from v to w.Vertex v is the source of edge (v, w); vertex w is its target.

Page 17: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Paths and Cycles

A path is a sequence of verticesP = (v

0, v

1, …, v

k) such that,

for 1 ≤ i ≤ k, edge (vi – 1

,vi) ∈ E.

Path P is simple if no vertex appears more than once in P.

g

f

i

d

e

b

a

h

j

c

Page 18: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Paths and CyclesA path is a sequence of verticesP = (v

0, v

1, …, v

k) such that,

for 1 ≤ i ≤ k, edge (vi – 1

,vi) ∈ E.

Path P is simple if no vertex appears more than once in P.

P1 = (g, d, e, b, d, a, h) is not simple.

g

f

i

d

e

b

a

h

j

c

P2 = (f, i, c, j) is simple.

Page 19: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Paths and CyclesA path is a sequence of verticesP = (v

0, v

1, …, v

k) such that,

for 1 ≤ i ≤ k, edge (vi – 1

,vi) ∈ E.

Path P is simple if no vertex appears more than once in P.

A cycle is a sequence of verticesC = (v

0, v

1, …, v

k – 1) such that,

for 0 ≤ i < k, edge (vi, v

(i + 1) mod k) in E.

Cycle C is simple if path (v0, v

1, …, v

k – 1)

is simple.

C1 = (a, h, j, c, i, f, g, d) is simple.

g

f

i

d

e

b

a

h

j

c

g

f

i

d

e

b

a

h

j

c

Page 20: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Paths and CyclesA path is a sequence of verticesP = (v

0, v

1, …, v

k) such that,

for 1 ≤ i ≤ k, edge (vi – 1

,vi) ∈ E.

Path P is simple if no vertex appears more than once in P.

A cycle is a sequence of verticesC = (v

0, v

1, …, v

k – 1,, v

0) such that,

for 0 ≤ i < k, edge (vi, v

(i + 1) mod k) ∈ E.

Cycle C is simple if path (v0, v

1, …, v

k – 1)

is simple.

g

f

i

d

e

b

a

h

j

c

C2 = (g, d, b, h, j, c, i, e, b) is not

simple.

g

f

i

d

e

b

a

h

j

c

C1 = (a, h, j, c, i, f, g, d) is simple.

Page 21: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

A graph H = (W, F) is a subgraph of a graph G = (V, E) if W in V and F in E.

Subgraphs

Page 22: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

A graph H = (W, F) is a subgraph of a graph G = (V, E) if W ⊆ V and F ⊆ E.

Subgraphs

Page 23: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Spanning GraphsA spanning graph of G is a subgraph of G that contains all vertices of G.

Page 24: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Connectivity

The connected components of a graph are its maximal connected subgraphs.

A graph G is connected if there is a path between any two vertices in G.

Page 25: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Trees and ForestsA tree is a graph that is connected and contains no cycles.

A forest is a graph that contains no cycles.

The connected components of a forest are trees.

Page 26: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Spanning Trees and Spanning Forests

A spanning tree of a graph is a spanning graph that is a tree.

A spanning forest of a graph is a spanning graph that is a forest.

Page 27: PSU CS 311 - Algorithms Analysis and Design Graphs

PSUCS 311 - Algorithms Analysis and Design

Graph Representationsincidency list representation: u

yw

v

xz c

b

e

d

a

b c d ea

u v w x y z

Page 28: PSU CS 311 - Algorithms Analysis and Design Graphs

28PSUCS 311 - Algorithms Analysis and Design

Searching a graph

With certain modifications, any tree search technique can be applied to a graph This includes depth-first, breadth-first, depth-first

iterative deepening, and other types of searches The difference is that a graph may have cycles

We don’t want to search around and around in a cycle To avoid getting caught in a cycle, we must keep

track of which nodes we have already explored There are two basic techniques for this:

Keep a set of already explored nodes, or Mark the node itself as having been explored

Page 29: PSU CS 311 - Algorithms Analysis and Design Graphs

29PSUCS 311 - Algorithms Analysis and Design

Example: Depth-first search

Here is how to do DFS on a tree: Put the root node on a stack;

while (stack is not empty) { remove a node from the stack; if (node is a goal node) return success; put all children of the node onto the stack;}return failure;

Here is how to do DFS on a graph: Put the starting node on a stack;

while (stack is not empty) { remove a node from the stack; if (node has already been visited) continue; if (node is a goal node) return success; put all adjacent nodes of the node onto the stack;}return failure;

Page 30: PSU CS 311 - Algorithms Analysis and Design Graphs

30PSUCS 311 - Algorithms Analysis and Design

Finding connected components

A depth-first search can be used to find connected components of a graph A connected component is a set of nodes; therefore, A set of connected components is a set of sets of nodes

To find the connected components of a graph: while there is a node not assigned to a component { put that node in a new component do a DFS from the node, and put every node

reached into the same component }

Page 31: PSU CS 311 - Algorithms Analysis and Design Graphs

31PSUCS 311 - Algorithms Analysis and Design

Graph applications

Graphs can be used for: Finding a route to drive from one city to another Finding connecting flights from one city to another Determining least-cost highway connections Designing optimal connections on a computer chip Implementing automata Implementing compilers Doing garbage collection Representing family histories Pert charts Playing games

Page 32: PSU CS 311 - Algorithms Analysis and Design Graphs

32PSUCS 311 - Algorithms Analysis and Design

Shortest-path

Suppose we want to find the shortest path from node X to node Y

It turns out that, in order to do this, we need to find the shortest path from X to all other nodes Why? If we don’t know the shortest path from X to Z, we

might overlook a shorter path from X to Y that contains Z

Dijkstra’s Algorithm finds the shortest path from a given node to all other reachable nodes

Page 33: PSU CS 311 - Algorithms Analysis and Design Graphs

33PSUCS 311 - Algorithms Analysis and Design

Dijkstra’s algorithm I

Dijkstra’s algorithm builds up a tree: there is a path from each node back to the starting node

For example, in the following graph, we want to find shortest paths from node B

Edge values in the graph are weights Node values in the tree are total weights The arrows point in the right direction for what we need

A

B

E F

DC

3

51 5

2

4 51

A

B

E F

DC

3

4 6

8 9

0

Page 34: PSU CS 311 - Algorithms Analysis and Design Graphs

34PSUCS 311 - Algorithms Analysis and Design

Dijkstra’s algorithm II

For each vertex v, Dijkstra’s algorithm keeps track of three pieces of information: A boolean telling whether we know the shortest path to that

node (initially true only for the starting node) The length of the shortest path to that node known so far (0

for the starting node) The predecessor of that node along the shortest known path

(unknown for all nodes) Dijkstra’s algorithm proceeds in phases—at each step:

From the vertices for which we don’t know the shortest path, pick a vertex v with the smallest distance known so far

Set v’s “known” field to true For each vertex w adjacent to v, test whether its distance so

far is greater than v’s distance plus the distance from v to w; if so, set w’s distance to the new distance and w’s predecessor to v

Page 35: PSU CS 311 - Algorithms Analysis and Design Graphs

35PSUCS 311 - Algorithms Analysis and Design

Dijkstra’s algorithm III

Three pieces of information for each node: + if the minimum distance is known for sure The best distance so far The node’s predecessor

A

B

E F

DC

35

1 52

4 51

node

A

B

C

D

E

F

init’ly

inf

0-

inf

inf

inf

inf

1

3B

+0-

5B

inf

inf

inf

2

+3B

+0-

4A

inf

inf

inf

3

+3B

+0-

+4A

6C

8C

inf

4

+3B

+0-

+4A

+6C

8C

11D

5

+3B

+0-

+4A

+6C

+8C

9E

6

+3B

+0-

+4A

+6C

+8C

+9E

A

B

E F

DC

3

4 6

8 9

0

Page 36: PSU CS 311 - Algorithms Analysis and Design Graphs

36PSUCS 311 - Algorithms Analysis and Design

Summary

A graph may be directed or undirected The edges (=arcs) may have weights or contain

other data, or they may be just connectors Similarly, the nodes (=vertices) may or may not

contain data There are various ways to represent graphs

The “best” representation depends on the problem to be solved

You need to consider what kind of access needs to be quick or easy

Many tree algorithms can be modified for graphs Basically, this means some way to recognize cycles

Page 37: PSU CS 311 - Algorithms Analysis and Design Graphs

37PSUCS 311 - Algorithms Analysis and Design

A graph puzzle

Suppose you have a directed graph with the above shape You don’t know how many nodes are in the leader You don’t know how many nodes are in the loop You don’t know how many nodes there are total

Devise an O(n) algorithm (n being the total number of nodes) to decide when you must already be in the loop This is not asking to find the first node in the loop You can only use a fixed (constant) amount of extra memory You cannot mark the graph as you go

nodes in leader nodes in loop

start here