Notes This set of slides (Handout #7) is missing the material on Reductions and Greed that was presented in class. Those slides are still under construction,

Notes

• This set of slides (Handout #7) is missing the material on Reductions and Greed that was presented in class. Those slides are still under construction, and will be posted as Handout #6.

• The quiz on Thursday of 8th week will cover Greed and Dynamic Programming, and through HW #4

Topological Sorting

• Prelude to shortest paths• Generic scheduling problem• Input:

– Set of tasks {T1, T2, T3, …, Tn}• Example: getting dressed in the morning: put on shoes, socks, shirt,

pants, belt, …

– Set of dependencies {T1T2, T3 T4, T5 T1, …}• Example: must put on socks before shoes, pants before belt, …

• Want:– ordering of tasks which is consistent with dependencies

• Problem representation: Directed Acyclic Graph– Vertices = tasks; Directed Edges = dependencies– Acyclic: if cycle of dependencies, no solution possible

Topological Sorting

• TOP_SORT PROBLEM: Given a DAG G=(V,E) with |V|=n, assign labels 1,...,n to vi V s.t. if v has label k, all vertices reachable from v have labels > k

• “Induction idea”: – Know how to label DAG’s with < n vertices

• Claim: A DAG G always has some vertex with indegree = 0– Take an arbitrary vertex v. If v doesn’t have indegree = 0, traverse

any incoming edge to reach a predecessor of v. If this vertex doesn’t have indegree = 0, traverse any incoming edge to reach a predecessor, etc.

– Eventually, this process will either identify a vertex with indegree = 0, or else reach a vertex that has been reached previously (a contradiction, given that G is acyclic).

• “Inductive approach”: Find v with indegree(v) = 0, give it lowest available label, then delete v (and incident edges), update degrees of remaining vertices, and repeat

Dynamic Programming Key Phrases

• As we discuss shortest paths, and then the dynamic programming approach, keep in mind the following “key phrases”– “Principle of Optimality”: Any subsolution of an optimal

solution is itself an optimal solution– “Overlapping Subproblems”: Can exploit a polynomially

bounded number of possible subproblems– “Bottom-Up D/Q” / “Table (cache) of Subproblem Solutions”:

avoid recomputation by tabulating subproblem solutions – “Relaxation / Successive Approximation”: Often, the DP

approach makes multiple passes, each time solving a less restricted version of the original problem instance

• Eventually, solve completely unrestricted = original problem instance

Single-Source Shortest Paths

• Given G=(V,E) a directed graph, L : E + a length function, and a distinguished source s V. Find the shortest path in G from s to each vi V, vi s.

• Let L*(i,j) = length of SP from vi to vj

• Lemma: Suppose S V contains s = vi and L*(1,w) is known w S. For vi S:– Let D(vi) = minw S L*(1,w) + L(w,i) (*) L*(1,w) = path length, with path restricted to vertices of S L(w,i) = edge length– Let vv minimize D(vi) over all nodes vi S (**)

Then L*(1,v) = D(vv).• Notation: D(v) = length of the 1-to-v SP that uses only

vertices of S (except for v). (D(v) is not necessarily the same as L*(1,v), because the path is restricted!)

Single-Source Shortest Paths

• Proof of Lemma: To prove equality, prove and • L*(1,v) D(v) is obvious because D(v) is the minimum length

of a restricted path, while L*(1,v) is unrestricted• Let L*(1,v) be the shortest path s=v1, v2, …, vr = v from the

source v1 to v. Let vj be the first vertex in this SP that is not in S

• Then: L*(1,v) = L*(1, vj-1) + L(vj-1, vj) + L*(vj,v) // else, not a shortest path

D(vj) + L*(vj,v)

// D(vj) minimizes over all w in S, including v j-1

D(v) + L*(vj,v)

// because both vj and vv S, but vv chosen first D(v)

// since L*(vj,v) 0. Lemma Dijkstra’s Algorithm

A Fact About Shortest Paths

• Triangle Inequality:

(u,v) (u,x) + (x,v)

(shortest path distances induce a metric)

u

x

v

Shortest Path Formulations

• Given a graph G=(V,E) and w: E – (1 to 2) “s-t”: Find a shortest path from s to t– (1 to all) “single-source”: Find a shortest path from a

source s to every other vertex v V– (All to all) “all-pairs”: Find a shortest path from every

vertex to every other vertex

• Weight of path <v[1],...,v[k]> = w(v[i],v[i+1])• Sometimes: no negative edges

– Examples of “negative edges”: travel inducements, exothermic reactions in chemistry, unprofitable transactions in arbitrage, …

• Always: no negative cycles– Makes the shortest-path problem well-defined

<0

Shortest Paths

• First case: All edges have positive length– Length of (vi,vj ) edge = dij

• Condition 1: dij > 0 • Condition 2: di + djk dik for some i,j,k

– else shortest-path problem would be trivial

• Observation 1: Length of a path > length of any of its subpaths

• Observation 2: Any subpath of a shortest path is itself a shortest path Principle of Optimality

• Observation 3: Any shortest path contains n-1 edges pigeonhole principle; assumes no negative cycles; n nodes total

Shortest Paths

• Scenario: All shortest paths from v0 = source to other nodes are ordered by increasing length:– |P1| |P2| … |Pn-1|– Index nodes accordingly

• Algorithm: Find P1, then find P2, etc.• Q: How many edges are there in P1 ?

– Exactly 1 edge, else can find a subpath that is shorter

• Q: How many edges are there in Pk ?– At most k edges, else can find k (shorter) subpaths, which would

contradict the definition of Pk

• Observation 4: Pk contains k edges• To find P1 : only look at one-edge paths (min = P1)• To find P2 : only look at one- and two-edge paths

– But, need only consider two-edge paths of form d01 + d1i

– Else would have 1 paths shorter than P2, a contradiction

Another Presentation of Dijkstra’s Algorithm

• Terminology– Permanent label: true SP distance from v0 to vi

– Temporary label: restricted SP distance from v0 to vi (going through only existing permanently-labeled nodes)

Permanently labeled nodes = set S in previous development

• Dijkstra’s Algorithm0. All vertices vi, i = 1,…, n-1, receive temporary labels li with

value d0i

LOOP:

1. Among all temporary labels, pick lk = minI li and change lk to lk* (i.e., make lk ‘s label permanent) // stop if no temporary labels left

2. Replace all temporary labels of vk ‘s neighbors, using

li min (li , lk* + dki)

Prim’s Algorithm vs. Dijkstra’s Algorithm• Prim: Iteratively add edge eij to T,

such that vi T, vj T,

and dij is minimum• Dijkstra: Iteratively add edge eij to T,


and li + dij is minimum• Both are building trees, in very similar ways!

– Prim: Minimum Spanning Tree– Dijkstra: Shortest Path Tree

• What kind of tree does the following algorithm build?• Prim-Dijkstra: Iteratively add edge eij to T,


and c li + dij is minimum // 0 c 1

Bellman-Ford Algorithm• Idea: Successive Approximation / Relaxation

– Find SP using 1 edges– Find SP using 2 edges– …– Find SP using n-1 edges have true shortest paths

• Let lj(k) denote shortest v0 – vj pathlength using k edges

• Then, li(1) = d0j j = 1, …, n-1 // dij = if no i-j edge

• In general, lj(k+1) = min { lj(k) , mini (li(k) + dij) }

– lj(k) : don’t need k+1 arcs

– mini (li(k) + dij) : view as length-k SP plus a single edge

Bellman-Ford vs. Dijkstra

B

S C

D

A

2 2

13

8

4

Pass 1 2 3 4

Label A 8 min(8, 3+4, +1, 2+ ) = 7 min(7, 3+4, 4+1, 2+) = 5 min(5, 3+4, 4+1, 2+ )= 5

B 3 min(3, 8+4, +, 2+) = 3 min(3, 7+4, 4+, 2+) = 3 min(3, 5+4, 4+, 2+) = 3

C min(, 8+1, 3+, 2+2) = 4 min(4, 7+1, 3+, 2+2) = 4 min(4, 5+1, 3+, 2+2) = 4

D 2 min(2, 8+, 3+, +2) = 2 min(2, 7+, 3+, 4+2) = 2 min(2, 5+, 3+, 4+2) = 2

Bellman-Ford vs. Dijkstra

B

S C

D

A

2 2

13

8

4

Pass 1 2 3 4

Label A 8 min([8], 2+ ) = 8 min([8], 3+4) = 7 min([7], 4+1) = 5*

B 3 min([3], 2+ ) = 3*

C min([], 2+2) = 4 min(4, 3+) = 4*

D 2*

Special Case: DAGs (24.2)• Longest-Path Problem: well-defined only when

there are no cycles• DAG: topologically sort the vertices

labels v1, …, vn s.t. all edges directed from vi to vj, i < j

• Let li denote longest v0 – vj pathlength

– l0 = 0

– l1 = d01 // dij = -if no i-j edge

– l2 = max(d01 + d12 , d02)

– In general, lk = maxj<k (lj + djk)

• Shortest pathlength in DAG: replace max by min, use dij = +if no i-j edge

DAG Shortest Paths Complexity (24.2)

• Bellman-Ford = O(VE)• Topological sort O(V+E) (DFS)• Will never relax edges out of vertex v until have

done all edges in to v– Runtime O(V+E)

• Application: PERT (program evaluation and review technique) – critical path is the longest path through the DAG

All-Pairs Shortest Paths (25.1-25.2)

• Directed graph G = (V,E), weight E

• Goal: Create n n matrix of SP distances

(u,v)

• Running Bellman-Ford once from each vertex

O( ) = O( ) on dense graphs

• Adjacency-matrix representation of graph:

– n n matrix W = (wij) of edge weights

– assume wii = 0 i, SP to self has no edges, as long as there are no negative cycles

Simple APSP Dynamic Programming (25.1)

• dij(m) = weight of s-p from i to j with m edges

dij(0) = 0 if i = j and dij

(0) = if i j

dij(m) = mink{dik

(m-1) + wkj}

• Runtime = O( n4)

n-1 passes, each computing n2 d’s in O(n) time

ji

m-1

m-1

Matrix Multiplication (25.1)

• Similar: C = A B, two n n matrices

cij = k aik bkj O(n3) operations

• replacing: ‘‘ + ’’ ‘‘ min ’’

‘‘ ’’ ‘‘ + ’’

– gives cij= mink {aik + bkj}

– D(m) = D(m-1) ‘‘’’ W– identity matrix is D(0) – Cannot use Strassen’s because no subtraction

• Time is still O(n n3 ) = O(n4 )

• Repeated squaring: W2n = Wn Wn (addition chains) Compute W, W2 , W4 ,..., W2k , k = log n O(n3 log n)

Floyd-Warshall Algorithm (26.2/25.2)

• Also DP, but even faster (by another log n actor O(n3))• cij

(m) = weight of SP from i to j with intermediate vertices in the set {1, 2, ..., m} (i, j)= cij

(n) • DP: compute cij

(n) in terms of smaller cij(n-1)

– cij(0) = wij

– cij(m) = min {cij

(m) , cim(m-1) + cmj

(m-1) }

intermediate nodes in {1, 2, ..., m}

ji

cim(m-1) cmj

(m-1)m

cij(m-1)

Floyd-Warshall Algorithm (26.2/25.2)

• Difference from previous: we do not check all possible intermediate vertices.

• for m=1..n do for i=1..n do for j = 1..n do

cij(m) = min {cij

(m-1) , cim(m-1) + cmj

(m-1) }

• Runtime O(n3 )• Transitive Closure G* of graph G:

– (i,j) G* iff path from i to j in G– Adjacency matrix, elements on {0,1}– Floyd-Warshall with ‘‘ min ’’ ‘‘OR’’ , ‘‘+’’ ‘‘ AND ’’

– Runtime O(n3 )– Useful in many problems

BEGIN DIGRESSION

(alternative presentation, can be skipped)

CLRS Notation: Bellman-Ford (24.1) SSSP in General Graphs

• Essentially a BFS based algorithm• Shortest paths (tree) easy to reconstruct

for each v V do d[v] ; d[s] 0

// initialization

for i =1,...,|V|-1 do

for each edge (u,v) E do

d[v] min{d[v], d[u]+w(u,v)}

// relaxation

for each v V do if d[v]> d[u] + w(u,v) then no solution

// negative cycle checking

for each v V, d[v]= (s,v) // have true shortest paths

Bellman-Ford Analysis (CLRS 24.1)

• Runtime = O(VE)

• Correctness– Lemma: d[v] (s,v)

• Initially true• Let d[v] = d[u] +w(u,v)

• by triangle inequality, for first violation

d[v] < (s,v) (s,u)+w(u,v) d(u)+w(u,v)

– After |V|-1 passes all d values are ’s if there are no negative cycles

• s v[1] v[2] ... v (some shortest path)

• After i-th iteration d[s,v[i]] is correct and final

Dijkstra Yet Again (CLRS 24.3)

• Faster than Bellman-Ford because can exploit having only non-negative weights

• Like BFS, but uses priority queue

for each v V do d[v] ; d[s] 0

S ; Q V

While Q dou Extract-Min(Q)S S + ufor v adjacent to u do

d[v] min{d[v], d[u]+w(u,v)}

(relaxation = Decrease-Key)

Dijkstra Runtime (24.3)

• Extract-Min executed |V| times• Decrease-Key executed |E| times• Time = |V|T(Extract-Min) // find+delete = O(log V))

+ |E| T(Decrease-Key) // delete+add =O(log V))

Binary Heap = E log V (30 years ago)Fibonacci Heap = E + V log V (10 years ago)

• Optimal time algorithm found 1 year ago; runs in time O(E) (Mikel Thorup)

• Same Lemma as for Bellman-Ford: d[v] (s,v) • Thm: Whenever u is added to S, d[u] = (s,u)

Proof: – Assume that u is the first vertex s.t. d[u] > (s,u)

– Let y be first vertex V-S that is on actual shortest s-u path d[y] = (s,y)

– For y’s predecessor x, d[x] = (s,x) – At the moment that we put x in S, d[y] gets value (s,y)

– d[u] > (s,u) = (s,y) + (y,u) = d[y] + (y,u) d[y]

Dijkstra Correctness (24.3)

s

xy

uS

Q

END DIGRESSION

(alternative presentation, can be skipped)

Dynamic Programming (CLRS 15)

• Third Major Paradigm so far– DQ, Greed are previous paradigms

• Dynamic Programming = metatechnique (not a particular algorithm)

• “Programming” refers to use of a “tableau” in the method (cf. “mathematical programming”), not to writing of code

Longest Common Subsequence (CLRS 15.4)

• Problem: Given x[1..m] and y[1..n], find LCS

x: A B C B D A B

B C B A

y: B D C A B A

• Brute-force algorithm:– for every subsequence of x, check if it is in y– O(n2m ) time

• 2m subsequences of x (each element is either in or out of the subsequence)

• O(n) for scanning y with x-subsequence (m n)

Recurrent Formula for LCS (15.4)

• Let c[i,j] = length of LCS of X[i]=x[1..i],Y[j]= y[1..j]• Then c[m,n] = length of LCS of x and y• Theorem:

if x[i] = y[j]

otherwise

• Proof: x[i] = y[j] LCS([X[i],Y[j]) = LCS(X[i-1],Y[j-1]) + x[i]

]),1[],1,[max(

1]1,1[],[

jicjic

jicjic

DP Properties(15.3)

• Any part of the optimal answer is also optimal– A subsequence of LCS(X,Y) is the LCS for some

subsequences of X and Y.• Subproblems overlap

– LCS(X[m],Y[n-1]) and LCS(X[m-1],Y[n]) have common subproblem LCS(X[m-1],Y[n-1])

– There are polynomially few subproblems in total = mn for LCS

– Unlike divide and conquer

DP (CLRS 15.3)

• After computing solution of a subproblem, store in table

• Time = O(mn)

• When computing c[i,j] we need O(1) time if we have:

– x[i], y[j]

– c[i,j-1]

– c[i-1,j]

– c[i-1,j-1]

DP Table for LCS

y B D C A B A

x

A

B

C

B

D

A

B

DP Table for LCS

y B D C A B A

x

A

B

C

B

D

A

B

0 0 0

0 0 0

0 1 1

0 1

0 1

0 1

0 1

0 1

DP Table for LCS

y B D C A B A

x

A

B

C

B

D

A

B

0 0 0

0 0 0

0 1 1

0 1 1

0 1 1

0 1 2

0 1

0 1

DP Table for LCS

y B D C A B A

x

A

B

C

B

D

A

B

0 0 0 0 0 0 0

0 0 0 0 1 1 1

0 1 1 1 1 2 2

0 1 1 2 2 2 2

0 1 1 2 2 3 3

0 1 2 2 2 3 3

0 1 2 2 3 3 4

0 1 2 2 3 4 4

Optimal Polygon Triangulation• Polygon has sides and vertices • Polygon is simple = not self-intersecting.• Polygon P is convex if any line segment with ends in P

lies entirely in P• Triangulation of P is partition of P with chords into

triangles.• Problem: Given a convex polygon and weight function

defined on triangles (e.g. the perimeter). Find triangulation of minimum weight (of minimum total length).

• # of triangles: Always have n-2 triangles with n-3 chords

Optimal Polygon Triangulation

• Optimal sub-triangulation of optimal triangulation

• Recurrent formula:t[i,j] = the weight of the optimal triangulation of the polygon

<v[i-1],v[i],...,v[j]>; If i=j, then t[i,j]=0; else

• Runtime O(n3) and space is O(n2)

])[][]1[(],1[],[{min],[1

jvkvivwjktkitjitjki

v[i-1]

v[i]

v[j]

v[i-1]

v[i]

v[j]v[k] v[k]

Documents

Notes This set of slides (Handout #7) is missing the material on Reductions and Greed that was presented in class. Those slides are still under construction,