1 Approximation Algorithm Instructor: yedeshi [email protected]

1

Approximation Algorithm

Instructor: yedeshi

[email protected]

2

Dealing with Hard Problems

What to do if: Divide and conquer

Dynamic programming

Greedy

Linear Programming/Network Flows

…

does not give a polynomial time algorithm?

3


Solution I: Ignore the problemCan’t do it ! There are thousands of problems for which we do not know polynomial time algorithms

For example:

Traveling Salesman Problem (TSP)

Set Cover

4

Traveling Salesman Problem

Traveling SalesmanProblem (TSP)

Input: undirected graph with lengths on edges

Output: shortest cycle that visits each vertex exactly once

Best known algorithm: O(n 2n) time.

5

The vertex-cover problem

A vertex cover of an undirected graph G = (V, E) is a subset V' ⊆ V such that if (u, v) ∈ E, then u ∈ V' or v ∈ V' (or both).

A vertex cover for G is a set of vertices that covers all the edges in E.

As a decision problem, we defineVERTEX-COVER = { 〈 G, k 〉 : graph G has a vertex cover of size k}.

Best known algorithm: O(kn + 1.274k)

6


Exponential time algorithms for small inputs. E.g., (100/99)n time is not bad for n < 1000.

Polynomial time algorithms for some (e.g., average-case) inputs

Polynomial time algorithms for all inputs, but which return approximate solutions

7

Approximation Algorithms

An algorithm A is ρ-approximate, if, on any inputof size n:

The cost CA of the solution produced by thealgorithm, and

The cost COPT of the optimal solution are such that CA ≤ ρ COPT

We will see: 2-approximation algorithm for TSP in the plane

2-approximation algorithm for Vertex Cover

8

Comments on Approximation

“CA ≤ ρ COPT ” makes sense only for minimization problems

For maximization problems, replace by COPT ≤ ρ CA

Additive approximation “CA ≤ ρ + COPT “ also makes sense, although difficult to achieve

9

The Vertex-cover problem

10

The vertex-cover problem

A vertex cover of an undirected graph G = (V, E)

is a subset V' ⊆ V such that if (u, v) ∈ E, then u ∈ V' or v ∈ V' (or both).

A vertex cover for G is a set of vertices that covers all the edges in E.

The goal is to find a vertex cover of minimum size in a given undirected graph G.

11

Naive Algorithm

APPROX-VERTEX-COVER(G)

1 C ← Ø

2 E′ ← E[G]

3 while E′ ≠ Ø

4 do let (u, v) be an arbitrary edge of E′

5 C ← C {∪ u, v}

6 remove from E′ every edge incident on either u or v

7 return C

12

Illustration of Naive algorithm

Input Edge bc is chosenSet C = {b, c}

Edge ef is chosen

Optimal solution{b, e, d}

Naive algorithmC={b,c,d,e,f,g}

13

Approximation 2

Theorem. APPROX-VERTEX-COVER is a 2-approximation algorithm.Pf. let A denote the set of edges that were picked by APPROX-VERTEX-COVER.To cover the edges in A, any vertex cover, in particular, an optimal cover C* must include at least one endpoint of each edge in A. No two edges in A share an endpoint.Thus no two edges in A are covered by the same vertex from C*, and we have the lower bound

C* ≥ |A|On the other hand, the algorithm picks an edge for which neither of its endpoints is already in C.

|C| = 2|A|Hence, |C| = 2|A| ≤ 2|C*|.

14

Vertex cover: summary

No better constant-factor approximation is known!!

More precisely, minimum vertex cover is known to be approximable within (for a given |V|≥2)

but cannot be approximated within 1.1666 for any sufficiently large vertex degree.

log log | |2

2 log | |

V

V

15

The Traveling Salesman Problem

Traveling SalesmanProblem (TSP)

Input: undirected graph G = (V, E) with edges cost c(u, v) associated with each edge (u, v)

∈ E

Output: shortest cycle that visits each vertex exactly once

Triangle inequality if for all vertices u, v, w ∈ V,

c(u, w) ≤ c(u, v) + c(v, w).

u

v

w

16

2-approximation for TSP with triangle inequality

Compute MST T An edge between any pair of points

Weight = distance between endpoints

Compute a tree-walk W of T Each edge visited twice

Convert W into a cycle H using shortcuts

17

Algorithm

APPROX-TSP-TOUR(G, c) 1 select a vertex r ∈ V [G] to be a "root" vertex 2 compute a minimum spanning tree T for G from root r using MST-PRIM(G, c, r) 3 let L be the list of vertices visited in a preorder tree walk of T 4 return the hamiltonian cycle H that visits the vertices in the order L

18

Preorder Traversal

Preorder: (root-left-right)Visit the root first; and then

traverse the left subtree; and then

traverse the right subtree.

Example:

Order: A,B,C,D,E,F,G,H,I

19

Illustration

A full walk of the tree visits the vertices in the order a, b, c, b, h, b, a, d, e, f, e, g, e, d, a.

MST Tree walk W

preorder walk (Final solution H)

OPT solution

20

2-approximation

Theorem. APPROX-TSP-TOUR is a polynomial-time 2-approximation algorithm for the traveling-salesman problem with the triangle inequality. Pf. Let COPT be the optimal cycle Cost(T) ≤ Cost(COPT)

Removing an edge from H gives a spanning tree, T is a spanning tree of minimum cost

Cost(W) = 2 Cost(T) Each edge visited twice

Cost(H) ≤ Cost(W) Triangle inequality

Cost(H) ≤ 2 Cost(COPT )

21

Load Balancing

Input. m identical machines; n jobs, job j has processing time tj.

Job j must run contiguously on one machine.A machine can process at most one job at a time.

Def. Let J(i) be the subset of jobs assigned to machine i. Theload of machine i is Li = j J(i) tj.

Def. The makespan is the maximum load on any machine L = maxi Li.

Load balancing. Assign each job to a machine to minimize makespan.

22

List-scheduling algorithm. Consider n jobs in some fixed order. Assign job j to machine whose load is smallest so far.

Implementation. O(n log n) using a priority queue.

Load Balancing: List Scheduling

List-Scheduling(m, n, t1,t2,…,tn) { for i = 1 to m { Li 0 J(i) }

for j = 1 to n { i = argmink Lk

J(i) J(i) {j} Li Li + tj

}}

jobs assigned to machine i

load on machine i

machine i has smallest load

assign job j to machine i

update load of machine i

23

Load Balancing: List Scheduling Analysis

Theorem. [Graham, 1966] Greedy algorithm is a (2-1/m)-approximation.

First worst-case analysis of an approximation algorithm. Need to compare resulting solution with optimal makespan L*.

Lemma 1. The optimal makespan L* maxj tj.

Pf. Some machine must process the most time-consuming job. ▪

Lemma 2. The optimal makespan Pf.

The total processing time is j tj . One of m machines must do at least a 1/m fraction of total

work. ▪

L * 1m t jj .

24


Theorem. Greedy algorithm is a (2-1/m)-approximation.Pf. Consider load Li of bottleneck machine i.

Let j be last job scheduled on machine i. When job j assigned to machine i, i had smallest load. Its load

before assignment is Li - tj Li - tj Lk for all 1 k m.

j

0L = LiLi - tj

machine i

blue jobs scheduled before j

25


Theorem. Greedy algorithm is a (2-1/m)- approximation.Pf. Consider load Li of bottleneck machine i.

Let j be last job scheduled on machine i. When job j assigned to machine i, i had smallest load. Its load

before assignment is Li - tj Li - tj Lk for all 1 k m. Sum inequalities over all k and divide by m:

Now ▪

Lemma 1

Lemma 2

1( )

* /

(1 1/ ) *

i j k jk

j

L t L tm

L t m

m L

Lemma 1

*(1 1/ ) *

( ) (2 1/ ) *i i j j

Lm L

L L t t m L

26


Q. Is our analysis tight?A. Essentially yes. Indeed, LS algorithm has tight bound 2- 1/m

Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m

machine 2 idle

machine 3 idle

machine 4 idle

machine 5 idle

machine 6 idle

machine 7 idle

machine 8 idle

machine 9 idle

machine 10 idle

list scheduling makespan = 19

m = 10

27


Q. Is our analysis tight?A. Essentially yes. Indeed, LS algorithm has tight bound 2- 1/m

Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m

m = 10

optimal makespan = 10

28

Machine 2

Machine 1a d f

b c e g

yes

Load Balancing on 2 Machines

Claim. Load balancing is hard even if only 2 machines.Pf. NUMBER-PARTITIONING P LOAD-BALANCE.

a d

f

b c

ge

length of job f

Time L0

machine 1

machine 2

29

Load Balancing: LPT Rule

Longest processing time (LPT). Sort n jobs in descending order of processing time, and then run list scheduling algorithm.

LPT-List-Scheduling(m, n, t1,t2,…,tn) {

Sort jobs so that t1 ≥ t2 ≥ … ≥ tn

for i = 1 to m { Li 0 J(i) }

for j = 1 to n { i = argmink Lk

J(i) J(i) {j} Li Li + tj

}}

jobs assigned to machine i

load on machine i

machine i has smallest load

assign job j to machine i

update load of machine i

30


Observation. If at most m jobs, then list-scheduling is optimal.Pf. Each job put on its own machine. ▪

Lemma 3. If there are more than m jobs, L* 2 tm+1.

Pf. Consider first m+1 jobs t1, …, tm+1. Since the ti's are in descending order, each takes at least tm+1

time. There are m+1 jobs and m machines, so by pigeonhole

principle, at least one machine gets two jobs. ▪

Theorem. LPT rule is a 3/2 approximation algorithm.Pf. Same basic approach as for list scheduling.

▪

L i (Li t j )

L*

t j

12

L*

32 L *.

Lemma 3( by observation, can assume number of jobs > m )

31


Q. Is our 3/2 analysis tight?A. No.

Theorem. [Graham, 1969] LPT rule is a (4/3 – 1/(3m))-approximation.Pf. More sophisticated analysis of same algorithm.

Q. Is Graham's (4/3 – 1/(3m))- analysis tight?A. Essentially yes.

Ex: m machines, n = 2m+1 jobs, 2 jobs of length m+1, m+2, …, 2m-1 and one job of length m.

32

LPT

Proof. Jobs are indexed t1 ≥ t2 ≥ … ≥ tn.

If n ≤ m, already optimal (one machine processes one job).

If n> 2m, then tn ≤ L*/3. Similar as the analysis of LS algorithm.

Suppose total 2m – h jobs, 0 ≤ h < m

Check that LPT is already optimal solution

1

hh+1h+2 n-1

n

Time

33

Approximation Scheme

NP-complete problems allow polynomial-time approximation algorithms that can achieve increasingly smaller approximation ratios by using more and more computation time Tradeoff between computation time and the quality of the approximation For any fixed >0, An ∈ approximation scheme for an optimization problem is an (1 + )-∈approximation algorithm.

34

PTAS and FPTAS

We say that an approximation scheme is a polynomial-time approximation scheme (PTAS) if for any fixed ∈ > 0, the scheme runs in time polynomial in the size n of its input instance.

Example: O(n2/∈).

an approximation scheme is a fully polynomial-time approximation scheme (FPTAS) if it is an approximation scheme and its running time is polynomial both in 1/∈ and in the size n of the input instance

Example: O((1/ )∈ 2n3).

35

The Subset Sum

Input. A pair (S, t), where S is a set {x1, x2, ...,

xn} of positive integers and t is a positive

integer

Output. A subset S′ of S

Goal. Maximize the sum of S′ but its value is not larger than t.

36

An exponential-time exact algorithm

If L is a list of positive integers and x is another positive integer, then we let L + x denote the list of integers derived from L by increasing each element of L by x. For example, if L = 〈 1, 2, 3, 5, 9 〉 , then L + 2 = 〈 3, 4, 5, 7, 11 〉 . We also use this notation for sets, so that

S + x = {s + x : s ∈ S}.

37

Exact algorithm

MERGE-LISTS(L, L′): returns the sorted list that is the merge of its two sorted input lists L and L′ with duplicate values removed.EXACT-SUBSET-SUM(S, t) 1 n ← |S| 2 L0 ← 〈 0 〉 3 for i ← 1 to n 4 do Li ← MERGE-LISTS(Li-1, Li-1 + xi) 5 remove from Li every element that is greater than t 6 return the largest element in Ln

38

Example

For example, if S = {1, 4, 5}, thenP1 ={0, 1} ,P2 ={0, 1, 4, 5} ,P3 ={0, 1, 4, 5, 6, 9, 10} .Given the identity

Since the length of Li can be as much as 2i, it is an exponential-time algorithm .

1 ( )i i i iP P P x

39

The Subset-sum problem: FPTAS

Trimming or rounding: if two values in L are close to each other, then for the purpose of finding an approximate solution there is no reason to maintain both of them explicitly.

Let δ such that 0 < δ < 1.

L′ is the result of trimming L, for every element y that was removed from L, there is an element z still in L′ that approximates y, that is

1

yz y

40

Example

For example, if δ = 0.1 andL = 〈 10, 11, 12, 15, 20, 21, 22, 23, 24, 29 〉 ,then we can trim L to obtainL′ = 〈 10, 12, 15, 20, 23, 29 〉 ,TRIM(L, δ) 1 m ← |L| 2 L′ ← 〈 y1 〉 3 last ← y1 4 for i ← 2 to m 5 do if yi > last · (1 + δ) ▹ yi ≥ last because L is sorted 6 then append yi onto the end of L′ 7 last ← yi 8 return L′

41

(1+ ∈)-Approximation algorithm

APPROX-SUBSET-SUM(S, t, ) ∈1 n ← |S| 2 L0 ← 〈 0 〉 3 for i ← 1 to n 4 do Li ← MERGE-LISTS(Li-1, Li-1 + xi) 5 Li ← TRIM(Li, /2∈ n)6 remove from Li every element that is greater than t 7 let z* be the largest value in Ln 8 return z*

42

FPTAS

Theorem. APPROX-SUBSET-SUM is a fully polynomial-time approximation scheme for the subset-sum problem.

Pf. The operations of trimming Li in line 5 and removing from Li every element that is greater than t maintain the property that every element of Li is also a member of Pi. Therefore, the value z* returned in line 8 is indeed the sum of some subset of S.

43

Pf. Con.

Pf. Let y* ∈ Pn denote an optimal solution to the subset-sum problem. we know that z* ≤ y*. We need to show that y*/z* ≤ 1 + ∈.

By induction on i, it can be shown that for every element y in Pi that is at most t, there is a

z ∈ Li such that

Thus, there is a z ∈ Ln , such that

(1 / 2 )i

yz y

n

**

(1 / 2 )n

yz y

n

44

Pf. Con.

And thus,

Since there is a z ∈ Ln

Hence,

*

(1 / 2 )nyn

z

*

*(1 / 2 )ny

nz

/ 2

2

(1 / 2 )

1 / 2 ( )

1

nn e

O

45

Pf. Con.

To show FPTAS, we need to bound Li.

After trimming, successive elements z and z′ of Li must have the relationship z′/z > 1+∈/2n

Each list, therefore, contains the value 0, possibly the value 1, and up to log⌊ 1+∈/2n t⌋ additional values 1 / 2

lnlog

ln (1 / 2 )

2 (1 / 2 ) ln

4 ln

n

tt

n

n n t

n t

Documents

1 Approximation Algorithm Instructor: yedeshi [email protected]