- Home
- Documents
*CSC373 - Approximation CSC373 - Approximation Algorithms Vincent Maccio Vincent MaccioCSC373 -...*

prev

next

out of 22

View

1Download

0

Embed Size (px)

CSC373 - Approximation Algorithms

Vincent Maccio

Vincent MaccioCSC373 - Approximation Algorithms 1 / 22

Algorithm Classification

The following is all informal.

• Problems can be broken down and classified into different groups based on how hard they are to solve

• Most problems we’ve seen in this course and most you’ve seen previously are easy problems

• Let P be the set of problems which can be solved in polynomial time, that is if for some k a problem has an algorithm which solves it in O(nk) then that problem is in P

• sorting a list, shortest path in a graph, determining primes, etc. are all in P

• If for a problem one can be given a solution and determine if that solution is correct or not in polynomial time, then that problem is said to be in NP

Vincent MaccioCSC373 - Approximation Algorithms 2 / 22

Algorithm Classification - cont

• For example, determining if there exists a solution to a Knapsack problem which gives you value V is hard to do, but to determine if a given set of items is a solution or not which gives a value of at least V is easy

• Just sum up the values to see if it’s greater than V and sum up the weights to see if it’s less than the capacity

• A problem is said to be NP-complete if it is the hardest problem to solve in NP (if you can solve any NP-complete problem you can solve all NP problems)

• A problem is said to be NP-hard if it’s at least as hard as NP-complete problems

Vincent MaccioCSC373 - Approximation Algorithms 3 / 22

Algorithm Classification - Observations

See if these make sense to you from the previous definitions

• P ⊆ NP • There exist problems which are NP-hard, but not NP • (NP-hard - NP-complete) ∪P = ∅ • P ∩ NP-complete =?

Vincent MaccioCSC373 - Approximation Algorithms 4 / 22

Coping with Complexity

If tasked to solve an NP-problem what can one do? Solving it is typically too costly.

1 Sacrifice generality • Limit the solution to a specific case • Assume the input won’t get too large

2 Sacrifice optimality • Write something which runs quickly but returns a solution which may

only be “reasonable” but may not be optimal

3 Sacrifice reliability • Write something which returns the optimal sometimes, but not always

Or some combination of the above.

Vincent MaccioCSC373 - Approximation Algorithms 5 / 22

Vertex Cover

• Given a graph G = (V ,E ), a set of vertices C is said to be a vertex cover if C ⊆ V and ∀(u, v) ∈ E : u ∈ C or v ∈ C

• A popular problem is to find an optimal vertex cover C ∗, such that |C ∗| is as small as possible (while still being a vertex cover)

• This problem is NP-hard • But we can come up with an approximation

Vincent MaccioCSC373 - Approximation Algorithms 6 / 22

Approximation

C = ∅ Let E ′ = E while E ′ 6= ∅:

Choose an arbitrary edge (u, v) from E ′

C = C ∪ {u, v} # two vertices, not an edge Remove all edges connected to u or v from E ′

return C

Vincent MaccioCSC373 - Approximation Algorithms 7 / 22

Approximation

• This is a “2-approximation”, that is, for the vertex cover C returned by this algorithm it holds that |C | ≤ 2|C ∗|

• Proof: • Let A be the set of edges “chosen” in the previous algorithm • No two edges in A share a start or end node, therefore, for every edge

(u, v) ∈ A all vertex covers must contain at least u or v (including the optimal vertex cover)

• Therefore |A| ≤ |C∗|, but each edge that’s added to A adds exactly two vertices to C

• Therefore C = 2|A| ⇒ C ≤ 2|C∗|

Vincent MaccioCSC373 - Approximation Algorithms 8 / 22

The Travelling Salesman Problem

• Given a graph G = (V ,E ) and a start node a ∈ V find a path in G which visits each vertex other than a exactly once, and which starts at a and ends at a

• Such a path is called a tour of G • The length of a tour is the sum of all the edge weights on said tour • In the travelling salesman problem, the goal is to find the shortest

tour possible

• This is an NP-hard problem • We will make two assumptions which will allow us to derive a

reasonable approximation 1 The graph is dense, from each node there is an edge to every other

node (if the graph is sparse, a tour usually does not exist in that graph) 2 The triangle inequality holds, i.e. w(a, c) ≤ w(a, b) + w(b, c) (often

times for the travelling salesman problem the distance between nodes is thought of as the Euclidean distance so this assumption makes sense)

Vincent MaccioCSC373 - Approximation Algorithms 9 / 22

TSP Approximation

• Consider an optimal tour of G denoted by T ∗, let the length or total weight of this tour be c(T ∗)

• Now consider removing any one edge from T ∗, this results in spanning tree of G , but because we removed an edge from T ∗ to create it, the total weight of that spanning tree is less than or equal to the total weight of T ∗, therefore for any minimum spanning tree denoted by MST , c(MST ) ≤ c(T ∗)

• The idea is from an MST create a tour T such that c(T ) ≤ 2c(T ∗), or in other words, T is a 2-approximation

• The general approach will be to derive an MST, from that MST create a walk of the graph, and from that walk create a tour of G

• The first 2 steps can be seen graphically on the next slide

Vincent MaccioCSC373 - Approximation Algorithms 10 / 22

The Travelling Salesman Problem

Vincent MaccioCSC373 - Approximation Algorithms 11 / 22

TSP Approximation

• A walk is a sequence of vertices of the graph such that each vertex appears at least once in the walk (it also starts and ends at a)

• A walk of an MST can be defined recursively where one simply calls “walk” on each child node of a given vertex

• Informally one can note that while a vertex may be visited an arbitrary number of times (1+ how many children it has), each edge of the MST is travelled across exactly twice (see the previous slide to convince yourself)

• Therefore, letting W denote a walk of an MST and letting the total weight or length of that walk be denoted by c(W ), it is known that c(W ) = 2c(MST )

Vincent MaccioCSC373 - Approximation Algorithms 12 / 22

TSP Approximation

• In the previous graph seen 2 slides ago, the walk of the MST would be

W = abcbhbadefegeda

• From this we want to create a tour, which means it cannot visit the same node more than once

• Note that the fourth node we visit on W is b, which is a node we’ve already visited

• W can be altered to be a new walk, say W ′ such that the second visit to b is removed

W ′ = abchbadefegeda

• But from the triangle inequality we know w(c , h) ≤ w(c, b) + w(b, h) • Therefore, c(W ′) ≤ c(W )

Vincent MaccioCSC373 - Approximation Algorithms 13 / 22

TSP Approximation

• This simplification of W can be iteratively applied until no second visits to nodes exist (except for the start node a), let this altered version of the walk be denoted by T

• Continuing with our example

T = abchdefga

• Note the walk T , is also a tour • We also know from our previous observations

c(T ) ≤ c(W ) ≤ 2c(MST ) ≤ 2c(T ∗)

which is what we’re trying to show

Vincent MaccioCSC373 - Approximation Algorithms 14 / 22

Load Balancing

• Given m identical machines and n job, where job j has size sj , your tasked with scheduling the jobs (determining which jobs are processed by which machines). Let Ti be the total work load of machine Mi . Let the “makespan” be the highest work load of all the machines and be denoted by T i.e. T = max

1≤i≤m Ti

• Let T ∗ denote the optimal (minimum) makespan, ideally you’d like to schedule the jobs to be optimal, but this is an NP-hard problem

• Consider the greedy algorithm which looks at the next job and schedules it on the machine which has the current lowest load

• This algorithm turns out to be a 2-approximation

Vincent MaccioCSC373 - Approximation Algorithms 15 / 22

Load Balancing - Proof

• We can note two bounds on the optimal makespan 1 T ∗ ≥ (1/m)

∑n j=1 sj

2 T ∗ ≥ max j

sj

• Let Tk be the workload of the kth machine after the greedy algorithm executes

• Let Ti be the makespan (the greatest workload among the machines), therefore, Mi is the heaviest loaded machine

• Consider the last job to be scheduled to Mi (let it be job j with size sj) and consider how the system looked the moment before that job is scheduled

• At that moment in time Mi had a workload of Ti − sj , but because the job was sent to Mi we know all other machines had at least a workload of Ti − sj

Vincent MaccioCSC373 - Approximation Algorithms 16 / 22

Load Balancing - Proof - cont

• Therefore, ∀k : Tk ≥ (Ti − sj)

⇒ m(Ti − sj) ≤ m∑

k=1

Tk

⇒ (Ti − sj) ≤ (1/m) m∑

k=1

Tk

⇒ (Ti − sj) ≤ (1/m) n∑

j=1

sj

Therefore, from o