Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
4/9/2015
1
2.3 Scheduling jobs on identicalparallel machines
• There are jobs to be processed, and there areidentical machines (running in parallel) to
which each job may be assigned• Each job = 1, … , , must be processed on one
of these machines for time units withoutinterruption, and each job is available forprocessing at time 0
• Each machine can process at most one job at atime
9-Apr-15MAT-72606 ApprAl, Spring 2015 96
• The aim is to complete all jobs as soon aspossible; that is, if job completes at a time(presuming that the schedule starts at time 0),then we wish to minimize = max
,…,, which
is often called the makespan or length of theschedule
• An equivalent view of the same problem is as aload-balancing problem: there are items, eachof a given weight , and they are to bedistributed among machines
• The aim is to assign each item to one machineso to minimize the maximum total weightassigned to one machine
9-Apr-15MAT-72606 ApprAl, Spring 2015 97
4/9/2015
2
• Local search algorithms make a set of localchanges or moves that change one feasiblesolution to another
• Start with any schedule; consider the job thatfinishes last; check whether or not there exists amachine to which it can be reassigned thatwould cause this job to finish earlier
• If so, transfer job to this other machine• We can determine whether to transfer job by
checking if there exists a machine that finishesits currently assigned jobs earlier than
• Repeat this procedure until the last job tocomplete cannot be transferred
9-Apr-15MAT-72606 ApprAl, Spring 2015 98
9-Apr-15MAT-72606 ApprAl, Spring 2015 99
4/9/2015
3
• Some natural lower bounds on the length of anoptimal schedule,
• Since each job must be processed, it follows thatmax
,…,
• On the other hand, there is, in total, =units of processing to accomplish, and onlymachines to do this work
• On average, a machine will be assignedunits of work, and there must exist one machinethat is assigned at least that much work
9-Apr-15MAT-72606 ApprAl, Spring 2015 100
• Consider the solution produced by the localsearch algorithm
• Let be a job that completes last in this finalschedule; the completion time of job , , isequal to this solution’s objective function value
• By the fact that the algorithm terminated with thisschedule, every other machine must be busyfrom time 0 until the start of job at time
• We can partition the schedule into two disjointtime intervals, from time 0 until , and the timeduring which job is being processed
9-Apr-15MAT-72606 ApprAl, Spring 2015 101
4/9/2015
4
• By the first lower bound, the latter interval haslength at most
• Now consider the former time interval; we knowthat each machine is busy processing jobsthroughout this period
• The total amount of work being processed in thisinterval is , which is clearly no more than thetotal work to be done,
• Hence,
9-Apr-15MAT-72606 ApprAl, Spring 2015 102
• By combining this with the lower bound, we seethat
• The length of the schedule before the start of jobis at most , as is the length of the
schedule afterward; in total, the makespan of theschedule computed is at most
• The value of for the sequence of schedulesproduced, iteration by iteration, never increases
• It can remain the same, but then the number ofmachines that achieve the max value decreases
• Assume that when we transfer a job to anothermachine, then we reassign that job to the onethat is currently finishing earliest
9-Apr-15MAT-72606 ApprAl, Spring 2015 103
4/9/2015
5
• Let be the completion time of a machinethat completes all its processing the earliest
• never decreases and if it remains thesame, then the number of machines that achievethis minimum value decreases
• This implies that we never transfer a job twice– Suppose this claim is not true, and consider the
first time that a job is transferred twice, say,from machine to , and later then to
– When job is reassigned from machine to , itthen starts at for the current schedule
– Similarly, when job is reassigned from machineto , it then starts at the current
9-Apr-15MAT-72606 ApprAl, Spring 2015 104
– Furthermore, no change occurred to the scheduleon machine in between these two moves for
– Hence, must be strictly smaller than (inorder for the transfer to be an improving move),but this contradicts our claim that the valueis non-decreasing over the iterations of the localsearch algorithm
• Hence, no job is transferred twice, and after atmost iterations, the algorithm must terminate
Theorem 2.5: The local search procedure forscheduling jobs on identical parallel machines is a2-approximation algorithm.
9-Apr-15MAT-72606 ApprAl, Spring 2015 105
4/9/2015
6
• It is not hard to see that the analysis of theapproximation ratio can be refined slightly
• In deriving the previous inequality, we includedjob among the work to be done prior to thestart of job
• Hence, we actually derived that
,
and hence the total length of the scheduleproduced is at most
+ =1
+
9-Apr-15MAT-72606 ApprAl, Spring 2015 106
• By the two lower bounds, we see that theschedule has length at most
• The difference between this bound and 2 issignificant only if there are very few machines
• Another algorithm assigns the jobs as soon asthere is machine availability to process them
• When a machine becomes idle, then one of theremaining jobs is assigned to start processing
• This is called the list scheduling algorithm, sinceone can equivalently view it as first ordering thejobs in a list (arbitrarily), and the next job to beprocessed is the one at the top of the list
9-Apr-15MAT-72606 ApprAl, Spring 2015 107
4/9/2015
7
• From the load-balancing perspective, the nextjob on the list is assigned to the machine that iscurrently the least heavily loaded
• In this sense the algorithm is a greedy one• If one uses this as the starting point for search
procedure, that algorithm would immediatelydeclare that the solution cannot be improved!
• Consider a job that is last to complete• Each machine is busy until , since
otherwise we would have assigned job to thatother machine
• Hence, no transfers are possible9-Apr-15MAT-72606 ApprAl, Spring 2015 108
Theorem 2.6: The list scheduling algorithm for theproblem of minimizing the makespan onidentical parallel machines is a 2-approximationalgorithm.• It is natural to use an additional greedy rule that
first sorts the jobs in non-increasing order• The relative error in the length of the schedule
produced in Thms 2.5 and 2.6 is entirely due tothe length of the last job to finish
• If that job is short, then the error is not too big• This greedy algorithm is called the longest
processing time rule, or LPT
9-Apr-15MAT-72606 ApprAl, Spring 2015 109
4/9/2015
8
Theorem 2.7: The LPT rule is a 4/3-approximationalgorithm for scheduling jobs to minimize themakespan on identical parallel machines.
Proof: Suppose that the theorem is false, andconsider an input that provides a counterexampleto the theorem. Assume that . We canassume that the last job to complete is indeed thelast (and smallest) job in the list. This followsw.l.o.g.: any counterexample for which the last job
to complete is not the smallest can yield asmaller counterexample, simply by omitting all ofthe jobs + 1, … , .
9-Apr-15MAT-72606 ApprAl, Spring 2015 110
The length of the schedule produced is the same,and the optimal value of the reduced input can beno larger. Hence the reduced input is also acounterexample.So the last job to complete in the schedule is job .If this is a counterexample, what do we knowabout ( )? If /3, then the analysisof Theorem 2.6 implies that the schedule length isat most (4/3) , and so this is not acounterexample. Hence, we know that in thispurported counterexample, job (and therefore allof the jobs) has a processing requirement strictlygreater than /3.
9-Apr-15MAT-72606 ApprAl, Spring 2015 111
4/9/2015
9
In the opt. schedule, each machine may process atmost two jobs (since otherwise the total processingassigned to that machine is more than ).We have reduced our assumed counterexample tothe point where it simply cannot exist.Lemma 2.8: For any input to the problem ofminimizing the makespan on identical parallelmachines for which the processing requirement ofeach job is more than one-third the optimalmakespan, the LPT rule computes an optimalschedule.As a consequence no counterexample to thetheorem can exist, and hence it must be true.
9-Apr-15MAT-72606 ApprAl, Spring 2015 112
2.4 The traveling salesman problem
• In TSP there is a given set of cities {1,2, … , },and the input consists of a symmetricmatrix = ( ) that specifies the cost oftraveling from city to city
• By convention,– we assume that the cost of traveling from any city
to itself is equal to 0, and– costs are nonnegative;– the cost of traveling from city to city is equal to
the cost of traveling from to city9-Apr-15MAT-72606 ApprAl, Spring 2015 113
4/9/2015
10
• Input an undirected complete graph with a costassociated with each edge: a feasible solution(tour) is a Hamiltonian cycle in this graph
• I.e., we specify a cyclic permutation of the citiesor, equivalently, a traversal of the cities in theorder (1), (2), … , ), where each city islisted as a unique image )
• The cost of the tour is equal to
) + )
• Each tour has distinct representations; it doesnot matter which city is the start of the tour
9-Apr-15MAT-72606 ApprAl, Spring 2015 114
• There are severe limits on our ability to computenear-optimal tours
• It is NP-complete to decide whether a givenundirected = ( ) has a Hamiltonian cycle
• An approximation algorithm for the TSP can beused to solve the Hamiltonian cycle problem:
• Given a graph = ( ), form an input to theTSP by setting, for each pair , the cost = 1if , and equal to + 2 otherwise
• If there is a Hamiltonian cycle in , then there isa tour of cost , and otherwise each tour costs atleast + 1
9-Apr-15MAT-72606 ApprAl, Spring 2015 115
4/9/2015
11
• If there were to exist a 2-approximationalgorithm for the TSP, then we could use it todistinguish graphs with Hamiltonian cycles fromthose without any
• Run the approximation algorithm on the newTSP input, and if the tour computed has cost atmost , then there exists a Hamiltonian cycle in
, and otherwise there does not• Setting the cost for the “non-edges” to be + 2
has a similarly inflated consequence, and weobtain an input to the TSP of polynomial sizeprovided that, for example, = (2 )
9-Apr-15MAT-72606 ApprAl, Spring 2015 116
Theorem 2.9: For any > 1, there does not existan -approximation algorithm for the travelingsalesman problem on cities, provided NP. Infact, the existence of an (2 )-approximationalgorithm for the TSP would similarly imply thatP = NP.• This however is not the end of the story• A natural assumption to make about the input to
the TSP is to restrict attention to those inputsthat are metric; that is, for each triplewe have that the triangle inequality holds
9-Apr-15MAT-72606 ApprAl, Spring 2015 117
4/9/2015
12
• This rules out the construction used in thereduction for the HC problem above; the non-edges can be given cost at most 2 for thetriangle inequality to hold, and this is too small toyield a nontrivial nonapproximability result
• A natural greedy heuristic to consider for theTSP; this is often referred to as the nearestaddition algorithm
• Find the two closest cities, say, and , and startby building a tour on that pair of cities consistingof going from to and then back to again
• This is the first iteration9-Apr-15MAT-72606 ApprAl, Spring 2015 118
• In each subsequent iteration, we extend the touron the current subset by including oneadditional city, until we include all cities
• In each iteration, we find a pair of cities andfor which the cost is minimum; let be
the city that follows in the current tour on• We add to , and insert into the current tour
between and9-Apr-15MAT-72606 ApprAl, Spring 2015 119
4/9/2015
13
• The crux of the analysis is the relationship of thisalgorithm to Prim’s algorithm for the minimumspanning tree (MST) in an undirected graph
• A spanning tree of a connected graph = ( )is a minimal subset of edges such thateach pair of nodes in is connected by a pathusing edges only in
• A MST is a spanning tree for which the totaledge cost is minimized
• Prim’s algorithm computes a MST by iterativelyconstructing a set along with a tree , startingwith = { for some (arbitrarily chosen) node
and = ( , ) with =9-Apr-15MAT-72606 ApprAl, Spring 2015 120
• In each iteration, it determines the edge )such that and is of minimum cost, andadds the edge ) to
• Clearly, this is the same sequence of vertexpairs identified by the nearest addition algorithm
• Furthermore, there is another importantrelationship between MST problem and TSP
Lemma 2.10: For any input to the travelingsalesman problem, the cost of the optimal tour isat least the cost of the minimum spanning tree onthe same input
9-Apr-15MAT-72606 ApprAl, Spring 2015 121
4/9/2015
14
Theorem 2.11: The nearest addition algorithm forthe metric traveling salesman problem is a 2-approximation algorithm.Proof: Let , … , = 1, … , be the subsetsidentified at the end of each iteration of the nearestaddition algorithm (where | = ), and let
= , , … , , where is theedge identified in iteration 1 (with ,
= 3, … , ). We also know that 1, … , is aMST for the original input, when viewed as acomplete undirected graph with edge costs. Thus,if OPT is the optimal value for the TSP input, thenOPT
9-Apr-15MAT-72606 ApprAl, Spring 2015 122
The cost of the tour on the first two nodes andis exactly . Consider an iteration in which acity is inserted between cities and in thecurrent tour. How much does the length of the tourincrease? An easy calculation gives .By the triangle inequality, we have that +
or, equivalently, . Hence, theincrease in cost in this iteration is at most
= 2 . Thus, overall, we know that thefinal tour has cost at most
2 2OPT,
and the theorem is proved.9-Apr-15MAT-72606 ApprAl, Spring 2015 123
4/9/2015
15
• A graph is said to be Eulerian if there exists apermutation of its edges of the form
), , … , , ( )• We will call this permutation a traversal of the
edges, since it allows us to visit every edgeexactly once
• A graph is Eulerian if and only if it is connectedand each node has even degree (the number ofedges with as one of its endpoints)
• Furthermore, if a graph is Eulerian, one caneasily construct the required traversal of theedges, given the graph
9-Apr-15MAT-72606 ApprAl, Spring 2015 124
• To find a good tour for a TSP input, suppose thatwe first compute a MST (e.g., by Prim’s alg.)
• Suppose that we then replace each edge by twocopies of itself
• The resulting (multi)graph has cost at most 2OPTand is Eulerian
• We can construct a tour of the cities from theEulerian traversal of the edges,
), , … , , ( )• Consider the sequence of nodes, , … , , and
remove all but the first occurrence of each city inthis sequence
9-Apr-15MAT-72606 ApprAl, Spring 2015 125
4/9/2015
16
• This yields a tour containing each city exactlyonce (assuming we then return to at the end)
• To bound the length of this tour, consider twoconsecutive cities in this tour, and
• Cities , … , have already been visited“earlier” in the tour
• By the triangle inequality, can be upperbounded by the total cost of the edges traversedin the Eulerian traversal between and , i.e.,the total cost of the edges ), … , ( )
• In total, the cost of the tour is at most the totalcost of all of the edges in the Eulerian graph,which is at most 2OPT
9-Apr-15MAT-72606 ApprAl, Spring 2015 126
Theorem 2.12: The double-tree algorithm for themetric traveling salesman problem is a 2-approximation algorithm.• The message of the analysis of the double-tree
algorithm is also quite useful• If we can efficiently construct an Eulerian
subgraph of the complete input graph, for whichthe total edge cost is at most times the optimalvalue of the TSP input, then we have derived an
-approximation algorithm as well• This strategy can be carried out to yield a 3/2-
approximation algorithm9-Apr-15MAT-72606 ApprAl, Spring 2015 127
4/9/2015
17
• Consider the output from the MST computation• This graph is not Eulerian, since any tree must
have nodes of degree one, but it is possible thatnot many nodes have odd degree
• is the set of odd-degree nodes in the MST• The sum of node degrees must be even, since
each edge in the graph contributes 2 to this total• The total degree of the even-degree nodes must
also be even, but then the total degree of theodd-degree nodes must also be even
• I.e., we must have an even number of odd-degree nodes; | | = 2 for some positive
9-Apr-15MAT-72606 ApprAl, Spring 2015 128
• Suppose that we pair up the nodes in :), , … , ( )
• Such a collection of edges that contain eachnode in exactly once is called a perfectmatching of
• One of the classic results of combinatorialoptimization is that given a complete graph (onan even number of nodes) with edge costs, it ispossible to compute the perfect matching ofminimum total cost in polynomial time
• Given the MST, we identify the set of odd-degree nodes with even cardinality, and thencompute a minimum-cost perfect matching on
9-Apr-15MAT-72606 ApprAl, Spring 2015 129
4/9/2015
18
• If we add this set of edges to our MST, we haveconstructed an Eulerian graph on our original setof cities: it is connected (since the spanning treeis connected) and has even degree (since weadded a new edge incident to each node of odddegree in the spanning tree)
• As in the double-tree algorithm, we can shortcutthis graph to produce a tour of no greater cost
• This is known as Christofides’ algorithm
Theorem 2.13: Christofides’ algorithm for themetric traveling salesman problem is a 3/2-approximation algorithm.
9-Apr-15MAT-72606 ApprAl, Spring 2015 130
Proof: We want to show that the edges in theEulerian graph produced by the algorithm havetotal cost at most . We know that the MSTedges have total cost at most OPT. So we needonly show that the perfect matching on has costat most OPT/2.First observe that there is a tour on just the nodesin of total cost at most OPT. This again uses theshortcutting argument. Start with the optimal touron the entire set of cities, and if for two cities and, the optimal tour between and contains only
cities that are not in , then include edge ) inthe tour on .
9-Apr-15MAT-72606 ApprAl, Spring 2015 131
4/9/2015
19
Each edge in the tour corresponds to disjoint pathsin the original tour, and hence by the triangleinequality, the total length of the tour on is nomore than the length of the original tour.Now consider this “shortcut” tour on the node set
. Color these edges red and blue, alternatingcolors. This partitions the edges into two sets;each of these is a perfect matching on the nodeset . In total, these two edge sets have cost
OPT. Thus, the cheaper of these two sets hascost OPT/2. Hence, there is a perfect matchingon of cost OPT/2. Therefore, the algorithmmust find a matching of cost at most OPT/2.
9-Apr-15MAT-72606 ApprAl, Spring 2015 132
• No better approximation algorithm for the metricTSP is known
• Substantially better algorithms might yet befound, since the strongest negative result is
Theorem 2.14: Unless P = NP, for any constant< 1.0045, no -approximation algorithm for
the metric TSP exists.
• It is possible to obtain a PTAS in the case thatcities correspond to points in the Euclideanplane and the cost of traveling between twocities is equal to the Euclidean distance betweenthe two points
9-Apr-15MAT-72606 ApprAl, Spring 2015 133
4/9/2015
20
3 Rounding data anddynamic programming
• Dynamic programming is a standard techniquein algorithm design in which an optimal solutionfor a problem is built up from optimal solutionsfor a number of subproblems, normally stored ina table or multidimensional array
• Approximation algorithms can be designed usingdynamic programming in a variety of ways,many of which involve rounding the input data insome way
9-Apr-15MAT-72606 ApprAl, Spring 2015 134
3.1 The knapsack problem
• We are given a set of items = {1, … , },where each item has a value and a size
• All sizes and values are positive integers• The knapsack has a positive integer capacity• The goal is to find a subset of items that
maximizes the value of items in theknapsack subject to the constraint that the totalsize of these items is no more than the capacity;that is,
9-Apr-15MAT-72606 ApprAl, Spring 2015 135
4/9/2015
21
• We consider only items that could actually fit inthe knapsack, so that for each
• We can use dynamic programming to find theoptimal solution to the knapsack problem
• We maintain an array entry ) for = 1, … ,• Each entry ) is a list of pairs )• A pair ) in the list of entry ) indicates that
there is a set from the first items that usesspace exactly and has value exactly ;i.e., there exists a set {1, … , s.t.
and=
9-Apr-15MAT-72606 ApprAl, Spring 2015 136
• Each list keeps track of only the most efficientpairs
• To do this, we need the notion of one pairdominating another one:– ) dominates another pair if
and ; that is, the solution indicated bythe pair uses no more space than
), but has at least as much value• Domination is a transitive property: if )
dominates ) which dominates ), then) also dominates )
• We will ensure that in any list, no pair dominatesanother one
9-Apr-15MAT-72606 ApprAl, Spring 2015 137