2.3 Scheduling jobs on identical parallel machineselomaa/teach/ApprAl-15-3.pdf · 2.3 Scheduling jobs on identical parallel machines • There are Jjobs to be processed, and there

4/9/2015

1

2.3 Scheduling jobs on identicalparallel machines

• There are jobs to be processed, and there areidentical machines (running in parallel) to

which each job may be assigned• Each job = 1, … , , must be processed on one

of these machines for time units withoutinterruption, and each job is available forprocessing at time 0

• Each machine can process at most one job at atime

9-Apr-15MAT-72606 ApprAl, Spring 2015 96

• The aim is to complete all jobs as soon aspossible; that is, if job completes at a time(presuming that the schedule starts at time 0),then we wish to minimize = max

,…,, which

is often called the makespan or length of theschedule

• An equivalent view of the same problem is as aload-balancing problem: there are items, eachof a given weight , and they are to bedistributed among machines

• The aim is to assign each item to one machineso to minimize the maximum total weightassigned to one machine


4/9/2015

2

• Local search algorithms make a set of localchanges or moves that change one feasiblesolution to another

• Start with any schedule; consider the job thatfinishes last; check whether or not there exists amachine to which it can be reassigned thatwould cause this job to finish earlier

• If so, transfer job to this other machine• We can determine whether to transfer job by

checking if there exists a machine that finishesits currently assigned jobs earlier than

• Repeat this procedure until the last job tocomplete cannot be transferred



4/9/2015

3

• Some natural lower bounds on the length of anoptimal schedule,

• Since each job must be processed, it follows thatmax

,…,

• On the other hand, there is, in total, =units of processing to accomplish, and onlymachines to do this work

• On average, a machine will be assignedunits of work, and there must exist one machinethat is assigned at least that much work


• Consider the solution produced by the localsearch algorithm

• Let be a job that completes last in this finalschedule; the completion time of job , , isequal to this solution’s objective function value

• By the fact that the algorithm terminated with thisschedule, every other machine must be busyfrom time 0 until the start of job at time

• We can partition the schedule into two disjointtime intervals, from time 0 until , and the timeduring which job is being processed


4/9/2015

4

• By the first lower bound, the latter interval haslength at most

• Now consider the former time interval; we knowthat each machine is busy processing jobsthroughout this period

• The total amount of work being processed in thisinterval is , which is clearly no more than thetotal work to be done,

• Hence,


• By combining this with the lower bound, we seethat

• The length of the schedule before the start of jobis at most , as is the length of the

schedule afterward; in total, the makespan of theschedule computed is at most

• The value of for the sequence of schedulesproduced, iteration by iteration, never increases

• It can remain the same, but then the number ofmachines that achieve the max value decreases

• Assume that when we transfer a job to anothermachine, then we reassign that job to the onethat is currently finishing earliest


4/9/2015

5

• Let be the completion time of a machinethat completes all its processing the earliest

• never decreases and if it remains thesame, then the number of machines that achievethis minimum value decreases

• This implies that we never transfer a job twice– Suppose this claim is not true, and consider the

first time that a job is transferred twice, say,from machine to , and later then to

– When job is reassigned from machine to , itthen starts at for the current schedule

– Similarly, when job is reassigned from machineto , it then starts at the current


– Furthermore, no change occurred to the scheduleon machine in between these two moves for

– Hence, must be strictly smaller than (inorder for the transfer to be an improving move),but this contradicts our claim that the valueis non-decreasing over the iterations of the localsearch algorithm

• Hence, no job is transferred twice, and after atmost iterations, the algorithm must terminate

Theorem 2.5: The local search procedure forscheduling jobs on identical parallel machines is a2-approximation algorithm.


4/9/2015

6

• It is not hard to see that the analysis of theapproximation ratio can be refined slightly

• In deriving the previous inequality, we includedjob among the work to be done prior to thestart of job

• Hence, we actually derived that

,

and hence the total length of the scheduleproduced is at most

+ =1

+


• By the two lower bounds, we see that theschedule has length at most

• The difference between this bound and 2 issignificant only if there are very few machines

• Another algorithm assigns the jobs as soon asthere is machine availability to process them

• When a machine becomes idle, then one of theremaining jobs is assigned to start processing

• This is called the list scheduling algorithm, sinceone can equivalently view it as first ordering thejobs in a list (arbitrarily), and the next job to beprocessed is the one at the top of the list


4/9/2015

7

• From the load-balancing perspective, the nextjob on the list is assigned to the machine that iscurrently the least heavily loaded

• In this sense the algorithm is a greedy one• If one uses this as the starting point for search

procedure, that algorithm would immediatelydeclare that the solution cannot be improved!

• Consider a job that is last to complete• Each machine is busy until , since

otherwise we would have assigned job to thatother machine

• Hence, no transfers are possible9-Apr-15MAT-72606 ApprAl, Spring 2015 108

Theorem 2.6: The list scheduling algorithm for theproblem of minimizing the makespan onidentical parallel machines is a 2-approximationalgorithm.• It is natural to use an additional greedy rule that

first sorts the jobs in non-increasing order• The relative error in the length of the schedule

produced in Thms 2.5 and 2.6 is entirely due tothe length of the last job to finish

• If that job is short, then the error is not too big• This greedy algorithm is called the longest

processing time rule, or LPT


4/9/2015

8

Theorem 2.7: The LPT rule is a 4/3-approximationalgorithm for scheduling jobs to minimize themakespan on identical parallel machines.

Proof: Suppose that the theorem is false, andconsider an input that provides a counterexampleto the theorem. Assume that . We canassume that the last job to complete is indeed thelast (and smallest) job in the list. This followsw.l.o.g.: any counterexample for which the last job

to complete is not the smallest can yield asmaller counterexample, simply by omitting all ofthe jobs + 1, … , .


The length of the schedule produced is the same,and the optimal value of the reduced input can beno larger. Hence the reduced input is also acounterexample.So the last job to complete in the schedule is job .If this is a counterexample, what do we knowabout ( )? If /3, then the analysisof Theorem 2.6 implies that the schedule length isat most (4/3) , and so this is not acounterexample. Hence, we know that in thispurported counterexample, job (and therefore allof the jobs) has a processing requirement strictlygreater than /3.


4/9/2015

9

In the opt. schedule, each machine may process atmost two jobs (since otherwise the total processingassigned to that machine is more than ).We have reduced our assumed counterexample tothe point where it simply cannot exist.Lemma 2.8: For any input to the problem ofminimizing the makespan on identical parallelmachines for which the processing requirement ofeach job is more than one-third the optimalmakespan, the LPT rule computes an optimalschedule.As a consequence no counterexample to thetheorem can exist, and hence it must be true.


2.4 The traveling salesman problem

• In TSP there is a given set of cities {1,2, … , },and the input consists of a symmetricmatrix = ( ) that specifies the cost oftraveling from city to city

• By convention,– we assume that the cost of traveling from any city

to itself is equal to 0, and– costs are nonnegative;– the cost of traveling from city to city is equal to

the cost of traveling from to city9-Apr-15MAT-72606 ApprAl, Spring 2015 113

4/9/2015

10

• Input an undirected complete graph with a costassociated with each edge: a feasible solution(tour) is a Hamiltonian cycle in this graph

• I.e., we specify a cyclic permutation of the citiesor, equivalently, a traversal of the cities in theorder (1), (2), … , ), where each city islisted as a unique image )

• The cost of the tour is equal to

) + )

• Each tour has distinct representations; it doesnot matter which city is the start of the tour


• There are severe limits on our ability to computenear-optimal tours

• It is NP-complete to decide whether a givenundirected = ( ) has a Hamiltonian cycle

• An approximation algorithm for the TSP can beused to solve the Hamiltonian cycle problem:

• Given a graph = ( ), form an input to theTSP by setting, for each pair , the cost = 1if , and equal to + 2 otherwise

• If there is a Hamiltonian cycle in , then there isa tour of cost , and otherwise each tour costs atleast + 1


4/9/2015

11

• If there were to exist a 2-approximationalgorithm for the TSP, then we could use it todistinguish graphs with Hamiltonian cycles fromthose without any

• Run the approximation algorithm on the newTSP input, and if the tour computed has cost atmost , then there exists a Hamiltonian cycle in

, and otherwise there does not• Setting the cost for the “non-edges” to be + 2

has a similarly inflated consequence, and weobtain an input to the TSP of polynomial sizeprovided that, for example, = (2 )


Theorem 2.9: For any > 1, there does not existan -approximation algorithm for the travelingsalesman problem on cities, provided NP. Infact, the existence of an (2 )-approximationalgorithm for the TSP would similarly imply thatP = NP.• This however is not the end of the story• A natural assumption to make about the input to

the TSP is to restrict attention to those inputsthat are metric; that is, for each triplewe have that the triangle inequality holds


4/9/2015

12

• This rules out the construction used in thereduction for the HC problem above; the non-edges can be given cost at most 2 for thetriangle inequality to hold, and this is too small toyield a nontrivial nonapproximability result

• A natural greedy heuristic to consider for theTSP; this is often referred to as the nearestaddition algorithm

• Find the two closest cities, say, and , and startby building a tour on that pair of cities consistingof going from to and then back to again

• This is the first iteration9-Apr-15MAT-72606 ApprAl, Spring 2015 118

• In each subsequent iteration, we extend the touron the current subset by including oneadditional city, until we include all cities

• In each iteration, we find a pair of cities andfor which the cost is minimum; let be

the city that follows in the current tour on• We add to , and insert into the current tour

between and9-Apr-15MAT-72606 ApprAl, Spring 2015 119

4/9/2015

13

• The crux of the analysis is the relationship of thisalgorithm to Prim’s algorithm for the minimumspanning tree (MST) in an undirected graph

• A spanning tree of a connected graph = ( )is a minimal subset of edges such thateach pair of nodes in is connected by a pathusing edges only in

• A MST is a spanning tree for which the totaledge cost is minimized

• Prim’s algorithm computes a MST by iterativelyconstructing a set along with a tree , startingwith = { for some (arbitrarily chosen) node

and = ( , ) with =9-Apr-15MAT-72606 ApprAl, Spring 2015 120

• In each iteration, it determines the edge )such that and is of minimum cost, andadds the edge ) to

• Clearly, this is the same sequence of vertexpairs identified by the nearest addition algorithm

• Furthermore, there is another importantrelationship between MST problem and TSP

Lemma 2.10: For any input to the travelingsalesman problem, the cost of the optimal tour isat least the cost of the minimum spanning tree onthe same input


4/9/2015

14

Theorem 2.11: The nearest addition algorithm forthe metric traveling salesman problem is a 2-approximation algorithm.Proof: Let , … , = 1, … , be the subsetsidentified at the end of each iteration of the nearestaddition algorithm (where | = ), and let

= , , … , , where is theedge identified in iteration 1 (with ,

= 3, … , ). We also know that 1, … , is aMST for the original input, when viewed as acomplete undirected graph with edge costs. Thus,if OPT is the optimal value for the TSP input, thenOPT


The cost of the tour on the first two nodes andis exactly . Consider an iteration in which acity is inserted between cities and in thecurrent tour. How much does the length of the tourincrease? An easy calculation gives .By the triangle inequality, we have that +

or, equivalently, . Hence, theincrease in cost in this iteration is at most

= 2 . Thus, overall, we know that thefinal tour has cost at most

2 2OPT,

and the theorem is proved.9-Apr-15MAT-72606 ApprAl, Spring 2015 123

4/9/2015

15

• A graph is said to be Eulerian if there exists apermutation of its edges of the form

), , … , , ( )• We will call this permutation a traversal of the

edges, since it allows us to visit every edgeexactly once

• A graph is Eulerian if and only if it is connectedand each node has even degree (the number ofedges with as one of its endpoints)

• Furthermore, if a graph is Eulerian, one caneasily construct the required traversal of theedges, given the graph


• To find a good tour for a TSP input, suppose thatwe first compute a MST (e.g., by Prim’s alg.)

• Suppose that we then replace each edge by twocopies of itself

• The resulting (multi)graph has cost at most 2OPTand is Eulerian

• We can construct a tour of the cities from theEulerian traversal of the edges,

), , … , , ( )• Consider the sequence of nodes, , … , , and

remove all but the first occurrence of each city inthis sequence


4/9/2015

16

• This yields a tour containing each city exactlyonce (assuming we then return to at the end)

• To bound the length of this tour, consider twoconsecutive cities in this tour, and

• Cities , … , have already been visited“earlier” in the tour

• By the triangle inequality, can be upperbounded by the total cost of the edges traversedin the Eulerian traversal between and , i.e.,the total cost of the edges ), … , ( )

• In total, the cost of the tour is at most the totalcost of all of the edges in the Eulerian graph,which is at most 2OPT


Theorem 2.12: The double-tree algorithm for themetric traveling salesman problem is a 2-approximation algorithm.• The message of the analysis of the double-tree

algorithm is also quite useful• If we can efficiently construct an Eulerian

subgraph of the complete input graph, for whichthe total edge cost is at most times the optimalvalue of the TSP input, then we have derived an

-approximation algorithm as well• This strategy can be carried out to yield a 3/2-

approximation algorithm9-Apr-15MAT-72606 ApprAl, Spring 2015 127

4/9/2015

17

• Consider the output from the MST computation• This graph is not Eulerian, since any tree must

have nodes of degree one, but it is possible thatnot many nodes have odd degree

• is the set of odd-degree nodes in the MST• The sum of node degrees must be even, since

each edge in the graph contributes 2 to this total• The total degree of the even-degree nodes must

also be even, but then the total degree of theodd-degree nodes must also be even

• I.e., we must have an even number of odd-degree nodes; | | = 2 for some positive


• Suppose that we pair up the nodes in :), , … , ( )

• Such a collection of edges that contain eachnode in exactly once is called a perfectmatching of

• One of the classic results of combinatorialoptimization is that given a complete graph (onan even number of nodes) with edge costs, it ispossible to compute the perfect matching ofminimum total cost in polynomial time

• Given the MST, we identify the set of odd-degree nodes with even cardinality, and thencompute a minimum-cost perfect matching on


4/9/2015

18

• If we add this set of edges to our MST, we haveconstructed an Eulerian graph on our original setof cities: it is connected (since the spanning treeis connected) and has even degree (since weadded a new edge incident to each node of odddegree in the spanning tree)

• As in the double-tree algorithm, we can shortcutthis graph to produce a tour of no greater cost

• This is known as Christofides’ algorithm

Theorem 2.13: Christofides’ algorithm for themetric traveling salesman problem is a 3/2-approximation algorithm.


Proof: We want to show that the edges in theEulerian graph produced by the algorithm havetotal cost at most . We know that the MSTedges have total cost at most OPT. So we needonly show that the perfect matching on has costat most OPT/2.First observe that there is a tour on just the nodesin of total cost at most OPT. This again uses theshortcutting argument. Start with the optimal touron the entire set of cities, and if for two cities and, the optimal tour between and contains only

cities that are not in , then include edge ) inthe tour on .


4/9/2015

19

Each edge in the tour corresponds to disjoint pathsin the original tour, and hence by the triangleinequality, the total length of the tour on is nomore than the length of the original tour.Now consider this “shortcut” tour on the node set

. Color these edges red and blue, alternatingcolors. This partitions the edges into two sets;each of these is a perfect matching on the nodeset . In total, these two edge sets have cost

OPT. Thus, the cheaper of these two sets hascost OPT/2. Hence, there is a perfect matchingon of cost OPT/2. Therefore, the algorithmmust find a matching of cost at most OPT/2.


• No better approximation algorithm for the metricTSP is known

• Substantially better algorithms might yet befound, since the strongest negative result is

Theorem 2.14: Unless P = NP, for any constant< 1.0045, no -approximation algorithm for

the metric TSP exists.

• It is possible to obtain a PTAS in the case thatcities correspond to points in the Euclideanplane and the cost of traveling between twocities is equal to the Euclidean distance betweenthe two points


4/9/2015

20

3 Rounding data anddynamic programming

• Dynamic programming is a standard techniquein algorithm design in which an optimal solutionfor a problem is built up from optimal solutionsfor a number of subproblems, normally stored ina table or multidimensional array

• Approximation algorithms can be designed usingdynamic programming in a variety of ways,many of which involve rounding the input data insome way


3.1 The knapsack problem

• We are given a set of items = {1, … , },where each item has a value and a size

• All sizes and values are positive integers• The knapsack has a positive integer capacity• The goal is to find a subset of items that

maximizes the value of items in theknapsack subject to the constraint that the totalsize of these items is no more than the capacity;that is,


4/9/2015

21

• We consider only items that could actually fit inthe knapsack, so that for each

• We can use dynamic programming to find theoptimal solution to the knapsack problem

• We maintain an array entry ) for = 1, … ,• Each entry ) is a list of pairs )• A pair ) in the list of entry ) indicates that

there is a set from the first items that usesspace exactly and has value exactly ;i.e., there exists a set {1, … , s.t.

and=


• Each list keeps track of only the most efficientpairs

• To do this, we need the notion of one pairdominating another one:– ) dominates another pair if

and ; that is, the solution indicated bythe pair uses no more space than

), but has at least as much value• Domination is a transitive property: if )

dominates ) which dominates ), then) also dominates )

• We will ensure that in any list, no pair dominatesanother one


Documents

2.3 Scheduling jobs on identical parallel machineselomaa/teach/ApprAl-15-3.pdf · 2.3 Scheduling jobs on identical parallel machines • There are Jjobs to be processed, and there