Upload
dothuan
View
226
Download
0
Embed Size (px)
Citation preview
Integer Programming: Large ScaleMethods Part II (Chapter 12 and Chapter 16)
University of ChicagoBooth School of Business
Kipp Martin
November 27, 2017
1 / 49
Outline
List of Files
Key Concepts
Cut Generation
Lagrangian Approach
Capstone Case: Capacitated Lot Sizing
Lagrangian Approach Application – Convex Optimization
2 / 49
Files
I genAssign.gms
I genAssignSub.gms
I genAssignData.txt
3 / 49
Key Concepts
I the quality of the formulation is what counts, not the size –quality usually measured by the lower bound
I if we can read the problem into memory we can solve thelinear relaxation (of course it may be a poor relaxation)
I many interesting applications have too many constraints orvariables to read into memory
I if there are too many variables we generate variables as needed
I we generate variables as needed by a pricing algorithm
I if there are too many constraints we generate constraints asneeded
I we generate constraints as needed by a separation algorithm
4 / 49
Key ConceptsThe mixed integer linear program under consideration is:
min c>x
(MIP) s.t. Ax ≥ b
Bx ≥ d
x ≥ 0
xj ∈ Z, j ∈ I
One solution approach is to solve the LP relaxation within branchand bound
min c>x
(MIP) s.t. Ax ≥ b
Bx ≥ d
x ≥ 0
5 / 49
Key Concepts
As an alternative, we try to define A and B appropriately and solve:
min c>x
(conv(MIP)) s.t. Ax ≥ b
x ∈ conv(Γ)
How to break the constraints into A and B takes a fair amount ofskill and practice. We want the characterization of conv(Γ) to bean easy hard problem.
If it is too easy there may not be a big win in terms of better lowerbounds.
6 / 49
Key Concepts
There are three approaches to actually solving
min c>x
(conv(MIP)) s.t. Ax ≥ b
x ∈ conv(Γ)
Approach 1: Characterize conv(Γ) by the inequalities that definethe polyhedron. This is the cutting plane approach. (Today)
Approach 2: Characterize conv(Γ) through inverse projection.Often called decomposition or column generation.
Approach 3: Lagrangian Relaxation (Today )
7 / 49
Cut Generation
There are three approaches to actually solving
min c>x
(conv(MIP)) s.t. Ax ≥ b
x ∈ conv(Γ)
We now illustrate cut generation with the TSP.
We need to figure out how to break up the constraints.
8 / 49
Cut Generation
Let’s go back and solve the cut generation problem. The problemis:
min∑e∈E
cexe (1)∑e∈δ(i)
xe = 2, i ∈ V (2)
∑e∈E(S)
xe ≤ |S | − 1, S ⊂ V \1 (3)
∑e∈E
xe = n (4)
xe ∈ 0, 1, ∀e ∈ E (5)
9 / 49
Cut Generation
Rewrite this as
min∑e∈E
cexe∑e∈δ(i)
xe = 2, i ∈ V
x ∈ Γ
where Γ corresponds to constraints (3)-(5). In the previous slidedeck, we characterized Γ by its extreme points and used columngeneration.
However, we can also generate the constraints in (3) throughseparation.
10 / 49
Cut GenerationWe now work with this example:
1
2
3
4
5
7
61
5
7
9
3
9
2
9
6
1
8
5
2
10
2
11 / 49
Cut Generation
So a problem relaxation by deleting the tour-breaking constraints.
min∑e∈E
cexe∑e∈δ(i)
xe = 2, i ∈ V
∑e∈E
xe = n
xe ∈ 0, 1, ∀e ∈ E
12 / 49
Cut GenerationHere is the solution. Argh! Subtours!
1
2
3
4
5
1 7
3
6
1
5
2
13 / 49
Cut Generation
The separation problem
max∑e∈E
xeαe −n∑i=1i 6=k
θi (6)
(SPk(x)) s.t. αe ≤ θi , e ∈ δ(i), i ∈ V (7)
θi ≥ 0. (8)
14 / 49
Cut Generation
Let k = 2. The solution is:
θ2 = 1, θ3 = 1, θ6 = 1
α(2,3) = 1, α(2,6) = 1, α(3,6) = 1
15 / 49
Lagrangian Approach
The MIP is
min c>x
(MIP) s.t. Ax ≥ b
x ∈ Γ
where, as always, the set Γ has special structure. Dualize theconstraints Ax ≥ b with a conformable vector λ ≥ 0 of Lagrangemultipliers and get
min c>x + λ>(b − Ax)
s.t. x ∈ Γ
16 / 49
Lagrangian Approach
Define L(λ) := minc>x + λ>(b − Ax) | x ∈ Γ.
The Lagrangian Dual problem (LD) is:
(LD) maxL(λ) : λ ≥ 0.
Theorem: (Weak Duality) v(LD) ≤ v(MIP)
Proof?
Theorem: L(λ) is a concave function of λ.
Proof?
Remark: No matter how ugly and non-convex the set Γ is, L(λ) isconcave, so at least we are trying to maximize a concave function.
17 / 49
Lagrangian Approach – Using Subgradients
Concave is not too bad, but this is a linear programming class! Idon’t have the ambition to tackle a concave function. You knowme, 1) I worry; and 2) I like easy problems.
γk is a subgradient of the concave function L(λ) at λk iff
L(λ) ≤ L(λk) + (γk)>(λ− λk)
for all λ.
Solve the following problem linear programming relaxation:
max θθ ≤ L(λk) + (γk)>(λ− λk), k = 1, . . . ,Kλ ≥ 0
(9)
where γk , k = 1, . . . ,K are subgradients. What is the geometry ofthis approach?
18 / 49
Lagrangian Approach – Finding Subrgradients
Here is how to find a subgradient. Given a λ ≥ 0, solve
L(λ) = minc>x + λ>
(b − Ax) | x ∈ Γ.
and obtain an optimal x(λ).
Theorem: The vector γ = b − Ax(λ) is a subgradient of L(λ) atλ = λ.
Proof: Next slide.
19 / 49
Lagrangian Approach – Finding Subrgradients
If
L(λ) = minc>x + λ>
(b − Ax) | x ∈ ΓL(λ) = c>x(λ) + λ>(b − Ax(λ))
then
L(λ) + (b − Ax(λ))>(λ− λ) = c>x(λ) + (b − Ax(λ))>λ
+(b − Ax(λ))>(λ− λ)
= c>x(λ) + (b − Ax(λ))>λ
+(b − Ax(λ))>λ− (b − Ax(λ))>λ
= c>x(λ) + (b − Ax(λ))>λ
≥ L(λ)
20 / 49
Lagrangian Approach - Subgradient Algorithm
Step 0: Initialize: Choose tolerance ε and K ← 1 and λK ← 0.Solve L(λK ) and obtain subgradient γK .
Step 1: Solve the relaxed Lagrangian dual (9) and obtain optimalθ and λ.
Step 2: K ← K + 1, λK ← λ, solve
L(λK ) = minc>x + (λK )>(b − Ax) | x ∈ Γ
save γK = b − Ax(λK ) and L(λK ).
Step 3: Stop if we satisfy ε− tolerance. Otherwise, add
θ ≤ L(λK ) + (γK )>(λ− λK )
to the relaxed Lagrangian dual (9).
Step 4: Go to Step 1
21 / 49
Lagrangian Approach - Subgradient Algorithm
Consider the generalized assignment problem. Let xij be 1 iftask i is assigned person j , and 0 otherwise.
minn∑
i=1
m∑j=1
cijxij
m∑j=1
xij = 1, i = 1, . . . , n
n∑i=1
aijxij ≤ bj , j = 1, . . . ,m
xij ∈ 0, 1, i = 1, . . . , n, j = 1, . . . ,m
How do we partition into Ax ≥ b and Γ? We want an easy hardproblem.
22 / 49
Lagrangian Approach - Subgradient AlgorithmThe model genAssign.gms solves the integer programmingformulation for the generalized assignment problem.
SETS
i "task", j "server" ;
PARAMETERS
c(i, j), a(i, j), b(j) ;
VARIABLES
z "minimize the cost" ;
BINARY VARIABLES
x(i, j);
EQUATIONS
cost, assign( i), resource(j) ;
cost .. z =e= sum( (i, j), c(i, j)*x(i, j) ) ;
assign( i) .. sum(j, x(i, j) ) =g= 1;
resource(j) .. sum(i, a(i, j)*x(i, j) ) =l= b(j);
MODEL genAssign / cost, assign, resource/;
23 / 49
Lagrangian Approach - Subgradient Algorithm
The data file is genAssignData.txt. For these data the optimalLP value is 20.32, and the optimal IP value is 34.0. A huge gap!
The GAMS model file genAssignSub.gms solves the Lagrangiandual. It contains two models: 1) genAssign without theassignment constraints; and 2) lagDual which is the relaxedLagrangian dual.
Model lagDual calls model genAssign inside the cut generationloop.
The value of the Lagrangian dual is 29.5, so we are really closingthe gap a lot.
24 / 49
Lagrangian Approach - Subgradient Algorithm
cost .. z =e= sum( (i, j), adj_c(i, j)*x(i, j) ) ;
assign( i) .. sum(j, x(i, j) ) =g= 1;
resource(j) .. sum(i, a(i, j)*x(i, j) ) =l= b(j);
MODEL genAssign / cost, resource/;
cut(allcuts) .. theta =l= LagValue( allcuts)
+ sum( i, cutCoeffSub( allcuts, i)*(lambda(i) -
cutCoeffLambda( allcuts, i) ) )
MODEL lagDual / cut /;
Inside the loop,
adj_c(i, j) = c(i, j) - lambda.l(i);
SOLVE genAssign USING MIP MINIMIZING z ;
SOLVE lagDual USING LP MAXIMIZING theta ;
25 / 49
Lagrangian Approach
If the Lagrangian dual is
maxL(λ) : λ ≥ 0.
where L(λ) := minc>x + λ>(b − Ax) | x ∈ Γ. Then this isequivalent to
(LD) max θθ ≤ c>x + λ>(b − Ax), ∀x ∈ Γλ ≥ 0.
(10)
26 / 49
Lagrangian Approach
KEY Theorem: The optimal value of (LD) is equal to theoptimal value of CONV (MIP).
Proof: I am too tired! Accept on faith.
All of our approaches are equivalent!
27 / 49
Capstone Case: Capacitated Lot SizingDynamic Lot Sizing :Variables:
xit – units of product i produced in period t
Iit – inventory level of product i at the end of the period t
yit – is 1 if there is nonzero production of product i duringperiod t, 0 otherwise
Parameters:
dit – demand product i in period t
fit – fixed cost associated with nonzero production of producti in period t
cit – marginal production cost for product i in period t
hit – marginal holding cost charged to product i at the end ofperiod t
gt – production capacity in period t
28 / 49
Capstone Case: Capacitated Lot Sizing
Objective: Minimize sum of marginal production cost, holdingcost, fixed cost
N∑i=1
T∑t=1
(citxit + hit Iit + fityit)
Constraint 1: Do not exceed total capacity in each period
N∑i=1
xit ≤ gt , t = 1, . . . ,T
29 / 49
Capstone Case: Capacitated Lot Sizing
Constraint 2: Inventory balance equations
Ii ,t−1 + xit − Iit = dit , i = 1, . . . ,N, t = 1, . . . ,T
Constraint 3: Fixed cost forcing constraints
xit −Mityit ≤ 0, i = 1, . . .N, t = 1, . . . ,T
30 / 49
Capstone Case: Capacitated Lot SizingDynamic Lot Sizing : A “standard formulation” is:
minN∑i=1
T∑t=1
(citxit + hit Iit + fityit)
s.t.
N∑i=1
xit ≤ gt , t = 1, . . . ,T
Ii ,t−1 + xit − Iit = dit , i = 1, . . . ,N, t = 1, . . . ,T
xit −Mityit ≤ 0, i = 1, . . .N, t = 1, . . . ,T
xit , Iit ≥ 0, i = 1, . . . ,N, t = 1, . . . ,T
yit ∈ 0, 1, i = 1, . . . ,N, t = 1, . . . ,T .
31 / 49
Capstone Case: Capacitated Lot Sizing
The generic model is
min c>x
(MIP) s.t. Ax ≥ b
x ∈ Γ
Map the capacitated lot sizing model into this as follows: thecapacity constraints
∑Ni=1 xit ≤ gt map into the Ax ≥ b
constraints and the rest of the constraints
I inventory balance
I fixed charge big M
I nonnegativity and binary
define the Γ.32 / 49
Capstone Case: Capacitated Lot Sizing
Four approaches for generating bounds to find the optimal value of
min c>x
conv((MIP)) s.t. Ax ≥ b
x ∈ conv(Γ)
1. Column generation
2. Subgradient optimization to find optimal L(λ)
3. Generate the cuts that characterize conv(Γ)
4. The zitk formulation that also gives conv(Γ) (via Homework
5, question 5)
33 / 49
Capstone Case: Capacitated Lot Sizing
Dynamic Lot Sizing : An alternate formulation:
zitk is 1, if for product i in period t, the decision is to produceenough items to satisfy demand for periods t through k , 0otherwise.
T∑k=1
zi1k = 1
−t−1∑j=1
zij ,t−1 +T∑
k=t
zitk = 0, i = 1, . . . ,N, t = 2, . . . ,T
T∑k=t
zitk ≤ yit , i = 1, . . . ,N, t = 1, . . . ,T
34 / 49
Capstone Case: Capacitated Lot Sizing
Dynamic Lot Sizing : Characterize conv(Γ) through cuttingplanes. Barany, Van Roy, and Wolsey have characterized conv(Γ)by the (`,S) inequalities:
∑t∈S
xt ≤∑t∈S
(∑k=t
dk)yt + I`
for ` = 1, . . . ,T and ∀S ⊆ 1, . . . , `.Example: Let ` = 5 and S = 5. The cut is
x5 ≤ (∑k=5
dk)y5 + I` = d5y5 + I5.
35 / 49
Capstone Case: Capacitated Lot Sizing
The separation algorithm (formulated as an LP/IP) is given(x , I , y) and ` = 1, . . .T, solve
max∑t=1
x tαt −∑t=1
(∑k=t
dk)y tαt − I `
αt ∈ 0, 1, t = 1, . . . , `
and the αt = 1 index the elements in the set S .
36 / 49
Capstone Case: Capacitated Lot SizingDynamic Lot Sizing : An alternate formulation: use subgradient,Lagrangian duality to get the result. The Lagrangian problem tosolve in order to generate the subgradient is
minN∑i=1
T∑t=1
((cit + λt)xit + hit Iit + fityit)−T∑t=1
λtgt
s.t. Ii ,t−1 + xit − Iit = dit , i = 1, . . . ,N, t = 1, . . . ,T
xit −Mityit ≤ 0, i = 1, . . .N, t = 1, . . . ,T
xit , Iit ≥ 0, i = 1, . . . ,N, t = 1, . . . ,T
yit ∈ 0, 1, i = 1, . . . ,N, t = 1, . . . ,T .
What is the economic interpretation of λt in the term (cit + λt)xit?37 / 49
Capstone Case: Capacitated Lot Sizing
Dynamic Lot Sizing : An alternate formulation: use columngeneration to get convex hull.
This is what we did with the TSP problem in the previous set ofslides. See also Example 12.20 in the text.
38 / 49
Capstone Case: Capacitated Lot Sizing
Final Exam: Take the two product, five period data:
http://faculty.chicagobooth.edu/kipp.martin/root/
htmls/coursework/36900/datafiles/lotsizeDataExam.txt
and use one of the following techniques to find the convex hullrelaxation.
I the Barany, Van Roy, and Wolsey cuts
I subgradient optimization
I column generation
IMPORTANT: The final exam must be done on an individualbasis. You are not allowed to collaborate in any way with yourclassmates.
39 / 49
Lagrangian Approach Application – Convex Optimization
Objective: Apply projection and the Lagrangian duality toconvex optimization.
supx∈<n f (x)s.t. gi (x) ≥ 0 for i = 1, . . . , p
x ∈ Ω(CP)
where
I f (x) and gi (x) for i = 1, . . . , p are concave functions
I Ω is a closed, convex set.
40 / 49
Lagrangian Approach Application – Convex Optimization
The Lagrangian function L(λ) is
L(λ) := supf (x) +
p∑i=1
λigi (x) : x ∈ Ω.
The Lagrangian dual is
infλ≥0
L(λ).
Weak Duality: infλ≥0 L(λ) ≥ v(CP) where v(CP) is the optimalvalue of (CP).
41 / 49
Lagrangian Approach Application – Convex Optimization
Unfortunately, even for a convex problem, there may be a dualitygap, that is
infλ≥0
L(λ) > v(CP).
A great deal of research has gone into finding sufficientconditions that guarantee a zero duality gap.
These are often called constraint qualifications because theytypically involve conditions on the constraints.
One of the most well-know constraint qualifications is the Slaterconstraint qualification.
42 / 49
Lagrangian Approach Application – Convex Optimization
The Slater constraint qualification is that there exists an x∗ ∈ Ωsuch that gi (x
∗) > 0 for all i = 1, . . . , p.
The Slater constraint qualification gives the following classictheorem in convex optimization.
Theorem: Suppose the convex program (CP) is feasible andbounded. Moreover, suppose there exists an x∗ ∈ Ω such thatgi (x
∗) > 0 for all i = 1, . . . , p. Then there is zero duality gapbetween the convex program (CP) and its Lagrangian dual (9).Moreover, there exists a λ∗ ≥ 0 such that v(CP) = L(λ∗), i.e., theLagrangian dual is solvable.
We prove this result by using projection (because, as you know bynow, that is all I can do).
43 / 49
Lagrangian Approach Application – Convex OptimizationConstruct the semi-infinite linear program (yes, linear)
inf σs.t. σ −
∑pi=1 λigi (x) ≥ f (x) for x ∈ Ωλ ≥ 0.
(CP-SILP)
This is equivalent to the Lagrangian dual
inf L(λ)s.t. L(λ) = supf (x) +
∑pi=1 λigi (x) : x ∈ Ω
λ ≥ 0.(LD)
Replace
L(λ) = supf (x) +
p∑i=1
λigi (x) : x ∈ Ω
with
L(λ) ≥ f (x) +
p∑i=1
λigi (x), x ∈ Ω.
44 / 49
Lagrangian Approach Application – Convex Optimization
Theorem: Assume
I the convex program (CP) is feasible and bounded, and
I there exists a x∗ ∈ Ω such that gi (x∗) > 0 for all i = 1, . . . , p
then
I there is a zero duality gap between the convex program (CP)and its Lagrangian dual (LD), and
I there exists a λ∗ ≥ 0 such that v(LD) = L(λ∗), i.e., theLagrangian dual is solvable.
45 / 49
Lagrangian Approach Application – Convex Optimization
KEY IDEA: Show that the semi-infinite linear program (CP-SILP)is solvable and has zero duality gap. Do this using, yes you guessedit, projection!
Put (CP-SILP) into our linear programming format and solve withFourier-Motzkin elimination.
z − σ ≥ 0σ −
∑pi=1 λigi (x) ≥ f (x) ∀x ∈ Ωλi ≥ 0 i = 1, . . . , p
Eliminate σ and end up with
z −∑p
i=1 λigi (x) ≥ f (x) ∀x ∈ Ωλi ≥ 0 i = 1, . . . , p
46 / 49
Lagrangian Approach Application – Convex Optimization
Claim: The variables λ1, . . . , λp remain clean as the FourierMotzkin elimination procedure proceeds.
Use induction to show that after eliminating variables λ1, . . . , λk ,there is an inequality
z −p∑
i=k+1
λigi (x∗) ≥ f (x∗).
1. This implies variables λi , for i = k + 1, are clean Why?
2. If there are no dirty variables, there is no duality gap. Why?
47 / 49
Lagrangian Approach Application – Convex Optimization
We can also show:
v(CP-SILP) = v(LD) ≥ v(CP) = v(CP-FDSILP)
See http://www.optimization-online.org/DB_HTML/2013/
04/3817.html.
We just showed
v(CP-SILP) = v(CP-FDSILP)
therefore
v(LD) = v(CP).
48 / 49
Lagrangian Approach Application – Convex Optimization
The Slater result can be strengthened.
Theorem: An instance of (CP-SILP) with finite optimal value istidy if and only if the set of optimal solutions to (CP-SILP) isbounded.
See http://www.optimization-online.org/DB_HTML/2013/
04/3817.html.
49 / 49