49
Integer Programming: Large Scale Methods Part II (Chapter 12 and Chapter 16) University of Chicago Booth School of Business Kipp Martin November 27, 2017 1 / 49

Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

  • Upload
    dothuan

  • View
    226

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Integer Programming: Large ScaleMethods Part II (Chapter 12 and Chapter 16)

University of ChicagoBooth School of Business

Kipp Martin

November 27, 2017

1 / 49

Page 2: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Outline

List of Files

Key Concepts

Cut Generation

Lagrangian Approach

Capstone Case: Capacitated Lot Sizing

Lagrangian Approach Application – Convex Optimization

2 / 49

Page 3: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Files

I genAssign.gms

I genAssignSub.gms

I genAssignData.txt

3 / 49

Page 4: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Key Concepts

I the quality of the formulation is what counts, not the size –quality usually measured by the lower bound

I if we can read the problem into memory we can solve thelinear relaxation (of course it may be a poor relaxation)

I many interesting applications have too many constraints orvariables to read into memory

I if there are too many variables we generate variables as needed

I we generate variables as needed by a pricing algorithm

I if there are too many constraints we generate constraints asneeded

I we generate constraints as needed by a separation algorithm

4 / 49

Page 5: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Key ConceptsThe mixed integer linear program under consideration is:

min c>x

(MIP) s.t. Ax ≥ b

Bx ≥ d

x ≥ 0

xj ∈ Z, j ∈ I

One solution approach is to solve the LP relaxation within branchand bound

min c>x

(MIP) s.t. Ax ≥ b

Bx ≥ d

x ≥ 0

5 / 49

Page 6: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Key Concepts

As an alternative, we try to define A and B appropriately and solve:

min c>x

(conv(MIP)) s.t. Ax ≥ b

x ∈ conv(Γ)

How to break the constraints into A and B takes a fair amount ofskill and practice. We want the characterization of conv(Γ) to bean easy hard problem.

If it is too easy there may not be a big win in terms of better lowerbounds.

6 / 49

Page 7: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Key Concepts

There are three approaches to actually solving

min c>x

(conv(MIP)) s.t. Ax ≥ b

x ∈ conv(Γ)

Approach 1: Characterize conv(Γ) by the inequalities that definethe polyhedron. This is the cutting plane approach. (Today)

Approach 2: Characterize conv(Γ) through inverse projection.Often called decomposition or column generation.

Approach 3: Lagrangian Relaxation (Today )

7 / 49

Page 8: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Cut Generation

There are three approaches to actually solving

min c>x

(conv(MIP)) s.t. Ax ≥ b

x ∈ conv(Γ)

We now illustrate cut generation with the TSP.

We need to figure out how to break up the constraints.

8 / 49

Page 9: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Cut Generation

Let’s go back and solve the cut generation problem. The problemis:

min∑e∈E

cexe (1)∑e∈δ(i)

xe = 2, i ∈ V (2)

∑e∈E(S)

xe ≤ |S | − 1, S ⊂ V \1 (3)

∑e∈E

xe = n (4)

xe ∈ 0, 1, ∀e ∈ E (5)

9 / 49

Page 10: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Cut Generation

Rewrite this as

min∑e∈E

cexe∑e∈δ(i)

xe = 2, i ∈ V

x ∈ Γ

where Γ corresponds to constraints (3)-(5). In the previous slidedeck, we characterized Γ by its extreme points and used columngeneration.

However, we can also generate the constraints in (3) throughseparation.

10 / 49

Page 11: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Cut GenerationWe now work with this example:

1

2

3

4

5

7

61

5

7

9

3

9

2

9

6

1

8

5

2

10

2

11 / 49

Page 12: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Cut Generation

So a problem relaxation by deleting the tour-breaking constraints.

min∑e∈E

cexe∑e∈δ(i)

xe = 2, i ∈ V

∑e∈E

xe = n

xe ∈ 0, 1, ∀e ∈ E

12 / 49

Page 13: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Cut GenerationHere is the solution. Argh! Subtours!

1

2

3

4

5

1 7

3

6

1

5

2

13 / 49

Page 14: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Cut Generation

The separation problem

max∑e∈E

xeαe −n∑i=1i 6=k

θi (6)

(SPk(x)) s.t. αe ≤ θi , e ∈ δ(i), i ∈ V (7)

θi ≥ 0. (8)

14 / 49

Page 15: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Cut Generation

Let k = 2. The solution is:

θ2 = 1, θ3 = 1, θ6 = 1

α(2,3) = 1, α(2,6) = 1, α(3,6) = 1

15 / 49

Page 16: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach

The MIP is

min c>x

(MIP) s.t. Ax ≥ b

x ∈ Γ

where, as always, the set Γ has special structure. Dualize theconstraints Ax ≥ b with a conformable vector λ ≥ 0 of Lagrangemultipliers and get

min c>x + λ>(b − Ax)

s.t. x ∈ Γ

16 / 49

Page 17: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach

Define L(λ) := minc>x + λ>(b − Ax) | x ∈ Γ.

The Lagrangian Dual problem (LD) is:

(LD) maxL(λ) : λ ≥ 0.

Theorem: (Weak Duality) v(LD) ≤ v(MIP)

Proof?

Theorem: L(λ) is a concave function of λ.

Proof?

Remark: No matter how ugly and non-convex the set Γ is, L(λ) isconcave, so at least we are trying to maximize a concave function.

17 / 49

Page 18: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach – Using Subgradients

Concave is not too bad, but this is a linear programming class! Idon’t have the ambition to tackle a concave function. You knowme, 1) I worry; and 2) I like easy problems.

γk is a subgradient of the concave function L(λ) at λk iff

L(λ) ≤ L(λk) + (γk)>(λ− λk)

for all λ.

Solve the following problem linear programming relaxation:

max θθ ≤ L(λk) + (γk)>(λ− λk), k = 1, . . . ,Kλ ≥ 0

(9)

where γk , k = 1, . . . ,K are subgradients. What is the geometry ofthis approach?

18 / 49

Page 19: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach – Finding Subrgradients

Here is how to find a subgradient. Given a λ ≥ 0, solve

L(λ) = minc>x + λ>

(b − Ax) | x ∈ Γ.

and obtain an optimal x(λ).

Theorem: The vector γ = b − Ax(λ) is a subgradient of L(λ) atλ = λ.

Proof: Next slide.

19 / 49

Page 20: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach – Finding Subrgradients

If

L(λ) = minc>x + λ>

(b − Ax) | x ∈ ΓL(λ) = c>x(λ) + λ>(b − Ax(λ))

then

L(λ) + (b − Ax(λ))>(λ− λ) = c>x(λ) + (b − Ax(λ))>λ

+(b − Ax(λ))>(λ− λ)

= c>x(λ) + (b − Ax(λ))>λ

+(b − Ax(λ))>λ− (b − Ax(λ))>λ

= c>x(λ) + (b − Ax(λ))>λ

≥ L(λ)

20 / 49

Page 21: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach - Subgradient Algorithm

Step 0: Initialize: Choose tolerance ε and K ← 1 and λK ← 0.Solve L(λK ) and obtain subgradient γK .

Step 1: Solve the relaxed Lagrangian dual (9) and obtain optimalθ and λ.

Step 2: K ← K + 1, λK ← λ, solve

L(λK ) = minc>x + (λK )>(b − Ax) | x ∈ Γ

save γK = b − Ax(λK ) and L(λK ).

Step 3: Stop if we satisfy ε− tolerance. Otherwise, add

θ ≤ L(λK ) + (γK )>(λ− λK )

to the relaxed Lagrangian dual (9).

Step 4: Go to Step 1

21 / 49

Page 22: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach - Subgradient Algorithm

Consider the generalized assignment problem. Let xij be 1 iftask i is assigned person j , and 0 otherwise.

minn∑

i=1

m∑j=1

cijxij

m∑j=1

xij = 1, i = 1, . . . , n

n∑i=1

aijxij ≤ bj , j = 1, . . . ,m

xij ∈ 0, 1, i = 1, . . . , n, j = 1, . . . ,m

How do we partition into Ax ≥ b and Γ? We want an easy hardproblem.

22 / 49

Page 23: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach - Subgradient AlgorithmThe model genAssign.gms solves the integer programmingformulation for the generalized assignment problem.

SETS

i "task", j "server" ;

PARAMETERS

c(i, j), a(i, j), b(j) ;

VARIABLES

z "minimize the cost" ;

BINARY VARIABLES

x(i, j);

EQUATIONS

cost, assign( i), resource(j) ;

cost .. z =e= sum( (i, j), c(i, j)*x(i, j) ) ;

assign( i) .. sum(j, x(i, j) ) =g= 1;

resource(j) .. sum(i, a(i, j)*x(i, j) ) =l= b(j);

MODEL genAssign / cost, assign, resource/;

23 / 49

Page 24: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach - Subgradient Algorithm

The data file is genAssignData.txt. For these data the optimalLP value is 20.32, and the optimal IP value is 34.0. A huge gap!

The GAMS model file genAssignSub.gms solves the Lagrangiandual. It contains two models: 1) genAssign without theassignment constraints; and 2) lagDual which is the relaxedLagrangian dual.

Model lagDual calls model genAssign inside the cut generationloop.

The value of the Lagrangian dual is 29.5, so we are really closingthe gap a lot.

24 / 49

Page 25: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach - Subgradient Algorithm

cost .. z =e= sum( (i, j), adj_c(i, j)*x(i, j) ) ;

assign( i) .. sum(j, x(i, j) ) =g= 1;

resource(j) .. sum(i, a(i, j)*x(i, j) ) =l= b(j);

MODEL genAssign / cost, resource/;

cut(allcuts) .. theta =l= LagValue( allcuts)

+ sum( i, cutCoeffSub( allcuts, i)*(lambda(i) -

cutCoeffLambda( allcuts, i) ) )

MODEL lagDual / cut /;

Inside the loop,

adj_c(i, j) = c(i, j) - lambda.l(i);

SOLVE genAssign USING MIP MINIMIZING z ;

SOLVE lagDual USING LP MAXIMIZING theta ;

25 / 49

Page 26: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach

If the Lagrangian dual is

maxL(λ) : λ ≥ 0.

where L(λ) := minc>x + λ>(b − Ax) | x ∈ Γ. Then this isequivalent to

(LD) max θθ ≤ c>x + λ>(b − Ax), ∀x ∈ Γλ ≥ 0.

(10)

26 / 49

Page 27: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach

KEY Theorem: The optimal value of (LD) is equal to theoptimal value of CONV (MIP).

Proof: I am too tired! Accept on faith.

All of our approaches are equivalent!

27 / 49

Page 28: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot SizingDynamic Lot Sizing :Variables:

xit – units of product i produced in period t

Iit – inventory level of product i at the end of the period t

yit – is 1 if there is nonzero production of product i duringperiod t, 0 otherwise

Parameters:

dit – demand product i in period t

fit – fixed cost associated with nonzero production of producti in period t

cit – marginal production cost for product i in period t

hit – marginal holding cost charged to product i at the end ofperiod t

gt – production capacity in period t

28 / 49

Page 29: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot Sizing

Objective: Minimize sum of marginal production cost, holdingcost, fixed cost

N∑i=1

T∑t=1

(citxit + hit Iit + fityit)

Constraint 1: Do not exceed total capacity in each period

N∑i=1

xit ≤ gt , t = 1, . . . ,T

29 / 49

Page 30: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot Sizing

Constraint 2: Inventory balance equations

Ii ,t−1 + xit − Iit = dit , i = 1, . . . ,N, t = 1, . . . ,T

Constraint 3: Fixed cost forcing constraints

xit −Mityit ≤ 0, i = 1, . . .N, t = 1, . . . ,T

30 / 49

Page 31: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot SizingDynamic Lot Sizing : A “standard formulation” is:

minN∑i=1

T∑t=1

(citxit + hit Iit + fityit)

s.t.

N∑i=1

xit ≤ gt , t = 1, . . . ,T

Ii ,t−1 + xit − Iit = dit , i = 1, . . . ,N, t = 1, . . . ,T

xit −Mityit ≤ 0, i = 1, . . .N, t = 1, . . . ,T

xit , Iit ≥ 0, i = 1, . . . ,N, t = 1, . . . ,T

yit ∈ 0, 1, i = 1, . . . ,N, t = 1, . . . ,T .

31 / 49

Page 32: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot Sizing

The generic model is

min c>x

(MIP) s.t. Ax ≥ b

x ∈ Γ

Map the capacitated lot sizing model into this as follows: thecapacity constraints

∑Ni=1 xit ≤ gt map into the Ax ≥ b

constraints and the rest of the constraints

I inventory balance

I fixed charge big M

I nonnegativity and binary

define the Γ.32 / 49

Page 33: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot Sizing

Four approaches for generating bounds to find the optimal value of

min c>x

conv((MIP)) s.t. Ax ≥ b

x ∈ conv(Γ)

1. Column generation

2. Subgradient optimization to find optimal L(λ)

3. Generate the cuts that characterize conv(Γ)

4. The zitk formulation that also gives conv(Γ) (via Homework

5, question 5)

33 / 49

Page 34: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot Sizing

Dynamic Lot Sizing : An alternate formulation:

zitk is 1, if for product i in period t, the decision is to produceenough items to satisfy demand for periods t through k , 0otherwise.

T∑k=1

zi1k = 1

−t−1∑j=1

zij ,t−1 +T∑

k=t

zitk = 0, i = 1, . . . ,N, t = 2, . . . ,T

T∑k=t

zitk ≤ yit , i = 1, . . . ,N, t = 1, . . . ,T

34 / 49

Page 35: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot Sizing

Dynamic Lot Sizing : Characterize conv(Γ) through cuttingplanes. Barany, Van Roy, and Wolsey have characterized conv(Γ)by the (`,S) inequalities:

∑t∈S

xt ≤∑t∈S

(∑k=t

dk)yt + I`

for ` = 1, . . . ,T and ∀S ⊆ 1, . . . , `.Example: Let ` = 5 and S = 5. The cut is

x5 ≤ (∑k=5

dk)y5 + I` = d5y5 + I5.

35 / 49

Page 36: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot Sizing

The separation algorithm (formulated as an LP/IP) is given(x , I , y) and ` = 1, . . .T, solve

max∑t=1

x tαt −∑t=1

(∑k=t

dk)y tαt − I `

αt ∈ 0, 1, t = 1, . . . , `

and the αt = 1 index the elements in the set S .

36 / 49

Page 37: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot SizingDynamic Lot Sizing : An alternate formulation: use subgradient,Lagrangian duality to get the result. The Lagrangian problem tosolve in order to generate the subgradient is

minN∑i=1

T∑t=1

((cit + λt)xit + hit Iit + fityit)−T∑t=1

λtgt

s.t. Ii ,t−1 + xit − Iit = dit , i = 1, . . . ,N, t = 1, . . . ,T

xit −Mityit ≤ 0, i = 1, . . .N, t = 1, . . . ,T

xit , Iit ≥ 0, i = 1, . . . ,N, t = 1, . . . ,T

yit ∈ 0, 1, i = 1, . . . ,N, t = 1, . . . ,T .

What is the economic interpretation of λt in the term (cit + λt)xit?37 / 49

Page 38: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot Sizing

Dynamic Lot Sizing : An alternate formulation: use columngeneration to get convex hull.

This is what we did with the TSP problem in the previous set ofslides. See also Example 12.20 in the text.

38 / 49

Page 39: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Capstone Case: Capacitated Lot Sizing

Final Exam: Take the two product, five period data:

http://faculty.chicagobooth.edu/kipp.martin/root/

htmls/coursework/36900/datafiles/lotsizeDataExam.txt

and use one of the following techniques to find the convex hullrelaxation.

I the Barany, Van Roy, and Wolsey cuts

I subgradient optimization

I column generation

IMPORTANT: The final exam must be done on an individualbasis. You are not allowed to collaborate in any way with yourclassmates.

39 / 49

Page 40: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex Optimization

Objective: Apply projection and the Lagrangian duality toconvex optimization.

supx∈<n f (x)s.t. gi (x) ≥ 0 for i = 1, . . . , p

x ∈ Ω(CP)

where

I f (x) and gi (x) for i = 1, . . . , p are concave functions

I Ω is a closed, convex set.

40 / 49

Page 41: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex Optimization

The Lagrangian function L(λ) is

L(λ) := supf (x) +

p∑i=1

λigi (x) : x ∈ Ω.

The Lagrangian dual is

infλ≥0

L(λ).

Weak Duality: infλ≥0 L(λ) ≥ v(CP) where v(CP) is the optimalvalue of (CP).

41 / 49

Page 42: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex Optimization

Unfortunately, even for a convex problem, there may be a dualitygap, that is

infλ≥0

L(λ) > v(CP).

A great deal of research has gone into finding sufficientconditions that guarantee a zero duality gap.

These are often called constraint qualifications because theytypically involve conditions on the constraints.

One of the most well-know constraint qualifications is the Slaterconstraint qualification.

42 / 49

Page 43: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex Optimization

The Slater constraint qualification is that there exists an x∗ ∈ Ωsuch that gi (x

∗) > 0 for all i = 1, . . . , p.

The Slater constraint qualification gives the following classictheorem in convex optimization.

Theorem: Suppose the convex program (CP) is feasible andbounded. Moreover, suppose there exists an x∗ ∈ Ω such thatgi (x

∗) > 0 for all i = 1, . . . , p. Then there is zero duality gapbetween the convex program (CP) and its Lagrangian dual (9).Moreover, there exists a λ∗ ≥ 0 such that v(CP) = L(λ∗), i.e., theLagrangian dual is solvable.

We prove this result by using projection (because, as you know bynow, that is all I can do).

43 / 49

Page 44: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex OptimizationConstruct the semi-infinite linear program (yes, linear)

inf σs.t. σ −

∑pi=1 λigi (x) ≥ f (x) for x ∈ Ωλ ≥ 0.

(CP-SILP)

This is equivalent to the Lagrangian dual

inf L(λ)s.t. L(λ) = supf (x) +

∑pi=1 λigi (x) : x ∈ Ω

λ ≥ 0.(LD)

Replace

L(λ) = supf (x) +

p∑i=1

λigi (x) : x ∈ Ω

with

L(λ) ≥ f (x) +

p∑i=1

λigi (x), x ∈ Ω.

44 / 49

Page 45: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex Optimization

Theorem: Assume

I the convex program (CP) is feasible and bounded, and

I there exists a x∗ ∈ Ω such that gi (x∗) > 0 for all i = 1, . . . , p

then

I there is a zero duality gap between the convex program (CP)and its Lagrangian dual (LD), and

I there exists a λ∗ ≥ 0 such that v(LD) = L(λ∗), i.e., theLagrangian dual is solvable.

45 / 49

Page 46: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex Optimization

KEY IDEA: Show that the semi-infinite linear program (CP-SILP)is solvable and has zero duality gap. Do this using, yes you guessedit, projection!

Put (CP-SILP) into our linear programming format and solve withFourier-Motzkin elimination.

z − σ ≥ 0σ −

∑pi=1 λigi (x) ≥ f (x) ∀x ∈ Ωλi ≥ 0 i = 1, . . . , p

Eliminate σ and end up with

z −∑p

i=1 λigi (x) ≥ f (x) ∀x ∈ Ωλi ≥ 0 i = 1, . . . , p

46 / 49

Page 47: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex Optimization

Claim: The variables λ1, . . . , λp remain clean as the FourierMotzkin elimination procedure proceeds.

Use induction to show that after eliminating variables λ1, . . . , λk ,there is an inequality

z −p∑

i=k+1

λigi (x∗) ≥ f (x∗).

1. This implies variables λi , for i = k + 1, are clean Why?

2. If there are no dirty variables, there is no duality gap. Why?

47 / 49

Page 48: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex Optimization

We can also show:

v(CP-SILP) = v(LD) ≥ v(CP) = v(CP-FDSILP)

See http://www.optimization-online.org/DB_HTML/2013/

04/3817.html.

We just showed

v(CP-SILP) = v(CP-FDSILP)

therefore

v(LD) = v(CP).

48 / 49

Page 49: Integer Programming: Large Scale Methods Part II (Chapter ...faculty.chicagobooth.edu/kipp.martin/root/htmls/coursework/36900/... · Methods Part II (Chapter 12 and Chapter 16)

Lagrangian Approach Application – Convex Optimization

The Slater result can be strengthened.

Theorem: An instance of (CP-SILP) with finite optimal value istidy if and only if the set of optimal solutions to (CP-SILP) isbounded.

See http://www.optimization-online.org/DB_HTML/2013/

04/3817.html.

49 / 49