Download pdf - Solution methods for constrained optimization problems

Active-set method Frank-Wolfe method Penalty method Barrier methods

Solution methods forconstrained optimization problems

Mauro Passacantando

Department of Computer Science, University of [email protected]

Optimization MethodsMaster of Science in Embedded Computing Systems – University of Pisa

http://pages.di.unipi.it/passacantando/om/OM.html

M. Passacantando Optimization Methods 1 / 28 –


Problems with linear equality constraints

Consider min f (x)Ax = b

with

I f strongly convex and twice continuously differentiable

I A matrix p × n with rank(A) = p

It is equivalent to an unconstrained problem:

write A = (AB ,AN) with det(AB) 6= 0, then Ax = b is equivalent to

ABxB + ANxN = b =⇒ xB = A−1B (b − ANxN),

thus min f (x)Ax = b

is equivalent to

min f (A−1

B (b − ANxN), xN)xN ∈ Rn−p



Problems with linear equality constraints

Example. Consider min x21 + x2

2 + x23

x1 + x3 = 1x1 + x2 − x3 = 2

Since x1 = 1− x3 and x2 = 2− x1 + x3 = 1 + 2x3, the original constrainedproblem is equivalent to the following unconstrained problem:

min (1− x3)2 + (1 + 2x3)2 + x23 = 6x2

3 + 2x3 + 2x3 ∈ R

Therefore, the optimal solution is x3 = −1/6, x1 = 7/6, x2 = 2/3.



Active-set method

Consider a quadratic programming problemmin 1

2xTQx + cTx

Ax ≤ b

where

I Q is positive definite

I for any feasible point x the vectors Ai : Aix = bi are linearly independent

The active-set method solves at each iteration a quadratic programming problemwith equality constraints only.



Active-set method

0. Choose a feasible point x0, set W0 = i : Aix0 = bi (working set) and

k = 0.

1. Find the optimal solution yk of the problemmin 1

2xTQx + cTx

Aix = bi ∀ i ∈Wk

2. If yk 6= xk then go to step 3else go to step 4

3. If yk is feasible then tk = 1

else tk = min

bi − Aix

k

Ai (yk − xk): i /∈Wk , Ai (y

k − xk) > 0

,

endxk+1 = xk + tk(yk − xk), Wk+1 = Wk ∪ i /∈Wk : Aix

k+1 = bi,k = k + 1 and go to step 1

4. Compute the KKT multipliers µk related to yk

If µk ≥ 0 then STOPelse xk+1 = xk , µk

j = mini∈Wk

µki , Wk+1 = Wk \ j, k = k + 1 and go to step 1



Active-set method

Example. Solve the problemmin 1

2 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 ≤ 0x1 + x2 ≤ 4−x2 ≤ 0

by means of the active-set method starting from x0 = (0, 0).

The working set W0 = 1, 3 hence y0 = x0; KKT multipliers are µ01 = −3/2,

µ03 = −11/2. The new point x1 = x0 with W1 = 1; y1 is the optimal solution of

min 12 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 = 0

which is equivalent to min 9

2x21 − 11x1

x1 ∈ Rthus y1 = (11/9, 22/9) which is feasible, therefore x2 = y1 and W2 = 1. Wealready know that y2 = x2; the KKT multiplier is µ2

1 = −8/9, hence x3 = x2 andW3 = ∅.



Active-set method

The optimal solution y3 = (3, 2) is not feasible, the step size is

t3 = min

b2 − A2x

3

A2(y3 − x3),

b3 − A3x3

A3(y3 − x3)

= min

1

4,

11

2

=

1

4,

x4 = x3 + t3(y3 − x3) = (5/3, 7/3) and W4 = 2. The optimal solutiony4 = (7/3, 5/3) is feasible, hence x5 = y4 and W5 = 2. Finally, y5 = x5 andµ5

2 = 2/3 > 0, thus x5 is the global minimum of the original problem.



Active-set method

Exercise 1. Solve the problemmin 2x2

1 + x22 − x1x2 − 2x1 + x2

−x1 ≤ 0−x2 ≤ 0x1 + x2 ≤ 12

by means of the active-set method starting from the point (0, 10).



Frank-Wolfe method

Consider the problem min f (x)Ax ≤ b

(P)

where f is convex and continuously differentiable and the feasible regionΩ = x : Ax ≤ b is bounded.

Linearize the objective function at xk , i.e., f (x) ' f (xk) +∇f (xk)T(x − xk) andsolve the linear programming problem:

min f (xk) +∇f (xk)T(x − xk)Ax ≤ b

The optimal solution yk of the linearized problem gives a lower bound for (P):

f (xk) ≥ v(P) = minx∈Ω

f (x) ≥ minx∈Ω

[f (xk) +∇f (xk)T(x − xk)

]= f (xk)+∇f (xk)T(y k − xk)︸︷︷︸

optimality gap

If ∇f (xk)T(yk − xk) = 0 then xk solves (P)else ∇f (xk)T(yk − xk) < 0, i.e., yk − xk is a descent direction for f at xk .



Frank-Wolfe method

Frank-Wolfe method

0. Choose a feasible point x0 and set k = 0

1. Compute an optimal solution yk of the LP problemmin ∇f (xk)TxA x ≤ b

2. if ∇f (xk)T(yk − xk) = 0 then STOPelse compute step size tk

3. Set xk+1 = xk + tk(yk − xk), k = k + 1 and go to step 1.

The step size tk can be predetermined, e.g., tk =1

k + 1or computed by an

(exact/inexact) line search along the direction yk − xk , with t ∈ [0, 1].

TheoremFor any starting point x0 the generated sequence xk is bounded and any of itscluster points is a global optimum.



Frank-Wolfe method

Example. Solve the problem min (x1 − 3)2 + (x2 − 1)2

0 ≤ x1 ≤ 20 ≤ x2 ≤ 2

by means of the Frank-Wolfe method starting from the point x0 = (0, 0) andusing an exact line search to compute the step size.

∇f (x) = (2x1 − 6, 2x2 − 2) and ∇f (x0) = (−6, −2), the optimal solution of thelinearized problem

min −6x1 − 2x2

x ∈ Ω

is y0 = (2, 2). Since ∇f (x0)T(y0 − x0) = −16, a line search is needed: the stepsize t0 is the optimal solution of

min (x1 − 3)2 + (x2 − 1)2

x1 = 2tx2 = 2tt ∈ [0, 1]



Frank-Wolfe method

which is equivalent to min 8t2 − 16tt ∈ [0, 1]

thus t0 = 1 and x1 = (2, 2). Since ∇f (x1) = (−2, 2), the optimal solution of thelinearized problem

min −2x1 + 2x2

x ∈ Ω

is y1 = (2, 0). Now, ∇f (x1)T(y1 − x1) = −4, the step size is the optimal solutionof

min (x1 − 3)2 + (x2 − 1)2

x1 = 2x2 = 2− 2tt ∈ [0, 1]

hence t1 = 1/2, x2 = (2, 1) and ∇f (x2) = (−2, 0), thus y2 = (2, 0) (analternative optimal solution is y2 = (2, 2)). Since ∇f (x2)T(y2 − x2) = 0, thepoint x2 is the global minimum of the original problem.



Frank-Wolfe method

Exercise 2. Implement in MATLAB the Frank-Wolfe method with exact linesearch for solving the problem

min 12x

TQx + cTxAx ≤ b

where Q is a positive definite matrix.

Exercise 3. Run the Frank-Wolfe method with exact line search for solving theproblem

min 12 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 ≤ 0x1 + x2 ≤ 4−x2 ≤ 0

starting from the point (0, 0). [Use ∇f (xk)T(yk − xk) > −10−2 as stoppingcriterion.]

Exercise 4. Solve the previous problem by means of the Frank-Wolfe methodwith predetermined step size tk = 1/(k + 1) starting from (0, 0).



Penalty method

Consider a constrained optimization problemmin f (x)gi (x) ≤ 0 ∀ i = 1, . . . ,m

(P)

Define the penalty function

p(x) =m∑i=1

(max0, gi (x))2

and consider the unconstrained penalized problemmin f (x) +

1

εp(x) := pε(x)

x ∈ Rn(Pε)

Note that

pε(x)

= f (x) if x ∈ Ω> f (x) if x /∈ Ω



Penalty method

Proposition

I If f , gi are continuously differentiable, then pε is continuously differentiableand ∇pε(x) = ∇f (x) + 2

ε

∑mi=1 max0, gi (x)∇gi (x)

I If f and gi are convex, then pε is convex

I Any (Pε) is a relaxation of (P), i.e., v(Pε) ≤ v(P) for any ε > 0

I If x∗ε solves (Pε) and x∗ε ∈ Ω, then x∗ε is optimal also for (P)If x∗ε solves (Pε) and x∗ε /∈ Ω, then v(Pε) > v(Pε′) for any ε′ > ε

Penalty method

0. Set ε0 > 0, τ ∈ (0, 1), k = 0

1. Find an optimal solution xk of the penalized problem (Pεk )

2. If xk ∈ Ω then STOPelse εk+1 = τεk , k = k + 1 and go to step 1.

TheoremIf f is coercive, then the sequence xk is bounded and any of its cluster points isan optimal solution of (P).



Penalty method

Exercise 5. Implement in MATLAB the penalty method for solving the problemmin 1

2xTQx + cTx

Ax ≤ b


Exercise 6. Run the penalty method withτ = 0.5 and ε0 = 5 for solving the prob-lem

min 12 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 ≤ 0x1 + x2 ≤ 4−x2 ≤ 0

[Use max(Ax − b) < 10−3 as stoppingcriterion.]

0 0.5 1 1.5 2 2.5 3 3.5 4

0

0.5

1

1.5

2

2.5

3



Barrier methods

Consider min f (x)g(x) ≤ 0

where

I f , gi convex and twice continuously differentiable

I there is no isolated point in Ω

I there exists an optimal solution (e.g. f coercive or Ω bounded)

I Slater constraint qualification holds: there exists x such that

x ∈ dom(f), gi (x) < 0, ∀ i = 1, . . . ,m

Hence strong duality holds.

Special cases: linear programming, convex quadratic programming



Unconstrained reformulation

The constrained problem min f (x)g(x) ≤ 0

is equivalent to the unconstrained problem min f (x) +m∑i=1

I−(gi (x))

x ∈ Rn

where

I−(u) =

0 if u ≤ 0

+∞ if u > 0

is called the indicator function of R−, that is neither finite nor differentiable.



Barrier function

However, I− can be approximated by the smooth convex function

u 7→ − εu, where u < 0,

and parameter ε > 0. Approximation improves as ε→ 0.Hence, we can approximate the problem min f (x) +

m∑i=1

I−(gi (x))

x ∈ Rn

with min f (x) + ε

m∑i=1

1

−gi (x)

x ∈ int(Ω)

B(x) =m∑i=1

1

−gi (x)is called barrier function.

dom(B) = int(Ω), B is convex and smooth.M. Passacantando Optimization Methods 19 / 28 –


Barrier method

If x∗(ε) is the optimal solution of min f (x) + ε

m∑i=1

1

−gi (x)

x ∈ int(Ω)

then

∇f (x∗(ε)) + ε

m∑i=1

1

[gi (x∗(ε))]2∇gi (x∗(ε)) = 0.

Define λ∗i (ε) =ε

[gi (x∗(ε))]2> 0, for any i = 1, . . . ,m.

Then the Lagrangian function L(x , λ∗(ε)) = f (x) +m∑i=1

λ∗i (ε)gi (x) is convex and

∇xL(x∗(ε), λ∗(ε)) = 0, hence

f (x∗(ε)) ≥ v(P) ≥ ϕ(λ∗(ε)) = minx L(x , λ∗(ε))= L(x∗(ε), λ∗(ε))= f (x∗(ε))− εB(x∗(ε))︸︷︷︸

optimality gap



Barrier method

Barrier method

0. Set δ > 0, τ < 1 and ε1 > 0. Choose x0 ∈ int(Ω), set k = 1

1. Find the optimal solution xk of the unconstrained problem min f (x) + εk

m∑i=1

1

−gi (x)

x ∈ int(Ω)

using xk−1 as starting point

2. If εkB(xk) < δ then STOPelse εk+1 = τεk , k = k + 1 and go to step 1

TheoremIf Ω is bounded, then the sequence xk is bounded and any of its cluster pointsis an optimal solution of (P).



Logarithmic barrier

The indicator function I− can be also approximated by the smooth convexfunction:

u 7→ −ε log(−u), with ε > 0,

and the approximation improves as ε→ 0.

Hence, we can approximate the problem min f (x) +m∑i=1

I−(gi (x))

x ∈ Rn

with min f (x)− εm∑i=1

log(−gi (x))

x ∈ int(Ω)



Logarithmic barrier

Logarithmic barrier function

B(x) = −m∑i=1

log(−gi (x))

I dom(B) = int(Ω)

I B is convex

I B is smooth with

∇B(x) = −m∑i=1

1

gi (x)∇gi (x)

∇2B(x) =m∑i=1

1

gi (x)2∇gi (x)∇gi (x)T +

m∑i=1

1

−gi (x)∇2gi (x)



Logarithmic barrier

If x∗(ε) is the optimal solution of min f (x)− εm∑i=1

log(−gi (x))

x ∈ int(Ω)

then

∇f (x∗(ε)) +m∑i=1

ε

−gi (x∗(ε))∇gi (x∗(ε)) = 0.

Define λ∗i (ε) =ε

−gi (x∗(ε))> 0, for any i = 1, . . . ,m. Then the Lagrangian

function

L(x , λ∗(ε)) = f (x) +m∑i=1

λ∗i (ε)gi (x)

is convex and ∇xL(x∗(ε), λ∗(ε)) = 0, hence

f (x∗(ε)) ≥ v(P) ≥ ϕ(λ∗(ε)) = minx L(x , λ∗(ε))= L(x∗(ε), λ∗(ε))= f (x∗(ε))− m ε︸︷︷︸

optimality gapM. Passacantando Optimization Methods 24 / 28 –


Interpretation via KKT conditions

The KKT system of the original problem is∇f (x) +

m∑i=1

λi∇gi (x) = 0

−λigi (x) = 0λ ≥ 0g(x) ≤ 0

Notice that (x∗(ε), λ∗(ε)) solves the system∇f (x) +

m∑i=1

λi∇gi (x) = 0

−λigi (x) = ελ ≥ 0g(x) ≤ 0

which is an approximation of the above KKT system.



Logarithmic barrier method


0. Set δ > 0, τ < 1 and ε1 > 0. Choose x0 ∈ int(Ω), set k = 1

1. Find the optimal solution xk of min f (x)− εkm∑i=1

log(−gi (x))

x ∈ int(Ω)

using xk−1 as starting point

2. If m εk < δ then STOPelse εk+1 = τεk , k = k + 1 and go to step 1

Choice of τ involves a trade-off: small τ means fewer outer iterations, more inneriterations



Choice of starting point

How to find x0 ∈ int(Ω)?

Consider the auxiliary problem minx,s

s

gi (x) ≤ s

I Take any x ∈ Rn, find s > maxi=1,...,m

gi (x)

[(x , s) is in the interior of the feasible region of the auxiliary problem]

I Find an optimal solution (x∗, s∗) of the auxiliary problem using a barriermethod starting from (x , s)

I If s∗ < 0 then x∗ ∈ int(Ω)else int(Ω) = ∅




Exercise 7. Implement in MATLAB the logarithmic barrier method for solving theproblem

min 12x

TQx + cTxAx ≤ b


Exercise 8. Run the logarithmic barriermethod with δ = 10−3, τ = 0.5, ε1 = 1and x0 = (1, 1) for solving the problem

min 12 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 ≤ 0x1 + x2 ≤ 4−x2 ≤ 0

0 0.5 1 1.5 2 2.5 3 3.5 4

0

0.5

1

1.5

2

2.5

3