59
The Ongoing Impact of Interior-Point Methods Stephen Wright University of Wisconsin-Madison SIAM Conference on Optimization Toronto May 20, 2002 (Slightly expanded version, prepared May 27, 2002) 1

The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

The Ongoing Impact of Interior-Point Methods

Stephen WrightUniversity of Wisconsin-Madison

SIAM Conference on OptimizationToronto

May 20, 2002

(Slightly expanded version, prepared May 27, 2002)

1

Page 2: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

themes

Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all aspects of optimization: algorithms, theory, software, applica-tions, discrete and continuous, linear and nonlinear, convex and noncon-vex.

Research has moved beyond the “frenetic” stage into a phase of consoli-dation and maturity. Important developments continue to occur.

This talk gives some background, and presents a selection of highlightsfrom the past 10 years.

Disclaimer: Opinions are personal!

2

Page 3: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

interior-point methods have been used in:

• linear programming

• quadratic programming

. support vector machines

. optimal control, model predictive control

• monotone linear complementarity

• column generation algorithms for large LP

• analytic center methods

• network optimization / multicommodity flow

3

Page 4: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

• semidefinite programming:

. combinatorial optimization: approximations

. control

. cluster analysis and statistics

. structural optimization

• second-order cone programming

• branch-and-bound methods for integer and combinatorial optimization

• convex nonlinear programming, monotone nonlinear complementarity

• general nonlinear programming

• stochastic optimization

4

Page 5: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

background on interior-point methods: outline

• the problem classes:

. linear programming (LP)

. nonlinear programming (NLP)

. semidefinite programming (SDP)

. second-order cone programming (2OCP)

. conic programming (CP)

• primal methods: general framework

• primal-dual methods for LP and SDP

5

Page 6: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

linear programming (LP)

min cTx subject to Ax = b, x ≥ 0,

where x ∈ Rn, A ∈ Rm×n, etc. Dual:

max bTy subject to ATy + s = c, s ≥ 0,

where y ∈ Rm, s ∈ Rn. KKT (optimality conditions) can be expressed as

Ax− b = 0,

ATy + s− c = 0,

XSe = 0,

x ≥ 0, s ≥ 0,

whereX = diag(x1, x2, . . . , xn), S = diag(s1, s2, . . . , sn), e = (1,1, . . . ,1)T .

The equation XSe = 0 expresses complementarity, i.e.xisi = 0 for all i = 1,2, . . . , n.

6

Page 7: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

nonlinear programming (NLP)

min f(x) subject to c(x) = 0, h(x) ≥ 0,

where f : Rn → R, c : Rn → Rm, h : Rn → Rp are smooth functions.

KKT conditions: If x is a stationary point (no first-order feasible descentdirections) and a constraint qualification is satisfied, there are multipliersλ ∈ Rm and ρ ∈ Rp and a slack vector s ∈ Rp such that

∇f(x)−∇c(x)λ−∇h(x)ρ = 0,

c(x) = 0,

h(x)− s = 0,

siρi = 0, i = 1,2, . . . , p,

s ≥ 0, ρ ≥ 0.

7

Page 8: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

second-order cone programming (2OCP)

maxm∑i=1

cTi xi + diti

subject to

m∑i=1

Aixi + aiti = b,

‖xi‖2 ≤ ti, i = 1,2, . . . ,m,

where xi ∈ Rni and ti ∈ R, i = 1,2, . . . ,m.

Dual:

min bTy subject to ‖ATi y − ci‖2 ≤ aTi y − di, i = 1,2, . . . ,m.

8

Page 9: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

semidefinite programming (SDP)

min 〈C,X〉 subject to 〈Ai, X〉 = bi, i = 1,2, . . . ,m, X = XT ≥ 0,

• X ∈ Rn×n, Ai ∈ Rn×n symmetric, C ∈ Rn×n symmetric, bi ∈ R;

• Inner product: 〈F,G〉 = trace F TG =∑nj,k=1FjkGjk;

• X ≥ 0 indicates thatX is semidefinite.

Dual is:

max bTy subject tom∑i=1

yiAi + S = C, S = ST ≥ 0.

9

Page 10: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

SDP: optimality conditions

〈Ai, X〉 = bi, i = 1,2, . . . ,m,m∑i=1

yiAi + S = C,

〈X,S〉 = 0,

X ≥ 0, S ≥ 0.

10

Page 11: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

general form: conic programming (CP)

min 〈c, x〉 subject to Ax = b, x ∈ K, (1)

where Ax = b is an affine constraint, K is a cone.

• LP: K = {x ∈ Rn |x ≥ 0};

• SDP: K = {X |X = XT , X ≥ 0};

• 2OCP: Ki = {(xi, ti) | ‖xi‖2 ≤ ti}, K = K1 ×K2 × · · ·.

These are all convex, pointed cones, and special in another sense too...

Can even write NLP as a CP (but K not convex).

11

Page 12: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

NLP as a CP

We can even write NLP in the form (1). Given

min f(x) subject to h(x) ≥ 0,

define

K = {(x, t) | t ≥ 0, h(x/t) ≥ 0}.

Then constraint h(x) ≥ 0 can be expressed as

t = 1, (x, t) ∈ K.

(Of course, this K is not convex, in general.)

12

Page 13: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

primal interior-point methods

In the CP setting, define a barrier function F : int (K)→ R:

F (x)→∞ as x→ bdryK

Seek successive minima of

〈c, x〉+ µF (x) subject to Ax = b,

for a sequence of µ values, decreasing to zero.

Under reasonable conditions, the sequence of minimizers x(µ) approachesthe solution x∗ of the CP as µ ↓ 0.

Find the minimizers using some variant of Newton’s method (or SQP, ifconstraints Ax = b are present).

13

Page 14: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

self-concordant barrier functions

(Nesterov, Nemirovskii 1993): If the barrier function F is “self-concordant”,we can devise primal IP algorithms with nice properties.

• F is “not too nonquadratic”:

|F ′′′(x)hhh| ≤ 2(F ′′(x)hh

)3/2, all x ∈ int (K);

• “Newton decrement” is bounded:

〈F ′(x),[F ′′(x)

]−1F ′(x)〉 ≤ ϑ, for some ϑ ∈ R, all x ∈ int (K).

Moreover, it’s possible (in principle) to construct self-concordant barriersfor all convex, pointed cones with nonempty interiors.

14

Page 15: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

self-concordant barriers: examples

• K = {x ∈ Rn |x ≥ 0} (LP):

F (x) = −n∑i=1

logxi (ϑ = n).

• K = {X |X = XT , X ≥ 0} (SDP):

F (X) = − log det(X) (ϑ = n).

• K = {(x, t) ∈ Rn+1 | ‖x‖2 ≤ t} (2OCP):

F (x, t) = − log(t2 − ‖x‖2

2

)(ϑ = 2).

15

Page 16: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

primal interior-point method: general form

• start at point x0 ∈ int (K), µ = µ0 (possibly alsoAx0 = b);

• apply Newton-like method (with “enhanced” starting point? with line search?)to find approximate solution of 〈c, x〉+ µF (x) s.t. Ax = b;

• decrease µ and repeat.

Nesterov & Nemirovskii show that to reduce µ by a factor of ε, require

• O(√ϑ log(1/ε)) iterations for “short-step” methods (modest decreases

in µ at each iteration, single full Newton step);

• O(ϑ log(1/ε)) iterations for “long-step” methods (larger decreases inµ at each iteration, multiple Newton steps with line search).

16

Page 17: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

primal-dual interior-point methods for LP

Recall the KKT conditions:

Ax− b = 0,

ATy + s− c = 0,

XSe = 0,

x ≥ 0, s ≥ 0.

The first three conditions form a nonlinear system of 2n+m equations in2n+m unknowns.

Primal-dual methods apply a modified Newton’s method to this system,choosing modifications and step lengths to ensure that x > 0 and s > 0

are satisfied at all iterates.17

Page 18: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

central path

Many primal-dual methods use the concept of the central path C: set ofpoints (x(µ), y(µ), s(µ)) satisfying

Ax− b = 0,

ATy + s− c = 0,

XSe = µe, for µ > 0

x > 0, s > 0.

That is, feasible (x, y, s) with xisi = µ, all i = 1,2, . . . , n.

Connection to primal methods: x(µ) ∈ C is the same x(µ) that minimizesthe primal barrier problem, when F (x) = −

∑logxi.

18

Page 19: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

primal-dual search directions

Search directions obtained by modifying the complementarity equation asfollows:

XSe = σµe,

where µ = xT s/n and σ ∈ [0,1]. Calculate Newton step: A 0 00 AT IS 0 X

∆x

∆y∆s

= −

Ax− bATy + s− cXSe− σµe

.

19

Page 20: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

primal-dual algorithms for LP

path-following:

• short-step: modest decreases in µ (i.e. σ slightly less than 1), one fullNewton step at each iteration;

• long-step: large decrease in µ (i.e. σ closer to 0), use a line search tostay inside a (generous) nbd of C;

• feasible / infeasible: depends on whether conditionsAx = b, ATy + s = c are/are not satisfied at each iteration.

potential-reduction:

• generate steps in the same way, but use a primal-dual potential function tomonitor progress and choose step lengths.

20

Page 21: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

convergence of primal-dual algorithms for LP

• complexity

. needO(nτ log(1/ε)) iterations, where τ = .5,1,2 dependingon the algorithm

• rate

. superlinear variants of long-step algorithms make aggressive choicesof σ (closer to 0) on later iterations

. besides improving convergence rate, superlinear algorithms havebetter robustness

• practical algorithm

. Mehrotra (1992) (includes many enhancements)

21

Page 22: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

SDP central path

Recall optimality conditions:

m∑i=1

yiAi + S = C, (2a)

〈Ai, X〉 = bi, i = 1,2, . . . ,m, (2b)

〈X,S〉 = 0, (2c)

X ≥ 0, S ≥ 0. (2d)

Define central path C by replacing (2c) by

XS = µI, for µ > 0. (3)

22

Page 23: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

primal-dual methods for SDP

Can’t apply Newton-like method to (2b), (2a), (3), because this system isnot square!

• variables: (X, y, S) ∈ Sn ×Rm × Sn;

• equations: Sn ×Rm ×Rn×n, becauseXS is not symmetric.

Restate complementarity condition in a symmetric form, to recover a “square”system.

Example: “AHO” method uses XS + SX = 2µI.

Search directions calculated similarly to LP primal-dual methods (but thelinear algebra is more complicated).

23

Page 24: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

selected highlights

1. semidefinite programming algorithms

2. primal-dual algorithms and symmetric cones

3. geometry of the central path in LP

4. nonlinear programming algorithms

5. software tools for LP, NLP, SDP, 2OCP, QP

6. stable linear algebra

7. classification of self-scaled barriers

24

Page 25: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

1. primal-dual semidefinite programming algorithms

• use different symmetrizations of the complementarity condition

• MZ family: For nonsingular P , define

PXSP−1 + P−TSXP T = 2µI

. AHO: P = I

. H..K..M: P = S1/2; dual H..K..M : P = X−1/2;

. NT:P = W−1/2, whereW = X1/2(X1/2SX1/2)−1/2X1/2

• many others—(Todd, 1999) lists twenty!

25

Page 26: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

convergence of SDP algorithms

primal-dual, path-following.

• similar to the analogous LP algorithms: choice of σ, definition of neighbor-hood, and Mehrotra heuristics for practical codes

• short-step and long-step variants; convergence inO(nτ log(1/ε)) iter-ations, where τ = .5,1,1.5

• infeasible variants require onlyX > 0 and S > 0 at starting point

• superlinearly convergent variants

See Todd in Acta Numerica (2001); Wolkowicz, Saigal, Vandenberghe(eds.) Handbook of Semidefinite Programming (2000)

26

Page 27: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

SDP least-squares approaches

Alternatively, don’t bother symmetrizing. Treat the central path conditions:

m∑i=1

yiAi + S = C, (4a)

〈Ai, X〉 = bi, i = 1,2, . . . ,m, (4b)

XS = µI (4c)

as an overdetermined system (zero residual at the solution). Apply a formof Gauss-Newton. (Kruk et al., 1997)

(Wolkowicz, 2001): Find particular solution X of (4b), and a linear parametriza-tion N : Rn(n+1)/2−m → null(A) of the null space of (4b); expressX = X +N (x).

27

Page 28: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

Substitute forX and S in (4c) to obtain a reduced, overdetermined system:

Fµ(x, y) = (X +N (x))

C − m∑i=1

yiAi

− µI = 0.

(Require X + N (x) ≥ 0, C −∑mi=1 yiAi ≥ 0 at most iterates and at

solution.)

• apply Gauss-Newton, with preconditioning;

• long-step strategy for decreasing µ;

• eventually “cross over” and set µ = 0;

• results obtained for max-cut problem: (4b) is diag(X) = (1,1, . . . ,1)T .

28

Page 29: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

dual barrier algorithms for SDP

max bTy subject tom∑i=1

yiAi + S = C, S = ST ≥ 0,

yields dual barrier subproblem:

max f(y, S;µ)def= bTy + µ log det(S) subject to (5)∑m

i=1 yiAi + S = C, S = ST > 0.

or alternatively

max g(y;µ)def= bTy + µ log det

C − m∑i=1

yiAi

. (6)

Assume that constraints include “diag(X) = given” or “trace(X) = given”.

29

Page 30: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

• (Burer et al., 1998, 2001) find a reparametrization that maintains S =ST ≥ 0

. substitute forS to obtain barrier function with only nonnegativity con-straints;

. use steepest descent to solve it.

• (Kojima et al., 2001) use both forms (5), (6)

. Newton or BFGS to solve the barrier subproblem;

. preconditioned CG to calculate predictor for each new subproblem.

• gradients can be calculated efficiently, sparsity exploited.

30

Page 31: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

2. self-scaled cones

(a.k.a. homogeneous self-dual cones, symmetric cones)

K in the conic program

min 〈c, x〉, subject to Ax = b, x ∈ K,is self-scaled if it is convex, self-dual (K = K∗), pointed, has nonemptyinterior, and admits a self-concordant barrier function F satisfying

• F (τx) = F (x)− ϑ log τ for all x ∈ intK , τ > 0;

• F ′′(v)x ∈ intK for all x, v ∈ intK ;

• F∗(F ′′(v)x) = F (x)− 2F (v)− ϑ.

(Nesterov & Todd, 1997, 1998)

31

Page 32: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

primal-dual algorithms for self-scaled cones

For any x, s ∈ intK, there is a unique scaling point w such that F ′′(w)x =s. (Critical in development of the theory and algorithms.)

Primal-dual algorithms for CP on self-scaled cones use search directionssatisfying A 0 0

0 A∗ IF ′′(w) 0 I

∆x

∆y∆s

= −

00

F ′(x) + 1σµs

,from feasible (x, y, s), for µ = 〈x, s〉.

This is the NT direction (already encountered in context of SDP). It’s goodin practice.

32

Page 33: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

self-scaled cones: notes

• self-scaled cones provide an vehicle for extending primal-dual LP algo-rithms to a larger class of conic programming problems.

• they include the cones from LP, 2OCP, SDP (and their Cartesian products)—but not much else.

• they had been well studied already by analysts, differential geometers.

• contributions also from Guler (1996) and Faybusovich (1997) (exploredconnections with Jordan algebras).

• see new monograph of Renegar (2001).

33

Page 34: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

3. geometry of the central path in LP

(Vavasis & Ye, 1994)

Recall the system defining central path points (x(µ), y(µ), s(µ)):

Ax− b = 0,

ATy + s− c = 0,

XSe = µe, (xisi = µ, i = 1,2, . . . , n)

x > 0, s > 0,

The geometry of C is relevant to path-following methods. Sharp turns,highly nonlinear behavior make C hard to follow.

34

Page 35: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

Vavasis-Ye observations

• C consists of “nearly straight” segments (easy to traverse) with possiblysharp turns in between.

• turns are associated with crossover events: values of µ below which sidefinitely larger than sj for some index pair (i, j);

• define a “layered step” algorithm to follow the path efficiently

. traverse nearly straight segments in one step;

. use short steps to round the turns.

35

Page 36: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

simple example of a twisted path

Given ε > 0:

min x0 subject to x0 ≥ εixi, 0 ≤ xi ≤ 1, i = 1,2, . . . , n,

Solution (x∗0, x∗1, . . . , x

∗n) = (0,0, . . . ,0).

36

Page 37: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

x0

x1

x2

C

37

Page 38: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

results from LP solvers

n = 20, ε = 0.1. Turn off presolve, scaling, crossover to simplex.

• MOSEK:Iter. 7 8 9 10 11 12 13 14logµ −1.2 −2.2 −3.2 −4.2 −5.1 −6.2 −7.3 −8.4

x∗ ≈ (10−11,10−10,10−9,10−8,10−7,10−6,10−5,10−4,10−3, .0024, .39, .49, .5, .5, . . .)

• PCx:Iter. 1 2 3 4 5 6 7 8logµ −2.6 −3.7 −4.8 −5.8 −6.8 −7.8 −8.8 −10.1

x∗ ≈ (10−8,10−7,10−6,10−5,10−4,10−3,10−3, .40, .49, .5, .5, . . .)

Each iteration resolves a single component.

38

Page 39: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

results from LP solvers

n = 20, ε = 0.5. Turn off presolve, scaling, crossover to simplex.

• MOSEK:Iter. 7 8 9 10 11 12logµ −0.7 −1.5 −2.3 −3.0 −4.0 −6.3

x∗ ≈ (2× 10−10,3× 10−10,1× 10−9, . . . ,8× 10−5,1× 10−4)

• PCx:Iter. 4 5 6 7 8 9 10logµ −4.7 −5.4 −6.2 −6.9 −7.7 −8.6 −12.0

x∗ ≈ (4× 10−12,7× 10−11,1× 10−10, . . . ,3× 10−5,2× 10−5)

Each iteration resolves multiple components.

39

Page 40: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

consequences of the geometry

• In theory, superlinear convergence doesn’t happen until the final almost-straight leg (basic and nonbasic indices identified);

. Might happen only when µ is very small

• Practical algorithms can step past many corners of C at once, providedthey are not too sharp.

. Can we get “semi-local” convergence results that yield fast conver-gence beyond the final leg?

40

Page 41: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

4. nonlinear programming algorithms

Consider the problem with only inequality constraints, slacks added:

min f(x) subject to h(x)− s = 0, s ≥ 0,

where f : Rn → R, h : Rn → Rp are smooth functions.

Barrier form:

min f(x)− µp∑

i=1

log si subject to h(x)− s = 0.

KKT conditions: There is ρ ∈ Rp such that

∇f(x)−∇h(x)ρ = 0,

h(x)− s = 0,

siρi = 0, i = 1,2, . . . , p,

s ≥ 0, ρ ≥ 0.

41

Page 42: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

primal-dual approach

As in LP, can obtain steps by applying Newton’s method to perturbationsof the equality conditions in the KKT system: W (x, ρ) −∇h(x) 0

∇h(x)T 0 −I0 S R

∆x

∆ρ∆s

= −

∇f(x)−∇h(x)ρh(x)− sSRe− σµe

.

For convex problems, can define “infeasible path-following” algorithms basedon these steps, that maintain (s, ρ) > 0. Nice properties (global conver-gence, fast asymptotic convergence). (Ralph, Wright, others).

For nonconvex problems, need much more complex algorithmic strategiesto ensure global convergence. (Wachter-Biegler example.)

42

Page 43: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

primal-dual modifications

• line search for merit function: primal barrier, `1, augmented Lagrangian

. doesn’t ameliorate Wachter-Biegler

• don’t insist on satisfaction of linearized equality constraint∇h(x)T∆x−∆s = −h(x) + s

. relax and penalize instead (Tits et al.)

. require only a fraction of the best possible improvement in feasibility(Byrd et al., KNITRO)

• use line search filter approach (Wachter)

. use two merit functions: ‖h(x)− s‖ and primal barrier;

. includes feasibility restoration steps, second-order correction.

43

Page 44: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

combining primal and primal-dual approaches

KNITRO (Byrd, Gilbert, Hribar, Liu, Nocedal, Waltz): Find approximatesolution of a trust-region SQP-like subproblem for the primal barrier:

min ∇f(x)T∆s+ 12∆xTW (x, ρ)∆x−µeTS−1∆s+ 1

2∆sT (S−1R)∆s

h(x) +∇h(x)T∆x− s−∆s = rh

∆s ≥ −τs,‖(∆x, S−1∆s)‖2 ≤ ∆k.

rh close to the smallest vector that makes this problem feasible (obtainedby solving for a “normal step” that minimizes

‖rh‖2 = ‖h(x) +∇h(x)T∆x− s−∆s‖2subject to trust-region constraints.

44

Page 45: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

5. software

A large range of interior-point software tools is now available.

Mittlemann’s optimization software benchmarks give battery-test perfor-mance data, including comparisons with non-interior-point competitors.

45

Page 46: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

software: linear programming

All based on Mehrotra’s predictor-corrector method, Gondzio correctors.Most use sparse Cholesky factorization of the A(XS−1)AT formulation.

Main difference between codes is in the sparse linear algebra.

Preprocessing, crossover to simplex, linear algebra modifications also im-prove efficiency and robustness.

In general, highly competitive with simplex codes on larger problems.

CPLEX Barrier, XPRESS-MP Barrier, MOSEK, PCx, BPMPD, HOPDM.

46

Page 47: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

software: quadratic programming

Convex: Primal-dual codes have similar algorithmic heuristics to LP, butrequire a symmetric indefinite solver (rather than a Cholesky).

CPLEX Barrier, XPRESS-MP Barrier, MOSEK.

Object-oriented code: OOQP. C++; default version uses MA27, MA47 fac-torizers; can incorporate specialized linear algebra modules.

Nonconvex: Can be solved using NLP codes (LOQO, KNITRO).

Specialized code: Galahad/QPB (two-phase, primal trust-region algorithmwith primal-dual scaling, preconditioned CG/Lanczos linear algebra).

47

Page 48: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

software: semidefinite programming

• primal-dual codes (typically implement H..K..M, NT search directions)

. SDPT3: Matlab/C, exploits sparsity;

. SeDuMi: Matlab/C, product-form storage of X and S promotes nu-merical stability;

. CDSP: C, modified Cholesky;

. SDPA: C++, exploits sparsity;

. PENNON, numerous others...

• dual codes (avoid storage of potentially denseX)

. DSDP

. BMZ, BMPR48

Page 49: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

semidefinite programming software issues

• input format: nice ones available

• homogeneous and self-dual reformulations;

• starting points

• exploiting structure (sparse, block diagonal) is critical to efficiency

• linear algebra software used:

. Meschach

. Ng-Peyton / modified Cholesky

. Matlab sparse matrix routines

. LAPACK49

Page 50: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

software: second-order cone programming

• MOSEK: C, presolver, self-dual formulation, NT search direction. Also sup-ports “rotated” second-order cones.

• SDPT3: see above.

• SeDuMi: see above.

50

Page 51: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

software: nonlinear programming

Interior-point codes are proving to be highly competitive with codes basedon other algorithms.

• IPOPT (www.coin-or.org ): line search, filter, preconditioned CG.

• KNITRO (www.ziena.com ): trust-region, primal barrier/SQP with primal-dual scaling, preconditioned CG.

• LOQO (orfe.princeton.edu/ ∼loqo/ ): primal-dual, direct fac-torization of a regularized KKT matrix.

51

Page 52: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

6. linear algebra

• robust, efficient linear algebra essential to practical effectiveness

• structure and size of linear systems to be solved typically the same at alliterations

. easier to exploit structure, sparsity than in other approaches

. easier to write an efficient code

• ill conditioning may cause concern, particularly near the solution

52

Page 53: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

stable linear algebra for LP

Formulate search direction linear system as either “augmented system:”[−X−1S AT

A 0

] [∆x∆y

]=

[rbrc

]or “normal equations:”

A(XS−1)AT∆y = r,

where X, S are positive diagonal (but X−1S increasingly ill conditionednear the solution).

Even when A has full rank, A(XS−1)AT may become ill conditioned onlater iterations—may lose numerical positive definiteness.

How to deal with the small or negative pivots that arise during Choleskyfactorization?

53

Page 54: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

• simple fixes often work:

. replace the offending diagonal with∞;

. replace the diagonal, and its row/column, and the correspondingright-hand side element, by 0.

• can show that these measures (and roundoff) may induce a large error in(∆x,∆y,∆s), but still yield a good step for the algorithm.

. errors are in a subspace that doesn’t matter much (S. Wright, 1996)

• more sophisticated fix: product-form Cholesky (Goldfarb & Scheinberg,2000).

54

Page 55: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

stable linear algebra for NLP

Compute Newton-like steps for

∇f(x)−∇h(x)ρ = 0,

h(x)− s = 0,

siρi = 0, i = 1,2, . . . , p

from points (x, s, ρ) with (s, ρ) > 0.

Usually use an “augmented system” or a “condensed” form of the stepequations, which approaches a singular limit.

Direct factorization approaches applied to these formulations yield reason-able steps for the primal-dual, until µ ≈

√u.

(M. Wright, 1997; S. Wright, 1998)

55

Page 56: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

7. classification of self-scaled barriers

(Schmieta, Hauser, Guler, Lim, 1999-2001)

• self-scaled cones were classified (by their identity with symmetric cones),but self-scaled barriers were not

• for each ofRn+, Sn, second-order cone, the self-scaled barrier function F

with smallest possible ϑ is well known. Is it unique?

. YES! (up to an additive constant)

56

Page 57: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

proof via Jordan algebras

self-scaled barrier for K

↓Euclidean Jordan algebra J for which K is the cone of squares

↓the self-scaled barrier for J : − log detx+ constant.

• technique for deriving J from K and a self-scaled barrier forK ;

• technique for generating the self-scaled barrier from J :

− log detx+ constant.57

Page 58: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

topics for further investigation

• NLP algorithms

. more computation-guided development of algorithms and theory

. practical algorithms appear to tie together many ideas, new and old(SQP, trust-region, filter, linear algebra of different types,...)

• linear algebra / software implementations

. SDP issues;

. NLP issues: iterative vs. direct linear algebra

• structured applications in emerging areas

. bioinformatics?58

Page 59: The Ongoing Impact of Interior-Point Methodspages.cs.wisc.edu/~swright/talks/siopt_talk_may02.pdf · Since 1984 (Karmarkar), interior-point techniques and ideas have per-vaded all

interior-point research has changed optimization

• LP software much better than in 1984

• opened a new line of research in practical NLP algorithms

• cross-fertilization between optimization disciplines

• semidefinite programming!

59