Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Nonlinear ProgrammingComputation, Applications, Software
Continuous Optimization:Recent Developments and Applications
Stephen Wright
Department of Computer SciencesUniversity of Wisconsin-Madison
SIAM, Portland, July, 2004
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Nonlinear ProgrammingBackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs: Equibibrium Constraints
Computation, Applications, SoftwareOptimization on the GridNew ApplicationsSoftware ToolsOptimization Modeling LanguagesThe Internet
Discuss developments within past 10 years (though the origins gofurther back in some instances).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Nonlinear Programming: Themes
I Nonlinear Programming (NLP) continues to provide fertileground for research into algorithms and software.
I New ideas;I Continual reevaluation of old ideas.
I NLP is a broad paradigm encompassing many pathologicalsituations.
I Difficult to design robust algorithms;I Relative performance of algorithms/software varies widely
between problems.
I Applications are becoming more plentiful and diverse, and aredriving developments in algorithms and software.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Background
Focus mainly on the inequality constrained problem.
min f (x) subject to c(x) ≥ 0,
where f : IRn → IR and c : IRn → IRm are assumed smooth.
I Algorithms usually find local solutions—not global);
I May converge to less desirable points.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Constraint Qualifications
Since NLP algorithms work with approximations based on thealgebraic representation of the feasible set, need conditions toensure that such approximations capture the essential geometry ofthe set: Constraint Qualifications.
Define A∗ def= i = 1, 2, . . . ,m | ci (x
∗) = 0.
I Linear Independence (LICQ): The gradients ∇ci (x∗), i ∈ A∗
are linearly independent.
I Mangasarian-Fromovitz (MFCQ): There is a direction d suchthat ∇ci (x
∗)Td > 0 for all i ∈ A∗.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
(MFCQ satisfied, not LICQ)
(CQ not satisfied)
linear approx
linear approx
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
First-Order Necessary Conditions
Karush-Kuhn Tucker (KKT): If x∗ is a local minimizer and a CQis satisfied at x∗, there is λ∗ ∈ IRm such that
∇f (x∗) =m∑
i=1
λ∗i∇ci (x∗),
0 ≤ c(x∗) ⊥ λ∗ ≥ 0.
Complementarity ⊥ means that (λ∗)T c(x∗) = 0, that is,
λ∗i = 0 or ci (x∗) = 0 for all i = 1, 2, . . . ,m.
“Objective gradient is a conic combination of active constraintgradients.”
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Optimality conditions can be expressed in terms of the Lagrangianfunction, which is a linear combination of objectives andconstraints, with Lagrange multipliers as coefficients:
L(x , λ) = f (x)− λT c(x).
Can write KKT conditions as
∇xL(x∗, λ∗) = 0,
0 ≤ c(x∗) ⊥ λ∗ ≥ 0.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Second-Order Conditions
There is a critical cone of feasible directions C for which firstderivative information alone cannot determine whether f increasesor decreases.
We idenfity “tiebreaking” conditions based on second derivativesto ensure increase in f along these directions.
Second-Order Sufficient: If x∗ satisfies KKT for some λ∗, and
dT∇2xxL(x∗, λ∗)d > 0, for all d ∈ C with d 6= 0,
then x∗ is a strict local solution.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
A Basic Algorithm: SQP for Inequality ConstraintsNo “outer loop.” From current iterate x , obtain step ∆x bysolving:
min ∇f (x)T∆x + 12∆xTH∆x subject to
c(x) +∇c(x)T∆x ≥ 0,
H typically contains curvature information about both objectiveand active constraints; “ideal” choice is Hessian of the Lagrangian:
∇2xxL(x , λ) = ∇2f (x)−
∑i
λi∇2ci (x).
I H can also be approximated using a quasi-Newton technique.
I ∆x modified via line-search and trust-region variants.
I Merit function often used to determine acceptability of step.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
SQP for Equality ConstraintsIf all constraints are equalities:
min f (x) subject to h(x) = 0, (1)
Get SQP step by solving
min ∇f (x)T∆x + 12∆xTH∆x subject to
h(x) +∇h(x)T∆x = 0,
When H = ∇2xx L(x , µ) and second-order conditions are satisfied,
the same ∆x can be obtained by applying Newton’s method foralgebraic equations to the KKT conditions for (1), which are
∇x L(x , µ) = ∇f (x)−∇h(x)µ = 0,
h(x) = 0.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
1. Filter Methods: Background
Many NLP algorithms use a merit function to gauge progress andto decide whether to accept a candidate step. Usually a weightedaverage of objective value and constraint violation, e.g.
P(x ; ν) = f (x) + νr(c(x)), where
r(c(x))def=
m∑i=1
max(0,−ci (x)) = ‖max(0,−c(x))‖1.
Choosing the penalty parameter ν may be tricky.Too small: P(x ; ν) unbounded below;Too large: Slow progress along boundary of Ω.
Seek a “minimalist” approach that doesn’t require choice of aparameter, and rejects good steps less often than does P.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
(Fletcher & Leyffer, 1997): Filter methods treat NLP as atwo-objective optimization problem:
I minimize f ;
I minimize r(c(x)) = ‖max(0,−c(x))‖1 (but we need a zeromin value).
Construct a filter from f (xk) and r(c(xk)) values visited, e.g.
r(c(x)) [infeasibility]
f
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Filter-SQP
Subproblem is a trust-region SQP:
min ∇f (x)T∆x + 12∆xTH∆x subject to
c(x) +∇c(x)T∆x ≥ 0, ‖∆x‖∞ ≤ ρ.
ρ is radius of trust region.
Includes a “restoration” phase to recover a (nearly) feasible point(minimize r(c(x)), ignoring f (x)) if the QP is infeasible.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Many enhancements added to yield good theoretical behavior, andto deal effectively with some difficult problems.
I Second-order correction: Accounts better for curvature inconstraints by solving
min ∇f (x)T∆x + 12∆xTH∆x subject to
c(x + ∆x) +∇c(x)T (∆x − ∆x) ≥ 0, ‖∆x‖∞ ≤ ρ,
where ∆x is the previous guess of ∆x .
I Use Lagrangian in place of f , allows rapid local convergence.
I Sufficient decrease conditions: Accept a candidate only if asignificant improvement over current filter.
I Allow for non-global and inexact solution of the subproblem.
I Others...
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Following original proposal by Fletcher and Leyffer, contributors tofilter-SQP theory have included Gould, M. Ulbrich, S. Ulbrich,Toint, Wachter, others.
The Filter approach is also used in conjunction with interior-pointmethods (see below).
Recent codes using filters:
I FilterSQP (Fletcher and Leyffer; SQP/filter)
I IPOPT (Wachter and Biegler; interior-point/filter).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
2. Interior-Point Methods: BackgroundLogarithmic barrier: Choose parameter µ; x(µ) solves
Bx(µ) : minx
P(x ;µ)def= f (x)− µ
m∑i=1
log ci (x) (c(x) > 0).
Under certain assumptions, have x(µ)→ x∗ as µ→ 0.
Optimality condition:
∇f (x)−m∑
i=1
µ
ci (x)∇ci (x) = 0.
x(4)
x(1)x(.1)
x(.3)
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Barrier-Unconstrained
k ← 1; set µ0 > 0; set x0
repeatSolve Bx(µk) for x(µk), starting from xk−1;Choose µk+1 ∈ (0, µk);Choose next starting point xk ;k ← k + 1;
until convergence.
Newton steps obtained from:
∇2xxP(x ;µ)∆x = −∇xP(x ;µ).
Primal barrier was one of the original methods for NLP (Frisch,’55; Fiacco & McCormick, ’68). Fell into disuse for various reasons,including severe ill-conditioning of the Hessian ∇2
xxP(x ;µ).(Many difficulties later turned out to be fixable.)
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Barrier-EQP
Introduce a slack variable w :
min f (x) subject to c(x)− w = 0, (w > 0).
Obtain x(µ) by solving an equality constrained problem:
Bw (µ) : min f (x)−µ
m∑i=1
log wi , subject to c(x)−w = 0, (w > 0).
Bx(µ) and Bw (µ) have identical solutions, but obtained differently:
I unconstrained algorithms are applied to Bx(µ)
I equality-constrained algorithms for Bw (µ) (don’t maintainfeasibility of c(x)− w = 0 at every iteration).
Recent successful codes take their theoretical underpinnings fromsolving Bw (µk) for a sequence of µk ↓ 0.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Applying SQP to Bw(µ)
KKT conditions for Bw (µ):
∇f (x)−∇c(x)λ = 0,
c(x)− w = 0,
−µW−1e + λ = 0, (λ > 0, w > 0),
where W = diag(w1,w2, . . . ,wm), e = (1, 1, . . . , 1)T .
Get pure SQP step by applying Newton’s method to KKTconditions: ∇2L(x , λ) −∇c(x) 0∇c(x)T 0 −I
0 I µW−2
∆x∆λ∆w
= −
∇xL(x , λ)c(x)− w−µW−1e + λ
.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Eliminate ∆w to get a more tractable form:[∇2L(x , λ) −∇c(x)∇c(x)T µ−1W 2
] [∆x∆λ
]= −
[∇xL(x , λ)
r(x ,w , λ, µ)
].
(Byrd, Gilbert, Nocedal, ’00): Apply a customized SQP-trustregion algorithm to Bw (µ),
(Wachter-Biegler, ’02): Apply a filter-SQP method to Bw (µ), with
I objective f (x)− µ∑
log wi
I infeasibility measure ‖c(x)− w‖2
However, the theory isn’t the whole story! Explicitmanipulation of dual variables λ is key to making thesealgorithms effective in practice.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Primal-Dual Approach
Transform the last KKT condition for Bw (µ) by multiplying by W ;then apply Newton-like method for algebraic equations to
∇f (x)−∇c(x)λ = 0,
c(x)− w = 0,
Wλ− µe = 0, (λ > 0, w > 0),
Choose search directions and steplengths to
I maintain λ > 0, w > 0;
I make progress toward the solution (according to certainmeasures).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Newton equations for primal-dual step, after elimination of ∆w :[∇2L(x , λ) −∇c(x)∇c(x)T Λ−1W
] [∆x∆λ
]= −
[∇xL(x , λ)
r(x ,w , λ, µ)
].
Take steps jointly in (x ,w , λ); maintain positivity of w and λ.Choose αx , αw , αλ in (0, 1) (not necessarily the same!) and set
x ← x + αx∆x ,
w ← w + αw∆w > 0,
λ ← λ + αλ∆λ > 0.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Primal-Dual Discussion
Primal-dual approaches for linear programming were well known by’87 and were the practical approach of choice by ’90.
Extension to NLP obvious in principle, but the devil was in thedetails! Many enhancements were needed to improve practicalrobustness and efficiency.
Codes LOQO (Benson, Shanno, Vanderbei), KNITRO (Byrd et al.)and IPOPT (Wachter & Biegler) all generate steps of primal-dualtype but differ in important respects.
KNITRO and IPOPT both derive their theoretical justificationfrom the Barrier-EQP approach, but replace µ−1W 2 in thereduced system by Λ−1W . (IPOPT checks explicitly that Λ−1Wdoes not stray too far from µ−1W 2.)
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
SoftwareIPOPT (Wachter, Biegler, ’04): Line-search filter method basedon Bw (µ), but with primal-dual steps.
Solve step equations using direct solver (MA27) forsymmetric-indefinite systems, adding perturbations to the diagonalto “correct” the inertia if necessary.
LOQO uses direct factorization of reduced step equations, withadditions to (1, 1) block to fix the inertia.
KNITRO Trust-region algorithm for Bw (µ), with “primal-dual”scaling (Λ−1W ). Merit function.
Solve step equations using an iterative method: projectedconjugate gradient.
KNITRO-Direct: Also uses direct solution of step equations anda line search on some iterations.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
3. Degenerate NLPReminder of optimality conditions
∇f (x∗) =m∑
i=1
λ∗i∇ci (x∗),
0 ≤ c(x∗) ⊥ λ∗ ≥ 0.
Given the active constraints
A∗ def= i = 1, 2, . . . ,m | ci (x
∗) = 0,
we can rewrite optimality conditions as
∇f (x∗)−∑i∈A∗
λ∗i∇ci (x∗) = 0,
ci (x∗) = 0, for all i ∈ A∗
(since by complementarity have λ∗i = 0 for i /∈ A∗).Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Local convergence anaylsis of algorithms usually assumes
I Linear independence of active constraints: ∇ci (x∗), i ∈ A∗
linearly independent (implies that λ∗ is unique).
I Strict complementarity: λ∗i > 0 for all i ∈ A∗.I Second-order sufficient conditions.
Under these assumptions, the system
∇f (x∗)−∑i∈A∗
λ∗i∇ci (x∗) = 0,
ci (x∗) = 0, for all i ∈ A∗
is a “square” nonlinear algebraic system with whose Jacobian[∇2L(x∗, λ∗) −∇cA∗(x)∇cA∗(x)T 0
]is nonsingular at the solution (x∗, λ∗A∗).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Hence, for the inequality problem, can get rapid convergence by
I Identifying A∗ correctly;
I Applying Newton’s method to the reduced system above.
(Many algorithms do this, implicitly or explicitly.)
The three assumptions are rather strong. Degenerate problems arethose for which they are not satisfied.
I A∗ hard to identify;
I Jacobian of Newton system is singular in the limit.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
We discuss an approach that obtains fast convergence withoutlinear independence and strict complementarity (but still needssecond-order sufficient conditions).
Denote S = (x∗, λ∗) satisfying KKT conditions.
Under our assumptions, x∗ is unique but λ∗ may not be. In fact,the set of optimal λ∗ may be unbounded.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
First Ingredient: Active Set Identification
First issue is to identify A∗ correctly.
Assume that estimates of both x and λ are available.
The following defines a computable estimate of the distance tosolution set S.
η(x , λ) =
∥∥∥∥[∇xL(x , λ)
min(λ, c(x))
]∥∥∥∥1
.
Can show that for (x , λ) near S, we have
η(x , λ) ∼ dist((x , λ),S).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Active set estimate: Choose σ ∈ (0, 1) (say σ = 1/2) and set
A(x , λ) = i | ci (x) ≤ η(x , λ)1/2.
For (x , λ) sufficiently close to S, we have
A∗ = A(x , λ).
Proof. Have η(x , λ)→ 0 as (x , λ)→ S.
I For i /∈ A∗, have ci (x)→ ci (x∗) > 0, so eventually
ci (x) > η(x , λ)1/2.
I For i ∈ A∗, have ci (x) = ci (x)− ci (x∗) = O(‖x − x∗‖) =
O(η(x , λ)) ≤ η(x , λ)1/2.
QED
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Second Ingredient: A Stabilized Method for Equalities
Solve the equality-constrained problem
min f (x) subject to cA(x) = 0.
Define distance to solution by
η(x , λA) =
∥∥∥∥[∇x L(x , λA)
cA(x)
]∥∥∥∥and solve a stabilized SQP subproblem:[∇2
xx L(x , λA) −∇cA(x)∇cA(x)T ηI
] [∆x
∆λA
]= −
[∇x L(x , λA)
h(x)
].
and set (x , λA)← (x , λA) + (∆x ,∆λA).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
A Complete Algorithm
Use the two ingredients above to build a rapidly convergent localphase into any algorithm that generates primal-dual iterates(xk , λk).
I η(xk , λk) < τ , enter local phase:
I estimate A and apply the stabilized algorithm to
min f (x) subject to cA(x) = 0;
I check that new iterate is acceptable for the original inequalityconstrained problem. If not, set τ ← τ/2 and return to outerloop;
With appropriate checks, get superlinear convergence.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
4. Optimization with Equilibrium Constraints (MPEC)
Simplest example: Given x1 and x2 in IR,
min f (x1, x2) subject to x1 ≥ 0, x2 ≥ 0, x1 ⊥ x2.
Can write complementarity condition as x1x2 = 0 or −x1x2 ≥ 0.
x2
x1
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
MPEC constraints look simple, but they
I Do not satisfy constraint qualifications at any feasible point.
I Are highly nonconvex at (0, 0).
I Are a union of “nice” feasible regions, rather than anintersection. (Unions are much harder to deal with.)
If x1 = 0, have active constraints x1 ≥ 0 and −x1x2 ≥ 0 withgradients: [
10
],
[−x2
0
],
which are linearly dependent.
At (x1, x2) = 0, all three constraints are active with gradients[10
],
[01
],
[00
].
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Application of MPEC: Equilibrium Problems with DesignSystems described by equilibrium conditions (e.g. economic ormechanical), with an optimization or design problem overlaid.
Example: Nash equilibrium in a 2-player game.Player 1 aims to maximize his some function f1(x1, x2) by choosing“his” variables from the set x1 ≥ 0. If f1(·, x2) is convex, theoptimal choice for a given x2 satisfies
0 ≤ x1 ⊥ −∇x1f1(x1, x2) ≥ 0.
Similarly for Player 2 (whose variables are x2), optimal choice has
0 ≤ x2 ⊥ −∇x2f2(x1, x2) ≥ 0.
The Nash equilibrium is the point (x1, x2) at which both sets ofconditions are satisfied.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Suppose there’s a design variable z that should be chosen tomaximize some overall utility, say g(x1, x2, z).
Adding z to the utility functions f1, f2 and assuming appropriateconvexity conditions, we can state the overall MPEC as
maxx1,x2,z
g(x1, x2, z) subject to
0 ≤ x1 ⊥ −∇x1f1(x1, x2, z) ≥ 0,
0 ≤ x2 ⊥ −∇x2f2(x1, x2, z) ≥ 0.
(Possibly other constraints on x1, x2, z .)
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Solving MPECs: 1996-2002
I Lack of CQ ⇒ Standard NLP algorithms may not work!
I Surge of interest prompted in part by ’96 monograph of Luo,Pang, Ralph.
I Various specialized algorithms proposed (’96-’01) to deal withthe special nature of the constraints.
However, Fletcher & Leyffer (’02) showed that ordinary SQP codesperformed excellently on MPEC benchmarks (and interior-pointcodes were also quite good).
Why?
Can robustness of standard NLP approaches be improved further?
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
MPEC First-Order Optimality Conditions
Consider this formulation (where G and H are functions fromIRn → IRm):
min f (x) subject to 0 ≤ G (x) ⊥ H(x) ≥ 0.
Can include general equality (h(x) = 0) and inequality (c(x) ≥ 0)constraints—they complicate the notation but not the concepts.
Active sets IG = i |Gi (x∗) = 0, IH = i |Hi (x
∗) = 0.
Complementarity ⇒ IG ∪ IH = 1, 2, . . . ,m.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Stationarity Conditions: Have x∗ feasible for the MPEC, andLagrange multipliers τ∗i , ν∗i such that
∇f (x∗)−∑i∈IG
τ∗i ∇Gi (x∗)−
∑i∈IH
ν∗i ∇Hi (x∗) = 0,
Gi (x∗) = 0, i ∈ IG ,
Hi (x∗) = 0, i ∈ IH .
For i ∈ IG\IH , have Hi (x) > 0 for all x near x∗, socomplementarity implies Gi (x) = 0.
Hence, Gi behaves as an equality constraint and so the multiplierτ∗i can have either sign.
(Similarly for ν∗i , i ∈ IH\IG .)
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
However for i ∈ IG ∩ IH—the biactive indices—it is less clear howthe sign of τ∗i and ν∗i should be restricted. Different flavors ofstationarity have been proposed:
I C-stationary: τ∗i and ν∗i have the same sign;
I M-stationary: Does not allow τ∗i and ν∗i of different signs, andone of them must be nonnegative.
I Strong stationarity: τ∗i and ν∗i both nonnegative.
Strong stationarity is the most appealing, as it ensures that thereis no first-order direction of descent.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Regularization (Scholtes ’01)
For parameter σ > 0, define
Reg(σ):min f (x) subject toG (x) ≥ 0, H(x) ≥ 0,Gi (x)Hi (x) ≤ σ, i = 1, 2, . . . ,m,
I Solve for decreasing sequence of σ = σk with σk ↓ 0; Canapply a more or less standard NLP algorithm.
I For each σ > 0, CQ are satisfied (under “reasonable”assumptions).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
x2
x1
x2
x1
Relationship between solution x(σ) of Reg(σ) and the originalMPEC is quite complex. We may have
I ‖x(σ)− x∗‖ = O(σ);
I ‖x(σ)− x∗‖ = O(σ1/2),
depending on exactly what properties hold at a strongly stationaryx∗ (Scholtes ’01, Ralph and SW ’04).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Penalization
For parameter ρ > 0 define
Pen(ρ):min f (x) + ρG (x)TH(x) subject toG (x) ≥ 0, H(x) ≥ 0.
I Force G (x)TH(x) = 0 by penalizing positive values of GTH.
I Solve Pen(ρk) for an increasing sequence of ρk .
I STOP when GTH = 0 at some ρk ; the solution of Pen(ρk) isthen at least stationary for the MPEC.
I CQ are satisfied (under “reasonable” assumptions).
I Good computational success reported for interior-pointmethods applied to this formulation.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
Some Results for Penalization
(Anitescu ’01; Ralph and SW ’04; Anitescu, Tseng, SW, ’04)
If x∗ solves the MPEC, then it is also a local solution of Pen(ρ) forall ρ sufficiently large.
If x∗ satisfies KKT conditions for Pen(ρ) and G (x∗)TH(x∗) = 0,then x∗ is strongly stationary for the MPEC.
(Global Convergence) If second-order necessary conditions aresatisfied at each solution of Pen(ρk) then either
I for some k, solution of Pen(ρk) is a strongly stationary pointof the MPEC; or
I any accumulation point is infeasible or does not satisfy theLICQ
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs
SQP for MPEC
I In Fletcher-Leyffer tests (’02), the SQP codes filterSQP andSNOPT solved almost all test problems.
I Subsequent analysis (Fletcher et al. ’02) provided sometheoretical support for local convergence of SQP.
MPEC is a prime candidate for the stabilized SQP approach fordegenerate NLP outlined above.
I No longer need to assume feasibility of SQP subproblems;
I Can be embedded in other approaches to produce rapid localconvergence;
I However, needs second-order conditions that are stronger thanthe ideal.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Computation, Applications, Software
I Large optimization problems on computational grids
I Applications
I Software packages; Trends in software design
I Modeling languages
I The Internet
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Optimization on the Grid
The Grid isn’t all hype! Optimizers have done seriouscomputations on it. Some examples:
I Traveling Salesman Problem: Applegate et al. Pre-”Grid”
I metaNEOS (’97-’01): NSF-funded collaboration betweenoptimizers and grid computing groups: Condor, Globus.
I MW Toolkit for implementing master-worker algorithms ongrids
I Many interesting parallel algorithmic paradigms can beshoehorned effectively into the master-worker framework(task-farming, branch-and-bound, “native” master-worker)
I Development of MW is continuing.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Successful Implementations with MW
I Quadratic Assignment Problem: Solved the touchstoneproblem NUG30 (Anstreicher et al., ’02). Seven years of CPUtime in a week on more than 1000 computers (10 locations, 3continents).
I Two-stage stochastic linear programming (Linderoth & SW,’03). Solved problems with up to 108 constraints and 1010
variables on similar platforms. Allowed extensive testing ofstatistical properties of solutions to such problems and insightinto the nature of solutions (Linderoth, Shapiro, SW, ’02).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
“New” Applications
The NLP user base is smaller than for linear programming orinteger (linear) programming, though there has been steadydemand for codes such as MINOS, CONOPT, NPSOL for manyyears.
NLP have arisen (or have attracted renewed attention) in a host ofapplications in the past 10-15 years.
Optimization with PDE constraints. An area of particular focus.Typically design or control problems overlaid on a system describedby PDEs. (see volume of Biegler et al., Springer LNCS, ’03)
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Cancer Treatment Planning. Find radiotherapy treatments forcancer that deposit a given dose of radiation in the tumor whileavoiding nearby normal tissue and critical organs. Involvesoptimization problems of many types; recent interest in biologicalobjective functions give rise to NLPs.
Meteorological Data Assimilation in medium-range weatherforecasting. Special case of PDE-constrained optimization.Unknown is the state of atmosphere 48 hours ago; objectivefunction is fit between model and observations. NLP formulations(with specialized algorithms) used at all major forecasting centers.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Model Predictive Control in process engineering. Solve asequence of nonlinear optimal control problem over finite timehorizons; implement closed-loop control using open-loop strategies.Feasible SQP methods have proved to be useful.
Statistics and Machine Learning. Multicategory support vectormachines, robust estimation.
Image Reconstruction. Reconstructing images fromundersampled MRI data.
Others were presented at this meeting.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Connecting with Users
Optimization is a consumer-oriented discipline with a wide anddiverse user base.
I Science
I Engineering
I Operations Research
I Finance and Economics
I “Retail” Users
How is optimization technology transferred to the market? Whatare the trends?
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
NLP Software PackagesSee NEOS Server www-neos.mcs.anl.gov
∗=freely available. Red=under active development.
I IPOPT∗: interior-point
I KNITRO: interior-point and SLQP variants
I SNOPT: SQP
I LOQO: interior-point
I FILTER: Filter-SQP
I CONOPT: Generalized reduced gradient
I LANCELOT∗: Augmented Lagrangian
I MINOS: sequential linearly constrained
I PENNON: Generalized augmented Lagrangian
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Other NLP solvers are embedded into larger optimization ornumerical computing systems: Matlab, Excel, NAG.
Numerous other commercial packages reviewed in 1998 OR/MSToday survey:www.lionhrtpub.com/orms/surveys/nlp/nlp.html.
NEOS Guide: www.mcs.anl.gov/otc/Guide/ (somewhat out ofdate).
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Trends in Optimization Software Design
I Once were monolithic Fortran codes. Fortran still popular; Cnow used in some leading codes; a few experiments in C++.
I Object-oriented design used in some recent codes; allowspecialization according to problem structure, modular use oflinear algebra:
I TAO (unconstrained and bound-constrained)I OOPS and OOQP (quadratic programming)I IOTR (NLP-forthcoming)
I Some interior-point codes are especially able to useoff-the-shelf linear algebra packages (e.g. IPOPT and OOQPuse Harwell’s MA27).
I Multiple user interfaces: native-language calls, batch files,modeling languages, high-level languages (Matlab, Octave),spreadsheets.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Open-Source Software
Some good codes are now open-source.
The COIN-OR initiative www.coin-or.org has was founded in2000 to organize and support community development ofoptimization codes.
I Peer review process;
I Sessions at optimization meetings;
I OSI: common interface to optimization solvers (currently forLP and MIP);
I Includes several excellent codes, e.g. IPOPT for NLP.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
Optimization Modeling Languages
Allows problems to be defined (and the results presented) in a waythat is natural to the application. No shoehorning into thematrices/vectors of the underlying software.
AMPL, GAMS, Maximal, AIMMS
They have become an extremely popular way to call optimizationsoftware (method of choice on NEOS Server).
Usually incorporate automatic differentiation, relieves user of needto code first derivatives by hand. (Several codes optionally allowsecond derivatives to be used; these must be provided by the user.)
They make optimzation the “outer loop”—not easy to embed theoptimization problem in a larger computation.
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
The Internet
Using the Internet to reach users and build community.
I Following on from pioneers in the NA community: NAnet,netlib (mid-’80s).
I NEOS (’94-present)I Server: online solution of optimization problems; now
1500-3000 submissions/week.I Guide: informational resource; not much recent development
I Preprint repositories:I IPMO: Interior-Point Methods Online (’94-’01) was vital to the
interior-point boom.I Optimization Online (’00-present)
Stephen Wright Continuous Optimization: Recent Developments
Nonlinear ProgrammingComputation, Applications, Software
Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet
THANKS FOR YOUR ATTENTION!
Stephen Wright Continuous Optimization: Recent Developments