Continuous Optimization: Recent Developments and Applicationspages.cs.wisc.edu/~swright/talks/siam-annual-jul04.pdf · Computation, Applications, Software Optimization on the Grid

Nonlinear ProgrammingComputation, Applications, Software

Continuous Optimization:Recent Developments and Applications

Stephen Wright

Department of Computer SciencesUniversity of Wisconsin-Madison

SIAM, Portland, July, 2004

Stephen Wright Continuous Optimization: Recent Developments


Nonlinear ProgrammingBackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs: Equibibrium Constraints

Computation, Applications, SoftwareOptimization on the GridNew ApplicationsSoftware ToolsOptimization Modeling LanguagesThe Internet

Discuss developments within past 10 years (though the origins gofurther back in some instances).



BackgroundFilter MethodsInterior-Point MethodsDegenerate NLPMPECs

Nonlinear Programming: Themes

I Nonlinear Programming (NLP) continues to provide fertileground for research into algorithms and software.

I New ideas;I Continual reevaluation of old ideas.

I NLP is a broad paradigm encompassing many pathologicalsituations.

I Difficult to design robust algorithms;I Relative performance of algorithms/software varies widely

between problems.

I Applications are becoming more plentiful and diverse, and aredriving developments in algorithms and software.




Background

Focus mainly on the inequality constrained problem.

min f (x) subject to c(x) ≥ 0,

where f : IRn → IR and c : IRn → IRm are assumed smooth.

I Algorithms usually find local solutions—not global);

I May converge to less desirable points.




Constraint Qualifications

Since NLP algorithms work with approximations based on thealgebraic representation of the feasible set, need conditions toensure that such approximations capture the essential geometry ofthe set: Constraint Qualifications.

Define A∗ def= i = 1, 2, . . . ,m | ci (x

∗) = 0.

I Linear Independence (LICQ): The gradients ∇ci (x∗), i ∈ A∗

are linearly independent.

I Mangasarian-Fromovitz (MFCQ): There is a direction d suchthat ∇ci (x

∗)Td > 0 for all i ∈ A∗.




(MFCQ satisfied, not LICQ)

(CQ not satisfied)

linear approx

linear approx




First-Order Necessary Conditions

Karush-Kuhn Tucker (KKT): If x∗ is a local minimizer and a CQis satisfied at x∗, there is λ∗ ∈ IRm such that

∇f (x∗) =m∑

i=1

λ∗i∇ci (x∗),

0 ≤ c(x∗) ⊥ λ∗ ≥ 0.

Complementarity ⊥ means that (λ∗)T c(x∗) = 0, that is,

λ∗i = 0 or ci (x∗) = 0 for all i = 1, 2, . . . ,m.

“Objective gradient is a conic combination of active constraintgradients.”




Optimality conditions can be expressed in terms of the Lagrangianfunction, which is a linear combination of objectives andconstraints, with Lagrange multipliers as coefficients:

L(x , λ) = f (x)− λT c(x).

Can write KKT conditions as

∇xL(x∗, λ∗) = 0,

0 ≤ c(x∗) ⊥ λ∗ ≥ 0.




Second-Order Conditions

There is a critical cone of feasible directions C for which firstderivative information alone cannot determine whether f increasesor decreases.

We idenfity “tiebreaking” conditions based on second derivativesto ensure increase in f along these directions.

Second-Order Sufficient: If x∗ satisfies KKT for some λ∗, and

dT∇2xxL(x∗, λ∗)d > 0, for all d ∈ C with d 6= 0,

then x∗ is a strict local solution.




A Basic Algorithm: SQP for Inequality ConstraintsNo “outer loop.” From current iterate x , obtain step ∆x bysolving:

min ∇f (x)T∆x + 12∆xTH∆x subject to

c(x) +∇c(x)T∆x ≥ 0,

H typically contains curvature information about both objectiveand active constraints; “ideal” choice is Hessian of the Lagrangian:

∇2xxL(x , λ) = ∇2f (x)−

∑i

λi∇2ci (x).

I H can also be approximated using a quasi-Newton technique.

I ∆x modified via line-search and trust-region variants.

I Merit function often used to determine acceptability of step.




SQP for Equality ConstraintsIf all constraints are equalities:

min f (x) subject to h(x) = 0, (1)

Get SQP step by solving


h(x) +∇h(x)T∆x = 0,

When H = ∇2xx L(x , µ) and second-order conditions are satisfied,

the same ∆x can be obtained by applying Newton’s method foralgebraic equations to the KKT conditions for (1), which are

∇x L(x , µ) = ∇f (x)−∇h(x)µ = 0,

h(x) = 0.




1. Filter Methods: Background

Many NLP algorithms use a merit function to gauge progress andto decide whether to accept a candidate step. Usually a weightedaverage of objective value and constraint violation, e.g.

P(x ; ν) = f (x) + νr(c(x)), where

r(c(x))def=

m∑i=1

max(0,−ci (x)) = ‖max(0,−c(x))‖1.

Choosing the penalty parameter ν may be tricky.Too small: P(x ; ν) unbounded below;Too large: Slow progress along boundary of Ω.

Seek a “minimalist” approach that doesn’t require choice of aparameter, and rejects good steps less often than does P.




(Fletcher & Leyffer, 1997): Filter methods treat NLP as atwo-objective optimization problem:

I minimize f ;

I minimize r(c(x)) = ‖max(0,−c(x))‖1 (but we need a zeromin value).

Construct a filter from f (xk) and r(c(xk)) values visited, e.g.

r(c(x)) [infeasibility]

f




Filter-SQP

Subproblem is a trust-region SQP:


c(x) +∇c(x)T∆x ≥ 0, ‖∆x‖∞ ≤ ρ.

ρ is radius of trust region.

Includes a “restoration” phase to recover a (nearly) feasible point(minimize r(c(x)), ignoring f (x)) if the QP is infeasible.




Many enhancements added to yield good theoretical behavior, andto deal effectively with some difficult problems.

I Second-order correction: Accounts better for curvature inconstraints by solving


c(x + ∆x) +∇c(x)T (∆x − ∆x) ≥ 0, ‖∆x‖∞ ≤ ρ,

where ∆x is the previous guess of ∆x .

I Use Lagrangian in place of f , allows rapid local convergence.

I Sufficient decrease conditions: Accept a candidate only if asignificant improvement over current filter.

I Allow for non-global and inexact solution of the subproblem.

I Others...




Following original proposal by Fletcher and Leyffer, contributors tofilter-SQP theory have included Gould, M. Ulbrich, S. Ulbrich,Toint, Wachter, others.

The Filter approach is also used in conjunction with interior-pointmethods (see below).

Recent codes using filters:

I FilterSQP (Fletcher and Leyffer; SQP/filter)

I IPOPT (Wachter and Biegler; interior-point/filter).




2. Interior-Point Methods: BackgroundLogarithmic barrier: Choose parameter µ; x(µ) solves

Bx(µ) : minx

P(x ;µ)def= f (x)− µ

m∑i=1

log ci (x) (c(x) > 0).

Under certain assumptions, have x(µ)→ x∗ as µ→ 0.

Optimality condition:

∇f (x)−m∑

i=1

µ

ci (x)∇ci (x) = 0.

x(4)

x(1)x(.1)

x(.3)




Barrier-Unconstrained

k ← 1; set µ0 > 0; set x0

repeatSolve Bx(µk) for x(µk), starting from xk−1;Choose µk+1 ∈ (0, µk);Choose next starting point xk ;k ← k + 1;

until convergence.

Newton steps obtained from:

∇2xxP(x ;µ)∆x = −∇xP(x ;µ).

Primal barrier was one of the original methods for NLP (Frisch,’55; Fiacco & McCormick, ’68). Fell into disuse for various reasons,including severe ill-conditioning of the Hessian ∇2

xxP(x ;µ).(Many difficulties later turned out to be fixable.)




Barrier-EQP

Introduce a slack variable w :

min f (x) subject to c(x)− w = 0, (w > 0).

Obtain x(µ) by solving an equality constrained problem:

Bw (µ) : min f (x)−µ

m∑i=1

log wi , subject to c(x)−w = 0, (w > 0).

Bx(µ) and Bw (µ) have identical solutions, but obtained differently:

I unconstrained algorithms are applied to Bx(µ)

I equality-constrained algorithms for Bw (µ) (don’t maintainfeasibility of c(x)− w = 0 at every iteration).

Recent successful codes take their theoretical underpinnings fromsolving Bw (µk) for a sequence of µk ↓ 0.




Applying SQP to Bw(µ)

KKT conditions for Bw (µ):

∇f (x)−∇c(x)λ = 0,

c(x)− w = 0,

−µW−1e + λ = 0, (λ > 0, w > 0),

where W = diag(w1,w2, . . . ,wm), e = (1, 1, . . . , 1)T .

Get pure SQP step by applying Newton’s method to KKTconditions: ∇2L(x , λ) −∇c(x) 0∇c(x)T 0 −I

0 I µW−2

∆x∆λ∆w

= −

∇xL(x , λ)c(x)− w−µW−1e + λ

.




Eliminate ∆w to get a more tractable form:[∇2L(x , λ) −∇c(x)∇c(x)T µ−1W 2

] [∆x∆λ

]= −

[∇xL(x , λ)

r(x ,w , λ, µ)

].

(Byrd, Gilbert, Nocedal, ’00): Apply a customized SQP-trustregion algorithm to Bw (µ),

(Wachter-Biegler, ’02): Apply a filter-SQP method to Bw (µ), with

I objective f (x)− µ∑

log wi

I infeasibility measure ‖c(x)− w‖2

However, the theory isn’t the whole story! Explicitmanipulation of dual variables λ is key to making thesealgorithms effective in practice.




Primal-Dual Approach

Transform the last KKT condition for Bw (µ) by multiplying by W ;then apply Newton-like method for algebraic equations to

∇f (x)−∇c(x)λ = 0,

c(x)− w = 0,

Wλ− µe = 0, (λ > 0, w > 0),

Choose search directions and steplengths to

I maintain λ > 0, w > 0;

I make progress toward the solution (according to certainmeasures).




Newton equations for primal-dual step, after elimination of ∆w :[∇2L(x , λ) −∇c(x)∇c(x)T Λ−1W

] [∆x∆λ

]= −

[∇xL(x , λ)

r(x ,w , λ, µ)

].

Take steps jointly in (x ,w , λ); maintain positivity of w and λ.Choose αx , αw , αλ in (0, 1) (not necessarily the same!) and set

x ← x + αx∆x ,

w ← w + αw∆w > 0,

λ ← λ + αλ∆λ > 0.




Primal-Dual Discussion

Primal-dual approaches for linear programming were well known by’87 and were the practical approach of choice by ’90.

Extension to NLP obvious in principle, but the devil was in thedetails! Many enhancements were needed to improve practicalrobustness and efficiency.

Codes LOQO (Benson, Shanno, Vanderbei), KNITRO (Byrd et al.)and IPOPT (Wachter & Biegler) all generate steps of primal-dualtype but differ in important respects.

KNITRO and IPOPT both derive their theoretical justificationfrom the Barrier-EQP approach, but replace µ−1W 2 in thereduced system by Λ−1W . (IPOPT checks explicitly that Λ−1Wdoes not stray too far from µ−1W 2.)




SoftwareIPOPT (Wachter, Biegler, ’04): Line-search filter method basedon Bw (µ), but with primal-dual steps.

Solve step equations using direct solver (MA27) forsymmetric-indefinite systems, adding perturbations to the diagonalto “correct” the inertia if necessary.

LOQO uses direct factorization of reduced step equations, withadditions to (1, 1) block to fix the inertia.

KNITRO Trust-region algorithm for Bw (µ), with “primal-dual”scaling (Λ−1W ). Merit function.

Solve step equations using an iterative method: projectedconjugate gradient.

KNITRO-Direct: Also uses direct solution of step equations anda line search on some iterations.




3. Degenerate NLPReminder of optimality conditions

∇f (x∗) =m∑

i=1

λ∗i∇ci (x∗),

0 ≤ c(x∗) ⊥ λ∗ ≥ 0.

Given the active constraints

A∗ def= i = 1, 2, . . . ,m | ci (x

∗) = 0,

we can rewrite optimality conditions as

∇f (x∗)−∑i∈A∗

λ∗i∇ci (x∗) = 0,

ci (x∗) = 0, for all i ∈ A∗

(since by complementarity have λ∗i = 0 for i /∈ A∗).Stephen Wright Continuous Optimization: Recent Developments



Local convergence anaylsis of algorithms usually assumes

I Linear independence of active constraints: ∇ci (x∗), i ∈ A∗

linearly independent (implies that λ∗ is unique).

I Strict complementarity: λ∗i > 0 for all i ∈ A∗.I Second-order sufficient conditions.

Under these assumptions, the system

∇f (x∗)−∑i∈A∗

λ∗i∇ci (x∗) = 0,

ci (x∗) = 0, for all i ∈ A∗

is a “square” nonlinear algebraic system with whose Jacobian[∇2L(x∗, λ∗) −∇cA∗(x)∇cA∗(x)T 0

]is nonsingular at the solution (x∗, λ∗A∗).




Hence, for the inequality problem, can get rapid convergence by

I Identifying A∗ correctly;

I Applying Newton’s method to the reduced system above.

(Many algorithms do this, implicitly or explicitly.)

The three assumptions are rather strong. Degenerate problems arethose for which they are not satisfied.

I A∗ hard to identify;

I Jacobian of Newton system is singular in the limit.




We discuss an approach that obtains fast convergence withoutlinear independence and strict complementarity (but still needssecond-order sufficient conditions).

Denote S = (x∗, λ∗) satisfying KKT conditions.

Under our assumptions, x∗ is unique but λ∗ may not be. In fact,the set of optimal λ∗ may be unbounded.




First Ingredient: Active Set Identification

First issue is to identify A∗ correctly.

Assume that estimates of both x and λ are available.

The following defines a computable estimate of the distance tosolution set S.

η(x , λ) =

∥∥∥∥[∇xL(x , λ)

min(λ, c(x))

]∥∥∥∥1

.

Can show that for (x , λ) near S, we have

η(x , λ) ∼ dist((x , λ),S).




Active set estimate: Choose σ ∈ (0, 1) (say σ = 1/2) and set

A(x , λ) = i | ci (x) ≤ η(x , λ)1/2.

For (x , λ) sufficiently close to S, we have

A∗ = A(x , λ).

Proof. Have η(x , λ)→ 0 as (x , λ)→ S.

I For i /∈ A∗, have ci (x)→ ci (x∗) > 0, so eventually

ci (x) > η(x , λ)1/2.

I For i ∈ A∗, have ci (x) = ci (x)− ci (x∗) = O(‖x − x∗‖) =

O(η(x , λ)) ≤ η(x , λ)1/2.

QED




Second Ingredient: A Stabilized Method for Equalities

Solve the equality-constrained problem

min f (x) subject to cA(x) = 0.

Define distance to solution by

η(x , λA) =

∥∥∥∥[∇x L(x , λA)

cA(x)

]∥∥∥∥and solve a stabilized SQP subproblem:[∇2

xx L(x , λA) −∇cA(x)∇cA(x)T ηI

] [∆x

∆λA

]= −

[∇x L(x , λA)

h(x)

].

and set (x , λA)← (x , λA) + (∆x ,∆λA).




A Complete Algorithm

Use the two ingredients above to build a rapidly convergent localphase into any algorithm that generates primal-dual iterates(xk , λk).

I η(xk , λk) < τ , enter local phase:

I estimate A and apply the stabilized algorithm to

min f (x) subject to cA(x) = 0;

I check that new iterate is acceptable for the original inequalityconstrained problem. If not, set τ ← τ/2 and return to outerloop;

With appropriate checks, get superlinear convergence.




4. Optimization with Equilibrium Constraints (MPEC)

Simplest example: Given x1 and x2 in IR,

min f (x1, x2) subject to x1 ≥ 0, x2 ≥ 0, x1 ⊥ x2.

Can write complementarity condition as x1x2 = 0 or −x1x2 ≥ 0.

x2

x1




MPEC constraints look simple, but they

I Do not satisfy constraint qualifications at any feasible point.

I Are highly nonconvex at (0, 0).

I Are a union of “nice” feasible regions, rather than anintersection. (Unions are much harder to deal with.)

If x1 = 0, have active constraints x1 ≥ 0 and −x1x2 ≥ 0 withgradients: [

10

],

[−x2

0

],

which are linearly dependent.

At (x1, x2) = 0, all three constraints are active with gradients[10

],

[01

],

[00

].




Application of MPEC: Equilibrium Problems with DesignSystems described by equilibrium conditions (e.g. economic ormechanical), with an optimization or design problem overlaid.

Example: Nash equilibrium in a 2-player game.Player 1 aims to maximize his some function f1(x1, x2) by choosing“his” variables from the set x1 ≥ 0. If f1(·, x2) is convex, theoptimal choice for a given x2 satisfies

0 ≤ x1 ⊥ −∇x1f1(x1, x2) ≥ 0.

Similarly for Player 2 (whose variables are x2), optimal choice has

0 ≤ x2 ⊥ −∇x2f2(x1, x2) ≥ 0.

The Nash equilibrium is the point (x1, x2) at which both sets ofconditions are satisfied.




Suppose there’s a design variable z that should be chosen tomaximize some overall utility, say g(x1, x2, z).

Adding z to the utility functions f1, f2 and assuming appropriateconvexity conditions, we can state the overall MPEC as

maxx1,x2,z

g(x1, x2, z) subject to

0 ≤ x1 ⊥ −∇x1f1(x1, x2, z) ≥ 0,

0 ≤ x2 ⊥ −∇x2f2(x1, x2, z) ≥ 0.

(Possibly other constraints on x1, x2, z .)




Solving MPECs: 1996-2002

I Lack of CQ ⇒ Standard NLP algorithms may not work!

I Surge of interest prompted in part by ’96 monograph of Luo,Pang, Ralph.

I Various specialized algorithms proposed (’96-’01) to deal withthe special nature of the constraints.

However, Fletcher & Leyffer (’02) showed that ordinary SQP codesperformed excellently on MPEC benchmarks (and interior-pointcodes were also quite good).

Why?

Can robustness of standard NLP approaches be improved further?




MPEC First-Order Optimality Conditions

Consider this formulation (where G and H are functions fromIRn → IRm):

min f (x) subject to 0 ≤ G (x) ⊥ H(x) ≥ 0.

Can include general equality (h(x) = 0) and inequality (c(x) ≥ 0)constraints—they complicate the notation but not the concepts.

Active sets IG = i |Gi (x∗) = 0, IH = i |Hi (x

∗) = 0.

Complementarity ⇒ IG ∪ IH = 1, 2, . . . ,m.




Stationarity Conditions: Have x∗ feasible for the MPEC, andLagrange multipliers τ∗i , ν∗i such that

∇f (x∗)−∑i∈IG

τ∗i ∇Gi (x∗)−

∑i∈IH

ν∗i ∇Hi (x∗) = 0,

Gi (x∗) = 0, i ∈ IG ,

Hi (x∗) = 0, i ∈ IH .

For i ∈ IG\IH , have Hi (x) > 0 for all x near x∗, socomplementarity implies Gi (x) = 0.

Hence, Gi behaves as an equality constraint and so the multiplierτ∗i can have either sign.

(Similarly for ν∗i , i ∈ IH\IG .)




However for i ∈ IG ∩ IH—the biactive indices—it is less clear howthe sign of τ∗i and ν∗i should be restricted. Different flavors ofstationarity have been proposed:

I C-stationary: τ∗i and ν∗i have the same sign;

I M-stationary: Does not allow τ∗i and ν∗i of different signs, andone of them must be nonnegative.

I Strong stationarity: τ∗i and ν∗i both nonnegative.

Strong stationarity is the most appealing, as it ensures that thereis no first-order direction of descent.




Regularization (Scholtes ’01)

For parameter σ > 0, define

Reg(σ):min f (x) subject toG (x) ≥ 0, H(x) ≥ 0,Gi (x)Hi (x) ≤ σ, i = 1, 2, . . . ,m,

I Solve for decreasing sequence of σ = σk with σk ↓ 0; Canapply a more or less standard NLP algorithm.

I For each σ > 0, CQ are satisfied (under “reasonable”assumptions).




x2

x1

x2

x1

Relationship between solution x(σ) of Reg(σ) and the originalMPEC is quite complex. We may have

I ‖x(σ)− x∗‖ = O(σ);

I ‖x(σ)− x∗‖ = O(σ1/2),

depending on exactly what properties hold at a strongly stationaryx∗ (Scholtes ’01, Ralph and SW ’04).




Penalization

For parameter ρ > 0 define

Pen(ρ):min f (x) + ρG (x)TH(x) subject toG (x) ≥ 0, H(x) ≥ 0.

I Force G (x)TH(x) = 0 by penalizing positive values of GTH.

I Solve Pen(ρk) for an increasing sequence of ρk .

I STOP when GTH = 0 at some ρk ; the solution of Pen(ρk) isthen at least stationary for the MPEC.

I CQ are satisfied (under “reasonable” assumptions).

I Good computational success reported for interior-pointmethods applied to this formulation.




Some Results for Penalization

(Anitescu ’01; Ralph and SW ’04; Anitescu, Tseng, SW, ’04)

If x∗ solves the MPEC, then it is also a local solution of Pen(ρ) forall ρ sufficiently large.

If x∗ satisfies KKT conditions for Pen(ρ) and G (x∗)TH(x∗) = 0,then x∗ is strongly stationary for the MPEC.

(Global Convergence) If second-order necessary conditions aresatisfied at each solution of Pen(ρk) then either

I for some k, solution of Pen(ρk) is a strongly stationary pointof the MPEC; or

I any accumulation point is infeasible or does not satisfy theLICQ




SQP for MPEC

I In Fletcher-Leyffer tests (’02), the SQP codes filterSQP andSNOPT solved almost all test problems.

I Subsequent analysis (Fletcher et al. ’02) provided sometheoretical support for local convergence of SQP.

MPEC is a prime candidate for the stabilized SQP approach fordegenerate NLP outlined above.

I No longer need to assume feasibility of SQP subproblems;

I Can be embedded in other approaches to produce rapid localconvergence;

I However, needs second-order conditions that are stronger thanthe ideal.



Optimization on the GridNew ApplicationsSoftwareModeling LanguagesInternet

Computation, Applications, Software

I Large optimization problems on computational grids

I Applications

I Software packages; Trends in software design

I Modeling languages

I The Internet




Optimization on the Grid

The Grid isn’t all hype! Optimizers have done seriouscomputations on it. Some examples:

I Traveling Salesman Problem: Applegate et al. Pre-”Grid”

I metaNEOS (’97-’01): NSF-funded collaboration betweenoptimizers and grid computing groups: Condor, Globus.

I MW Toolkit for implementing master-worker algorithms ongrids

I Many interesting parallel algorithmic paradigms can beshoehorned effectively into the master-worker framework(task-farming, branch-and-bound, “native” master-worker)

I Development of MW is continuing.




Successful Implementations with MW

I Quadratic Assignment Problem: Solved the touchstoneproblem NUG30 (Anstreicher et al., ’02). Seven years of CPUtime in a week on more than 1000 computers (10 locations, 3continents).

I Two-stage stochastic linear programming (Linderoth & SW,’03). Solved problems with up to 108 constraints and 1010

variables on similar platforms. Allowed extensive testing ofstatistical properties of solutions to such problems and insightinto the nature of solutions (Linderoth, Shapiro, SW, ’02).




“New” Applications

The NLP user base is smaller than for linear programming orinteger (linear) programming, though there has been steadydemand for codes such as MINOS, CONOPT, NPSOL for manyyears.

NLP have arisen (or have attracted renewed attention) in a host ofapplications in the past 10-15 years.

Optimization with PDE constraints. An area of particular focus.Typically design or control problems overlaid on a system describedby PDEs. (see volume of Biegler et al., Springer LNCS, ’03)




Cancer Treatment Planning. Find radiotherapy treatments forcancer that deposit a given dose of radiation in the tumor whileavoiding nearby normal tissue and critical organs. Involvesoptimization problems of many types; recent interest in biologicalobjective functions give rise to NLPs.

Meteorological Data Assimilation in medium-range weatherforecasting. Special case of PDE-constrained optimization.Unknown is the state of atmosphere 48 hours ago; objectivefunction is fit between model and observations. NLP formulations(with specialized algorithms) used at all major forecasting centers.




Model Predictive Control in process engineering. Solve asequence of nonlinear optimal control problem over finite timehorizons; implement closed-loop control using open-loop strategies.Feasible SQP methods have proved to be useful.

Statistics and Machine Learning. Multicategory support vectormachines, robust estimation.

Image Reconstruction. Reconstructing images fromundersampled MRI data.

Others were presented at this meeting.




Connecting with Users

Optimization is a consumer-oriented discipline with a wide anddiverse user base.

I Science

I Engineering

I Operations Research

I Finance and Economics

I “Retail” Users

How is optimization technology transferred to the market? Whatare the trends?




NLP Software PackagesSee NEOS Server www-neos.mcs.anl.gov

∗=freely available. Red=under active development.

I IPOPT∗: interior-point

I KNITRO: interior-point and SLQP variants

I SNOPT: SQP

I LOQO: interior-point

I FILTER: Filter-SQP

I CONOPT: Generalized reduced gradient

I LANCELOT∗: Augmented Lagrangian

I MINOS: sequential linearly constrained

I PENNON: Generalized augmented Lagrangian




Other NLP solvers are embedded into larger optimization ornumerical computing systems: Matlab, Excel, NAG.

Numerous other commercial packages reviewed in 1998 OR/MSToday survey:www.lionhrtpub.com/orms/surveys/nlp/nlp.html.

NEOS Guide: www.mcs.anl.gov/otc/Guide/ (somewhat out ofdate).




Trends in Optimization Software Design

I Once were monolithic Fortran codes. Fortran still popular; Cnow used in some leading codes; a few experiments in C++.

I Object-oriented design used in some recent codes; allowspecialization according to problem structure, modular use oflinear algebra:

I TAO (unconstrained and bound-constrained)I OOPS and OOQP (quadratic programming)I IOTR (NLP-forthcoming)

I Some interior-point codes are especially able to useoff-the-shelf linear algebra packages (e.g. IPOPT and OOQPuse Harwell’s MA27).

I Multiple user interfaces: native-language calls, batch files,modeling languages, high-level languages (Matlab, Octave),spreadsheets.




Open-Source Software

Some good codes are now open-source.

The COIN-OR initiative www.coin-or.org has was founded in2000 to organize and support community development ofoptimization codes.

I Peer review process;

I Sessions at optimization meetings;

I OSI: common interface to optimization solvers (currently forLP and MIP);

I Includes several excellent codes, e.g. IPOPT for NLP.




Optimization Modeling Languages

Allows problems to be defined (and the results presented) in a waythat is natural to the application. No shoehorning into thematrices/vectors of the underlying software.

AMPL, GAMS, Maximal, AIMMS

They have become an extremely popular way to call optimizationsoftware (method of choice on NEOS Server).

Usually incorporate automatic differentiation, relieves user of needto code first derivatives by hand. (Several codes optionally allowsecond derivatives to be used; these must be provided by the user.)

They make optimzation the “outer loop”—not easy to embed theoptimization problem in a larger computation.




The Internet

Using the Internet to reach users and build community.

I Following on from pioneers in the NA community: NAnet,netlib (mid-’80s).

I NEOS (’94-present)I Server: online solution of optimization problems; now

1500-3000 submissions/week.I Guide: informational resource; not much recent development

I Preprint repositories:I IPMO: Interior-Point Methods Online (’94-’01) was vital to the

interior-point boom.I Optimization Online (’00-present)




THANKS FOR YOUR ATTENTION!


Documents

Continuous Optimization: Recent Developments and Applicationspages.cs.wisc.edu/~swright/talks/siam-annual-jul04.pdf · Computation, Applications, Software Optimization on the Grid