50
Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction to nonlinear optimization Gabriel Haeser Department of Applied Mathematics Institute of Mathematics and Statistics University of Sªo Paulo Sªo Paulo, SP, Brazil Santiago de Compostela, Spain, October 28-31, 2014 www.ime.usp.br/ ghaeser Gabriel Haeser

Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Short course: Optimality Conditions andAlgorithms in Nonlinear Optimization

Part I - Introduction to nonlinear optimization

Gabriel Haeser

Department of Applied MathematicsInstitute of Mathematics and Statistics

University of São PauloSão Paulo, SP, Brazil

Santiago de Compostela, Spain, October 28-31, 2014

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 2: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Outline

Part I - Introduction to nonlinear optimizationExamples and historical notesFirst and Second order optimality conditionsPenalty methodsInterior point methods

Part II - Optimality ConditionsAlgorithmic proof of Karush-Kuhn-Tucker conditionsSequential Optimality ConditionsAlgorithmic discussion

Part III - Constraint QualificationsGeometric InterpretationFirst and Second order constraint qualifications

Part IV - AlgorithmsAugmented Lagrangian methodsInexact Restoration algorithmsDual methods

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 3: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 4: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Optimization

Optimization is a mathematical problem with many “real world”applications. The goal is to find minimizers or maximizers of amultivariable real function, under a restricted domain.

to draw a map of Americawith areas proportional to the

real areas

hard-spheres problem: toplace m points on a

n-dimensional sphere in sucha way that the smallest

distance between two pointsis maximized.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 5: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Problem America

To draw a map of America, similar to the usual map, with areasproportional to real areas.

Minimize 12∑m

i=1 ‖pi − pi‖2,Subject to 1

2∑nj

i=1(pxi py

i+1 − pxi+1py

i ) = βj, j = 1, . . . , c

c = 17 countriesβj is the real area of country j

m = 132 given points pi on thefrontiers of the usual mapGreen-Gauss formula to computeareas

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 6: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Problem America

United States (without Alaska and Hawaii) = 8.080.464 km2

Brazil = 8.514.876 km2

Usual map ratio ≈ 1.32

Real ratio ≈ 0.95

Usual map Areas proportional to real areaswww.ime.usp.br/∼ghaeser Gabriel Haeser

Page 7: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Problem America

Areas proportional to GDP Areas proportional to population

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 8: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Kissing and hard-spheres problems

The kissing number of dimension n is the largest number of unitspheres that may be put touching a n-dimensional unit spherewithout overlapping.

The hard-spheres problem consists of maximizing the smallestdistance d between m points on the n-dimensional sphere of ra-dius 2.

n Kissing number2 63 124 245 40–446 72–787 126–1348 2409 306–36410 500–554

d∗ ≥ 2⇒ kissing number ≥ m

n = 2, n = 3,m = 6, d∗ = 2 m = 12, d∗ ≈ 2.194

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 9: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Applications: Packing

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 10: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Applications: PackingInitial configuration for molecular dynamics

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 11: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Large scale problems: Finance

Jacek Gondzio and Andreas Grothey (May 2005):quadratic convex program with 353 million constraints and 1010million variables.

Tool: Interior Point Method

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 12: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Large scale problems: Localization

Find a point in the rectangle but not in the ellipsis such that thesum of the distances to the polygons is minimized.

1.567.804 polygons.3.135.608 variables.1.567.804 upper levelconstraints.12.833.106 lower levelconstraints.convergence in10 outer iterations,56 inner iterations,133 funct. evaluations,185 seconds.

Tool: Augmented Lagrangian method

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 13: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

TANGO Project - www.ime.usp.br/∼egbirgin/tango

Trustable Algorithms for Nonlinear General Optimization

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 14: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

TANGO Project - www.ime.usp.br/∼egbirgin/tango

40.370 visits registered by Google Analytics - Since 2007(More than 3.000 downloads)

USA: 7.969, Brazil: 7.230, Germany: 2.974

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 15: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

TANGO Project - www.ime.usp.br/∼egbirgin/tango

Spain: 733

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 16: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Historical Notes

Military Programs formulated as a system of linearinequalities gave rise to the term Programming in a linearstructure (title of the first paper by G. Dantzig, 1948).Koopmans shortened the term to Linear Programming.Dorfman (in 1949) thought that Linear Programming wastoo restrictive and suggest the more general termMathematical Programming, now called MathematicalOptimization.Nonlinear Programming is the title of the 1951 paper byKuhn and Tucker that deals with Optimality Conditions.These results are the extension of the Lagrange rule ofmultipliers (1813) to the case of equality and inequalityconstraints. These were previously considered on the 1939unpublished master’s thesis of Karush (KKT conditions).These works are particularly important because theysuggest the development of algorithms to deal withpractical problems.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 17: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Historical Notes

Linear Programming is part of a revolutionary developmentthat gave humanity the capability to formulate an objectiveand determine a way of detailed decisions to reach thisgoal in the best way possible.Tools: Models, algorithms, computers and softwares.The impossibility to perform large computations is the mainreason, according to Dantzig, to the lack of interest inoptimization before 1947.

Important topics in computing: (a) Dealing with sparsity allowsfor solving larger problems; (b) Global optimization; (c)Automatic differentiation of a function represented in aprogramming language.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 18: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Automatic Differentiation

f (x1, x2) = sin(x1) + x1x2

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 19: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Duality

Game theory and linear programming:1948 - G. Dantzig visited John von Neumann in Princeton.

J. von Neumann, 1963. Discussion of a maximum problem.D. Gale, H. W. Kuhn, A. W. Tucker, 1951. Linear programmingand the theory of games.

Elements of duality:

a pair of optimization problems, one a maximum problemwith objective function f and the other a minimum problemwith objective function h, based on the same datafor feasible solutions to the pair of problems, always h ≥ f

necessary and sufficient conditions for optimality are h = f

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 20: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Duality

(Fermat XVII century): Given 3 points p1, p2 and p3 on the plane,find the point x that minimizes the sum of the distances from x top1, p2 and p3.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 21: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Duality

(Thomas Moss, The Ladies Diary, 1755): “In the three sides ofan equiangular field stand three trees, at the distances of 10, 12and 16 chains from one another: to find the content of the field,it being the greatest the data will admit.”

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 22: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Duality

(J.D. Gergonne (ed), Annales de Mathématiques Pures et Ap-pliquées, 1810-1811): Given any triangle, circumscribe thelargest possible equilateral triangle about it.

Solution given in the 1811-1812 edition by Rochat, Vecten,Fauguier and Pilatte where duality was acknowledged.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 23: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

The problem (NLP)

Minimize f (x),Subject to hi(x) = 0, i = 1, . . . ,m.

gj(x) ≤ 0, j = 1, . . . , p.

f , hi, gj : Rn → R are (twice) continuously differentiablefunctions.

Ω = x ∈ Rn | h(x) = 0, g(x) ≤ 0 (feasible set)

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 24: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Solution

Global Solution: A feasible point x∗ ∈ Ω is a global minimizer ofNLP when

f (x∗) ≤ f (x),∀x ∈ Ω

Local Solution: A feasible point x∗ ∈ Ω is a local minimizer ofNLP when there exists a neighbourhood B(x∗, ε) of x∗ such that

f (x∗) ≤ f (x), ∀x ∈ Ω ∩ B(x∗, ε)

A(x) = j ∈ 1, . . . , p | gj(x) = 0 (set of active inequalities atx ∈ Ω)

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 25: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Example

Minimize x2 + y2,Subject to x + y− 1 = 0.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 26: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

First order optimality condition - Lagrange multipliers

Minimize x2 + y2,Subject to x + y− 1 = 0.

x = 12 , y = 1

2 ,

(11

)+ (−1)

(11

)= 0

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 27: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Example

Maximize x2 + y2,Subject to x + 2y− 2 ≤ 0,

x ≥ 0,y ≥ 0.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 28: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Minimize −x2 − y2,Subject to x + 2y− 2 ≤ 0,

−x ≤ 0,−y ≤ 0.

x = 2, y = 0,(−40

)+ 4

(12

)+ 8

(0−1

)= 0

x = 0, y = 1,(

0−2

)+ 1

(12

)+ 1

(−10

)= 0

x = 0.4, y = 0.8,(−0.8−1.6

)+ 0.8

(12

)= 0

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 29: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

First order optimality condition - KKT condition

(Karush-Kuhn-Tucker) Under some condition (constraintqualification), if x∗ is a local solution, there exist Lagrangemultipliers λ ∈ Rm and µ ∈ Rp such that:

∇f (x) +

m∑i=1

λi∇hi(x∗) +

p∑j=1

µj∇gj(x∗) = 0, (Lagrange condition)

µjgj(x∗) = 0, j = 1, . . . , p, (complementarity)

h(x∗) = 0, g(x∗) ≤ 0, (feasibility)

µ ≥ 0. (dual feasibility)

Interpretation: up to first order, a feasible direction cannot be adescent direction.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 30: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Second order optimality condition

x∗ =

(0.40.8

),∇g1(x∗) =

(12

), ∇2f (x∗) =

(−2 00 −2

).

There exists some d ∈ Rn,∇g1(x∗)Td ≤ 0, dT∇2f (x∗)d < 0.

Theorem: Under some conditions, if x∗ is a local minimizer

dT

∇2f (x) +

m∑i=1

λi∇2hi(x∗) +

p∑j=1

µj∇2gj(x∗)

d ≥ 0,

for every d ∈ Rn such that

∇f (x∗)Td ≤ 0,

∇hi(x∗)Td = 0, i = 1 . . . ,m

∇gj(x∗)Td ≤ 0, j ∈ A(x∗).

Interpretation: All critical directions must be of ascent nature.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 31: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

History of nonlinear programming

Kuhn, Tucker, 1951.Nonlinear programming.

Albert William Tucker(1905 - 1995)Princeton UniversityTopology

Harold William Kuhn(1925 - 2014)Princeton UniversityPhD 1950, AlgebraGame Theory, Optimization

Saddle point problem

φ(x∗, u) ≤ φ(x∗, u∗) ≤ φ(x, u∗), ∀x, u

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 32: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

History of nonlinear programming

William Karush (1917-1997)

1939. Minima of Functions of Several Variableswith Inequalities as Side Conditions.M.Sc. thesis, Department of Mathematics,University of Chicago

Calculus of Variations and Optimization

University of Chicago and California StateUniversity (also Manhattan Project)

I concluded that you two had exploited and de-veloped the subject so much further than I, thatthere was no justification for my announcing tothe world, “Look what I did, first.”, 1975.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 33: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

History of nonlinear programming

Fritz John (1910 - 1994)

1948. Extremum problems with inequalities assubsidiary conditions.

PhD 1933 in Göttingen under CourantNew York University

Partial differential equations, convex geometry,nonlinear elasticity

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 34: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

History of nonlinear programming

Fritz John (1910 - 1994)

Let S be a bounded set in Rm. Find the sphereof least positive radius enclosing S.

Minimize F(x) := xm+1,Subject toG(x, y) := xm+1−

∑mi=1(xi− yi)

2 ≥ 0 for all y ∈ S.

the boundary of a compact convex set S in Rn

lies between two homothetic ellipsoids of ratio≤ n, and the outter ellipsoid can be taken to bethe ellipsoid of least volume containing S.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 35: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Snell’s law of diffractionsin θy

vy= sin θz

vz

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 36: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Snell’s law of diffractionsin θy

vy= sin θz

vz

Minimize T(x) :=‖x− y‖

vy+‖x− z‖

vzSubject to h(x) = 0

At the solution x∗, ∇T(x∗) = x∗−yvy‖y−x∗‖ + x∗−z

vz‖z−x∗‖ is parallel to∇h(x∗), the normal vector to the surface.

Define y = x∗ + y−x∗vy‖y−x∗‖ and z = x∗ + z−x∗

vz‖z−x∗‖ .Hence −∇T(x∗) = (y− x∗) + (z− x∗) is the diagonal of thefollowing parallelogram:

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 37: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Snell’s law of diffractionsin θy

vy= sin θz

vz

By triangular sim-ilarity, y and z areequally away fromthe normal line.Hence‖y− x∗‖ sin θy =‖z− x∗‖ sin θz.The calculation‖y − x∗‖ = 1

vyand

‖z− x∗‖ = 1vz

yieldsSnell’s law.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 38: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

External Penalty Method

Choose a sequence ρk with ρk → +∞ and for each k solvethe problem

Minimize f (x) + ρkP(x),

obtaining the (global) solution xk, if it exists.P is a smooth functionP(x) ≥ 0

P(x) = 0⇔ h(x) = 0, g(x) ≤ 0

For example: P(x) = ‖h(x)‖22 + ‖max0, g(x)‖2

2

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 39: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

External Penalty Method

Theorem: If xk is well defined then every limit point of xk isa global solution to Minimize P(x)

Theorem: If xk is well defined and there exists a point wherethe function P vanishes (feasible region is not empty), thenevery limit point of xk is a global solution ofMinimize f (x), Subject to h(x) = 0, g(x) ≤ 0.

The External Penalty Method can be used as a theoretical toolto prove KKT conditions, but also, it can be adjusted to be anefficient algorithm (augmented lagrangian method).

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 40: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

External Penalty Method

Minimize x21 + x2

2,Subject to x1 − 1 = 0

x2 − 1 ≤ 0.

Minimize x21 + x2

2 + ρk((x1 − 1)2 + max0, x2 − 12)(= Φk(x)).

Solving ∇Φk(x) = 0 we get xk = ( ρk1+ρk

, 0)→ (1, 0).

Show simulation

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 41: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Internal Penalty Method

Choose a sequence µk with µk → 0+ and for each k solve theproblem

Minimize f (x) + µkB(x),Subject to h(x) = 0

g(x) < 0.

B is smoothB(x) ≥ 0

B(x)→ +∞ if some gi(x)→ 0 with g(x) < 0.For example: B(x) = −

∑mi=1 log(−gi(x))

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 42: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Interior Point Method

Consider the convex quadratic problem

Minimize cTx + 12 xTQx,

Subject to Ax = bx ≥ 0.

and the barrier subproblem

Minimize cTx + 12 xTQx− µ

∑nj=1 log xj,

Subject to Ax = bx > 0.

KKT condition

c− ATλ+ Qx− µX−1e = 0,Ax = b,

where X−1 = diagx−11 , . . . , x−1

n and e = (1, . . . , 1)T. Denotings = µX−1e we get

ATλ+ s− Qx = c,Ax = b,

XSe = µe, (x, s) > 0.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 43: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Interior Point Method

Active-set methods

ATλ+ s− Qx = c,

Ax = b,

XSe = 0,

(x, s) ≥ 0.

Interior point methods

ATλ+ s− Qx = c,

Ax = b,

XSe = µe,

(x, s) > 0.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 44: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Interior Point Method

Complementarity: xisi = 0,∀i = 1, . . . , n.

Active-set methods try to guess the optimal active subsetA ⊆ 1, . . . , n and set xi = 0 for i ∈ A (active constraints), si = 0for i 6∈ A (inactive constraints).

Interior point methods use ε-mathematics:Replace xisi = 0,∀i = 1, . . . , nby xisi = µ, ∀i = 1, . . . , n.

Force convergence by letting µ→ 0+.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 45: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Interior Point Method

Solve the nonlinear system of equations

f (x, λ, s) = 0,

where f : R2n+m → R2n+m is the mapping:

f (x, λ, s) =

ATλ+ s− Qx− cAx− bXSe− µe

.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 46: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Interior Point Method

Newton direction:

−Q AT IA 0 0S 0 X

.

∆x∆λ∆s

=

c− ATλ− s + Qxb− Axµe− XSe

.

Reduce µ at each Newton iteration.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 47: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Interior Point Method

Algorithm: Step 0: Choose (x0, λ0, s0), (x0, s0) > 0, µ0 > 0 andparameters 0 < γ < 1 and ε > 0. Set k = 0.

Step 1: Compute the Newton direction (∆x,∆λ,∆s) at(x, λ, s) := (xk, λk, sk).

Step 2: Choose a stepsize α such that (xk +α∆x, sk +α∆s) > 0.

Step 3: Update µk+1 = γµk.

Step 4: If xksk ≤ εx0s0, stop. Else set k := k + 1 and go to Step 1.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 48: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Interior Point Method

Consider the merit function

ψ(x, s) = (n +√

n) log(xTs)−n∑

i=1

(xisi),

(Note that ψ(x, s)→ −∞⇒ xTs→ 0.)

Choosing the stepsize α that minimizes ψ(xk + α∆x, sk + α∆s)(exact line search) we get:

Theorem: If γ = nn+√

n , we have xTk sk ≤ εxT0 s0 in O(√

n log(nε

))iterations.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 49: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Algorithms

There are no “direct method” to solve NLP.NLP is solved using iterative methods.An iterative method generates a sequence of pointsxk ∈ Rn that converges (or not) to a solution of the problem.Iterative methods are programmed and implemented oncomputers, where real mathematical operations arereplaced by floating point operations.

www.ime.usp.br/∼ghaeser Gabriel Haeser

Page 50: Short course: Optimality Conditions and Algorithms in ...ghaeser/santiago1.pdf · Short course: Optimality Conditions and Algorithms in Nonlinear Optimization Part I - Introduction

Algorithms

Theory is necessary to avoid performing an infinite numberof experiments.Useful theory should be able to predict the behavior ofmany experiments.Usually, the theory does not refer to the real sequencesgenerated by the computer, but theoretical sequencesdefined by the algorithms.The analogy between real sequences and theoreticalsequences is not perfect.There are practical phenomena that the theory is not ableto predict, but relevant theory is the one that contributs inexplaining practical phenomena.

www.ime.usp.br/∼ghaeser Gabriel Haeser