Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local...

Optimization (Introduction)

Goal: Find the minimizer !∗that minimizes the objective (cost) function " ! :ℛ" → ℛ

Optimization

Unconstrained Optimization

ID f(x) : IR → IR

NE FCI) : IR"→ 112

fatal*

/H¥nfGY or x: arg yi±yf-

Goal: Find the minimizer !∗that minimizes the objective (cost) function " ! :ℛ" → ℛ

Optimization

Constrained Optimizationf- (x*) = min fCx)

s .t . hi G) ⇐ o →equalitygig:# so → inequality(in

Unconstrained Optimization• What if we are looking for a maximizer !∗?

" !∗ = max# " ! t÷f(x*) = main (-fan )

Calculus problem: maximize the rectangle area subject to perimeter constraint

max! ∈ ℛ!

✓area

-Max Area

① - -

perimeter

→perimeterconstraint

②ditto③dzflo }

di.dz#Lwithoutpericoust)-s/di--dz=l0-sA=IooT-

()*+ = '$'%

-*)./*0*) = 2('$ + '%)

Area [mata -100

perimeter t.a.pe (violates"

-f#¥Lr.D D

sizes A

pig ¥.dz):÷÷÷ILi€¥⇐:*:i÷÷÷.* i;

(d, .dz) d' dies

What is the optimal solution? (1D)

(First-order) Necessary condition

(Second-order) Sufficient condition

! "∗ = min"! "IT ,max-

gifted, pointsf"txt) > o → x

* is minimum

f-"(x*) co - X

* is maximum

Types of optimization problems

Gradient-free methods

Gradient (first-derivative) methods

Evaluate ! " , !′ " , !′′ "

Second-derivative methods

! "∗ = min" ! "

Evaluate ! "

Evaluate ! " , !′ "

5: nonlinear, continuous and smooth

Does the solution exists? Local or global solution? "°*^#

a IEEE

Example (1D)

-6 -4 -2 2 4 6

Consider the function " ! = 7!8 −

7"9 − 11 -

: + 40-. Find the stationary point and check the sufficient condition

miufcx)X

* lstarder necessary conditionmax: :

If-'(x) .-4¥ - 3¥ - 22×+40 ; min

f)G) = o ⇒ x3 - E - 22×+40=0 !

solutions ⇒ x =/-25 !* min

* 2nd order condition :

as .ae#-aftIfa7::ii:i:.::.oIEhin ,f-" f-5) =3 (25) +10-2220

What is the optimal solution? (ND)

(First-order) Necessary condition

(Second-order) Sufficient condition

1D: !## " > 0

1D: !′ " = 0

! *∗ = min$! * f- (x)

HI)= A

NI : If(E) = Q - gives stationary solution×*

NI i t ) is positive definite → x*is

minimizer

Taking derivatives…f : IR"→ R

f-(Xn) = f- (Xi , Xz , - - -,Xn)

¥.÷ . - - - :# ⇒ I:÷÷÷lgradient of

f CnxD

¥. If z

µ ,±,Ein Eas - - - - Tian Hi .-¥,

:*. :*. :#ox.- - -

- o¥oxn a ¥÷⇒KYE' o¥a.. ¥¥.

- . g¥)nth

From linear algebra:

A symmetric ; ×; matrix = is positive definite if >&= > > @ for any > ≠ @

A symmetric ; ×; matrix = is positive semi-definite if >&= > ≥ @ for any > ≠ @

A symmetric ; ×; matrix = is negative definite if >&= > < @ for any > ≠ @

A symmetric ; ×; matrix = is negative semi-definite if >&= > ≤ @ for any > ≠ @

A symmetric ; ×; matrix = that is not negative semi-definite and not positive semi-definite is called indefinite

I.i ta- O

y . HyGEo II's. s

e::÷.I :.

! *∗ = min$! *

First order necessary condition: +! * = ,Second order sufficient condition: - * is positive definiteHow can we find out if the Hessian is positive definite?

la. eight)

THW→ ( X, y ) → are eigenpairs of H

jyxyty -- x " yn: → "5fa¥:*.

* ki so ti →

/)yTHy >

o tty ftp.defsxt isminimizer

* Xi Lo fi ⇒ y'

Hy < o Hy → His neg def ⇒ x* is

maximizer

* Tgi,Yo }→ H is indefinite → x*is psaddetdeoint

Types of optimization problems

Gradient-free methods

Gradient (first-derivative) methods

Evaluate ! * , +! * , +%! *

Second-derivative methods

! *∗ = min$ ! *

Evaluate ! '

Evaluate ! ' , !" #

5: nonlinear, continuous and smooth

Consider the function " -E, -: = 2-E9 + 4-:: + 2-: − 24-EFind the stationary point and check the sufficient condition

Example (ND)me

E- if" -244¥:("o" : )8×2+2

1)of -- e → (GI!:{4]=fg]⇒

6×2--24 → xf=4→x,=±z8×2=-2 → Xz= - O -25

stationary points : x# III.as] x'*-- fo?⇐]

''et: Hina.fi#:.u.fttfo.-t-l::Ih::::i.n.

Optimization (1D Methods)

Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding• Needs to bracket the minimum inside an interval• Required the function to be unimodal

A function ":ℛ → ℛ is unimodal on an interval [4, 5]

ü There is a unique !∗ ∈ [4, 5] such that "(!∗) is the minimum in [4, 5]

ü For any -E, -: ∈ [4, 5] with -E < -:

§ -: < !∗⟹"(-E) > "(-:)§ -E > !∗⟹"(-E) < "(-:)

HuhfK¥,if'mI::¥:*"i÷/ / 4 I

fix.✓

GH$ H% H'H(

¥7 (FITta fa

• B.+**22

B A *a *2×* B

Ifif@HsxIIfIf1xisxIXttE-La.XzTfI.t[ 4 , b ]

K£7 Propose the point asks- t.IT x

, e-atCl-E) hk

/XI Xz

Xz --at Chic

Ill- C) hk :

The 1 at the start h$⇐ (b -a)i I

1←Ehk'#£E)hI f , > f,or f, sfz

!¥%,9k Ex, , b ]

[a. xD

T.ee/-fEoeryiteration

aFI¥EIb * hkti-E.hn interval gets'

Em!i hatch. .ch#..EfET;db"I

'(I - E) hrs -

¥t%h* ,Cte) -- E - 12=0.6-187

KI interval Caio) 12-0.618-1h. .- Cb -a)-

! I I → x, = at( t - E) ho

I,hofz-t-GD-qh.pe#qs.--cyh-/if f , sfz

: -→x*E[a,Xz]

" I 9k b=X2

µh-# xz=x, →fz - fi

a:④TTbtf+,

III.Ethan..

t¥h¥¥/( f , - FCK)

if f , > fz : →x*E Ex, , b]

¥+,¥¥¥hµ,That Ej÷¥÷nn-f=fz-

Xz= At Chu fz=f(Xz)

Golden Section Search

Golden Section Search What happens with the length of the interval after one iteration?

ℎ! = ( ℎ"

Or in general: ℎ#$! = ( ℎ#

Hence the interval gets reduced by )(for bisection method to solve nonlinear equations, (=0.5)

For recursion: ( ℎ! = (1 − () ℎ"( ( ℎ" = (1 − () ℎ"

(% = (1 − ()) = .. 012

• Derivative free method!

• Slow convergence:

limI→K|@ILE|@I

= 0.618 D = 1 (EFG@4D HIGJ@DK@GH@)

• Only one function evaluation per iteration

Golden Section Search II → the < tot

① IevaluateFG)

-¥ -- hee ef÷=Y÷ -- Chien-

reI → T

-Xi cheap ,

Example

A = - to → ho = 20TT b -- to h

he,=I ko → 0.618×20 = 12.36

Newton’s MethodUsing Taylor Expansion, we can approximate the function " with a quadratic function about -M" - ≈ " -M + "N -M (- − -M) + E

: "N′ -M (- − -M):

And we want to find the minimum of the quadratic function using the first-order necessary condition

Yeti = Xk t h

n¥near" =I

* = O →point5-

'Go) tyzf

Go) Cx - to)¢= ofstationary

sin:÷±÷÷÷µ÷¥¥÷7¥-

Newton’s Method• Algorithm:"3 = starting guess"456 = "4 − !′ "4 /!′′ "4

• Convergence: • Typical quadratic convergence• Local convergence (start guess close to solution)• May fail to converge, or converge to a maximum or

point of inflection

Newton’s Method (Graphical Representation)get A

tho)HA

•Iµ.¥i

#i→X3 Xz Xl Xo X

sequence of opt.

using quad. approx I

ExampleConsider the function " - = 4 -9 + 2 -: + 5 - + 40

If we use the initial guess -M = 2, what would be the value of - after one iteration of the Newton’s method?

G) = 12×2+4×+5

G) = 24 X t 4

h =-fifty -- - CMg¥£¥t# =

X , = Xoth → X

,= 2 - ¥2 -1×1--0.82697

Optimization (ND Methods)

Optimization in ND: Steepest Descent Method Given a function ! * :ℛ7 → ℛ at a point *, the function will decrease its value in the direction of steepest descent: −+! *

" -E, -: = (-E − 1)O+(-: − 1)O

What is the steepest descent direction?

min FG)x

£Xz - - - - - - - - • ×

(Xo)iX1

Steepest Descent Method

" -E, -: = (-E − 1)O+(-: − 1)OStart with initial guess:

!M = 33

Check the update:

¥2 = Ii - Of

es I 0ItEtan .- IT, ] missed± . ft;]

Steepest Descent Method

" -E, -: = (-E − 1)O+(-: − 1)OUpdate the variable with: !ILE = !I − PIQ" !I

How far along the gradient should we go? What is the “best size” for PI?

I , = Io - 0=5 OfGo).-

-µ12=0.57How can we get a ?

÷÷÷÷÷:÷÷÷..

Steepest Descent Method Algorithm:

Initial guess: *3

Evaluate: ;4= −+! *4

Perform a line search to obtain <4 (for example, Golden Section Search)

<4 = argmin8

! *4 + < ;4

Update: *456 = *4 + <4 ;4

① ÷÷.

D Oret ask-

Line Search Xktt = Xk - Ak Tf(Xn)we want to find de St .

f- (Hett ) min f(Xn -④DfC)x he

1st order condition ¥70 → gives a)DI = ofArt, ) . Tf(Xr) = Oda OXKH

pg(XnH)•Tf¥f(Xkti ) is orthogonal tozigpoEEfrwnergefa-a.GR#

ExampleConsider minimizing the function

! "!, "" = 10("!)# − "" " + "! − 1

Given the initial guess"! = 2, ""= 2

what is the direction of the first step of gradient descent?

Miu far, , Xz)X, gXz

=@± .- I:]

a- =p if Iska -- III ]

4un±÷÷t→fi;

Newton’s Method Using Taylor Expansion, we build the approximation:

f- (Ets ) -- f t ELITE tf It④ = ICI-- tnonlinear quadratic+St order condition : II - o approx off

I⇒t⇒ → His :p.my#etric--/tfH.)s---TfGTf→ solve liusys to find

-Newton slip s

Newton’s MethodAlgorithm:Initial guess: *3

Solve:-9 *4 ;4 = −+! *4Update: *456 = *4 + ;4

Note that the Hessian is related to the curvature and therefore contains the information about how large the step should be.

,#ii=f¥any

→ solve 0¥⑤-

Try this out!" -, R = 0.5-: + 2.5R:

When using the Newton’s Method to find the minimizer of this function, estimate the number of iterations it would take for convergence?

A) 1 B) 2-5 C) 5-10 D) More than 10 E) Depends on the initial guess

Newton’s Method SummaryAlgorithm:Initial guess: *3Solve:-9 *4 ;4 = −+! *4Update: *456 = *4 + ;4

About the method…• Typical quadratic convergence J• Need second derivatives L• Local convergence (start guess close to solution)• Works poorly when Hessian is nearly indefinite• Cost per iteration: >(@:)

Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local...

Documents

Hessian-free Optimization for Learning Deep Multidimensional Recurrent Neural Networks

REDUCED-HESSIAN QUASI-NEWTON METHODS FOR …peg/papers/rhr_siam.pdf · 2007. 8. 2. · REDUCED-HESSIAN QUASI-NEWTON METHODS FOR UNCONSTRAINED OPTIMIZATION∗ PHILIP E. GILL† AND

Optimized programming algorithm for cylindrical and directional deep brain stimulation ... · 2017. 6. 5. · Keywords: deep brain stimulation, optimization algorithm, hessian, activating

A Practical Approach for Solving Mesh Optimization ... · PDF fileA Practical Approach for Solving Mesh Optimization Problems using Newton ... propose a simple and e cient Hessian

MATLAB Optimization Toolbox Version 4.0 (R2008a) …sites.poli.usp.br/d/pqi5779/matlab_toolbox_optim.pdf · X = FMINSEARCH(FUN,X0) starts at X0 and attempts to find a local minimizer

Opening the Black Box: Low-Dimensional Dynamics in High ... · ployed a second-order optimization technique, Hessian-free optimization, to train all the weights in an RNN using backpropagation

Deep Learning via Hessian-free Optimizationjmartens/docs/HF_talk.pdf · 2010-06-29 · Practical Considerations for 2nd-order optimization Hessian size problem for machine learning

Server Management Made Ezee!. Robust, light-weight, C++ based, SSH server management gateway Force Multiplier Time Minimizer Cost Minimizer Security Enhancer

On the Use of Stochastic Hessian Information in ...€¦ · ON THE USE OF STOCHASTIC HESSIAN INFORMATION IN OPTIMIZATION METHODS FOR MACHINE LEARNING* RICHARD H. BYRD†, GILLIAN

Gradient-based Methods for Optimization. Part I.sites.science.oregonstate.edu/~gibsonn/optpart1.pdfNewton direction may not be a descent direction (if Hessian not positive deﬁnite)

sEL STRUKTUR DAN FUNGSI MINIMIZER

Template for Electronic Submission to ACS Journals · Web viewKeywords: Model reduction-based optimization, reduced Hessian, nonlinear inequality constraints, black-box simulator,

Optimization Randomized Hessian estimation and directional ... · randomized Hessian estimation and prove some basic properties. In Section 3, we consider algorithmic applications

Pore Minimizer - cindykrinerunit.weebly.com

Willmore Minimizer

LECTURE #3: Complex Hessian Matrices - asl.epfl.ch · 2 Lecture #3: Complex Hessian Matrices EE210B: Inference over Networks (A. H. Sayed) Reference Appendix B (Complex Hessian Matrices,

Hessian Presentation PDF

A Comparison of Gradient- and Hessian-Based Optimization ...imr.sandia.gov/papers/imr18/Sastry.pdf · plementation in Mesquite [15] is a line search that approximates the Hessian

Hessian Matrices in Statistics

Maximizing the Homework Minimizer - Abrams & Associatesabramsandassociates.com/wp-content/articles/Maximizing the HW Minimizer.pdf · Maximizing the Homework Minimizer by Kay Kosak