Derivative-free Methods using Linesearch Techniques Stefano Lucidi

Derivative-free Methods

Linesearch Techniques

Stefano Lucidi

P. Tseng

L. Grippo

joint works with

(the father linesearch approach)

M. Sciandrone

G. Liuzzi

F. Lampariello

V. Piccialli

(in order of appearance in this research activity)

F. Rinaldi G. Fasano

PROBLEM DEFINITION:

0)( s.t.

)( min

1n , : CfRRf 1

imn , : CgRRg

i, gf are not available

MOTIVATIONS:

In many engineering problems the objective and constraint function values are obtained by

direct measurements

complex simulation programs

first order derivatives can be often neither explicitly calculated nor approximated

MOTIVATIONS:

In fact

the mathematical representations of the objective function and the constraints are not available

the source codes of the programs are not available

the values of the objective function and the constraints can be affected by the presence of noise

the evaluations of the objective function and the constraints can be very expensive

MOTIVATIONS:

the mathematical representations of the objective function and the constraints are not available

the first order derivatives the objective function and the constraints can not be computed analytically

MOTIVATIONS:

the source codes of the programs are not available

the automatic differentiation techniques can not be applied

MOTIVATIONS:

the evaluations of the objective function and the constraints can be very expensive

the finite difference approximations can be too expensive(they need n function evaluations at least)

MOTIVATIONS:

finite difference approximations can produce very wrong estimates of the first order derivatives

the values of the objective function and the constraints can be affected by the presence of noise

NUMERICAL EXPERIENCE:

we considered 41 box constrained standard test problems

we perturbed such problems in the following way:

)N (0, ,1)( )(~

2 xfxf

where denotes a Gaussian distributed random number

with zero mean and variance

)N (0, 2 2

NUMERICAL EXPERIENCE:

we considered two codes:

Number of Failures

DF_box = derivative-free method

E04UCF = NAG subroutine using finite-differences gradients

-92 01

DF_box

E04UCF

GLOBALLY CONVERGENT DF METHODS

Direct search methods use only function values

- pattern search methods where the function is evaluated

on specified geometric patterns

- line search methods which use one-dimensional minimization

along suitable search directions

Modelling methods approximate the functions by suitable

models which are progressively built and updated

UNCONSTRAINED MINIMIZATION PROBLEMS

n s.t.

)( min

1n , : CfRRf

is not available

0 xfxfRxL is compact

THE ROLE OF THE GRADIENT

kkk1k dxx

characterizes accurately the local behaviour of f allows us

to determine an "efficient" descent direction kd

to determine a "good" step length along the directionk

THE ROLE OF THE GRADIENT

is the directional derivatives of alongf ie

f provides the rates of change of along the 2n directions f ie

f characterizes accurately the local behaviour of f

HOW TO OVERCOME THE LACK OF GRADIENT

r,1 ,i ,ik p the local behaviour of along f

should be indicative of the whole local behaviour of f

a set of directions can be associated at each

kxr,1 ,i ,ik p

ASSUMPTION D

r,1 ,i , ik pGiven , the bounded sequences are

such that

0,0 minlimr

0 lim kk

EXAMPLES OF SETS OF DIRECTIONS

n,1,i ,

n,1,i ,ik

p are linearly independent and bounded

22k ep

11k ep

,24k ep

EXAMPLES OF SETS OF DIRECTIONS (Lewis,Torczon)

r,1,i ,

Rppcone

are bounded

22k ep

11k ep

EXAMPLES OF SETS OF DIRECTIONS

1kk vfvfxf

11k ep

22k ep

UNCONSTRAINED MINIMIZATION PROBLEMS

Assumption D ensures that, performing finer and finer sampling

of along it is possible:

f r,1 ,i ,ik p

- either to realize that the point is a good approximation of a

stationary point of fkx

- or to find a point where is decreased f1kx

GLOBAL CONVERGENCE

By Assumption D we have:

xfpxfpf

GLOBAL CONVERGENCE

By using satisfying Assumption D it is possible: r,1 ,i , ik p

to characterize the global convergence of a sequence of points kx

by means

the existence of suitable sequences of failures in decreasing the

objective function along the directions r,1 ,i , ik p f

GLOBAL CONVERGENCE

By Assumption D we have:

r 1,i 0,

xfpxfpf

PROPOSITION Let and be such that: r,1,i , i

k p kx

r,1,i , ik p

k1k xfxf -

- satisfy Assumption D

0 lim kk

- there exist sequences of points and scalars

such that

r,1,i , ik

oyfpyf

GLOBAL CONVERGENCE

• the sampling of along all the directions can be

distributed along the iterations of the algorithm

r,1,i , ik p f

• the Proposition characterizes in “some sense” the requirements on the accettable samplings of along the directions that guarantee the global convergence

r,1,i , ik p f

• it is not necessary to perform at each point a sampling

of along all the directions f r,1,i , ik p

GLOBAL CONVERGENCE

The use of directions satisfying Condition D and the result of producing

sequences of points satisfying the hypothesis of the Proposition are

the common elements of

all the globally convergent direct search methods

The direct search methods can divided in

- pattern search methods

- line search methods

PATTERN SEARCH METHODS

Cons: all the points produced must lie in a suitable lattice this implies - additional assumptions on the search directions

- restrictions on the choiches of the steplenghts

Pros: they require that the new point produces a simple decrease

(in the line search methods the new point must guarantees

a “sufficient” decrease of ) f

(in the line search methods no additional requiriments

respect to Assumption D and the assumptions of the Proposition)

LINESEARCH TECHNIQUES

(0,1) ,22 k

kkkkkk

dxfxfdxf

kk dxf

kk dxfxf

(0,1) ,22

kkkkkk

dxfdxf

kk dxf

2k dxf

LINESEARCH TECHNIQUES

ALGORITHM DF

STEP 1 Compute satisfying Assumption D

1k , , pp

Minimization of along STEP 2 r

k1k , , pp f

STEP 3 Compute and set k=k+1

) ( k1k

1ik xypyy

STEP 2

The aim of this step is:

- to detect the “promising” directions, the direction along which the function decreases “sufficiently”

- to compute steplenghts along these directions which guarantee both a “sufficiently” decrease of the function and a “sufficient” moving from the previous point

ik pyf

LINESEARCH TECHNIQUE

)~,( ik

ik pyf

)~,( ik

STEP 2

~ The value of the initial step along the i-th direction derives from

the linesearch performed along the i-th direction at the previuos

iteration

If the set of search directions does not depend on the iteration

namely

the scalar should be representative of the behaviour of the objective

function along the i-th direction

r,1 ,i ,iik pp

STEP 3

set k=k+1 and go to Step 1

Find such that 1kx 1rk1k

otherwise set

At Step 3, every approximation technique can be used to produce

a new better point

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by DF

Algorithm then there exists an accomulation point of and every

accumulation points of is a stationary point of the objective

function

k x k x

LINEARLY CONSTRAINED MINIMIZATION PROBLEMS

1n , : CfRRf

mnmn , , RbRARx

f is not available

: 00 xfxfFxL is compactis compact

bAxRxF : n

)( min(LCP)

Fx Given a feasible point it is possible to define

• the set of the indeces of the active constraints

jTj :m,1 ,j I bxax

• the set of the feasible directions

I(x )j ,0 : Tj

n daRdxT

is a stationary point

for Problem (LCP)

Fx * *T* T ,0 xppxf

for Problem (LCP)

*r1 T,,c o n e xpp

for Problem (LCP)

1k T,,c o n e xpp

TT lim kk

• an estimate of the set of the indeces of the active constraints

jTj :m,1 ,j ,I bxax

• an estimate of the set of the feasible directions

),I(j ,0 : , Tj

n xdaRdxT

Fx Given and it is possible to define 0

• has good properties which allow us to define globally convergent algorithms

kk ,T x

ASSUMPTION D2 (an example)

, , krk

1k pp Given and the set of directions

with satisfies:

kik r,1,i ,1 p

kr is uniformly bounded

0 , ,T,T, , kkrk

k xxppcone

,T k x

ALGORITHM DFL

STEP 1 Compute satisfying Assumption D2 kr

k1k , , pp

Minimization of along

STEP 2 kr

k1k , , pp f

STEP 3 Compute the new point and set k=k+11kx

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by DFL

Algorithm then there exists an accomulation point of and every

accumulation points of is a stationary point for Problem

k x k x

BOX CONSTRAINED MINIMIZATION PROBLEMS

1n , : CfRRf

nnn , , RuRlRx

f is not available

: 00 xfxfFxL is compact

uxlRxF : n

)( min(BCP)

nn11 ,,,, eeee satisfies Assumption D2 the set

NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS

0 s.t.

)( min(NCP)

mnmn , , RbRARx

p,1 ,i , , : 1i

hn CgRRg

f is not available

p,1 ,i , i g are not available

1n , : CfRRf

mnmn , , RbRARx

p,1 ,i , , : 1i

hn CgRRg

and given a point

jTj :m,1 ,j I bxax

I(x )j ,0 : Tj

n daRdxT

bAxxgRxF ,0)(: n

We define

bAxRxF : ~

ASSUMPTION A1 F

~ The set

is compact

For every 0 :i , 0 i

Ti xgdxg

xTd ASSUMPTION A2 there exists a vector such that

Assumption A1

boundeness of the iterates

Assumption A2

existence and boundeness of the Lagrangemultipliers

0, max1

)();P(h

We consider the following continuously differentiable

penalty function:

(penalty parameter)

0 s.t.

)( min

;P min

? 0k ?

ALGORITHM DFN

k1k , , pp

STEP 2 kr

k1k , , pp k;P x

STEP 3 Compute the new point and set k=k+11k1k , x

new STEP 3 ( )

Find such that

1k k1r

kk1k ;P;P yx

otherwise set

0 ,1,0 ,0 k s

r,1,i ~ max

otherwise set k1k

1iki0, max

xg then

new STEP 3

is reduced whenever a better approximation of a stationary point of the penalty function has

been obtained

can be viewed as stationarity measure

ikr,1,i

~ maxk

0~ max ik

r,1,i k

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by DFN

Algorithm then there exists an accomulation point of which is

a stationary point for Problem (LCP)

k x k x

MIXED NONLINEARLY MINIMIZATION PROBLEMS

0 s.t.

)( min

(MNCP)

zi i , IZx

zinn i , : ,0)(: IZxRxuxlxgRxF

We define

uxlRxF : ~

zz In n. discr. var. cc Inn n. cont. var.

We define

iiciiziic , ,

xhxhxxxx

1 , : ,

xxxxRxFxxfxf

Fxxxxgxf

Fx is a stationary point of Problem MNLP if there exists m R such that:

ALGORITHM MDFN

STEP 1 Compute

, , nnk

11k epep

Mixed Minimization of along STEP 2 n

k1k , , pp k;P x

ALGORITHM MDFN

STEP 1 Compute

, , nnk

11k epep

STEP 2

kpa continuous linesearch along

cIiIf perform

kpa discrete linesearch along

zIiIf perform

Continuous linesearch

Continuous linesearch of MDFN = linesearch of DFN

it produces the point

ik pyf

)~,( ik

)2~ max(1,~

yy 2 ik

ik pyf

)~,( ik

)2~ max(1,~

yy 2 ik

ik pyf

)~,( ik

or every accumulation point of the sequence produced by

the algorithm, satisfies:

ASSUMPTION A3

Either the nonlinear constraints functions do not

depend on the integer variables

m ,1 ,...,i, i g

zi i , Ix

1 , : ~zzcc

n xxxxRxFx

are such that

p1,...,i 0 , ~ i

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by MDFN

Algorithm then there exists an accomulation point of which is a

stationary point for Problem (MNCP)

k x k x

More complex (and expensive) derivative-free algorithms allows us

to determine “better” stationary points

to tackle “more difficult” mixed nonlinear optimization problems

to determine “better” stationary points for Problem (MNCP)

1 , :~ ,~

xxxxRxFxxfxf

Fxxxxgxf

~ : 1 , :~zzcc

xfxfxxxxRxFx

satisfies the KKT conditions w.r.t. cx

discrete general variables

continuous variables

discrete dimensional variables

Discrete dimensional variables z: Vector of discrete variables which determine the number of continuous and discrete variables

Three different sets of variables:

to tackle “more difficult” mixed nonlinear optimization problems

),,(min

HARD MIXED NONLINEARLY MINIMIZATION PROBLEMS

The feasible set of y depends on the dimensional variables z

The feasible set of x depends on the discrete variables y and on the dimensional variables z

(Hard-MNCP)

}2)-(yx,y5)-max{(x min 2222

NONSMOOTH MINIMIZATION PROBLEMS

}2)-(yx,y5)-max{(x min 2222

the cone of descent directions can be made arbitrarily narrow

Possible approaches:

smoothing techniques

“larger” set of search directions

smoothing techniques

2n , : CfRRf

mnmn , , RbRARx

)(max min iqi1

)(max )( iqi1 xfxf

xx ln x x,

x ln x,

ffxpeff

ALGORITHM DFN

k1k , , pp

STEP 2 kr

k1k , , pp k;xf

new STEP 3

Find such that

1k k1r

kk1k ;; yfxf

otherwise set

21 i 21 i i k 1k )(,)~( max , min

new STEP 3

is reduced whenever a better approximation of a stationary point of the penalty function has

been obtained

can be viewed as stationarity measure

ikr,1,i

~ maxk

0~ max ik

r,1,i k

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by the

Algorithm then there exists an accomulation point of which is

a stationary point for the MinMax Problem

k x k x

“larger” set of search directions

0 s.t.

)( min(NCP)

nnn , , RuRlRx

RRf : n

p,1,i , : hni RRg

locally Lipschitz-continuous

0, max1

)();Z(h

We consider the following nonsmooth penalty function:

0 (penalty parameter)

ASSUMPTION A1 F

~ The set

is compact

For every 0 :i , 0 i

Ti xgdxg

xTd ASSUMPTION A2 there exists a vector such that

0 s.t.

)( min

; Zmin

, , krk

1k pp set of search directions which are

asintotically dense in the unit sphere

It is possible to define algorithms globally convergent towards

stationary points (in the Clarke sense)

by assuming that the algorithms use

Multiobjective optimization problem

(working in progress)

0 s.t.

)( ,),( min q1

nnn , , RuRlRx

q,1,j , : nj RRf

p,1,i , : ni RRg

locally Lipschitz-continuous

Bilevel optimization problem

(working in progress)

0, s.t.

),( min arg

0, s.t.

),( min

Our DF-codes are available at:

http://www.dis.uniroma1.it/~lucidi/DFL

Thank your

for your attention

n. rings=6

half magnet

Optimal Design of Magnetic Resonance apparatus

magnetsn

ringsnz

Design Variables

Positions of the rings along the X-axis

x1 x2 x3 x4 x5 x6

Angular positions of each row of small magnets 1 2

Design Variables

Offsets of the 4 outermost rings w.r.t. the 2 innermost ones

Xb1 b2 b3 b4

Radius of magnets (integer values)

Objective Function

The objective function measures the non-uniformity of the

magnetic field within a specified target region which is

2)(2)(2)(

xBxBxBxBxU

Magnetic field as uniform as possible and directed along the Z axis

nr=5nm=3r=22

f=51 ppm

Starting point (commercial devices)

nr=7nm=3r=27

f=18 ppm

Final point

Magnetic Resonance Results

ZZ Behavior of on the ZY plane

51ppm configuration 18ppm configuration

Derivative-free Methods using Linesearch Techniques Stefano Lucidi

Documents

Lucidi Tavernola 21 Maggio 2007

Stim 2009 2010 Solima Lucidi

Lucidi lezione 2013 breve

CRM Lucidi dispensa.pdf

CORSO BASE DINFORMATICA Diego Marianucci. I LUCIDI ONLINE Scaricate i lucidi direttamente dal sito:

Laboratorio di Basi di Dati - homes.di.unimi.it · Laboratorio di Basi di Dati Introduzione Parte di questi lucidi è tratta da una versione precedente di Marco Mesiti, Stefano Valtolina,

LUCIDI RILEVAMENTO 08

Egec 2009 Lucidi Parte 4

Lucidi capitolo 2

22 Sogni Lucidi Tecniche

Storiria dellEconomia Politica lucidi a cura di Stefano Lucarelli Limpresa e limprenditore nella prospettiva di Joseph Alois Schumpeter stefano.lucarelli@eco.unipv.it

1 Derivative Securities- Learning Objectives zWhat is a derivative security? zImportant characteristics of a derivative security; zMarkets for derivative

Egec 2009 Lucidi Parte 5

Stefano Kubiça stefano@pr.br

Egec 2009 Lucidi Parte 3

Elementi di Informatica LB Basi di Dati - LIAlia.deis.unibo.it/Courses/ElemLB0708-ELE/lucidi/17-SQL.pdf · Anno accademico 2007/2008 Prof. Stefano Contadini Elementi di Informatica

Ricerca Operativa - Dipartimento di Ingegneria informatica ...lucidi/didattica/disp-Modelli.pdf · Stefano Lucidi-Massimo Roma ... si tratta di disegnare una piastra madre in modo

Lucidi Marketing 2009-10

Sogni Lucidi in Psicologia

Tecniche Sogni Lucidi