Derivative-free Methods using Linesearch Techniques Stefano Lucidi

Preview:

Citation preview

Derivative-free Methods

using

Linesearch Techniques

Stefano Lucidi

P. Tseng

L. Grippo

joint works with

(the father linesearch approach)

M. Sciandrone

G. Liuzzi

F. Lampariello

V. Piccialli

(in order of appearance in this research activity)

F. Rinaldi G. Fasano

PROBLEM DEFINITION:

0)( s.t.

)( min

xg

xf

1n , : CfRRf 1

imn , : CgRRg

n Rx

i, gf are not available

MOTIVATIONS:

In many engineering problems the objective and constraint function values are obtained by

direct measurements

complex simulation programs

first order derivatives can be often neither explicitly calculated nor approximated

MOTIVATIONS:

In fact

the mathematical representations of the objective function and the constraints are not available

the source codes of the programs are not available

the values of the objective function and the constraints can be affected by the presence of noise

the evaluations of the objective function and the constraints can be very expensive

MOTIVATIONS:

the mathematical representations of the objective function and the constraints are not available

the first order derivatives the objective function and the constraints can not be computed analytically

MOTIVATIONS:

the source codes of the programs are not available

the automatic differentiation techniques can not be applied

MOTIVATIONS:

the evaluations of the objective function and the constraints can be very expensive

the finite difference approximations can be too expensive(they need n function evaluations at least)

MOTIVATIONS:

finite difference approximations can produce very wrong estimates of the first order derivatives

the values of the objective function and the constraints can be affected by the presence of noise

NUMERICAL EXPERIENCE:

we considered 41 box constrained standard test problems

we perturbed such problems in the following way:

)N (0, ,1)( )(~

2 xfxf

where denotes a Gaussian distributed random number

with zero mean and variance

)N (0, 2 2

NUMERICAL EXPERIENCE:

we considered two codes:

Number of Failures

0 2

DF_box = derivative-free method

E04UCF = NAG subroutine using finite-differences gradients

-92 01

DF_box

E04UCF

3 3

2 23

GLOBALLY CONVERGENT DF METHODS

Direct search methods use only function values

- pattern search methods where the function is evaluated

on specified geometric patterns

- line search methods which use one-dimensional minimization

along suitable search directions

Modelling methods approximate the functions by suitable

models which are progressively built and updated

UNCONSTRAINED MINIMIZATION PROBLEMS

n s.t.

)( min

Rx

xf

f

1n , : CfRRf

n Rx

is not available

: 0n

0 xfxfRxL is compact

THE ROLE OF THE GRADIENT

kkk1k dxx

characterizes accurately the local behaviour of f allows us

to determine an "efficient" descent direction kd

f

to determine a "good" step length along the directionk

THE ROLE OF THE GRADIENT

iT

i

efx

f

is the directional derivatives of alongf ie

f provides the rates of change of along the 2n directions f ie

f characterizes accurately the local behaviour of f

0

f

0

0 0

0

nT

1T

nT

1T

ef

efef

ef

HOW TO OVERCOME THE LACK OF GRADIENT

r,1 ,i ,ik p the local behaviour of along f

should be indicative of the whole local behaviour of f

0

f

0

0

rk

T

1k

T

pf

pf

a set of directions can be associated at each

kxr,1 ,i ,ik p

ASSUMPTION D

r,1 ,i , ik pGiven , the bounded sequences are

such that

k x

0,0 minlimr

1i

ik

Tk

k

pxf

0 lim kk

xf

EXAMPLES OF SETS OF DIRECTIONS

n,1,i ,

n,1,i ,ik

nik

ik

pp

p are linearly independent and bounded

22k ep

11k ep

,24k ep

13 ep

EXAMPLES OF SETS OF DIRECTIONS (Lewis,Torczon)

nrk

1k

ik

,,

r,1,i ,

Rppcone

p

are bounded

22k ep

11k ep

1

13p

EXAMPLES OF SETS OF DIRECTIONS

kx

1kv

2kv

,2k

1kk vfvfxf

11k ep

22k ep

22k ep

UNCONSTRAINED MINIMIZATION PROBLEMS

Assumption D ensures that, performing finer and finer sampling

of along it is possible:

f r,1 ,i ,ik p

- either to realize that the point is a good approximation of a

stationary point of fkx

- or to find a point where is decreased f1kx

GLOBAL CONVERGENCE

By Assumption D we have:

0 f

0

0

rk

T

1k

T

pf

pf

0 , 0

0 , 0

rkk

rk

rkk

1kk

1k

1kk

xfpxf

xfpxf

0

0

rk

T

1k

T

pf

pf

rk

krk

rkkr

kT

1k

k1k

1kk1

kT

xfpxfpf

xfpxfpf

GLOBAL CONVERGENCE

By using satisfying Assumption D it is possible: r,1 ,i , ik p

to characterize the global convergence of a sequence of points kx

by means

the existence of suitable sequences of failures in decreasing the

objective function along the directions r,1 ,i , ik p f

GLOBAL CONVERGENCE

By Assumption D we have:

0 f

0

0

rk

T

1k

T

pf

pf

r 1,i 0,

0 , 0

0 , 0

kik

rk

rk

rk

rk

rk

1k

1k

1k

1k

1k

xy

yfpyf

yfpyf

0

0

rk

T

1k

T

pf

pf

rk

krk

rkkr

kT

1k

k1k

1kk1

kT

xfpxfpf

xfpxfpf

PROPOSITION Let and be such that: r,1,i , i

k p kx

r,1,i , ik p

k1k xfxf -

- satisfy Assumption D

then

0 lim kk

xf

- there exist sequences of points and scalars

such that

r,1,i , ik

0 lim

0 lim

kik

k

ik

ik

ik

ik

ik

ik

k

xy

oyfpyf

y ik

GLOBAL CONVERGENCE

• the sampling of along all the directions can be

distributed along the iterations of the algorithm

r,1,i , ik p f

• the Proposition characterizes in “some sense” the requirements on the accettable samplings of along the directions that guarantee the global convergence

r,1,i , ik p f

• it is not necessary to perform at each point a sampling

of along all the directions f r,1,i , ik p

kx

GLOBAL CONVERGENCE

The use of directions satisfying Condition D and the result of producing

sequences of points satisfying the hypothesis of the Proposition are

the common elements of

all the globally convergent direct search methods

The direct search methods can divided in

- pattern search methods

- line search methods

PATTERN SEARCH METHODS

Cons: all the points produced must lie in a suitable lattice this implies - additional assumptions on the search directions

- restrictions on the choiches of the steplenghts

Pros: they require that the new point produces a simple decrease

of f

(in the line search methods the new point must guarantees

a “sufficient” decrease of ) f

(in the line search methods no additional requiriments

respect to Assumption D and the assumptions of the Proposition)

LINESEARCH TECHNIQUES

(0,1) ,22 k

T

kkkkkk

k

T

kkkkkk

dxfxfdxf

dxfxfdxf

kk dxf

k

T

kk dxfxf

k k2

(0,1) ,22

2

kkkkkk

2

kkkkkk

dxfdxf

dxfdxf

kk dxf

2k dxf

LINESEARCH TECHNIQUES

k k2

ALGORITHM DF

STEP 1 Compute satisfying Assumption D

rk

1k , , pp

Minimization of along STEP 2 r

k1k , , pp f

STEP 3 Compute and set k=k+1

1rk1k

yx

) ( k1k

ik

ik

ik

1ik xypyy

STEP 2

The aim of this step is:

- to detect the “promising” directions, the direction along which the function decreases “sufficiently”

- to compute steplenghts along these directions which guarantee both a “sufficiently” decrease of the function and a “sufficient” moving from the previous point

ik

ik pyf

2ik

ik pyf

LINESEARCH TECHNIQUE

ik

~ik

~

)~,( ik

ik p

2~ ~

0

ik

i1k

ik

1ik

ik

yy

ik

ik pyf

2ik

ik pyf

LINESEARCH TECHNIQUE

ik

~8ik

~2 ik

~4ik

~

)~,( ik

ik p

ik

i1k

ik

ik

ik

1ik

ik

ik

~

~4

pyy

STEP 2

ik

~ The value of the initial step along the i-th direction derives from

the linesearch performed along the i-th direction at the previuos

iteration

If the set of search directions does not depend on the iteration

namely

the scalar should be representative of the behaviour of the objective

function along the i-th direction

r,1 ,i ,iik pp

ik

~ip

STEP 3

set k=k+1 and go to Step 1

Find such that 1kx 1rk1k

yfxf

otherwise set

1rk1k

yx

At Step 3, every approximation technique can be used to produce

a new better point

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by DF

Algorithm then there exists an accomulation point of and every

accumulation points of is a stationary point of the objective

function

k x k x

k xf

LINEARLY CONSTRAINED MINIMIZATION PROBLEMS

1n , : CfRRf

mnmn , , RbRARx

f is not available

: 00 xfxfFxL is compactis compact

bAxRxF : n

bAx

xf

s.t.

)( min(LCP)

LINEARLY CONSTRAINED MINIMIZATION PROBLEMS

Fx Given a feasible point it is possible to define

• the set of the indeces of the active constraints

jTj :m,1 ,j I bxax

• the set of the feasible directions

I(x )j ,0 : Tj

n daRdxT

LINEARLY CONSTRAINED MINIMIZATION PROBLEMS

is a stationary point

for Problem (LCP)

Fx * *T* T ,0 xppxf

is a stationary point

for Problem (LCP)

Fx *

*r1 T,,c o n e xpp

0

0

rT

1T

pf

pf

LINEARLY CONSTRAINED MINIMIZATION PROBLEMS

is a stationary point

for Problem (LCP)

Fx k

krk

1k T,,c o n e xpp

0

0

rk

Tk

1k

Tk

pxf

pxf

0 , 0

0 , 0

rkk

rk

rkk

1kk

1k

1kk

xfpxf

xfpxf

0

0

rk

Tk

1k

Tk

pxf

pxf

xxxx

TT lim kk

LINEARLY CONSTRAINED MINIMIZATION PROBLEMS

• an estimate of the set of the indeces of the active constraints

jTj :m,1 ,j ,I bxax

• an estimate of the set of the feasible directions

),I(j ,0 : , Tj

n xdaRdxT

Fx Given and it is possible to define 0

• has good properties which allow us to define globally convergent algorithms

kk ,T x

ASSUMPTION D2 (an example)

, , krk

1k pp Given and the set of directions

with satisfies:

k x 0

kik r,1,i ,1 p

kr is uniformly bounded

0 , ,T,T, , kkrk

1k

k xxppcone

,T k x

1kp

2kp

ALGORITHM DFL

STEP 1 Compute satisfying Assumption D2 kr

k1k , , pp

Minimization of along

STEP 2 kr

k1k , , pp f

STEP 3 Compute the new point and set k=k+11kx

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by DFL

Algorithm then there exists an accomulation point of and every

accumulation points of is a stationary point for Problem

(LCP)

k x k x

k x

BOX CONSTRAINED MINIMIZATION PROBLEMS

1n , : CfRRf

nnn , , RuRlRx

f is not available

: 00 xfxfFxL is compact

uxlRxF : n

uxl

xf

s.t.

)( min(BCP)

nn11 ,,,, eeee satisfies Assumption D2 the set

NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS

bAx

xg

xf

0 s.t.

)( min(NCP)

mnmn , , RbRARx

p,1 ,i , , : 1i

hn CgRRg

f is not available

p,1 ,i , i g are not available

1n , : CfRRf

mnmn , , RbRARx

p,1 ,i , , : 1i

hn CgRRg

NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS

Fx~

and given a point

jTj :m,1 ,j I bxax

I(x )j ,0 : Tj

n daRdxT

bAxxgRxF ,0)(: n

We define

bAxRxF : ~

n

NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS

ASSUMPTION A1 F

~ The set

is compact

Fx~

For every 0 :i , 0 i

Ti xgdxg

xTd ASSUMPTION A2 there exists a vector such that

Assumption A1

boundeness of the iterates

Assumption A2

existence and boundeness of the Lagrangemultipliers

NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS

0, max1

)();P(h

1i

qi

xgxfx

We consider the following continuously differentiable

penalty function:

where

0

1q

(penalty parameter)

NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS

bAx

xg

xf

0 s.t.

)( min

bAx

x

s.t.

;P min

? 0k ?

ALGORITHM DFN

STEP 1 Compute satisfying Assumption D2 kr

k1k , , pp

Minimization of along

STEP 2 kr

k1k , , pp k;P x

STEP 3 Compute the new point and set k=k+11k1k , x

new STEP 3 ( )

set k=k+1 and go to Step 1

Find such that

Fx~

1k k1r

kk1k ;P;P yx

otherwise set

1rk1k

yx

0 ,1,0 ,0 k s

if

sk

ik

r,1,i ~ max

k

set

k1k

otherwise set k1k

and

k

h

1iki0, max

xg then

new STEP 3

is reduced whenever a better approximation of a stationary point of the penalty function has

been obtained

k

can be viewed as stationarity measure

ikr,1,i

~ maxk

0~ max ik

r,1,i k

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by DFN

Algorithm then there exists an accomulation point of which is

a stationary point for Problem (LCP)

k x k x

MIXED NONLINEARLY MINIMIZATION PROBLEMS

uxl

xg

xf

0 s.t.

)( min

(MNCP)

zi i , IZx

zinn i , : ,0)(: IZxRxuxlxgRxF

We define

uxlRxF : ~

n

zz In n. discr. var. cc Inn n. cont. var.

MIXED NONLINEARLY MINIMIZATION PROBLEMS

We define

c

zc

iiciiziic , ,

I

II x

xhxhxxxx

1 , : ,

0 0,

~ ,0

zzccn

T

c

T

cc

xxxxRxFxxfxf

xg

Fxxxxgxf

Fx is a stationary point of Problem MNLP if there exists m R such that:

ALGORITHM MDFN

STEP 1 Compute

, , nnk

11k epep

Mixed Minimization of along STEP 2 n

k1k , , pp k;P x

STEP 3 Compute the new point and set k=k+11k1k , x

ALGORITHM MDFN

STEP 1 Compute

, , nnk

11k epep

STEP 2

STEP 3 Compute the new point and set k=k+11k1k , x

i

kpa continuous linesearch along

cIiIf perform

i

kpa discrete linesearch along

zIiIf perform

Continuous linesearch

Fy~1i

k

Continuous linesearch of MDFN = linesearch of DFN

it produces the point

i

k

i

k

i

k

1i

k pyy

ik

ik pyf

||ik

ik yf

LINESEARCH TECHNIQUE

ik

~ik

~

)~,( ik

ik p

)2~ max(1,~

0

ik

i1k

ik

1ik

ik

yy 2 ik

i1k

ik

ik pyf

||i1k

ik yf

LINESEARCH TECHNIQUE

ik

~ik

~

)~,( ik

ik p

)2~ max(1,~

0

ik

i1k

ik

1ik

ik

yy 2 ik

i1k

ik

ik pyf

LINESEARCH TECHNIQUE

ik

~8ik

~2 ik

~4ik

~

)~,( ik

ik p

ik

i1k

ik

ik

ik

1ik

ik

ik

~

~4

pyy

||ik

ik yf

ik

i1k

MIXED NONLINEARLY MINIMIZATION PROBLEMS

or every accumulation point of the sequence produced by

the algorithm, satisfies:

ASSUMPTION A3

Either the nonlinear constraints functions do not

depend on the integer variables

m ,1 ,...,i, i g

zi i , Ix

Fx

1 , : ~zzcc

n xxxxRxFx

are such that

p1,...,i 0 , ~ i

xg

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by MDFN

Algorithm then there exists an accomulation point of which is a

stationary point for Problem (MNCP)

k x k x

MIXED NONLINEARLY MINIMIZATION PROBLEMS

More complex (and expensive) derivative-free algorithms allows us

to determine “better” stationary points

to tackle “more difficult” mixed nonlinear optimization problems

MIXED NONLINEARLY MINIMIZATION PROBLEMS

to determine “better” stationary points for Problem (MNCP)

1 , :~ ,~

0 0,

~ ,0

zzccn

T

c

T

cc

xxxxRxFxxfxf

xg

Fxxxxgxf

~

~ : 1 , :~zzcc

n

x

xfxfxxxxRxFx

satisfies the KKT conditions w.r.t. cx

z

y

x

discrete general variables

continuous variables

discrete dimensional variables

Discrete dimensional variables z: Vector of discrete variables which determine the number of continuous and discrete variables

Three different sets of variables:

to tackle “more difficult” mixed nonlinear optimization problems

znx

zny

nz

d

c

z

Rzyx

zy

z

zyxf

),(

)(

),,(min

HARD MIXED NONLINEARLY MINIMIZATION PROBLEMS

The feasible set of y depends on the dimensional variables z

The feasible set of x depends on the discrete variables y and on the dimensional variables z

(Hard-MNCP)

}2)-(yx,y5)-max{(x min 2222

NONSMOOTH MINIMIZATION PROBLEMS

}2)-(yx,y5)-max{(x min 2222

NONSMOOTH MINIMIZATION PROBLEMS

the cone of descent directions can be made arbitrarily narrow

NONSMOOTH MINIMIZATION PROBLEMS

Possible approaches:

smoothing techniques

“larger” set of search directions

NONSMOOTH MINIMIZATION PROBLEMS

smoothing techniques

2n , : CfRRf

mnmn , , RbRARx

bAx

xf

s.t.

)(max min iqi1

NONSMOOTH MINIMIZATION PROBLEMS

)(max )( iqi1 xfxf

xx ln x x,

x ln x,

i q

1 i

i q

1 i

ffxpeff

fxpef

0

ALGORITHM DFN

STEP 1 Compute satisfying Assumption D2 kr

k1k , , pp

Minimization of along

STEP 2 kr

k1k , , pp k;xf

STEP 3 Compute the new point and set k=k+11k1k , x

new STEP 3

set k=k+1 and go to Step 1

Find such that

Fx~

1k k1r

kk1k ;; yfxf

otherwise set

1rk1k

yx

set

21 i 21 i i k 1k )(,)~( max , min

new STEP 3

is reduced whenever a better approximation of a stationary point of the penalty function has

been obtained

k

can be viewed as stationarity measure

ikr,1,i

~ maxk

0~ max ik

r,1,i k

GLOBAL CONVERGENCE

THEOREM Let be the sequence of points produced by the

Algorithm then there exists an accomulation point of which is

a stationary point for the MinMax Problem

k x k x

NONSMOOTH MINIMIZATION PROBLEMS

“larger” set of search directions

uxl

xg

xf

0 s.t.

)( min(NCP)

nnn , , RuRlRx

RRf : n

p,1,i , : hni RRg

locally Lipschitz-continuous

locally Lipschitz-continuous

0, max1

)();Z(h

1ii

xgxfx

We consider the following nonsmooth penalty function:

where

0 (penalty parameter)

NONSMOOTH MINIMIZATION PROBLEMS

NONSMOOTH MINIMIZATION PROBLEMS

ASSUMPTION A1 F

~ The set

is compact

Fx~

For every 0 :i , 0 i

Ti xgdxg

xTd ASSUMPTION A2 there exists a vector such that

bAx

xg

xf

0 s.t.

)( min

bAx

x

s.t.

; Zmin

],0(

NONSMOOTH MINIMIZATION PROBLEMS

, , krk

1k pp set of search directions which are

asintotically dense in the unit sphere

It is possible to define algorithms globally convergent towards

NONSMOOTH MINIMIZATION PROBLEMS

stationary points (in the Clarke sense)

by assuming that the algorithms use

NONSMOOTH MINIMIZATION PROBLEMS

Multiobjective optimization problem

(working in progress)

uxl

xg

xfxf

0 s.t.

)( ,),( min q1

nnn , , RuRlRx

q,1,j , : nj RRf

p,1,i , : ni RRg

locally Lipschitz-continuous

locally Lipschitz-continuous

NONSMOOTH MINIMIZATION PROBLEMS

Bilevel optimization problem

(working in progress)

yy

xx

uyl

yxg

yxfy

uxl

yxG

yxF

0, s.t.

),( min arg

where

0, s.t.

),( min

Our DF-codes are available at:

http://www.dis.uniroma1.it/~lucidi/DFL

Thank your

for your attention

n. m

agne

ts=

3

n. rings=6

n.m

agne

ts=4

half magnet

Optimal Design of Magnetic Resonance apparatus

magnetsn

ringsnz

.

.

Design Variables

Positions of the rings along the X-axis

X

x1 x2 x3 x4 x5 x6

Angular positions of each row of small magnets 1 2

3

4

Design Variables

Offsets of the 4 outermost rings w.r.t. the 2 innermost ones

Xb1 b2 b3 b4

r

Radius of magnets (integer values)

ry

2

1

2

1

2

1

b

b

x

x

x

Objective Function

The objective function measures the non-uniformity of the

magnetic field within a specified target region which is

21

2)(2)(2)(

xB

xBxBxBxBxU

Z

N

i

iY

iXZ

iZ

p

p

N

i

iZ

Z N

xBxB

p

1

)(

Magnetic field as uniform as possible and directed along the Z axis

nr=5nm=3r=22

f=51 ppm

Starting point (commercial devices)

nr=7nm=3r=27

f=18 ppm

Final point

Magnetic Resonance Results

*

**

xB

xBxB

Z

ZZ Behavior of on the ZY plane

51ppm configuration 18ppm configuration

Recommended