24
Example 1: Optimization Problem a 1 a 2 -5 -4 -3 -2 -1 0 1 2 3 4 5 -5 0 5 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 3 Overview of Multivariate Optimization Topics Problem definition Algorithms Cyclic coordinate method Steepest descent Conjugate gradient algorithms PARTAN Newton’s method Levenberg-Marquardt Concise, subjective summary J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 1 Example 1: Optimization Problem a 1 a 2 -5 -4 -3 -2 -1 0 1 2 3 4 5 -5 0 5 J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 4 Multivariate Optimization Overview The “unconstrained optimization” problem is a generalization of the line search problem Find a vector a such that a = argmin a f (a) Note that the are no constraints on a Example: Find the vector of coefficients (w R p×1 ) that minimize the average absolute error of a linear model Akin to a blind person trying to find their way to the bottom of a valley in a multidimensional landscape We want to reach the bottom with the minimum number of “cane taps” Also vaguely similar to taking core samples for oil prospecting J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 2

Multivariate Optimization - Web Services Overview - Portland State

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Example 1: Optimization Problem

a1

a 2

−5 −4 −3 −2 −1 0 1 2 3 4 5−5

0

5

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 3

Overview of Multivariate Optimization Topics

• Problem definition

• Algorithms

– Cyclic coordinate method

– Steepest descent

– Conjugate gradient algorithms

– PARTAN

– Newton’s method

– Levenberg-Marquardt

• Concise, subjective summary

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 1

Example 1: Optimization Problem

a1

a 2

−5 −4 −3 −2 −1 0 1 2 3 4 5−5

0

5

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 4

Multivariate Optimization Overview

• The “unconstrained optimization” problem is a generalization ofthe line search problem

• Find a vector a such that

a∗ = argmina

f(a)

• Note that the are no constraints on a

• Example: Find the vector of coefficients (w ∈ Rp×1) that

minimize the average absolute error of a linear model

• Akin to a blind person trying to find their way to the bottom of avalley in a multidimensional landscape

• We want to reach the bottom with the minimum number of “canetaps”

• Also vaguely similar to taking core samples for oil prospecting

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 2

Example 1: Optimization Problem

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 7

Example 1: Optimization Problem

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 5

Example 1: MATLAB Code

function [] = OptimizationProblem ();

%==============================================================================

% User -Specified Parameters

%==============================================================================

x = -5:0.05 :5;

y = -5:0.05 :5;

%==============================================================================

% Evaluate the Function

%==============================================================================

[X,Y] = meshgrid(x,y);

[Z,G] = OptFn(X,Y);

functionName = ’OptimizationProblem ’;

fileIdentifier = fopen([ functionName ’.tex’],’w’);

%==============================================================================

% Contour Map

%==============================================================================

figure;

FigureSet(2,’Slides’);

contour(x,y,Z ,50);

xlabel(’a_1’);

ylabel(’a_2’);

zoom on;

AxisSet (8);

fileName = sprintf(’%s-%s’,functionName ,’Contour ’);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 8

Example 1: Optimization Problem

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 6

case 1, view (45 ,10);

case 2, view ( -55 ,22);

case 3, view ( -131 ,10);

otherwise , error(’Not implemented. ’);

end

fileName = sprintf(’%s-%s%d’,functionName ,’Surface ’,c1);

print(fileName ,’-depsc’);

fprintf(fileIdentifier ,’%%=============================================================================

fprintf(fileIdentifier ,’\\ newslide\n’);

fprintf(fileIdentifier ,’\\ slideheading{Example \\ arabic{exc}: Optimization Problem }\n’);

fprintf(fileIdentifier ,’%%=============================================================================

fprintf(fileIdentifier ,’\\ includegraphics[scale =1]{ Matlab/%s}\n’,fileName );

fprintf(fileIdentifier ,’\n’);

end

%==============================================================================

% List the MATLAB Code

%==============================================================================

fprintf(fileIdentifier ,’%%==============================================================================\ n’

fprintf(fileIdentifier ,’\\ newslide \n’);

fprintf(fileIdentifier ,’\\ slideheading{Example \\ arabic{exc}: MATLAB Code}\n’);

fprintf(fileIdentifier ,’%%==============================================================================\ n’

fprintf(fileIdentifier ,’\t \\ matlabcode{Matlab/%s.m}\n’,functionName );

fclose(fileIdentifier );

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 11

print(fileName ,’-depsc’);

fprintf(fileIdentifier ,’%%==============================================================================\ n’

fprintf(fileIdentifier ,’\\ newslide\n’);

fprintf(fileIdentifier ,’\\ stepcounter{exc}\n’);

fprintf(fileIdentifier ,’\\ slideheading{Example \\ arabic{exc}: Optimization Problem }\n’);

fprintf(fileIdentifier ,’%%==============================================================================\ n’

fprintf(fileIdentifier ,’\\ includegraphics[scale =1]{ Matlab/%s}\n’,fileName );

fprintf(fileIdentifier ,’\n’);

%==============================================================================

% Quiver Map

%==============================================================================

figure;

FigureSet(1,’Slides’);

axis([-5 5 -5 5]);

contour(x,y,Z ,50);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

hold on;

xCoarse = -5:0.5:5;

yCoarse = -5:0.5:5;

[X,Y] = meshgrid(xCoarse ,yCoarse );

[ZCoarse ,GCoarse] = OptFn(X,Y);

nr = size(xCoarse ,1);

dzx = GCoarse( 1:nr ,1:nr);

dzy = GCoarse(nr + (1:nr),1:nr);

quiver(xCoarse ,yCoarse ,dzx ,dzy);

hold off;

xlabel(’a_1’);

ylabel(’a_2’);

zoom on;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 9

Global Optimization?

• In general, all optimization algorithms find a local minimum in asfew steps as possible

• There are also “global” optimization algorithms based on ideassuch as

– Evolutionary computing

– Genetic algorithms

– Simulated annealing

• None of these guarantee convergence in a finite number ofiterations

• All require a lot of computation

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 12

AxisSet (8);

fileName = sprintf(’%s-%s’,functionName ,’Quiver’);

print(fileName ,’-depsc’);

fprintf(fileIdentifier ,’%%==============================================================================\ n’

fprintf(fileIdentifier ,’\\ newslide\n’);

fprintf(fileIdentifier ,’\\ slideheading{Example \\ arabic{exc}: Optimization Problem }\n’);

fprintf(fileIdentifier ,’%%==============================================================================\ n’

fprintf(fileIdentifier ,’\\ includegraphics[scale =1]{ Matlab/%s}\n’,fileName );

fprintf(fileIdentifier ,’\n’);

%==============================================================================

% 3D Maps

%==============================================================================

figure;

set(gcf ,’Renderer ’,’zbuffer ’);

FigureSet(1,’Slides’);

h = surf(x,y,Z);

set(h,’LineStyle ’,’None’);

xlabel(’a_1’);

ylabel(’a_2’);

shading interp;

grid on;

AxisSet (8);

hl = light(’Position ’ ,[0,0,30]);

set(hl ,’Style’,’Local’);

set(h,’BackFaceLighting ’,’unlit’)

material dull

for c1=1:3

switch c1

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 10

Cyclic Coordinate Method

1. For i = 1 to p,

ai := argminα

f([a1, a2, . . . , ai−1, α, ai+1, . . . , ap])

2. Loop to 1 until convergence

+ Simple to implement

+ Each line search can be performed semi-globally to avoid shallowlocal minima

+ Can be used with nominal variables

+ f(a) can be discontinuous

+ No gradient required

− Very slow compared to gradient-based optimization algorithms

− Usually only practical when the number of parameters, p, is small

• There are modified versions with faster convergence

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 15

Optimization Comments

• Ideally, when we construct models we should favor those whichcan be optimized with few shallow local minima and reasonablecomputation

• Graphically you can think of the function to be minimized as theelevation in a complicated high-dimensional landscape

• The problem is to find the lowest point

• The most common approach is to go downhill

• The gradient points in the most “uphill” direction

• The steepest downhill direction is the opposite of the gradient

• Most optimization algorithms use a line search algorithm

• The methods mostly differ only in the way that the “direction ofdescent” is generated

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 13

Example 2: Cyclic Coordinate Method

−5 0 5−5

−4

−3

−2

−1

0

1

2

3

4

5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 16

Optimization Algorithm Outline

• The basic steps of these algorithms is as follows

1. Pick a starting vector a

2. Find the direction of descent, d

3. Move in that direction until a minimum is found:

α∗ := argminα

f(a + αd)

a := a + α∗d

4. Loop to 2 until convergence

• Most of the theory of these algorithms is based on quadraticsurfaces

• Near local minima, this is a good approximation

• Note that the functions should (must) have continuous gradients(almost) everywhere

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 14

Example 2: Cyclic Coordinate Method

0 5 10 15 20 250

1

2

3

4

5

6

Iteration

Euc

lidea

n Po

sitio

n E

rror

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 19

Example 2: Cyclic Coordinate Method

−3 −2 −1 0−3.5

−3

−2.5

−2

−1.5

−1

−0.5

0

0.5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 17

Example 2: Relevant MATLAB Code

function [] = CyclicCoordinate ();

%clear all;

close all;

ns = 26;

x = -3;

y = 1;

b0 = -1;

ls = 30;

a = zeros(ns ,2);

f = zeros(ns ,1);

[z,dzx ,dzy] = OptFn(x,y);

a(1,:) = [x y];

f(1) = z;

for cnt = 2:ns,

if rem(cnt ,2)==1 ,

d = [1 0]’; % Along x direction

else

d = [0 1]’; % Along y direction

end;

[b,fmin] = LineSearch ([x y]’,d,b0 ,ls);

x = x + b*d(1);

y = y + b*d(2);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 20

Example 2: Cyclic Coordinate Method

0 5 10 15 20 250

1

2

3

4

5

6

7

Iteration

Func

tion

Val

ue

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 18

print -depsc CyclicCoordinateContourB;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])’).^2)’).^(1/2);

h = plot(k-1,xerr ,’b’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Euclidean Position Error’);

xlim ([0 ns -1]);

ylim ([0 xerr (1)]);

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc CyclicCoordinatePositionError;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Function Value’);

ylim ([0 f(1)]);

xlim ([0 ns -1]);

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 23

a(cnt ,:) = [x y];

f(cnt) = fmin;

end;

[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt ,id1] = min(z);

[zopt ,id2] = min(zopt);

id1 = id1(id2);

xopt = x(id1 ,id2);

yopt = y(id1 ,id2);

[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt2 ,id1] = min(z);

[zopt2 ,id2] = min(zopt2);

id1 = id1(id2);

xopt2 = x(id1 ,id2);

yopt2 = y(id1 ,id2);

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);

z = OptFn(x,y);

contour(x,y,z ,50);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 21

print -depsc CyclicCoordinateErrorLinear;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 24

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’);

set(h(1),’LineWidth ’,1.5);

set(h(2),’LineWidth ’,0.5);

set(h(1),’MarkerSize ’ ,5);

set(h(2),’MarkerSize ’ ,4);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

AxisSet (8);

print -depsc CyclicCoordinateContourA;

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (-1.5 + (-2:0.05:2),-1.5 + (-2:0.05 :2));

[z,dzx ,dzy] = OptFn(x,y);

contour(x,y,z ,75);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

AxisSet (8);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 22

Example 3: Steepest Descent

−5 0 5−5

−4

−3

−2

−1

0

1

2

3

4

5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 27

Steepest Descent

The gradient of the function f(a) is defined as the vector of partialderivatives:

∇af(a) ≡[

∂f(a)∂a1

∂f(a)∂a2

. . . ∂f(a)∂ap

]T

• It can be shown that the gradient, ∇af(a), “points” in thedirection of maximum ascent

• The negative of the gradient, −∇af(a), “points” in the directionof maximum descent

• A vector d is a direction of descent if there exists a ε such thatf(a + λd) < f(a) for all 0 < λ < ε

• It can also be shown that d is a direction of descent iff(∇af(a))

T

d < 0

• The algorithm of steepest descent uses d = −∇af(a)

• The most fundamental of all algorithms for minimizing acontinuously differentiable function

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 25

Example 3: Steepest Descent

−2 −1.8 −1.6 −1.4 −1.2−2.2

−2.1

−2

−1.9

−1.8

−1.7

−1.6

−1.5

−1.4

−1.3

−1.2

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 28

Steepest Descent

+ Very stable algorithm

− Can converge very slowly once near the local minima where thesurface is approximately quadratic

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 26

Example 3: Relevant MATLAB Code

function [] = SteepestDescent ();

%clear all;

close all;

ns = 26;

x = -3;

y = 1;

b0 = 0.01;

ls = 30;

a = zeros(ns ,2);

f = zeros(ns ,1);

[z,g] = OptFn(x, y);

a(1,:) = [x y];

f(1) = z;

d = -g/norm(g);

for cnt = 2:ns,

[b,fmin] = LineSearch ([x y]’,d,b0 ,ls);

x = x + b*d(1);

y = y + b*d(2);

[z,g] = OptFn(x, y);

d = -g;

d = d/norm(d);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 31

Example 3: Steepest Descent

0 5 10 15 20 250

1

2

3

4

5

6

7

Iteration

Func

tion

Val

ue

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 29

a(cnt ,:) = [x y];

f(cnt) = z;

end;

[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt ,id1] = min(z);

[zopt ,id2] = min(zopt);

id1 = id1(id2);

xopt = x(id1 ,id2);

yopt = y(id1 ,id2);

[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt2 ,id1] = min(z);

[zopt2 ,id2] = min(zopt2);

id1 = id1(id2);

xopt2 = x(id1 ,id2);

yopt2 = y(id1 ,id2);

[zopt zopt2]

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);

z = OptFn(x,y);

contour(x,y,z ,50);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 32

Example 3: Steepest Descent Method

0 5 10 15 20 250

1

2

3

4

5

6

Iteration

Euc

lidea

n Po

sitio

n E

rror

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 30

AxisSet (8);

print -depsc SteepestDescentErrorLinear ;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 35

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’);

set(h(1),’LineWidth ’,1.5);

set(h(2),’LineWidth ’,0.5);

set(h(1),’MarkerSize ’ ,5);

set(h(2),’MarkerSize ’ ,4);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

AxisSet (8);

print -depsc SteepestDescentContourA;

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (-1.6 + (-0.5:0.01:0.5),-1.7 + (-0.5:0.01:0.5));

z = OptFn(x,y);

contour(x,y,z ,75);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 33

Conjugate Gradient Algorithms

1. Take a steepest descent step

2. For i = 2 to p

• α := argminα

f(a + αd)

• a := a + αd

• gi := ∇f(a)

• β := gTi gi

gTi−1gi−1

• d := −gi + βdi

3. Loop to 1 until convergence

• Based on quadratic approximations of f

• Called the Fletcher-Reeves method

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 36

AxisSet (8);

print -depsc SteepestDescentContourB;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])’).^2)’).^(1/2);

h = plot(k-1,xerr ,’b’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Euclidean Position Error’);

xlim ([0 ns -1]);

ylim ([0 xerr (1)]);

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc SteepestDescentPositionError;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Function Value’);

ylim ([0 f(1)]);

xlim ([0 ns -1]);

grid on;

set(gca ,’Box’,’Off’);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 34

Example 4: Fletcher-Reeves Conjugate Gradient

0 5 10 15 20 250

1

2

3

4

5

6

7

Iteration

Func

tion

Val

ue

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 39

Example 4: Fletcher-Reeves Conjugate Gradient

−5 0 5−5

−4

−3

−2

−1

0

1

2

3

4

5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 37

Example 4: Fletcher-Reeves Conjugate Gradient

0 5 10 15 20 250

1

2

3

4

5

6

Iteration

Euc

lidea

n Po

sitio

n E

rror

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 40

Example 4: Fletcher-Reeves Conjugate Gradient

1.5 2 2.5−3.5

−3.4

−3.3

−3.2

−3.1

−3

−2.9

−2.8

−2.7

−2.6

−2.5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 38

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’);

set(h(1),’LineWidth ’,1.5);

set(h(2),’LineWidth ’,0.5);

set(h(1),’MarkerSize ’ ,5);

set(h(2),’MarkerSize ’ ,4);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

AxisSet (8);

print -depsc FletcherReevesContourA;

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (1.5:0.01:2.5 ,-3.5:0.01:-2.5);

z = OptFn(x,y);

contour(x,y,z ,75);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 43

Example 4: Relevant MATLAB Code

function [] = FletcherReeves ();

%clear all;

close all;

ns = 26;

x = -3;

y = 1;

b0 = 0.01;

ls = 30;

a = zeros(ns ,2);

f = zeros(ns ,1);

[z,g] = OptFn(x, y);

a(1,:) = [x y];

f(1) = z;

d = -g/norm(g); % First direction

for cnt = 2:ns,

[b,fmin] = LineSearch ([x y]’,d,b0 ,ls);

x = x + b*d(1);

y = y + b*d(2);

go = g; % Old gradient

[z,g] = OptFn(x, y);

beta = (g’*g)/(go ’*go);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 41

AxisSet (8);

print -depsc FletcherReevesContourB;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])’).^2)’).^(1/2);

h = plot(k-1,xerr ,’b’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Euclidean Position Error’);

xlim ([0 ns -1]);

ylim ([0 xerr (1)]);

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc FletcherReevesPositionError;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Function Value’);

ylim ([0 f(1)]);

xlim ([0 ns -1]);

grid on;

set(gca ,’Box’,’Off’);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 44

d = -g + beta*d;

a(cnt ,:) = [x y];

f(cnt) = z;

end;

[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt ,id1] = min(z);

[zopt ,id2] = min(zopt);

id1 = id1(id2);

xopt = x(id1 ,id2);

yopt = y(id1 ,id2);

[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt2 ,id1] = min(z);

[zopt2 ,id2] = min(zopt2);

id1 = id1(id2);

xopt2 = x(id1 ,id2);

yopt2 = y(id1 ,id2);

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);

z = OptFn(x,y);

contour(x,y,z ,50);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 42

Example 5: Polak-Ribiere Conjugate Gradient

−5 0 5−5

−4

−3

−2

−1

0

1

2

3

4

5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 47

AxisSet (8);

print -depsc FletcherReevesErrorLinear;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 45

Example 5: Polak-Ribiere Conjugate Gradient

1.5 2 2.5−3.5

−3.4

−3.3

−3.2

−3.1

−3

−2.9

−2.8

−2.7

−2.6

−2.5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 48

Conjugate Gradient Algorithms Continued

• There is also a variant called Polak-Ribiere where

β :=(gi − gi−1)

Tgi

gT

i−1gi−1

+ Only requires the gradient

+ Converges in a finite No. steps when f(a) is quadratic and perfectline searches are used

− Less stable numerically than steepest descent

− Sensitive to inexact line searches

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 46

Example 5: MATLAB Code

function [] = PolakRibiere ();

%clear all;

close all;

ns = 26;

x = -3;

y = 1;

b0 = 0.01;

ls = 30;

a = zeros(ns ,2);

f = zeros(ns ,1);

[z,g] = OptFn(x, y);

a(1,:) = [x y];

f(1) = z;

d = -g/norm(g); % First direction

for cnt = 2:ns,

[b,fmin] = LineSearch ([x y]’,d,b0 ,ls);

x = x + b*d(1);

y = y + b*d(2);

go = g; % Old gradient

[z,g] = OptFn(x, y);

beta = ((g-go)’*g)/(go ’*go);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 51

Example 5: Polak-Ribiere Conjugate Gradient

0 5 10 15 20 250

1

2

3

4

5

6

7

Iteration

Func

tion

Val

ue

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 49

d = -g + beta*d;

a(cnt ,:) = [x y];

f(cnt) = z;

end;

[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt ,id1] = min(z);

[zopt ,id2] = min(zopt);

id1 = id1(id2);

xopt = x(id1 ,id2);

yopt = y(id1 ,id2);

[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt2 ,id1] = min(z);

[zopt2 ,id2] = min(zopt2);

id1 = id1(id2);

xopt2 = x(id1 ,id2);

yopt2 = y(id1 ,id2);

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);

z = OptFn(x,y);

contour(x,y,z ,50);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 52

Example 5: Polak-Ribiere Conjugate Gradient

0 5 10 15 20 250

1

2

3

4

5

6

Iteration

Euc

lidea

n Po

sitio

n E

rror

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 50

AxisSet (8);

print -depsc PolakRibiereErrorLinear;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 55

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’);

set(h(1),’LineWidth ’,1.5);

set(h(2),’LineWidth ’,0.5);

set(h(1),’MarkerSize ’ ,5);

set(h(2),’MarkerSize ’ ,4);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

AxisSet (8);

print -depsc PolakRibiereContourA;

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (1.5:0.01:2.5 ,-3.5:0.01:-2.5);

z = OptFn(x,y);

contour(x,y,z ,75);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 53

Parallel Tangents (PARTAN)

1. First gradient step

• d := ∇f(a)

• α := argminα f(a + αd)

• sp := αd

• a := a + sp

2. Gradient Step

• dg := ∇f(a)

• α := argminα f(a + αd)

• sg := αd

• a := a + sg

3. Conjugate Step

• dp := sp + sg

• α := argminα f(a + αd)

• sp := αd

• a := a + sp

4. Loop to 2 until convergence

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 56

AxisSet (8);

print -depsc PolakRibiereContourB;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])’).^2)’).^(1/2);

h = plot(k-1,xerr ,’b’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Euclidean Position Error’);

xlim ([0 ns -1]);

ylim ([0 xerr (1)]);

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc PolakRibierePositionError;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Function Value’);

ylim ([0 f(1)]);

xlim ([0 ns -1]);

grid on;

set(gca ,’Box’,’Off’);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 54

Example 6: PARTAN

1.5 2 2.5−3.5

−3.4

−3.3

−3.2

−3.1

−3

−2.9

−2.8

−2.7

−2.6

−2.5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 59

PARTAN Concept

a0

a1

a2a3

a4

a5

a6 a7

• First two steps are steepest descent

• Thereafter, each iteration consists of two steps

1. Search along the direction

di = ai − ai−2

where ai is the current point and ai−2 is the point from twosteps ago

2. Search in the direction of the negative gradient

di = −∇f(ai)

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 57

Example 6: PARTAN

0 5 10 15 20 250

1

2

3

4

5

6

7

Iteration

Func

tion

Val

ue

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 60

Example 6: PARTAN

−5 0 5−5

−4

−3

−2

−1

0

1

2

3

4

5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 58

cnt = 2;

while cnt <ns ,

% Gradient step

[z,g] = OptFn(x,y);

d = -g/norm(g); % Direction

[bg ,fmin] = LineSearch ([x y]’,d,b0 ,ls);

xg = x + bg*d(1);

yg = y + bg*d(2);

cnt = cnt + 1;

a(cnt ,:) = [xg yg];

f(cnt) = OptFn(xg ,yg);

fprintf(’G : %d %5.3f\n’,cnt ,f(cnt ));

if cnt==ns,

break;

end;

% Conjugate

d = [xg -xa yg -ya]’;

if norm(d)�=0,

d = d/norm(d);

[bp ,fmin] = LineSearch ([xg yg]’,d,b0 ,ls);

else

bp = 0;

end;

if bp >0, % Line search in conjugate direction was successful

fprintf(’P :’);

x = xg + bp*d(1);

y = yg + bp*d(2);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 63

Example 6: PARTAN

0 5 10 15 20 250

1

2

3

4

5

6

Iteration

Euc

lidea

n Po

sitio

n E

rror

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 61

else % Could not move - do another gradient update

cnt = cnt + 1;

a(cnt ,:) = a(cnt -1 ,:);

f(cnt) = f(cnt -1);

if cnt==ns ,

break;

end;

fprintf(’G2:’);

[z,g] = OptFn(xg,yg);

d = -g/norm(g); % Direction

[bp ,fmin] = LineSearch ([xg yg]’,d,b0 ,ls);

x = xg + bp*d(1);

y = yg + bp*d(2);

end;

% Update anchor point

xa = xg;

ya = yg;

cnt = cnt + 1;

a(cnt ,:) = [x y];

f(cnt) = OptFn(x,y);

fprintf(’ %d %5.3f\n’,cnt ,f(cnt ));

end;

[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt ,id1] = min(z);

[zopt ,id2] = min(zopt);

id1 = id1(id2);

xopt = x(id1 ,id2);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 64

Example 6: MATLAB Code

function [] = Partan ();

%clear all;

close all;

ns = 26;

x = -3;

y = 1;

b0 = 0.01;

ls = 30;

a = zeros(ns ,2);

f = zeros(ns ,1);

[z,g] = OptFn(x,y);

a(1,:) = [x y];

f(1) = z;

xa = x;

ya = y;

% First step - substitute for a Conjugate step

d = -g/norm(g); % First direction

[bp,fmin] = LineSearch ([x y]’,d,b0 ,100);

x = x + bp*d(1); % Standin for a conjugate step

y = y + bp*d(2);

a(2,:) = [x y];

f(2) = fmin;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 62

xlim ([0 ns -1]);

ylim ([0 xerr (1)]);

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc PartanPositionError;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Function Value’);

ylim ([0 f(1)]);

xlim ([0 ns -1]);

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc PartanErrorLinear;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 67

yopt = y(id1 ,id2);

[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt2 ,id1] = min(z);

[zopt2 ,id2] = min(zopt2);

id1 = id1(id2);

xopt2 = x(id1 ,id2);

yopt2 = y(id1 ,id2);

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);

z = OptFn(x,y);

contour(x,y,z ,50);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’);

set(h(1),’LineWidth ’,1.5);

set(h(2),’LineWidth ’,0.5);

set(h(1),’MarkerSize ’ ,5);

set(h(2),’MarkerSize ’ ,4);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 65

PARTAN Pros and Cons

a0

a1

a2a3

a4

a5

a6 a7

+ For quadratic functions, converges in a finite number of steps

+ Easier to implement than 2nd order methods

+ Can be used with large number of parameters

+ Each (composite) step is at least as good as steepest descent

+ Tolerant of inexact line searches

− Each (composite) step requires two line searches

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 68

AxisSet (8);

print -depsc PartanContourA;

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (1.5:0.01:2.5 ,-3.5:0.01:-2.5);

z = OptFn(x,y);

contour(x,y,z ,75);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

AxisSet (8);

print -depsc PartanContourB;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])’).^2)’).^(1/2);

h = plot(k-1,xerr ,’b’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Euclidean Position Error’);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 66

Example 7: Newton’s with Steepest Descent Safeguard

0 0.5 1 1.5 2

−3

−2.5

−2

−1.5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 71

Newton’s Method

ak+1 = ak − H(ak)−1 ∇f(ak)

where ∇f(ak) is the gradient and H(ak) is the hessian of f(a),

H(ak) ≡

⎡⎢⎢⎢⎢⎢⎣

∂2f(a)∂a2

1

∂2f(a)∂a1 ∂a2

. . . ∂2f(a)∂a1 ∂ap

∂2f(a)∂a2 ∂a1

∂2f(a)∂a2

2. . . ∂2f(a)

∂a2 ∂ap

......

. . ....

∂2f(a)∂ap ∂a1

∂2f(a)∂ap ∂a2

. . . ∂2f(a)∂a2

p

⎤⎥⎥⎥⎥⎥⎦

• Based on a quadratic approximation of the function f(a)

• If f(a) is quadratic, converges in one step

• If H(a) is positive-definite, the problem is well defined near localminima where f(a) is nearly quadratic

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 69

Example 7: Newton’s with Steepest Descent Safeguard

0 10 20 30 40 50 60 70 80 900

1

2

3

4

5

6

7

Iteration

Func

tion

Val

ue

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 72

Example 7: Newton’s with Steepest Descent Safeguard

−5 0 5−5

−4

−3

−2

−1

0

1

2

3

4

5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 70

y = y + b*d(2);

[z,g,H] = OptFn(x,y);

a(cnt ,:) = [x y];

f(cnt) = z;

end;

[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt ,id1] = min(z);

[zopt ,id2] = min(zopt);

id1 = id1(id2);

xopt = x(id1 ,id2);

yopt = y(id1 ,id2);

[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt2 ,id1] = min(z);

[zopt2 ,id2] = min(zopt2);

id1 = id1(id2);

xopt2 = x(id1 ,id2);

yopt2 = y(id1 ,id2);

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);

z = OptFn(x,y);

contour(x,y,z ,50);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 75

Example 7: Newton’s with Steepest Descent Safeguard

0 10 20 30 40 50 60 70 80 900

1

2

3

4

5

6

Iteration

Euc

lidea

n Po

sitio

n E

rror

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 73

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’);

set(h(1),’LineWidth ’,1.5);

set(h(2),’LineWidth ’,0.5);

set(h(1),’MarkerSize ’ ,5);

set(h(2),’MarkerSize ’ ,4);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

AxisSet (8);

print -depsc NewtonsContourA;

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (1.0 + (-1:0.02:1), -2.4 + (-1:0.02 :1));

z = OptFn(x,y);

contour(x,y,z ,75);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

hold off;

xlabel(’X’);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 76

Example 7: Relevant MATLAB Code

function [] = Newtons ();

%clear all;

close all;

ns = 100;

x = -3; % Starting x

y = 1; % Starting y

b0 = 1;

a = zeros(ns ,2);

f = zeros(ns ,1);

[z,g,H] = OptFn(x, y);

a(1,:) = [x y];

f(1) = z;

for cnt = 2:ns,

d = -inv(H)*g;

if d’*g>0, % Revert to steepest descent if is not direction of descent

%fprintf (’(%2d of %2d) Min. Eig :%5 .3f Reverting...\n’,cnt ,ns,min(eig(H)));

d = -g;

end;

d = d/norm(d);

[b,fmin] = LineSearch ([x y]’,d,b0 ,100);

%a(cnt ,:) = (a(cnt -1,:)’ - inv(H)*g)’; % Pure Newton ’s Method

x = x + b*d(1);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 74

Newton’s Method Pros and Cons

ak+1 = ak − H(ak)−1 ∇f(ak)

+ Very fast convergence near local minima

− Not guaranteed to converge (may actually diverge)

− Requires p × p Hessian

− Requires a p × p matrix inverse that uses O(p3) operations

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 79

ylabel(’Y’);

zoom on;

AxisSet (8);

print -depsc NewtonsContourB;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])’).^2)’).^(1/2);

h = plot(k-1,xerr ,’b’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Euclidean Position Error’);

xlim ([0 ns -1]);

ylim ([0 xerr (1)]);

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc NewtonsPositionError;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Function Value’);

ylim ([0 f(1)]);

xlim ([0 ns -1]);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 77

Levenberg-Marquardt

1. Determine if εkI + H(ak) is positive definite. If not, εk := 4εk

and repeat.

2. Solve the following equation for ak+1

[εkI + H(ak)] (ak+1 − ak) = −∇f(ak)

3.

rk ≡ f(ak) − f(ak+1)q(ak) − q(ak+1)

where q(a) is the quadratic approximation of f(a) based on thef(a), ∇f(a), and H(ak)

4. If rk < 0.25, then εk+1 := 4εk

If rk > 0.75, then εk+1 := 12εk

If rk ≤ 0, then ak+1 := ak

5. If not converged, k := k + 1 and loop to 1.

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 80

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc NewtonsErrorLinear;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 78

Example 8: Levenberg-Marquardt Conjugate Gradient

1.5 2 2.5−3.5

−3.4

−3.3

−3.2

−3.1

−3

−2.9

−2.8

−2.7

−2.6

−2.5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 83

Levenberg-Marquardt Comments

• Similar to Newton’s method

• Has safety provisions for regions where quadratic approximation isinappropriate

• Compare

Newton’s: ak+1 = ak − H(ak)−1 ∇f(ak)LM : [εkI + H(ak)] (ak+1 − ak) = −∇f(ak)

• If ε = 0, these are equivalent

• If ε → ∞, ak+1 → ak

• ε is chosen to ensure that the smallest eigenvalue of H(ak) ispositive and sufficiently large (≥ δ)

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 81

Example 8: Levenberg-Marquardt Conjugate Gradient

0 5 10 15 20 250

1

2

3

4

5

6

7

Iteration

Func

tion

Val

ue

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 84

Example 8: Levenberg-Marquardt Conjugate Gradient

−5 0 5−5

−4

−3

−2

−1

0

1

2

3

4

5

X

Y

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 82

y = a(cnt ,2);

zo = zn; % Old function value

zn = OptFn(x,y);

xd = (a(cnt ,:)’-ap);

qo = zo;

qn = zn + g’*xd + 0.5*xd ’*H*xd;

if qo==qn , % Test for convergence

x = a(cnt ,1);

y = a(cnt ,2);

a(cnt:ns ,:) = ones(ns -cnt +1 ,1)*[x y];

f(cnt:ns ,:) = OptFn(x,y);

break;

end;

r = (zo -zn)/(qo-qn);

if r<0.25 ,

eta = eta * 4;

elseif r>0.50 , % 0.75 is recommended , but much slower

eta = eta / 2;

end;

if zn>zo , % Back up

a(cnt ,:) = a(cnt -1 ,:);

else

ap = a(cnt ,:)’;

end;

x = a(cnt ,1);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 87

Example 8: Levenberg-Marquardt Conjugate Gradient

0 5 10 15 20 250

1

2

3

4

5

6

Iteration

Euc

lidea

n Po

sitio

n E

rror

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 85

y = a(cnt ,2);

a(cnt ,:) = [x y];

f(cnt) = OptFn(x,y);

%disp([cnt a(cnt ,:) f(cnt) r eta])

end;

[x,y] = meshgrid (0+( -0 .01:0.001:0.01),3+(-0.01:0 .001:0.01 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt ,id1] = min(z);

[zopt ,id2] = min(zopt);

id1 = id1(id2);

xopt = x(id1 ,id2);

yopt = y(id1 ,id2);

[x,y] = meshgrid (1 .883+(-0.02:0.001:0.02),-2.963+(-0.02:0.001:0.02 ));

[z,dzx ,dzy] = OptFn(x,y);

[zopt2 ,id1] = min(z);

[zopt2 ,id2] = min(zopt2);

id1 = id1(id2);

xopt2 = x(id1 ,id2);

yopt2 = y(id1 ,id2);

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (-5:0.1:5,-5:0.1 :5);

z = OptFn(x,y);

contour(x,y,z ,50);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 88

Example 8: Relevant MATLAB Code

function [] = LevenbergMarquardt ();

%clear all;

close all;

ns = 26;

x = -3; % Starting x

y = 1; % Starting y

eta = 0.0001;

a = zeros(ns ,2);

f = zeros(ns ,1);

[zn,g,H] = OptFn(x, y);

a(1,:) = [x y];

f(1) = zn;

ap = [x y]’; % Previous point

for cnt = 2:ns,

[zn ,g,H] = OptFn(x,y);

while min(eig(eta*eye (2)+H))<0,

eta = eta * 4;

end;

a(cnt ,:) = (ap - inv(eta*eye (2)+H)*g )’;

x = a(cnt ,1);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 86

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc LevenbergMarquardtErrorLinear;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 91

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

h = plot(xopt ,yopt ,’kx’,xopt ,yopt ,’rx’);

set(h(1),’LineWidth ’,1.5);

set(h(2),’LineWidth ’,0.5);

set(h(1),’MarkerSize ’ ,5);

set(h(2),’MarkerSize ’ ,4);

hold off;

xlabel(’X’);

ylabel(’Y’);

zoom on;

AxisSet (8);

print -depsc LevenbergMarquardtContourA ;

figure;

FigureSet (1,4.5 ,2.75);

[x,y] = meshgrid (1.5:0.01:2.5 ,-3.5:0.01:-2.5);

z = OptFn(x,y);

contour(x,y,z ,75);

h = get(gca ,’Children ’);

set(h,’LineWidth ’,0.2);

axis(’square’);

hold on;

h = plot(a(:,1),a(:,2),’k’,a(:,1),a(:,2),’r’);

set(h(1),’LineWidth ’,1.2);

set(h(2),’LineWidth ’,0.6);

hold off;

xlabel(’X’);

ylabel(’Y’);

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 89

Levenberg-Marquardt Pros and Cons

[εkI + H(ak)] (ak+1 − ak) = −∇f(ak)

• Many equivalent formulations

+ No line search required

+ Can be used with approximations to the hessian

+ Extremely fast convergence (2nd order)

− Requires gradient and hessian (or approximate hessian)

− Requires O(p3) operations for each solution to the key equation

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 92

zoom on;

AxisSet (8);

print -depsc LevenbergMarquardtContourB ;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

xerr = (sum (((a-ones(ns ,1)*[ xopt2 yopt2])’).^2)’).^(1/2);

h = plot(k-1,xerr ,’b’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Euclidean Position Error’);

xlim ([0 ns -1]);

ylim ([0 xerr (1)]);

grid on;

set(gca ,’Box’,’Off’);

AxisSet (8);

print -depsc LevenbergMarquardtPositionError;

figure;

FigureSet (2,4.5 ,2.75);

k = 1:ns;

h = plot(k-1,f,’b’ ,[0 ns],zopt *[1 1],’r’ ,[0 ns],zopt2 *[1 1],’g’);

set(h(1),’Marker’,’.’);

set(h,’MarkerSize ’ ,6);

xlabel(’Iteration ’);

ylabel(’Function Value’);

ylim ([0 f(1)]);

xlim ([0 ns -1]);

grid on;

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 90

Optimization Algorithm Summary

Algorithm Convergence Stable ∇f(a) H(a) LSCyclic Coordinate Slow Y N N YSteepest Descent Slow Y Y N YConjugate Gradient Fast N Y N YPARTAN Fast Y Y N YNewton’s Method Very Fast N Y Y NLevenberg-Marquardt Very Fast Y Y Y N

J. McNames Portland State University ECE 4/557 Multivariate Optimization Ver. 1.14 93