View
237
Download
1
Category
Preview:
Citation preview
7/29/2019 NLP Unconstrained Multivariable
1/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory78
What you can do for one variable,you can do for many (in principle)
7/29/2019 NLP Unconstrained Multivariable
2/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory79
Method of Steepest Descent
The method of steepest descent (also known as the gradient
method) is the simplest example of a gradient based methodforminimizing a function of several variables.
Its core is the following recursion formula:
xk+1 = xk kF k
xk , xk+1 = values of the variables in the k and k+1 iteration.
F(x) = objective function to be minimized (or maximized)
F = gradients of the objective function, constituting the direction of trav = the size of the step in the direction of travel
Advantage: Simple
Disadvantage: Seldom converges reliably.
Remember: Direction = dk= S(k) = -F(x(k))
-
Refer to Section 3.5 for Algorithm and Stopping Criteria
7/29/2019 NLP Unconstrained Multivariable
3/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory80
Newton's Method (multi-variable case)
How to extend Newtons method to multivariable c ase?
xk+1 = xk -y(xk)y(xk)
Is this correct?
Start again with Taylor expansion:
y(x
) = y(x
k) +
y(x
k)(x-x
k) + 0.5 (x
-x
k)H
(x
k) (x
-x
k)
Note thatH is the Hessian containing the second order derivatives.
xk+1 = xk -y(xk)H(xk)
Is this correct?
Newtons method for finding an extreme point is
xk+1 = xk -H-1(xk) y(xk)
No. Why?
Not yet. Why?
Like the Steepest Descent Method,Newtons searches in the negativegradient direction.
See Sec. 1.4.
Remainderis dropped.Significance?
Dont confuse H-1with .
T
7/29/2019 NLP Unconstrained Multivariable
4/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory81
Properties of Newton's Method
Good properties (fast convergence) if started near solution.
However, needs modifications if started far away from solution.
Also, (inverse) Hessian is expensive to calculate.
To overcome this, several modifications are often made.
One of them is to add a search parameter in from of the Hessian.(similar to steepest descent). This is often referred to as themodified Newton's method.
Other modification focus on enhancing the properties of the second
and first order gradient combination. Quasi-Newton methods build up curvature information by observing
the behavior of the objective functions and its first order gradient.This info is used to generate an approximation of the Hessian.
7/29/2019 NLP Unconstrained Multivariable
5/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory82
Conjugate Directions Method
Conjugate direction methods can be regarded as somewhat inbetween steepest descent and Newton's method, havingthe positive features of both of them.
Motivation: Desire to accelerate slow convergence of steepestdescent, but avoid expensive evaluation, storage, and
inversion of Hessian.
Application: Conjugate direction methods are invariablyinvented and solved for the quadratic problem:
Note: Condition for optimality isy = Qx - b = 0 orQx = b (linear equation)
Minimize: () xTQx - bTx
Note: Textbook uses A instead of Q.
7/29/2019 NLP Unconstrained Multivariable
6/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory83
Basic Principle
So, since the vectors di are independent, the solution to thenxn quadratic problem can be rewritten as
x* = 0d0 + ... + n-1 dn-1
Multiplying by Q and by taking the scalar product with di, youcan express in terms ofd, Q, and either x* orb
Definition: Given a symmetric matrix Q, two vectors d1 and d2 are saidto be Q orthogonal orQ conjugate (with respect to Q) ifd1
TQd2 = 0.
Note that orthogonal vectors (d1Td2 = 0)are a special case of conjugate
vectors
Note that A is used instead of Q in your textbook
7/29/2019 NLP Unconstrained Multivariable
7/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory84
Conjugate Gradient Method
The conjugate gradient method is the conjugate directionmethod that is obtained by selecting the successive directionvectors as a conjugate version of the successive gradientsobtained as the method progresses.
You generate the conjugate directions as you go along.
ik
i
ikk dgd
1
0
or kkkk dgd 11Search direction
@ iteration k.
Three advantages:
1) Gradient is always nonzero and linearly independent of all previousdirection vectors.
2) Simple formula to determine the new direction. Only slightly morecomplicated than steepest descent.
3) Process makes good progress because it is based on gradients.
7/29/2019 NLP Unconstrained Multivariable
8/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory85
0 - Starting at any x0
define d0
= -g0
= b - Q x0
, where gk
is thecolumn vector of gradients of the objective function at point f(xk)
1 - Using dk , calculate the new point xk+1= xk+ kdk , where
2 - Calculate the new conjugate gradient direction dk+1, accordingto: dk+1= - gk+1+ kdk
where
Pure Conjugate Gradient Method (Quadratic Case)
Tk= -
gk dkdkTQdk
k=gk+1TQdkdkTQdk
This is slightly different than your current textbook
Note that is calculated
7/29/2019 NLP Unconstrained Multivariable
9/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory86
Non-Quadratic Conjugate Gradient Methods
For non-quadratic cases, you have the problem that you do notknow Q, and you would have to make an approximation.
One approach is to substitute Hessian H(xk) instead ofQ.
Problem is that Hessian has to be evaluated at each point.
Other approaches avoid the Q completely by using LineSearches
Examples: Fletcher-Reeves and Polak-Robiere methods
Difference in methods: find k through line search
different formulas for calculating kthan the pure Conjugate Gradientalgorithm
7/29/2019 NLP Unconstrained Multivariable
10/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory87
Polak-Robiere & Fletcher Reeves Method for Minimizing f(x)
0 -Starting at any x0define d0 = -g0,where g is the column vector
of gradients of the objective function at point f(x)
1 -Using dk , find the new point xk+1= xk+ kdk , where k is foundusing a line search that minimizes f(xk+ kdk)
2 - Calculate the new conjugate gradient direction dk+1, accordingto: dk+1= - gk+1+ kdk
where kcan vary depending on what (update) formula you use.
Fletcher-Reeves: Polak-Robiere:
Note: gk+1 is the gradient of the objective function at point xk+1
)()()()( 11
kk
kkk
ggggg
T
T
k
)()()()( 11
kk
kk
gggg
T
T
k
7/29/2019 NLP Unconstrained Multivariable
11/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory88
Fletcher-Reeves Method for Minimizing f(x)
0 -Starting at any x0define d0 = -g0,where g is the column vectorof gradients of the objective function at point f(x)
1 -Using dk , find the new point xk+1= xk+ kdk , where k is foundusing a line search that minimizes f(xk+ kdk)
2 - Calculate the new conjugate gradient direction dk+1, accordingto: dk+1= - gk+1+ kdk
where
)()(
)()( 11
kk
kk
gg
gg
T
T
k
See also Example 3.9 (page73) in your textbook
7/29/2019 NLP Unconstrained Multivariable
12/12
Optimization in Engineering Design
Georgia Institute of Technology
Systems Realization Laboratory89
Conjugate Gradient Method Advantages
http://www.esm.vt.edu/~zgurdal/COURSES/4084/4084-Docs/Animation.html
For animations of each of ALL preceding search techniques, check out:
See em in action!
Attractive are the simple formulae for updating the direction vector.
Method is slightly more complicated than steepest descent, but
converges faster.
Recommended