- Home
- Documents
*Higher-Order Derivatives in Engineering Derivatives in Engineering Applications, ... By applying...*

prev

next

out of 61

View

214Category

## Documents

Embed Size (px)

Higher-Order Derivatives in Computational Systems Engineering

Problem Solving

Wolfgang MarquardtRWTH Aachen, AVT.PSE, CCES and AICES

5th International Conference on Automatic Differentiation AD08Bonn, August 11-15, 2008

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 2

AD and its Applications

Automatic Differentiation (AD) is a set of techniques based on the mechanical application of the chain rule to obtain derivatives of a function given as a computer program.

AD is used in the following areas: Numerical Methods Sensitivity Analysis Design Optimization Data Assimilation & Inverse Problems

www.autodiff.org

http://www.autodiff.org/

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 3

Applications of AD in this Talk

Numerical Methods nonlinear, differential / algebraic equation solving bifurcation analysis nonlinear programming Sensitivity Analysis first and second order sensitivitiesDesign Optimization under uncertaintyInverse Problems state estimation (data assimiliation) parameter and function estimation model-based optimal experimental design

Optimal Control trajectory optimization dynamic real-time optimization

topics listed at www.autodiff.org

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 4

Higher-Order Derivatives and AD

AD exploits the fact that every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations such as additions or elementary functions such as exp(). By applying the chain rule of derivative calculus repeatedly to these operations, derivatives of arbitrary order can be computed automatically, and accurate to working precision.

Are higher-order derivatives just a nice-to-have in applications ? Nonlinear Programming Optimal Control Dynamic Real-time Optimization Inverse Problems Model-based Experimental Design Design Optimization under Uncertainty

www.autodiff.org

http://www.autodiff.org/

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 5

AD tools implement the semantic transformation that systematically applies the chain rule of differential calculus to source code written in various programming languages. 29 AD tools listed on www.autodiff.org

Higher order derivatives from AD have not reached the level of attention (at least in applications) they deserve !

Are current tools only supporting expert users?Is there sufficient awareness of higher-order AD in applications ?

AD Tools for the User

derivative order # tools # citations1 18 2962 6 17n 5 45

http://www.autodiff.org/

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 6

Agenda

Introduction AD for Higher Order Derivatives and their Applications AD Tools for the User

Higher-Order Derivatives in Applications Numerical Methods Nonlinear Programming Optimal Control Dynamic Real-time Optimization Inverse Problems Model-Based Experimental Design Design Optimization under Uncertainty

Software Tools Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 7

Agenda

Introduction AD for Higher Order Derivatives and their Applications AD Tools for the User

Higher-Order Derivatives in Applications Numerical Methods Nonlinear Programming Optimal Control Dynamic Real-time Optimization Inverse Problems Model-Based Experimental Design Design Optimization under Uncertainty

Software Tools Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 8

Numerical Optimization Methods

Why optimize ?Engineering design means to search a large space of decision variables for an appropriate solution. Optimization supports this search effectively.

Applications often result in Nonlinear Programming problems (NLP):

Solution algorithms are based on first order necessary (Karush-Kuhn-Tucker) conditions.

I E i ,0(x)cs.t.

(x) min

i

x

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 9

Necessary Conditions of Optimality

Lagrange function

first-order necessary (KKT) conditions

Iixc

Ii

Iixc

ixc

xL

ii

i

i

i

x

=

=

=

,0)(

,0

,0)(

,0)(

0),(

**

*

*

*

**

)()(),()(

xcxfxL ixAi

i

=

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 10

Hessian of Lagrange Function ?

gradient based evaluation of KKT conditions requires Hessian of Lagrange function

first-order BFGS updates of approximate Hessian vs.

exact Hessian

improved robustness and faster convergence with exact Hessian and suitable NLP solvers warm start facilitated by exact Hessian in repetitive optimization check of sufficient conditions of optimality requires projected Hessian of Lagrange function

NLP algorithms evaluate symmetrically projected Hessiancan be and should be exploited by AD algorithms

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 11

Agenda

Introduction AD for Higher Order Derivatives and their Applications AD Tools for the User

Higher-Order Derivatives in Applications Numerical Methods Nonlinear Programming Optimal Control Dynamic Real-time Optimization Inverse Problems Model-Based Experimental Design Design Optimization under Uncertainty

Software Tools Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 12

0),,,,( s.t.

min,

=avxvxf

t fta f&& Finish

.)(,0)(ftff

xtxtv =

Start

0)0(),0( =vx

max)( vtv

0 10 20 30

-1

-0.5

0

0.5

1MAX

MIN

PATH

acceleration a

time [second]0 10 20 300

2

4

6

8

10

0 10 20 30 0

50

100

150

200

250

velocity vdistance x

time [second]

A Simple Optimal Control Problem

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 13

Optimal Control Optimization with ODEs

Mayer-type optimal control problemfixed final time without loss of generality

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 14

Semi-Discretization by Control Vector Parameterization

ui(t)

i,j(t)

pij

discretization of the continuous control u(t) by projection intofinite-dimensional function space (e.g. low-order B-splines)

example: piecewise constantapproximation of the control

finite dimensional decision vector p

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 15

Semi-Discretization Results in NLP

Lagrangian functional

How to efficiently determine the Hessian of the Lagrangian ?

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 16

Composite Adjoint Approach for Hessian Evaluation

composite adjoint approach is a slight modification of second-order adjoint sensitivity analysis

set of ODEs involving terms with second-orderderivatives but partly separable functions

(zyurt & Barton, SISC, 2005)

(Hannemann & Marquardt, 2008)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 17

Example: Optimal Control of van der Pol Oscillator (1)

0

10

20

30

40

50

60

np=25 np=50 np=100 np=250

BFGS (1st-order derivatives)

Hessian (discrete compositeadjoints, RK4)qual. Hessian (discretecomposite adjoint, Euler)

number of NLP iterations

total CPU time for optimization (sec)30 % reduction in NLPiterationsreduced computational effortimproved robustness

(Hannemann & Marquardt, 2008)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 18

Hessian evaluation scales cost for high-accuracy approximate Hessian is same as for gradient evaluation !

scaled CPU time for Jacobian and Hessian evaluation [msec / # parameters]

Jacobian (RK4): forward sensitivities, RK4HESS (RK4): exact Hessian, discrete composite adjoints, RK4HESS (Euler): approx. Hessian, discrete composite adjoints, Euler

(Hannemann & Marquardt, 2008)

Example: Optimal Control of van der Pol Oscillator (2)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 19

adaptive discretization of control #3initial guess: 25 pars. adaptive parameterization at final solution: 129 pars.equivalent equidistant parameterization: 3072 pars.

CPU time per sensitivity integration (first-order derivative): > 2 h total CPU time for (adaptive) optimization: > 1 month

A Real World Problem

large-scale chemical plant (Shell)optimal load change14.000 DAEs, 4 controls, sparse Jacobiantime horizon >> 24 hrs

successful solution of with adaptive discretzation (~4 million sensitivity equations)

non-adaptive solution impossible (~100 million sensitivity equations)

next challenge: use of second-order derivatives

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 20

Agenda

Introduction AD for Higher Order Derivatives and their Applications AD Tools for the User

Higher-Order Derivatives in Applications Numerica