61
Higher-Order Derivatives in Computational Systems Engineering Problem Solving Wolfgang Marquardt RWTH Aachen, AVT.PSE, CCES and AICES 5th International Conference on Automatic Differentiation – AD’08 Bonn, August 11-15, 2008

Higher-Order Derivatives in Engineering · PDF fileHigher-Order Derivatives in Engineering Applications, ... By applying the chain rule of derivative calculus ... Higher order derivatives

  • Upload
    lamque

  • View
    228

  • Download
    2

Embed Size (px)

Citation preview

Higher-Order Derivatives in Computational Systems Engineering

Problem Solving

Wolfgang MarquardtRWTH Aachen, AVT.PSE, CCES and AICES

5th International Conference on Automatic Differentiation – AD’08Bonn, August 11-15, 2008

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 2

AD and its Applications

Automatic Differentiation (AD) is a set of techniques based on the mechanical application of the chain rule to obtain derivatives of a function given as a computer program.

AD is used in the following areas: • Numerical Methods • Sensitivity Analysis • Design Optimization • Data Assimilation & Inverse Problems

www.autodiff.org

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 3

Applications of AD in this Talk

Numerical Methods• nonlinear, differential / algebraic equation solving• bifurcation analysis• nonlinear programming Sensitivity Analysis• first and second order sensitivitiesDesign Optimization• …• under uncertaintyInverse Problems• state estimation (data assimiliation)• parameter and function estimation • model-based optimal experimental design

Optimal Control• trajectory optimization • dynamic real-time optimization

topics listed at www.autodiff.org

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 4

Higher-Order Derivatives and AD

AD exploits the fact that every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations such as additions or elementary functions such as exp(). By applying the chain rule of derivative calculus repeatedly to these operations, derivatives of arbitrary order can be computed automatically, and accurate to working precision.

Are higher-order derivatives just a nice-to-have in applications ?• Nonlinear Programming• Optimal Control• Dynamic Real-time Optimization • Inverse Problems – Model-based Experimental Design• Design Optimization under Uncertainty

www.autodiff.org

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 5

AD tools implement the semantic transformation that systematically applies the chain rule of differential calculus to source code written in various programming languages. 29 AD tools listed on www.autodiff.org

Higher order derivatives from AD have not reached the level of attention (at least in applications) they deserve !

Are current tools only supporting expert users?Is there sufficient awareness of higher-order AD in applications ?

AD Tools – for the User

derivative order # tools # citations1 18 2962 6 17n 5 45

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 6

Agenda

Introduction• AD for Higher Order Derivatives and their Applications• AD Tools for the User

Higher-Order Derivatives in Applications• Numerical Methods – Nonlinear Programming • Optimal Control • Dynamic Real-time Optimization• Inverse Problems – Model-Based Experimental Design • Design Optimization under Uncertainty

Software Tools Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 7

Agenda

Introduction• AD for Higher Order Derivatives and their Applications• AD Tools for the User

Higher-Order Derivatives in Applications• Numerical Methods – Nonlinear Programming • Optimal Control • Dynamic Real-time Optimization• Inverse Problems – Model-Based Experimental Design • Design Optimization under Uncertainty

Software Tools Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 8

Numerical Optimization Methods

Why optimize ?Engineering design means to search a large space of decision variables for an appropriate solution. Optimization supports this search effectively.

Applications often result in Nonlinear Programming problems (NLP):

Solution algorithms are based on first order necessary (Karush-Kuhn-Tucker) conditions.

I E i ,0(x)cs.t.

(x) min

i

x

∪∈≥

φ

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 9

Necessary Conditions of Optimality

Lagrange function

first-order necessary (KKT) conditions

Iixc

Ii

Iixc

ixc

xL

ii

i

i

i

x

∪Ε∈∀=

∈∀≥

∈∀≥

Ε∈∀=

=∇

,0)(

,0

,0)(

,0)(

0),(

**

*

*

*

**

λ

λ

λ

)()(),()(

xcxfxL ixAi

i∑∈

−= λλ

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 10

Hessian of Lagrange Function ?

gradient based evaluation of KKT conditions requires Hessian of Lagrange function

first-order BFGS updates of approximate Hessian vs.

exact Hessian

improved robustness and faster convergence with exact Hessian and suitable NLP solvers warm start facilitated by exact Hessian in repetitive optimization check of sufficient conditions of optimality requires projected Hessian of Lagrange function

NLP algorithms evaluate symmetrically projected Hessiancan be and should be exploited by AD algorithms

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 11

Agenda

Introduction• AD for Higher Order Derivatives and their Applications• AD Tools for the User

Higher-Order Derivatives in Applications• Numerical Methods – Nonlinear Programming • Optimal Control • Dynamic Real-time Optimization• Inverse Problems – Model-Based Experimental Design • Design Optimization under Uncertainty

Software Tools Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 12

0),,,,( s.t.

min,

=avxvxf

t fta f

&& Finish

.)(,0)(ftff xtxtv ≥=

Start

0)0(),0( =vx

max)( vtv ≤

0 10 20 30

-1

-0.5

0

0.5

1MAX

MIN

PATH

acceleration a

time [second]0 10 20 300

2

4

6

8

10

0 10 20 30 0

50

100

150

200

250

velocity vdistance x

time [second]

A Simple Optimal Control Problem

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 13

Optimal Control – Optimization with ODEs

Mayer-type optimal control problemfixed final time without loss of generality

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 14

Semi-Discretization by Control Vector Parameterization

ui(t)

φi,j(t)

pij

discretization of the continuous control u(t) by projection intofinite-dimensional function space (e.g. low-order B-splines)

example: piecewise constantapproximation of the control

finite dimensional decision vector p

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 15

Semi-Discretization Results in NLP

Lagrangian functional

How to efficiently determine the Hessian of the Lagrangian ?

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 16

Composite Adjoint Approach for Hessian Evaluation

composite adjoint approach is a slight modification of second-order adjoint sensitivity analysis

set of ODEs involving terms with second-orderderivatives but partly separable functions

(Özyurt & Barton, SISC, 2005)

(Hannemann & Marquardt, 2008)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 17

Example: Optimal Control of van der Pol Oscillator (1)

0

10

20

30

40

50

60

np=25 np=50 np=100 np=250

BFGS (1st-order derivatives)

Hessian (discrete compositeadjoints, RK4)qual. Hessian (discretecomposite adjoint, Euler)

number of NLP iterations

total CPU time for optimization (sec)30 % reduction in NLPiterationsreduced computational effortimproved robustness

(Hannemann & Marquardt, 2008)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 18

• Hessian evaluation scales • cost for high-accuracy approximate Hessian is same as for gradient evaluation !

scaled CPU time for Jacobian and Hessian evaluation [msec / # parameters]

Jacobian (RK4): forward sensitivities, RK4HESS (RK4): exact Hessian, discrete composite adjoints, RK4HESS (Euler): approx. Hessian, discrete composite adjoints, Euler

(Hannemann & Marquardt, 2008)

Example: Optimal Control of van der Pol Oscillator (2)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 19

adaptive discretization of control #3initial guess: 25 pars. adaptive parameterization at final solution: 129 pars.equivalent equidistant parameterization: 3072 pars.

• CPU time per sensitivity integration (first-order derivative): > 2 h• total CPU time for (adaptive) optimization: > 1 month

A Real World Problem

large-scale chemical plant (Shell)optimal load change14.000 DAEs, 4 controls, sparse Jacobiantime horizon >> 24 hrs

• successful solution of with adaptive discretzation (~4 million sensitivity equations)

• non-adaptive solution „impossible“ (~100 million sensitivity equations)

• next challenge: use of second-order derivatives

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 20

Agenda

Introduction• AD for Higher Order Derivatives and their Applications• AD Tools for the User

Higher-Order Derivatives in Applications• Numerical Methods – Nonlinear Programming • Optimal Control • Dynamic Real-time Optimization• Inverse Problems – Model-Based Experimental Design • Design Optimization under Uncertainty

Software Tools Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 21

Achieve economically optimal plant operation at any time !

DynamicReal-Time

Optimization

State and Parameter Estimation

Processmeasurements

states

controls

Chemical Process

Maximize profit

Applications of dynamic real-time optimizationoperation of chemical processes: batch or continuous during transientsflight management and many others

0 5 0 0 0 1 0 0 0 0 1 5 0 0 00 . 7 5

0 . 8

0 . 8 5

0 . 9

0 . 9 5

1

1 . 0 5

1 . 1

1 . 1 5

grade A

grade B

Dynamic Real-Time Optimization

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 22

Real-Time Optimization on Receding Horizon

process and model uncertainties require repetitive solution ofsimilar optimization problemsfast action requires short cycle times large-scale models result in challenging optimization problems

compute optimal controls u(t) to maximize profit on horizon

time

control horizon

prediction horizonestimation horizon

controls u

states x

measurements y

τ τ + Δτ

infer state x at current time from y(t) on estimation

horizon

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 23

Fast Sensitivity-based Solution Update

θ

parameterize uncertainty exploit sensitivity information of previously solved optimization problem to generate an approximation of the optimal update

⎥⎦

⎤⎢⎣

⋅−=⎥

⎤⎢⎣

⎥⎥⎦

⎤⋅−

⎢⎢⎣

)(

)(

0

)(

)(

)( ,

θ

θ

θ

λ g

Lpgg

L pTa

pTa

p

pp

0)(:

)()(:

)()(:

0

00

00

=−=Δ

Δ=−=Δ

Δ=−=Δ

inainaina

aaaa

pppp

λθλλ

θθλλθλλ

θθθ

θ

θ

Sensitivity system (Fiacco, 1983), invariant active set L: Lagrange functionf: objective functiong: constraintsp: discretized controls: uncertain param.

refrefrefp

refTp

refp

Trefpp

T

z

ggpg

pfpLpLp

+Δ−≥Δ

Δ+ΔΔ+ΔΔΔ

θ

θ

θ

θ

s.t.

5.0min ,

Changing active set (Ganesh & Biegler, 1987)

• compute first- and second-order derivatives

• solve QP for fast update• re-iterate if necessary

(Kadam & Marquardt, Dycops, 2004)

θθ ggfLL ppppp ,,,,

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 24

Example – Real-time Optimization of Semi-batch Reactor

Optimization problem

3

o

o

,

m 5)(0

C 90)(60

C 100)(20

kg/sec 784.5)(0 s.t.

)(yieldmax

≤≤

≤≤

≤≤

≤≤

f

r

w

B

fTF

tV

tT

tT

tF

twB

Uncertainties

k1

k2

k3

A+B CC+B P+EP+C G

C10a(t)]- 35[

%10sec 7.6666

3,...,1 ),15.273

exp(

o

1-0,1

=

+=

=+

=

in

r

iii

T

b

iT

bak

],[: 1bTin=θ

M

cooling

feed

Maximize product P at end of batch !

Tin, FB

Tw

(Würth, Hannemann, Kadam and Marquardt, 2008)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 26

Semi-batch Reactor Example – Fast Solution UpdateControl variables

Constrained variables

(Würth, Hannemann, Kadam and Marquardt, 2008)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 27

Semi-batch Reactor Example – Computational Performance

Time 31.25 62.5 93.75 125 156.25 187.5 218.75# NLPIt 5 1 1 1 1 1 1CPU(s) 6.25 1.29 1.13 1.11 1.09 1.06 1.04Obj. 1266.7 1265.9 1262.5 1248.9 1204.1 1128 1083.2

Time 31.25 62.5 93.75 125 156.25 187.5 218.75# NLPIt 21 5 3 6 7 8 8CPU(s) 9.58 3.49 6.09 3.53 3.78 4.25 3.89Obj. 1266.8 1265.9 1262.5 1249 1204.9 1130.4 1083.9

Exact Hessian faciliates favorable solution update !

Fast updates with second-order derivatives

Rigorous solution (only first-order derivatives)

(Würth, Hannemann, Kadam and Marquardt, 2008)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 28

Agenda

Introduction• AD for Higher Order Derivatives and their Applications• AD Tools for the User

Higher-Order Derivatives in Applications• Numerical Methods – Nonlinear Programming • Optimal Control • Dynamic Real-time Optimization• Inverse Problems – Model-Based Experimental Design • Design Optimization under Uncertainty

Software Tools Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 29

experimentexperiment measurementtechniques

experimental conditions

Model-Based Experimental Design

optimal experimental

design

numericalsimulation

computed statesand measurements

inputs, parameters,initial conditions

a-priori knowledge and intuition

mathematicalmodels

formulation and solutionof inverse problems

model structure,parameters,

inputs, states,confidence regions

sensor modelcalibration selection

measurements

… a model-assisted methodology for model building from experiments !

Marquardt, Chem. Eng. Res. Design, 2005

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 30

θ* θ

φ

φ*

φ(x(1))φ(x(2))

choose exp. conditions such that information

on parameter is maximized

=maximize curvature of

parameter estimation objective

θ* θ

φ

φ*

φ(x(1))φ(x(2))

θ* θ

φ

φ*

φ(x(1))φ(x(2))

parameter estimation objectivee.g. least-squares

error level

Optimal Experimental Design (OED) – Key Idea

OED for parameter estimation:

with

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 31

OED for Best Parameter Precision (1)

mathematical model

OED for improved parameter precision: „size“ of

Different metrics for

• D – optimality:

• E – optimality:

• A – optimality:

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 32

OED for Best Parameter Precision (2)

may involve sensitivities of the solution of a set of parametric evolution equations (ODE or PDE)

s.t. model equations θ∂∂ /y

θ∂∂ /y

first-order gradient-based NLP solversecond-order gradient-based NLP solver

second-order derivativesthird-order derivatives

numerical solution requires higher order derivatives !

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 33(Alsmeyer et al., Appl. Spec. 2003,2004; Bardow et al., AIChE J., 2003, 2006)

(experiments: Göke, Oertker, Koß, LTT, RWTH Aachen)

0.8 0.85 0.9 0.95 10

2

4

6

8

mole fraction [-]

leng

th [m

m]

0.8 0.82

7

7.5

time

t=t0

z [m

m]

mea

surin

g zo

ne

analysis of one-dimensional diffusionsimultaneous measurement of all mole fractionsnonlinear calibration high resolution (Δt = 10 s, Δz = 20 μm)measurement error: statistical ≤ 0.2 mol-%, systematic ≤ 0.5 mol-%

Example – Diffusion Modeling Using Spectroscopic Data

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 34

Experimental Design Issues

→ what mixture volume ratio?

→ where to measure?

→ how long to measure?

→ which mixture compositions?

optimal experimental designfor diffusion coefficient measurements

mea

surin

g zo

ne

z 0 z sen

sor

xL

xU

Bardow et al., AIChE J., 2003, 2006)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 35

mea

surin

g zo

ne

z 0 z sen

sor

xL

xU

example: acetone-benzene-methanol

measurements at the wall, i.e., restricted diffusion experiments• unequal volume of both phases, actual value depending on molar volume• optimal experiment duration depends on cell size → optimal Fourier number

Optimal Experimental Conditions - Visualization

maximuminformation

content

Bardow et al., AIChE J., 2003, 2006)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 36

OED-Problem for 1D Multi-Component Diffusion

subject to diffusion model:θ∂∂ /),( iji tzc

… at least 2nd, better 3rd

order sensitivities of solution to PDE model

Bardow et al., AIChE J., 2003, 2006)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 37

Agenda

Introduction• AD for Higher Order Derivatives and their Applications• AD Tools for the User

Higher-Order Derivatives in Applications• Numerical Methods – Nonlinear Programming • Optimal Control • Dynamic Real-time Optimization• Inverse Problems – Model-Based Experimental Design • Design Optimization under Uncertainty

Software Tools Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 38

time [min]te

mpe

ratu

re [C

]

reactor temperature of anindustrial ammonia synthesis:

Complexity and nonlinearities can lead to unexpected process behavior:

• ammonia synthesis on industrial scale since 1916• >80 years later still surprising behavior

Chemical processes are complex:• one unit operation is already complex

• plant is combination of many unit operations

• feedback and recycling increases complexity

Include dynamic properties and uncertainty in the process design!

Design Optimization – Constructive Nonlinear Dynamics

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 39

Simple Example

Nonlinear program

max ФF, SF

- steady state operating point- bounds on SF

subject to

0.13.0 ≤≤ FS kmol m-3

Semi-batch bioreactor

S, Substrate conc.X, Biomass density

F, SF

S, X

V

Process model of the bioreactor

,)()(

,)(

XSSSVF

dtdS

XSXVF

dtdX

F σ

μ

−−=

+−= ,)( 00 StS =

00 )( XtX =

bSaSSKSkSS

+=−=

)()(),/exp()( μσμ

1

0.8

0.6

0.4

0.2

0 1 2 3 4 5 6 7

φ

F

steady statesoptimum

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 40

Stability at the Design Optimum

4 5 6 7

0.4

0.6

0.8

1

F

SF

parameter space

0.4

t [s]

X [kg m-3]

2.4

2.8

SF[kmol m-3]

120 240 360 480

+10%

-10%

stable trial design

+1%SF

[kmol m-3]1.0

16X [kg m-3]

60 120 t [s]0

unstable optimal design

Unstable optimumnot operable!

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 41

Detection of Stability Boundary

4 5 6 7

0.4

0.6

0.8

1

F

SF

(1)

(3)

(2)Re

Im

eigenvalues

(1) (2) (3)

stableunstable

detection of stability boundaries by evaluating the eigenvalues

stability boundary separates parameter space (SF,F)stability boundary defined only implicitly

parameter space

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 42

Normal Vector Method

unstablestable

rΔ α1

Δ α2

α1

α2 model uncertainties][ )0( ααα Δ±∈

parameters not known exactly,drifting parameters

closest critical points is along the normal vector direction r (Dobson, 1993)

normal vector constraints for optimization problem(Mönnigmann & Marquardt, 2002)

Normal vector is one-dimensional regardless of the number of parameters

complexity grows only linear with the number of parameters

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 43

Design Optimization – Problem Formulation

max Ф Consider systematicallydesired dynamic behaviorprocess uncertainties

Nonlinear programmax ФF, SF

- steady state operating point- bounds on SF- normal vector constraints to stability boundary

subject to

0.13.0 ≤≤ FS kmol m-3

BioreactorF, SF

S, X

V

F

SF

Critical manifolds• stability boundaries• eigenvalue bounds• feasibility constraints

Consider systematically

• desired dynamic behavior• process uncertainties

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 44

Design Optimization with Normal Vector Constraints

objective

normal vector r(i) for critical manifold i

distance between operating pointand critical manifold i

steady state (x(eq), p, α(eq) )

Ii

nrl

rlrpxG

pxf

px

pii

iieqi

iiii

eqeq

eqeq

lpx ieqeq

,,1

,

,),,,,(0

),,,(0

,,max

)()(

)()()(

)()()()(

)()(

)()(

,,, )()()(

K=

+=

=

=

)(

)( αα

α

α

αφα

• stability boundaries• feasibility constraints• performance constraints (eigenvalue sectors)• higher codimension bifurcations (cusp, isola, ...)

Types of critical manifolds:

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 45

Normal Vector Equation Systems

Saddle-node bifurcation

vfrvvvf

f

T

T

x

α−=

−=

==

010

00

Grazing bifurcation

augmented system

rhxhxfxfx

hxhttxh

xxttxfx

x

x

tx

−+==+=

+==

==

αα

αααα

αα

00)0(,

0),),((0

)0(),,),(( 0

&

&

&

augmented system

Hopf bifurcation

)2()2()1()1(

)2()2()1()1(

)1()2()2()1(

)2()2()1()1(

)1(2

)2(1

)1()2(

)2(2

)1(1

)2()1(

)2()1(

)1()2(

)2()1(

0

00

10

0

000

0

00

wfvwfvufr

wfvwfvufwvwvwvwv

wwvvf

wwvvfww

ww

wwf

wwff

xT

xTT

xxT

xxTT

x

TT

TT

Tx

Tx

T

Tx

x

ααα

γγω

γγω

ω

ω

++−=

++=

−=

−+=

+++=

−+−=

=

=

−=

+=

=

augmented system

Normal vector constraints require first and second order derivatives.

Gradient based NLP solver requires at least second/third order derivatives

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 46

Time-Domain Constraints

Eigenvalue bounds guarantee asymptotic behavior

Time-domain constraints cannot be guaranteed !

T

t

Tmax

Re

Im

In the presence of disturbances the transient process behavior has to be considered.

Definition of new types of critical manifolds are necessary.

temperature constraint

Tsoll

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 47

Critical Manifolds for Time-Domain Constraints

Grazing Bifurcation (x) (Nordmark 1991)trajectory touchesstate constraint x(tg,p1) = xmax

critical manifold in the parameter spaceseparates region in which constraint is violated from region in which constraint is not violated

α1

α2

t

α1

α2

α1t

T Tmax

tgα1

g

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 50

Endpoint constraints

augmented system

rhxhtpttxx

ttttxh

tpttxx

x

e

−+==−=

==

αα

αα αϕ

ααϕ

0),,,),((

0),),((0

),,,),((

00

00

Normal Vector Equation Systems

Grazing bifurcation

augmented system

rhxhtpttxx

hxhttxh

tpttxx

x

tx

−+==

+===

αα

αα αϕ

ααϕ

0),,,),((

0),),((0

),,,),((

00

00

&

Normal vector equation systems for time-domain constraints

• flux of nonlinear systems• first-order sensitivitivies of flux w.r.t. uncertain parameters• gradients of constraints require second-order sensitivities of the flux w.r.t. parameters α and p

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 51

Agenda

Introduction• AD for Higher Order Derivatives and their Applications• AD Tools for the User

Higher-Order Derivatives in Applications• Numerical Methods – Nonlinear Programming • Optimal Control • Dynamic Real-time Optimization• Inverse Problems – Model-Based Experimental Design • Design Optimization under Uncertainty

Software Tools • Normal Vector Approach• Optimal Control

Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 52

process model is coded in MAPLE (Monagan et al. 2000)

normal vector constraints (1st and 2nd derivatives, SN, Hopf) are calculated by forward AD algorithm implemented in MAPLEto augment process model to result in design optimization problem

first-order derivates (gradient) for numerical solution of NLP calculated by automatic differentiation with ADIFOR (Bischof et al. 1998), generation of second-order derivatives (Hessian) not possible

optimization by standard NLP solver NPSOL (Gill et al., 1986), evaluation of test functions along the linear path between initial guess and optimal end point

Software Implementation – Normal Vector Approach

Augmented process model involves higher order derivatives of process model equations or flows (normal vector constraints)

… a work-around for higher-order derivatives based on AD

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 53

Agenda

Introduction• AD for Higher Order Derivatives and their Applications• AD Tools for the User

Higher-Order Derivatives in Applications• Numerical Methods – Nonlinear Programming • Optimal Control • Dynamic Real-time Optimization• Inverse Problems – Model-Based Experimental Design • Design Optimization under Uncertainty

Software Tools • Normal Vector Approach• Optimal Control

Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 54

requires to code a (e.g. DAE) model

with path and end point constraints

and an objective function

by means of an advanced modeling and simulation environment.

)),,((min, fup

tpuxΦ

00

0

)(0

,],[),,),(),((

xtx

ttttptutxFxM f

−=

∈=&

A Typical Optimization Problem …

))((0

,],[),,,,(0 0

f

f

txE

ttttpuxP

∈≥

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 55

DyOS yes

no

optimal trajectory

gridrefinement

DAEintegrator

stopping criterion

ESO

u(t)

u(t)

u(t)

u(t)

NLPsolver

initial trajectory

SetVariables

ESO

model server(e.g. gPROMS)

processmodel

ESO

CAPE-OPEN compliantsoftware interface

CORBA Object Bus

GetResiduals

Modelica-model

Translation

Modelica-modelModelica-model

Modelica-modelModelica-modelCAPE - ML

Automatic differentiation

Modelica-modelModelica-modelDynamic link libraryES

O

gPROMS

DyOS

Bischof & Marquardt groups

Modelica/CapeML/ADiCape

DyOS – Dynamic Optimization Software Environment

Process Systems Enterprise Ltd.

Marquardt group

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 56

commercial modeling and simulation environmentfor process engineeringhigh-level equation- /object-oriented modelinglanguage for PDAE systemsmodel compiler generatessemi-discretized DAE modeland Jacobian (by AD)ESO interface to export right hand side side and Jacobianvaluesinterfaces to native numerical solution engines

DyOS yes

no

optimal trajectory

gridrefinement

DAEintegrator

stopping criterion

ESO

u(t)

u(t)

u(t)

u(t)

NLPsolver

initial trajectory

SetVariables

ESO

model server(e.g. gPROMS)

processmodel

ESO

CAPE-OPEN compliantsoftware interface

CORBA Object Bus

GetResiduals

Modelica-model

Translation

Modelica-modelModelica-model

Modelica-modelModelica-modelCAPE - ML

Automatic differentiation

Modelica-modelModelica-modelDynamic link libraryES

O

gPROMS

DyOS

Bischof and Marquardt groups

Modelica/CapeML/ADiCape

DyOS – Dynamic Optimization Software Environment

Process Systems Enterprise Ltd.

Marquardt group

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 57

Modelica (www.modelica.org)open source, high-level equation-/object-oriented modeling language for DAE systems

CapeML (von Wedel, 2002)XML based model representation for model exchangestructured, equation-oriented models; follows systems approach

DyOS yes

no

optimal trajectory

gridrefinement

DAEintegrator

stopping criterion

ESO

u(t)

u(t)

u(t)

u(t)

NLPsolver

initial trajectory

SetVariables

ESO

model server(e.g. gPROMS)

processmodel

ESO

CAPE-OPEN compliantsoftware interface

CORBA Object Bus

GetResiduals

Modelica-model

Translation

Modelica-modelModelica-model

Modelica-modelModelica-modelCAPE - ML

Automatic differentiation

Modelica-modelModelica-modelDynamic link libraryES

O

gPROMS

DyOS

Bischof and Marquardt groups

Modelica/CapeML/ADiCape

DyOS – Dynamic Optimization Software Environment

Process Systems Enterprise Ltd.

Marquardt group

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 58

A Simple Example (Car Dynamics)

model carReal velo;Real dist;Real time;Real parameter accel;Real parameter alpha;Real constant pi;

equationder(dist) = velo;der(time) = 1.0;der(velo) = 4/pi * arctan(accel)

- alpha*pow(velo,2);end car;

Modelica code<Equation><BalancedEquation myID="V-0"><Expression><Term><Factor><FunctionCall fcn.name=“der"><Expression><Term><Factor><VariableOccurrence

definition="V-car-dist“/></Factor>

</Term></Expression>

</FunctionCall></Factor>

</Term></Expression>….

CapeML code fragment

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 59

Modelica (www.modelica.org)open source, high-level equation-/object-oriented modeling language for DAE systems

CapeML (von Wedel, 2002)XML based model representation for model exchangestructured, equation-oriented models, follows systems approach

ADiCape (Bischof et al., 2005)model transformation from Modelica to CapeML (via XSLT)AD applied to CapeML codeextended ESO interface for values of right hand side,Jacobian, Jacobian*vector, Hessian & symmetric projection of Hessian Modelica-model

Translation

Modelica-modelModelica-model

Modelica-modelModelica-modelCAPE - ML

Automatic differentiation

Modelica-modelModelica-modelDynamic link libraryES

O

Bischof and Marquardt groups

Modelica/CapeML/ADiCape

DyOS – Dynamic Optimization Software Environment

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 60

Agenda

Introduction• AD for Higher Order Derivatives and their Applications• AD Tools for the User

Higher-Order Derivatives in Applications• Numerical Methods – Nonlinear Programming • Optimal Control • Dynamic Real-time Optimization• Inverse Problems – Model-Based Experimental Design • Design Optimization under Uncertainty

Software Tools • Normal Vector Approach• Optimal Control

Summary and Conclusions

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 61

Summary

higher-order derivatives are more than a nice-to-have • NLP solvers are more efficient if exact Hessian is available• many applications result in NLP involving first- or second-order

derivates of • functions (right hand sides of models) or• flux (solution of evolution equations)

models often large-scale involving complex nonlinear terms• (manual) symbolic differentiation and implementation impractical• higher-order AD is highly desirable, but users are not aware of

existing capabilities models are coded in high-level modeling languages • AD of computer programs is therefore often not appropriate• AD technology has to be built on modeling languages in addition to

programming languages

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 62

Lessons Learned – Rough Software Architecture

neutral model representation (XML-based?)

modeling environment A

extended neutral model representation (derivatives, additional equations involving derivatives)

efficient model implementation (executable)

modeling environment Z

numerical solvers

model implementation

model translation

model extension & AD

code generation

. . . .

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 63

Conclusions – We Want Your Expertise!

easy-to-use tool for computational systems engineering applications should support• high-level modeling languages• facilitate model extensions prior to AD (e.g. symbolic

manipulation)• derivatives of functions and solutions of evolution equations up

to order 4• „generalized seeding“ to define terms involving derivatives• replacement of recurrent terms• code generation for efficient numerical evaluation • standard interfaces to numerical solvers (e.g. CAPE-OPEN)

Higher-Order Derivatives in Engineering Applications, AD 2008, August 11 - 15 64

Conclusions – We Want Your Expertise!

run-time efficiency is a decisive issue for technology acceptance• large-scale, strongly (non-polynomial) nonlinear models• computational complexity should not only grow with higher-order

derivatives• model development requires extremely short processing times to

facilitate interactive work process

Cooperation between problem owners and AD technology providers is highly desirable !