Design Optimization Utilizing Gradient/Hessian Enhanced Surrogate Model

Design Optimization UtilizingDesign Optimization UtilizingGradient/Hessian Enhanced Surrogate ModelGradient/Hessian Enhanced Surrogate Model

Dept. of Mechanical Engineering,University of Wyoming, USA

Wataru YAMAZAKI,Markus P. RUMPFKEIL,Dimitri J. MAVRIPLIS

28th, June, 2010,28th AIAA Applied Aerodynamics Conference

Yamazaki, W., Dept. of Aero. Eng., Tohoku Univ.Wataru YAMAZAKI, Univ. of Wyoming-2-

Outline

*Background- Efficient CFD Gradient/Hessian calculations- Surrogate Model Enhanced by Gradient/Hessian- Uncertainty Analysis

*Objectives

*Surrogate Model Approaches- Kriging- Direct and Indirect Gradient-enhanced Kriging- Gradient/Hessian-enhanced Kriging Approaches

*Results & Discussion- Analytical Function Fitting- Aerodynamic Data Modeling- 2D Airfoil Drag Minimization- Uncertainty Analysis at Optimal Airfoil

*Conclusions

Background~ Efficient CFD Hessian Calculation

M.P. Rumpfkeil and D.J. Mavriplis, AIAA-2010-1268“Efficient Hessian Calculations using Automatic Differentiation and the Adjoint Method”

An efficient CFD Hessian calculation methodby Adjoint method and Automatic Differentiation (AD)

For steady flowi. Solutions for grid deformation / flow residual equations

ii. Adjoint solutions for flow / grid deformation equations

iii. Ndv linear solutions each for dx/dDj and dw/dDj

iv. Ndv(Ndv+1)/2 cheap evaluations for each Hessian component

0, DxDs 0,, DwDxDR

sR jkT

Background~ Efficient CFD Hessian Calculation

Grid Deformation

Flow Residual

Flow AdjointMesh Adjoint

dx/dD1

dw/dD1

dx/dD2

dw/dD2

dx/dDNdv

dw/dDNdv

......

Gradient and Hessian

An efficient CFD Hessian calculation methodby Adjoint method and Automatic Differentiation (AD)

Background~ Approximate CFD Hessian

For steady flow, a special form of objective function

kkkk FFwF

2target

2target2target

1.. DDDLLL CCwCCwFge

2target

Last approximation is accurate only nearly optimum Approximate Hessian only requires the first-order derivatives

Background~ Uncertainty Analysis

Uncertainty due to manufacturing tolerances in-service wear-and-tear etc

Analysis of mean/variance/PDF ofobjective function w.r.t. fluctuation of design variables

Full Monte-Carlo Simulation Thousands/Millions exact function calls Accurate and easy, but computationally expensive

Moment Method Taylor series expansion by grad/Hessian at the center No information about PDF

Inexpensive Monte-Carlo Simulation Thousands/Millions surrogate model function calls Much lower computational cost

Objectives

The efficient adjoint gradient/Hessian calculation methodswill be effective…

for more efficient global design optimizationwith G/H-enhanced surrogate model approach

for more accurate and cheaper uncertainty analysisby inexpensive Monte-Carlo simulation

with G/H-enhanced surrogate model

Development of gradient/Hessian-enhanced surrogate models Application to design optimization and uncertainty analysis

Kriging, Gradient-enhanced Kriging

Kriging model approach - originally in geological statistics

Two gradient-enhanced Kriging (cokriging or GEK)

Direct CokrigingGradient information is included in the formulation(correlation between func-grad and grad-grad)

Indirect CokrigingSame formulation as original KrigingAdditional samples are created by using the gradient infoKriging model by both real and additional pts

iadd2D example

: Real Sample Point: Additional Sample Point

Gradient/Hessian-enhanced Kriging

Indirect Approach

2D example: Real Sample Point: Additional Sample Point

Arrangements to Use Full Hessian / Diagonal Terms

Major parameters : distance between real / additional pts number of additional pts per real pt

Worse matrix conditioning with smaller distancelarger number of additional pts

Severe tradeoffs for these parameters

Gradient/Hessian-enhanced KrigingDirect ApproachConsider a random process model estimating a function valueby a linear combination of function/gradient/Hessian components

iii yyywy

Minimizing Mean-Squared-Error (MSE) between exact/estimated function

ˆ YyyywEyMSEn

with an unbiasedness constraint

Solving by using the Lagrange multiplier approach

iforJJJ

JwyMSEJ

Gradient/Hessian-enhanced KrigingDirect Approach

Introducing correlation function for covariance termsCorrelation is estimated by distance between two pts with radial basis function

xRxZZCov

RZZCov

Unknown parameters are determined by the following system of equations

,,,,, 111 wT x

Final form of the gradient/Hessian-enhanced direct Kriging approach is

FYRr xx 1ˆ Ty

samplesgivenatninformatioexactyyy

dataobservedbetweennscorrelatio

dataobservedandbetweennscorrelatio

termconstantmean

,,,,, ''1

YRFFRF

Gradient/Hessian-enhanced KrigingDirect Approach

nnnnnn

111111111

111111

xxxxxxxxxxxx

Correlations between F-F, F-G, G-G, F-H, G-H and H-H Up to 4th order derivatives of correlation function Automatic Differentiation by TAPENADE No sensitive parameter Better matrix conditioning than indirect approach

FYRr xx 1ˆ Ty

0.0 0.2 0.4 0.6 0.8 1.0Design Variable

0.0E+00

2.0E-03

4.0E-03

6.0E-03

8.0E-03

1.0E-02

Exact Function Sample Points Kriging RMSE EI

Infill Sampling Criteria for Optimization How to find promising location on surrogate model ? Maximization of Expected Improvement (EI) value Potential of being smaller than current minimum (optimal) Consider both estimated function and uncertainty (RMSE)

yyyyEI minmin

Results & DiscussionResults & Discussion

2D Rastrigin Function Fitting

21 2cos2cos1020 xxxxy x

80 samples by Latin Hypercube SamplingDirect Kriging approach

Exact Rastrigin Function Function-based KrigingGradient-enhancedGradient/Hessian-enhanced

5D Rosenbrock Function Fitting

F: Function-based KrigingFG: Gradient-enhancedFGHd: G/diag. Hess-enhancedFGH: G/full Hess-enhanced

RMSE .vs. Number of sample points Superiority in direct Kriging approaches

thanks to exact enforcement of derivative information better conditioning of correlation matrix

Validation on Rosenbrock Func.

Minimization of 20D Rosenbrock 30 initial sample points by LHS EI-based infill sampling criteria Faster convergence in

G/H-enhanced direct approach

1.E-03

1.E-02

1.E-01

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

0 100 200 300

Number of Sample Points

Direct_FG

Direct_FGH

Indirect_FGH

2 1001dvN

iiii xxxy x

Uncertainty analysis on 2D Rosenbrock 5 sample points for surrogate model

(No sample point on the center location) Superior performance in

G/H-enhanced Inexpensive MC (IMC)

CDFs of Full-MC and IMCOptimization History

0 10 20 30 40 50 60 70 80 90

Function ValueC

Full-MC

IMC_FG

IMC_FGH

Aerodynamic Data Modeling Unstructured mesh CFD Steady inviscid flow, NACA0012 20,000 triangle elements Mach Number [0.5, 1.5] Angle of Attack[deg] [0.0, 5.0] 21x21=441 validation data

Exact Hypersurface of Lift Coefficient Exact Hypersurface of Drag Coefficient

Aerodynamic Data Modeling

Adjoint gradient is helpful to construct accurate surrogate model CFD Hessian is not helpful due to noisy design space

Function-based KrigingGradient-enhancedExact

2D Airfoil Shape Optimization

Unstructured mesh CFD Steady inviscid flow, M=0.755 NACA0012, 16 DVs for Hicks-Henne function Objective function of inverse design form

Exact / Approximate CFD Hessian available Computational time of F : 2 min,

FG : 4 min, FGHapprox

. : 6 min, FGHexact : 36 min (4 min in parallel)

Geometrical constraint for sectional area

2target2target

000.02

100675.0

dddlll

CCwCCwF

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Start from 16 initial sample points which only have function info Gradient/Hessian evaluations only for new optimal designs

Faster convergence in derivative-enhanced surrogate model Best design in gradient/exact Hessian-enhanced model

Towards supercritical airfoils Shock reduction on upper surface

NACA0012 (Baseline) Optimal by G/exact H-enhanced model

2D Airfoil Shape Uncertainty Analysis

Geometrical uncertainty analysis at optimal airfoil Center = optimal obtained by Grad/exact H model

Comparison between2nd order Moment Method (MM2)

using gradient/Hessian at the centerInexpensive Monte-Carlo (IMC1)

using final surrogate model obtained in optimizationInexpensive Monte-Carlo (IMC2)

using different G/H-enhanced model by 11 samplesFull Non-Linear Monte-Carlo (NLMC)

using 3,000 CFD function calls

optimal (center) ±0.1 airfoil

Mean of objective w.r.t. standard deviation of all design variables IMC showed good agreement with NLMC at smaller st. devi. Necessity of additional sampling criteria for total model accuracy ? Promising IMC with much cheaper computational cost

MM2 using derivative at the centerIMC1 using G/H surrogate model obtained in optimizationIMC2 using different G/H model by 11 samples (for st.devi.=0.01)NLMC using 3,000 CFD function calls

Concluding Remarks / Future Works

Development of gradient/Hessian-enhanced Kriging models Application to design optimization and uncertainty analysis

Direct Kriging approach is superior to indirect approach More accurate fitting on exact function Faster convergence towards global optimal design Promising inexpensive Monte-Carlo simulation at much lower cost

Application to higher-dimensional / complicated design problem Robust design with inexpensive Monte-Carlo simulation Gradient/Hessian vector product-enhanced approach

Thank you for your attention !!

Appendix

Moment Method

Taylor series expansion by grad/Hessian at the center No information about PDF

1st order Moment Method

2nd order Moment Method

jiMMMM

Gradient/Hessian-enhanced KrigingImplementation Details

Correlation function of a RBF

Estimation of hyper parameters by maximizing likelihood function with GA

Correlation matrix inversion by Cholesky decomposition

Search of new sample point location by maximizing Expected Improvement (EI) value with GA

hforhhhhscf

13183513

Infill Sampling Criteria for Optimization How to find promising location on surrogate model ? Expected Improvement (EI) value Potential of being smaller than current minimum (optimal) Consider both estimated function and uncertainty (RMSE)

yyyyEI minmin

0.0 0.2 0.4 0.6 0.8 1.0Design Variable

0.0E+00

2.0E-03

4.0E-03

6.0E-03

8.0E-03

1.0E-02

0.0 0.2 0.4 0.6 0.8 1.0Design Variable

0.0E+00

2.0E-03

4.0E-03

6.0E-03

8.0E-03

1.0E-02

0.0 0.2 0.4 0.6 0.8 1.0Design Variable

0.0E+00

2.0E-03

4.0E-03

6.0E-03

8.0E-03

1.0E-02

0.0 0.2 0.4 0.6 0.8 1.0Design Variable

0.0E+00

2.0E-03

4.0E-03

6.0E-03

8.0E-03

1.0E-02

0.0 0.2 0.4 0.6 0.8 1.0Design Variable

0.0E+00

2.0E-03

4.0E-03

6.0E-03

8.0E-03

1.0E-02

0.0 0.2 0.4 0.6 0.8 1.0Design Variable

0.0E+00

2.0E-03

4.0E-03

6.0E-03

8.0E-03

1.0E-02

0.0 0.2 0.4 0.6 0.8 1.0Design Variable

0.0E+00

2.0E-03

4.0E-03

6.0E-03

8.0E-03

1.0E-02

EI-based criteria have good balancebetween global/local searching

5D Rosenbrock Function Fitting

# of pieces of information = sum of # of F/G/H net components To scatter samples is better than concentration at limited samples Approximated computational time factor

G/H-enhanced surrogate model provides better performancewith efficient Gradient/Hessian calculation methods

FGHFGFhasiifTTTF i

sample

//,3/2/1,1

1D Step Function Fitting

0.0 0.2 0.4 0.6 0.8 1.0

Design Variable

alue Exact

Samples

Much better fit by G/H-enhanced direct Kriging

Minimization of 20D Rosenbrock Func.

Minimization of 20 dimensional Rosenbrock function No computational cost for Func/Grad/Hess evaluation Expensive for construction - likelihood function maximization

- inversion of correlation matrix Parallel computation for the likelihood maximization problem

1.E-02

1.E-01

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

0 3000 6000 9000 12000 15000

Computational Time [sec]

Uncertainty Analysis

Uncertainty analysis at (1.0,1.0) on 2D Rosenbrock 5 sample points for surrogate model approaches

(No sample point on the center location) 2nd order Moment Method (MM2) by G/H at the center Superior results in G/H-enhanced Inexpensive MC (IMC)

St. Devi. = 0.15 means the possibility within -0.15<dx<0.15 is about 70%

0.0 0.1 0.2 0.3 0.4 0.5

Standard Deviation of DVs

Full-MC

IMC_FG

IMC_FGH

0 10 20 30 40 50 60 70 80 90

Function Value

CDF at St. Devi.=0.15

Aerodynamic Data Modeling

1.390 1.395 1.400 1.405 1.410Mach Number

CFD Data

Linear by Adj_Grad

Quadratic by Adj_G/H

0.1080

0.1082

0.1084

0.1086

0.1088

0.1090

1.390 1.395 1.400 1.405 1.410

Mach Number

CFD Data

Linear by Adj_Grad

Quadratic by Adj_G/H

NACA0012 M=1.4 AoA=3.5[deg] Noisy in Mach number direction

Cumulative Density Function at St. Devi. of 0.01 Quadratic model only by using gradient/Hessian at optimal Additional sampling criteria to increase total model accuracy

Design Optimization Utilizing Gradient/Hessian Enhanced Surrogate Model

Documents

Numerical Optimization · PDF fileNumerical Optimization Algorithms Overview 2 • Only objective function evaluations are used to ﬁnd optimum point. Gradient and Hessian of the

Blind Source Separation Using Hessian Evaluation

Randomized Hessian Estimation and Directional Search

Gradient-based Methods for Optimization. Part I.sites.science.oregonstate.edu/~gibsonn/optpart1.pdfNewton direction may not be a descent direction (if Hessian not positive deﬁnite)

A350 Take-off Conﬁguration Optimization using a Surrogate ... · A350 Take-off Conﬁguration Optimization using a Surrogate-based Steepest Descent Method ... gradient steepest

Deep Learning via Hessian-free Optimizationasamir/cifar/HFO_James.pdfGradient descent is bad at deep learning (cont.) Two hypotheses for why gradient descent fails: increased frequency

Business Process Analysis of Export of Jute Hessian Bags ... · Statistics on Export of Jute Hessian Bags to India Bangladesh exported US$ 20-30 million worth of jute hessian bags

Compressed Gradient Methods with Hessian-Aided Error ... › pdf › 1909.10327.pdfIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 [1]. [6]

Hessian Fly ID/Management in Wheat

SGDR: STOCHASTIC GRADIENT DESCENT WITH WARM ...ml.informatik.uni-freiburg.de/papers/17-ICLR-SGDR.pdfthe-art optimization techniques attempt to approximate the inverse Hessian in a

Calculating Hessian matrices - Télécom ParisTech · Calculating Hessian matrices Calculating Hessian matrices Student Robert Mansel Gower gowerrobert@gmail.com Advisor Margarida

ATRIX ANAL. APPL c · 2020. 4. 16. · 4, we derive some properties of the manifold essential for the Riemannian trust- gradient and Hessian, and the retraction. Special attention

Optimization Methods for Non-Linear/Non-Convex Learning … · Yann LeCun Making the Hessian better conditionedMaking the Hessian better conditioned The Hessian is the covariance

Limited-Memory Quasi-Newton and Hessian-Free Newton ...opt.kyb.tuebingen.mpg.de/slides/OPT2010-schmidt.pdf · 2 L-BFGS and Hessian-Free Newton 3 Two-Metric (Sub-)Gradient Projection

Efficient Gradient-Free Variational Inference using …...Gradient-Free Variational Inference by Policy Search 2.3. Fitting a Reward Surrogate In order to use Equation5, Gaussianity

Learning Hessian matrix.pdf

First Time Surrogate Marie - International Surrogacy Center...First Time Surrogate Marie Proven: No, first time Surrogate Location: Winchester, CA Age: 37 Insurance: Surrogate has

Ventura County; and TO SURROGATE OR NOT TO SURROGATE

Surrogate To Survive: Surrogate Marketing · Surrogate To Survive: Surrogate Marketing Subject: The essence of Surrogate marketing lies on Communicating the value of a particular

Hessian recovery for finite element methods