13
Metamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao, K. K. Choi, and Ikjin Lee University of Iowa, Iowa City, Iowa 52242 DOI: 10.2514/1.J051017 Metamodeling has been widely used for design optimization by building surrogate models for computationally intensive engineering application problems. Among all the metamodeling methods, the kriging method has gained signicant interest for its accuracy. However, in traditional kriging methods, the mean structure is constructed using a xed set of polynomial basis functions, and the optimization methods used to obtain the optimal correlation parameter may not yield an accurate optimum. In this paper, a new method called the dynamic kriging method is proposed to t the true model more accurately. In this dynamic kriging method, an optimal mean structure is obtained using the basis functions that are selected by a genetic algorithm from the candidate basis functions based on a new accuracy criterion, and a generalized pattern search algorithm is used to nd an accurate optimum for the correlation parameter. The dynamic kriging method generates a more accurate surrogate model than other metamodeling methods. In addition, the dynamic kriging method is applied to the simulation-based design optimization with multiple efciency strategies. An engineering example shows that the optimal design obtained by using the surrogate models from the dynamic kriging method can achieve the same accuracy as the one obtained by using the sensitivity-based optimization method. Nomenclature e = stochastic process vector in kriging model F = matrix of basis functions evaluated at sample points f = full set of basis functions f EHA = global optimal subset of basis functions by exhaustive algorithm f GA = optimal subset of basis functions by genetic algorithm m = number of design variables NTS = number of testing points used in local window n = number of samples used for surrogate modeling P = the highest order of the polynomial in basis functions R = correlation matrix in kriging model R = correlation function in kriging model r = correlation vector between the point of interest and the samples Vary = variance of the true function values y at sample points y = vector of true function value at sample points ^ yx = prediction of response at point x = regression coefcient vector in kriging model = correlation parameter vector in kriging model 2 = process variance in kriging model = the objective function for correlation parameter estimation I. Introduction M ETAMODELING has been widely used in engineering applications when a simulation is difcult to obtain due to high computational cost. A surrogate model is used to represent the true model, with a limited number of simulations required to be evaluated. Extensive research has been carried out to investigate methods for generating the surrogate models based on limited samples. A number of methods, such as the least-squares regression, moving least-squares regression, support vector regression, and radial basis functions, have been developed over the years [17]. Recently, the kriging method has gained signicant attention due to its capability of dealing with highly nonlinear problems [8,9]. In the kriging method, the response is modeled in two parts: the mean structure and a zero-mean stationary Gaussian stochastic process. The ordinary kriging method (OKG) assumes that this mean structure part is zero or constant on the entire domain. The universal kriging method (UKG) constructs the mean structure using the rst- or second-order polynomials [10]. However, during the practical use of these methods, two problems have been discovered. The rst problem is that the performance of the optimization methods used to nd the optimal correlation parameter are affected by the highly nonlinear region near the origin and the large at region in the rest area with multilocal minima of the objective function [11,12]. The popular DACE toolbox for the kriging method [11] uses the modied Hooke and Jeeves (H-J) algorithm to nd the optimal correlation parameter. Martin [13] uses the LevenbergMarquardt (L-M) method by employing a scoring method to calculate the Hessian matrix for optimization. Forrester and Keane [14] use a genetic algorithm (GA), which is a gradient-free method, to nd the optimum. All these methods have their own advantages and disadvantages. The modied H-J method is efcient but unable to provide the true optimum. The L-M method is a gradient-based method, although it is efcient, it can only nd a local optimum, and thus the obtained optimum is affected by the initial search point. Moreover, due to the large at region and multiple local minima of the objective function, the L-M method often stops prematurely before converging to a true optimum. The GA method is supposed to be able to nd the global optimum, but it is less efcient, and the obtained optimum varies due to the randomness within a genetic algorithm. In this paper, a generalized pattern search algorithm is used to nd the optimal correlation parameter for the kriging method accurately and efciently based on the MLE. The second problem is that neither the OKG nor the UKG can adaptively t the mean structure of the kriging model for highly nonlinear functions and fails to characterize the local nonlinearity of the true function in different design areas. It is shown that different basis functions may yield different surrogate models at the same sample prole. This is especially the case when a local window is used to generate the surrogate model in the design optimization Presented as Paper 2010-2391 at the 13th AIAA/ISSMO Multidisciplinary Analysis and Optimization, Fort Worth, TX, 1315 September 2010; received 17 October 2010; revision received 5 March 2011; accepted for publication 15 March 2011. Copyright © 2011 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved. Copies of this paper may be made for personal or internal use, on condition that the copier pay the $10.00 per- copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923; include the code 0001-1452/11 and $10.00 in correspondence with the CCC. Graduate Research Assistant; [email protected]. Roy J. Carver Professor; Professor, World Class University, Department of Naval Architecture and Ocean Engineering, Seoul National University, Seoul, Korea; [email protected] (Corresponding Author). Postdoctoral Research Scholar; [email protected]. AIAA JOURNAL Vol. 49, No. 9, September 2011 2034

Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

  • Upload
    dinhnga

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

Metamodeling Method Using Dynamic Krigingfor Design Optimization

Liang Zhao,∗ K. K. Choi,† and Ikjin Lee‡

University of Iowa, Iowa City, Iowa 52242

DOI: 10.2514/1.J051017

Metamodeling has been widely used for design optimization by building surrogate models for computationally

intensive engineering application problems. Among all the metamodeling methods, the kriging method has gained

significant interest for its accuracy.However, in traditional krigingmethods, themean structure is constructed using

a fixed set of polynomial basis functions, and the optimization methods used to obtain the optimal correlation

parameter may not yield an accurate optimum. In this paper, a new method called the dynamic kriging method is

proposed to fit the true model more accurately. In this dynamic kriging method, an optimal mean structure is

obtainedusing thebasis functions that are selected bya genetic algorithm from the candidate basis functions based on

a new accuracy criterion, and a generalized pattern search algorithm is used to find an accurate optimum for the

correlation parameter. The dynamic kriging method generates a more accurate surrogate model than other

metamodeling methods. In addition, the dynamic kriging method is applied to the simulation-based design

optimization with multiple efficiency strategies. An engineering example shows that the optimal design obtained by

using the surrogate models from the dynamic kriging method can achieve the same accuracy as the one obtained by

using the sensitivity-based optimization method.

Nomenclature

e = stochastic process vector in kriging modelF = matrix of basis functions evaluated at sample pointsf = full set of basis functionsfEHA = global optimal subset of basis functions by exhaustive

algorithmfGA = optimal subset of basis functions by genetic algorithmm = number of design variablesNTS = number of testing points used in local windown = number of samples used for surrogate modelingP = the highest order of the polynomial in basis functionsR = correlation matrix in kriging modelR��� = correlation function in kriging modelr = correlation vector between the point of interest and

the samplesVar�y� = variance of the true function values y at sample pointsy = vector of true function value at sample pointsy�x� = prediction of response at point x� = regression coefficient vector in kriging model� = correlation parameter vector in kriging model�2 = process variance in kriging model ��� = the objective function for correlation parameter

estimation

I. Introduction

M ETAMODELING has been widely used in engineeringapplications when a simulation is difficult to obtain due to

high computational cost. A surrogate model is used to represent thetrue model, with a limited number of simulations required to be

evaluated. Extensive research has been carried out to investigatemethods for generating the surrogate models based on limitedsamples. A number of methods, such as the least-squares regression,moving least-squares regression, support vector regression, andradial basis functions, have been developed over the years [1–7].Recently, the kriging method has gained significant attention due toits capability of dealing with highly nonlinear problems [8,9]. In thekriging method, the response is modeled in two parts: the meanstructure and a zero-mean stationary Gaussian stochastic process.The ordinary kriging method (OKG) assumes that this meanstructure part is zero or constant on the entire domain. The universalkriging method (UKG) constructs the mean structure using the first-or second-order polynomials [10].

However, during the practical use of these methods, two problemshave been discovered. Thefirst problem is that the performance of theoptimization methods used to find the optimal correlation parameterare affected by the highly nonlinear region near the origin andthe large flat region in the rest area with multilocal minima of theobjective function [11,12]. The popular DACE toolbox for thekriging method [11] uses the modified Hooke and Jeeves (H-J)algorithm to find the optimal correlation parameter. Martin [13] usesthe Levenberg–Marquardt (L-M) method by employing a scoringmethod to calculate the Hessian matrix for optimization. ForresterandKeane [14] use a genetic algorithm (GA),which is a gradient-freemethod, to find the optimum. All these methods have their ownadvantages and disadvantages. The modified H-J method is efficientbut unable to provide the true optimum. The L-M method is agradient-basedmethod, although it is efficient, it can only find a localoptimum, and thus the obtained optimum is affected by the initialsearch point. Moreover, due to the large flat region andmultiple localminima of the objective function, the L-M method often stopsprematurely before converging to a true optimum. TheGAmethod issupposed to be able to find the global optimum, but it is less efficient,and the obtained optimum varies due to the randomness within agenetic algorithm. In this paper, a generalized pattern searchalgorithm is used to find the optimal correlation parameter for thekriging method accurately and efficiently based on the MLE.

The second problem is that neither the OKG nor the UKG canadaptively fit the mean structure of the kriging model for highlynonlinear functions and fails to characterize the local nonlinearity ofthe true function in different design areas. It is shown that differentbasis functions may yield different surrogate models at the samesample profile. This is especially the case when a local window isused to generate the surrogate model in the design optimization

Presented as Paper 2010-2391 at the 13thAIAA/ISSMOMultidisciplinaryAnalysis andOptimization, FortWorth, TX, 13–15September 2010; received17 October 2010; revision received 5 March 2011; accepted for publication15 March 2011. Copyright © 2011 by the American Institute of Aeronauticsand Astronautics, Inc. All rights reserved. Copies of this paper may be madefor personal or internal use, on condition that the copier pay the $10.00 per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive,Danvers, MA 01923; include the code 0001-1452/11 and $10.00 incorrespondence with the CCC.

∗Graduate Research Assistant; [email protected].†Roy J. Carver Professor; Professor, World Class University, Department

of Naval Architecture and Ocean Engineering, Seoul National University,Seoul, Korea; [email protected] (Corresponding Author).

‡Postdoctoral Research Scholar; [email protected].

AIAA JOURNALVol. 49, No. 9, September 2011

2034

Page 2: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

process. That is, different basis functions may need to be used at ondifferent local windows. Therefore, a new method that optimallyselects basis functions to represent the mean structure based oncurrent samples within the local window is desirable [15]. Onemethod of adjusting themean structurewas proposed by Joseph et al.[16] by using aBayesian framework to identify themean structure forthe kriging method. They use a Bayesian forward variable selection,which can be trapped into a local optimum and prevents itself fromfinding a global optimal subset of the basis functions. In this paper, anew method is proposed to find the pseudoglobal optimal basisfunctions by applying a GA for the selection procedure based on anew accuracy criterion.

Another issue is how to apply the proposed dynamic krigingmethod for simulation-based design optimization. Extensive workshave been conducted in applying a surrogate modeling method forsimulation-based design optimization [1,3,5,7,14,17]. The surrogatemodel generated by the dynamic kriging method needs to beefficiently and accurately applied for design optimization. Inparticular, because the dynamic kriging method uses the generalizedpattern search for correlation parameter search and the geneticalgorithm for basis functions selection, the computational time maybecome a concern. To overcome this difficulty, three efficiencystrategies, including the local window for the surrogate model, asequential sampling technique for new sample generation, and anadaptive initial search point for both the generalized pattern searchand the genetic algorithm, are applied in this work. The proposeddynamic kriging method with the efficiency strategies is applied tosolve anM1A1Abrams road-arm design optimization example. Theoptimal result obtained by the proposedmethod can achieve the sameaccuracy as the result obtained by using the sensitivity-based designoptimization.

II. Dynamic Kriging Method Using PatternSearch and Basis Selection

In this section, the traditional kriging method is reviewed first.Two issues in the krigingmodels pointed out earlier are solved by theproposedmethods, which are using the generalized pattern search forcorrelation parameter estimation and using the genetic algorithm forbasis-function selection.

A. Kriging Method

In the kriging method, the outcomes are considered as a realiza-tion of a stochastic process. Consider n sample points, x��x1;x2; . . . ; xn�T with xi 2 Rm, and n responses y � �y�x1�;y�x2�; . . . ; y�xn��T with y�xi� 2 R1. In the kriging method, theresponse at the samples consists of a summation of two parts as

y � F�� e (1)

The first part of the right-hand side of Eq. (1), F�, is the meanstructure of the response, where F� �f�xi�� [f�x� � ffk�x�g,i� 1; . . . ; n, and k� 1; . . . ; K] is an n � K model matrix, and f�x�represents user-selected basis functions, which are usually in simplepolynomial form, such as 1; x; x2; . . .. In Eq. (1), �� ��1;�2; . . . ; �K �T is the vector of the regression coefficients, which isobtained from the generalized least-squares method. The second partof the right-hand side of Eq. (1), e� �e�x1�; e�x2�; . . . ; e�xn��T , is arealization of the stochastic process e�x� that is assumed to have thezero mean E�e�xi�� � 0 and covariance structure E�e�xi�e�xj����2R��;xi;xj�, where �2 is the process variance, �� ��1; �2; . . . ;�m�T is the unknown process correlation parameter vector ofdimension m that has to be estimated, and R��;xi;xj� is thecorrelation function of the stochastic process [11]. In mostengineering problems the correlation function is set to be a Gaussianform expressed as

R��;xi;xj� �Yml�1

exp��l�xi;l xj;l�2� (2)

where xi;l is the lth component of the ith vector xi. The optimalchoice of � is defined as the maximum likelihood estimator(MLE) [18], which is the maximizer of the likelihood function L,expressed as

L� �2��2�n2jRj12 exp� 1

2�2�y F��TR1�y F��

�(3)

whereR is the symmetric correlation matrix with �i–j�th componentRij � R��;xi;xj� (i; j� 1; . . . ; n), and �2 � 1

n�y F��TR1�y

F�� and �� �FTR1F�1FTR1y are obtained from thegeneralized least-squares regression. By taking the logarithm ofEq. (3) with the imposed �2 value and multiplying by 1, themaximization problem to obtain optimal � is equivalent to

minimize ��� where ��� � 1

2ln �jRj� � n

2ln ��2� (4)

After finding the optimal �, the prediction of kriging model whichinterpolates the n sample points and the derivative of the predictionwith respect to x are expressed as

y�x� � fT�� rTR1�y F�� y0�x� � Jf�x�T�� Jr�x�Tr(5)

where f� �fk�x��T (k� 1; . . . ; K) is the basis-function valuesevaluated at the predicted point x, r� �R��;x1; x�; . . . ;R��;xn;x��T , Jf�x� and Jr�x� are the Jacobians of f and r,respectively.

B. Correlation Parameter Estimation Using Pattern Search Method

To show how the optimal � affects the final accuracy of the krigingprediction in Eq. (5), consider one simple revised example based onForrester and Keane’s work [14]. In this example, the true function isexpressed as

y� �6x 2�2 sin�12x 4� � 10 x 2 �0; 1� (6)

and five evenly distributed samples along the x axis are used togenerate the kriging prediction. Then the accuracy of the krigingprediction is tested using different �, and the relative root-mean-squared error (rrmse) is used as the accuracy measurement. Inparticular, the rrmse is defined as

rrmse �

�����������������������������������������������������1

NTS

XNTSi�1

�y�xi� y�xi�

y�xi�

�2

vuut (7)

where NTS is the number of testing points, and y�xi� and y�xi� are,respectively, the kriging prediction and the true response at testingpoint xi. In this example, NTS is 100 and all the testing points areevenly distributed along the x axis. As shown in Fig. 1a, the rrmsevalue changes significantly as� changes. As � increases from 2.7345to 10, the rrmse decreases from its maximum (0.21544) to theminimum (0.12981), whereas in Fig. 1b, the ��� function valueonly changes from 0.080374 to0:20981. This behavior shows thatit is very important to accurately solve the minimization problem ofEq. (4) and find the true optimal � to generate an accurate krigingprediction.

Therefore, to accurately solve Eq. (4), it is proposed to use thegeneralized pattern search (GPS) method. The reason for using theGPS method is that the ��� function in Eq. (4) usually has a highlynonlinear region near the origin and a large flat region elsewhere, asshown in Fig. 1b. In addition, the ��� usually contains multiplelocal minima for high-dimension problems. For such a minimizationproblem, a gradient-based optimization algorithm often prematurelyconverges to a local minimum if the initial � value is close to theorigin, or it prematurely stops in the large flat region if the initial �value is close to the upper bound of the� domain. Among all the non-gradient-based optimization algorithms, the GA is considered time-consuming and unreliable for such a continuous optimizationproblem,whereas theGPSmethod is not affected by the initial search

ZHAO, CHOI, AND LEE 2035

Page 3: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

point and accurately converges to the optimal �. The globalconvergence of the GPS method has been proven by Lewis andTorczon [19]. It is worth mentioning that since the ��� is alwayshighly nonlinear near the origin and large flat elsewhere, the initialsearch point is set to be the lower bound of the � domain for theoptimization method to converge quickly.

To better demonstrate the challenge in this minimization problemand how the GPS method performs, the Branin–Hoo problem

f�x1; x2���x2

5:1

4�2x21�

5

�x1 6

�2

� 10

�1 1

8�

�cos�x1�� 10

x1 2 �5;10�; x2 2 �0;15� (8)

is used. The 20-Latin-hypercube sample (LHS) profile and the truefunction contour are first shown in Fig. 2a. The associated ��� plotis shown in Fig. 2b, where one can see that ��� indeed has a sharpcorner region near the origin and a large flat region in the rest of the �domain. The initial search point is set to be the origin of the� domain.To have a fair comparison, it is worthmentioning the computer codesused in this example first. The modified H-J algorithm is applied byusing the DACEMATLAB package. The L-M method is applied byfollowing Martin’s [13] work. The GA method and the GPS methodare applied by using the MATLAB Genetic Algorithm and Direct

Search Toolbox R14, respectively. The stopping criteria set for allmethods are the same:

1) The change in � value is less than 1E 6.2) The change in objective function ���value is less than 1E 6.To compare the accuracy of the kriging predictions based on the

optimal � obtained from different optimization methods, 100 � 100grid testing points are used to calculate the rrmse values for eachmethod. Table 1 shows that the GPS method finds the best optimal �value with the smallest ��� value, and the kriging prediction basedon the optimal � obtained by the GPS method achieves the bestaccuracy as well.

To demonstrate performance of the GPS for finding a betteroptimum of the ��� function compared with the other three optimi-zationmethods in a general way, a statistical study is conducted usingthe Branin–Hoo example again. In this statistical study, 100randomly generated sets of 20-LHS samples are used. For eachsample set, the ordinary kriging is applied to generate the prediction.The four optimization methods discussed above are applied to solveEq. (4), and the ��� function values at the global optimum of � areranked from the smallest to the largest order. After the 100 trials, thefrequency of the rank for four methods is shown in Table 2, where itshows that the GPS found the best optimal � 92 times out of 100.Even though it is hard to claim that one optimization algorithmperformance is better than the others all the time, to solve this

Fig. 1 Effect of different � values for the accuracy of kriging prediction.

Fig. 2 True function and ��� plot.

2036 ZHAO, CHOI, AND LEE

Page 4: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

particular bounded constrained problemof Eq. (4), it is clearly shownthat theGPSmethod can obtain the best results infinding the accurateoptimal �, while the L-M method and the GA method havecomparable performance thereafter.

To show the accuracy of the kriging models based on differentoptimal � values from four optimization methods, 100 � 100 gridtesting points are evaluated to calculate the rrmse values, and the rankof rrmse values associated with each optimization method from thesmallest to the largest is shown in Table 3. It shows that the krigingmodel using the GPS method achieves a better accuracy than theother threemethods. Note that the difference among the fourmethodsin Table 3 is not as significant as the one in Table 2.

As the dimension of the design variables increases, the differenceof using four optimizationmethods to solveEq. (4) is becomingmoresignificant. Consider a 12-D mathematical example, expressed as

y� �xi 1�2 �X12i�2

i�2x2i xi1�2; 10 xi 10 (9)

This function is called the Dixon–Price function. With 60 samplesgenerated using the Latin hypercube sampling method, the ordinarykriging prediction using four different optimization methods forfinding optimal � is generated. Table 4 shows the optimal � obtainedusing four different optimization methods and the associatedobjective function values and the rrmse values. The GPS methodfinds the best optimal �with the smallest ��opt� value and generatesthe most accurate kriging prediction.

Like the previous example, to exclude the effect from the sampleposition and show the general performance of four optimizationmethods, 100 randomly generated sets of 60-LHS samples are used.The four optimizations are applied to find the optimal � values togenerate kriging prediction. The rrmse values are calculated based ona fixed set of 1000-LHS samples. Table 5 shows that the GPSmethodfinds the best optimal � value in 83 times, and the L-Mmethod findsthe best optimal � in 17 times. The GAmethod and H-J method failsto find the best optimal � value. At the same time, the associatedrrmse values are ranked as well, as shown in Table 6. The krigingmodel with the GPS method generates the most accurate surrogatemodel in 80 times, followed by the one with the L-M method of 18

times. Tables 5 and 6 indeed show that the GPS method outperformsother three optimization methods for this high-dimension problem.

C. Dynamic Basis-Function Selection Using Genetic Algorithm

For the UKGmethod, the basis functions f of F used in Eq. (1) arefixed during the entire metamodeling process, and it usually takes upto the second-order polynomial functions. However, it is clear thathigher-order terms can predict nonlinear mean structure, which mayvary for different problems. Hence, in general, for highly nonlinearproblems, fixed lower-order basis functions may not be suitable todescribe the nonlinearity of the mean structure. On the other hand,Martin and Simpson [20] pointed out that, in some cases, theaccuracy of the surrogate model may not be enhanced by usinghigher-order terms. That is, the surrogate model may become evenworse when some particular higher-order terms are used.

The impact of selection of basis functions can be shown using thefollowing illustrative example:

f�x1; x2� ��x1 5�3 � ex1

10� x2 � 200

100x1; x2 2 �0; 10� (10)

where the true function plot and the samples are shown in Fig. 3.The true function is highly nonlinear in the x1 direction and linear

in the x2 direction. The kriging method with different basis functionsis applied to this problem using the 14 samples obtained from LHS,as shown in Fig. 3, and the rrmse values that are calculated from100 � 100 grid testing points are compared. From Table 7 one can

Table 1 Comparison between four optimization methods

H-J L-M GA GPS

Optimal �opt(1.7679, 0.2628) (0.7897, 0.0108) (0.7158, 0.0088) (0.7168, 0.0087)

��opt�0.1169 0.0763 0.0758 0.0750

RRMSE

1.9223 0.1072 0.0825 0.0755

Table 2 Frequency of rank of ��� function value atoptimal � by different optimization methods

Rank H-J L-M GA GPS

First 0 4 4 92Second 0 71 24 5Third 2 23 72 3Fourth 98 2 0 0

Table 3 Frequency of rank of rrmse values by

different optimization methods

Rank H-J L-M GA GPS

First 0 30 31 39Second 3 43 14 40Third 13 22 48 17Fourth 84 5 7 4

Table 4 Comparison between four optimization

methods (12-D problem)

H-J L-M GA GPS

Optimal �opt0.0100 0.0197 0.0306 0.01000.0100 0.0100 0.0185 0.01000.0133 0.2439 0.4174 0.05550.0100 0.0100 0.0275 0.01000.0157 0.2727 0.1990 0.06880.0100 0.0100 0.0950 0.01000.0115 0.1024 0.1316 0.06870.0100 0.0100 0.1988 0.03060.0170 0.1275 0.1524 0.07950.0124 0.0100 0.1859 0.11220.0234 0.1583 0.3100 0.13080.0189 1.1422 0.9055 0.1680

��opt�1.1193 0.7067 0.8201 0.6719

RRMSE0.2443 0.1552 0.2095 0.1234

Table 5 Frequency of rank of ��� functionvalue at optimal � by different

optimization methods

Rank H-J L-M GA GPS

First 0 17 0 83Second 12 16 55 17Third 37 23 40 0Fourth 51 44 5 0

Table 6 Frequency of rank of rrmse values

by different optimization methods

Rank H-J L-M GA GPS

First 0 18 2 80Second 16 55 12 17Third 61 20 16 3Fourth 70 7 70 0

ZHAO, CHOI, AND LEE 2037

Page 5: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

see that the rrmse value decreases from zeroth-order to first-orderpolynomials. However, it increases from first-order to third-orderpolynomials, which indicates that increasing order does notnecessarily improve the accuracy of the surrogate model. Moreover,if several unnecessary basis functions are excluded to obtain thecustomized-order kriging, the kriging prediction becomes moreaccurate.

Therefore, the problem is how to find the optimal subset of thebasis functions such that the kriging prediction would have the bestaccuracy: that is, to find a subset of the basis functions such that theobtained kriging prediction can have the smallest rrmse value.However, since the rrmse value is not available unless the truefunction is explicitly known, it is proposed to use the kriging processvariance �2 as the estimator of the accuracy for the kriging predictionin this paper. Therefore, the formulation for this problem becomes tofind a subset of the basis functions to minimize

�2 � 1

n�y F��TR1�y F�� (11)

Note that different types of the candidate of basis functions, suchas Hermit polynomials, trigonometric functions, and exponentialfunctions, have been tested in this work, and it is found that thesimple polynomial forms perform efficiently and effectively withoutlosing accuracy. Thus, in this paper, all the candidate basis functionsare assumed to be polynomials and in the form of theirmultiplications xp11 x

p22 ; . . . ; x

pmm where m is the number of design

variables, pi 2 �0; P� is an integer power of xi,

Xmi�1

pi P

and P is the highest order of the mean structure in the kriging model.The total number of possible candidate basis functions is CPm�P.Therefore, the full set f becomes

f� �1; x1; x2; . . . ; xm; x21; . . . ; x2m; x1x2; . . . ; xm1xm; . . . ;xP1 ; x

P11 x2; . . . ; x

Pm�1�CP

m�P(12)

In Eq. (11), one constraint needs to be satisfied first. That is, thetotal number of possible candidate basis functions cannot be largerthan the number of samples to generate the kriging prediction.Therefore, by finding the largest P such that CPm�P n 1, thehighest-orderP in Eq. (11) is determined. The reason for using n 1instead ofn is that it is known thatwhen the number of basis functionsequals the number of samples, it sometimes causes an overfittingproblem in the kriging prediction. Therefore, to make the krigingmethodwork robustly, it is recommended to use n 1 instead of n tofind the highest-order P. After P is determined according to thenumber of samples, Eq. (10) becomes a classic variable selectionproblem, expressed as follows:

Find the subset of f to minimize

�2 � 1

n�y F��TR1�y F�� (13)

It is obvious that the global optimal subset of these candidate basisfunctions can be guaranteed only by applying the exhaustivealgorithm (EHA),which evaluates all possible 2M subsets of the basisfunctions, where M is the number of candidate basis functions.Consequently, the computational expense of EHA increases rapidlyand becomes unaffordablewhenM is large. Therefore, an alternativemethod to solve Eq. (11) needs to be applied such that the krigingprediction based on this alternative optimal subset is accurate enoughand close to the result obtained using the true optimal subset with lesscomputational expense.

As discussed in section I, many research works have been carriedout for the variable selection problems. In an area related to krigingmodeling, the blind krigingmethod uses a Bayesian forward variableselection to find the significant coefficients in � by using the cross-validation error as the objective function. In this blind krigingframework, the forward-selection scheme canmake the optimizationprocess prematurely converge to a local optimal selection. In thiswork, the GA is applied to find the optimal selection for basisfunctions. A main concern of using the GA method is the number ofiterations and convergence time [21]. This is true for using the GA tosolve a continuous problem. However, in this particular basis-function selection problem there are several reasons that the GAmethod can be efficient and attractive. Themain reason is that theGAintends to find the global optimum instead of the local optimum,which leads to amore accurate kriging prediction, comparedwith theresult using a forward-selection scheme. The second reason is that itis a discrete minimization problem in Eq. (12) with limited CPm�Ppossible basis functions. Unlike the encoding or decoding compu-tation for the solution in a continuous problem, the selection of thebasis function itself can be directly expressed in genetic form, where1 means selected and 0 means nonselected. The third reason is thatwith selection of complementary (i.e., opposite) subsets for the initialgeneration, the GA can converge quickly. The fourth reason is that torestrict the total computational time, one can set the maximumnumber of iterations and modifies the highest order of P for the GAmethod for complex engineering application and yet obtains asatisfactory result. Efficiency strategies for how to apply the GAmethod for solving Eq. (10) are discussed in detail in the followingsections.

1. Initial Generation

The GA procedure starts with an initial generation, called thezeroth generation. In this paper, the zeroth generation includes boththe single basis-function subsets and almost-full basis-functionsubsets. The � P

m�P� single basis-function subsets are defined as

�1; 0; 0; . . . ; 0�; �0; 1; 0; . . . ; 0�; . . . ; �0; 0; . . . ; 0; 1�

which indicate that each single basis-function subset is included inthe initial generation. The other � P

m�P� almost-full basis-function

subsets

�0; 1; 1; . . . ; 1�; �1; 0; 1; . . . ; 1�; . . . ; �1; 1; . . . ; 1; 0�

Fig. 3 Plot of true function and sample profile.

Table 7 RRMSE of kriging prediction using different

basis functions

Kriging methods Basis functions RRMSE

OKG 1 0.1343First-order UKG 1, x1, x2 0.0819Second-order UKG 1, x1, x1, x1x2, x

21, x

22 0.1137

Third-order UKG 1, x1, x2, x1x2, x21, x

22, x

21x2,

x1x22, x

31, x

32

0.1671

Customized-order kriging 1, x1, x21, x

31, x2 0.0810

2038 ZHAO, CHOI, AND LEE

Page 6: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

which are complementary to the single basis-function subsets, areincluded in the initial generation. Other than these subsets, the fullbasis-function subsets for each order (from first order to Pth order)are also included, which adds another P subsets in the initialgeneration. Altogether, there are 2CPm�P � P subsets in the zerothgeneration. The single basis-function subsets and almost-full basis-function subsets are used to avoid reaching the local optimum for thebasis selection. Based on numerous examples tested during thisstudy, it is found that if theGA starts only at one side, either the singlebasis-function subsets or the almost-full basis-function subsets, thereis a good chance that the GA procedure will prematurely converge toa local optimum.

2. Convergence Criteria

As discussed earlier, the convergence criterion of the GA needs tobe carefully set to have the GA converge efficiently. In this work, theconvergence conditions are chosen as follows:

1) Condition 1 is that the number of stalled iterations is equal to 2.2) Condition 2 is

�����2k1 �2k�2k1

���� 1%

�����2k2 �2k1�2k2

���� 1% (14)

3) Condition 3 is that the maximum number of iterations is � Pn�P�.

Condition 1 means the GA stops if the process variances �2 in anytwo consecutive iterations are the same. Condition 2 means the GAstops if the absolute relative change of �2 between two consecutiveiterations is less than 1%. Condition 3 means the GA stops if themaximum number of iteration reaches the number of the candidatebasis functions. The total stopping criterion is that theGA stops if anyof these three conditions is satisfied. After testing a number ofdifferent problems, it is found that condition 1 is the most frequent,which indeed indicates that the GA stops very quickly.

3. Additional Efficiency Strategy in Basis-Function Selection

When evaluating the process variance �2 using different subsets,the optimization problem for � search in Eq. (4) needs to be solvedevery time, which results in a significant computational time if theGPS method is used. With a number of testing problems, it is foundthat the optimal � does not significantly affect the result of the basis-function selection. Therefore, in this paper, the � search in evaluatingeach subset during the GA process is removed; instead, the ordinarykriging model is generated first and the obtained optimal � by theGPS method is used during the GA process for basis-functionselection. Only after the optimal subset of the basis functions is foundby the GA method and used to generate the final kriging prediction,Eq. (4) is solved again by using theGPSmethod tofind the optimal�.Therefore, the GPS method is used twice in the dynamic krigingprocess.

D. Performance of GA-Based Basis Selection

In this section, the GA-based optimal subset fGA is compared withthe global optimal subset fEHA from the EHA for small-scaleproblems to demonstrate the accuracy and the robustness of the GA-based selection method. The Branin–Hoo testing problem with thesame 20 LHS samples used by Forrester and Keane [14] is used here.The EHA is first applied to find the true global optimal subset fEHA.Since the total number of samples is 20, the highest-possible-order Pis found to be 4. Therefore, the total number of possible candidatebasis functions is � 4

4�2� � 15.

The fEHA is obtained by running the EHA procedure, whichevaluates the �2 values for all 215 possible subsets, and the fEHA isfound to be �1; x2; x1x2; x22; x31; x21x2; x1x22; x32; x41; x42�with the rrmse of0.04 based on 100 � 100 grid testing points. When the GA-basedbasis selection is applied, the fGA is �1; x1; x2; x1x2; x21; x22� with therrmse of 0.07. The rrmse obtained by the blind krigingmethod [22] is0.19, which is not as good as the result of the GA-based basisselection. Figure 4a shows that the optimization process using theGA method, which converges after 4 iterations. The contours of the

kriging prediction based on three different basis selection methodsare shown in Figs. 4b–4d. For computational efficiency, in the GA-based basis selection, only 4 � �2� P

m�P� � 4� � 136 subsets have

been evaluated, whereas in the EHA-based selection process, all215 � 32768 subsets have to be evaluated to find fEHA. Thus, the GA-based selection requires only about 136=32; 768� 0:42% of thecomputational time spent by the EHA to find a solution for thisproblem. Specifically, the clock time spent on the GA-basedselection is 604 ms, whereas the clock time spent on the EHA-basedselection is 152,370ms, and the clock time spent on the blind krigingis 374 ms on the Intel P8700 CPU computer.

To verify whether the GA-based selection algorithm is robust, arobustness study is carried out as follows. First, a performance isdefined as

performance �P

2Cpm�p

i�1 I �rrmseGA < rrmsei�2C

pm�p

� 100% (15)

where I��� is the indicator function, and rrmsei is calculated using the

kriging prediction with the ith subset out of the total 2�P

m�P� subsetsduring the EHA selection process. The performance in Eq. (14)indicates the percentile of the accuracy of the subset of the basis

functions obtained by the GA method among all possible 2�P

m�P�

subsets. Since the sample position has an influence on the result, 10consecutive trials with different sample sets fromLHS are carried outto see if the GA-based selection method is robust. In Fig. 5, the solidline is the rrmse values obtained by using the GA method for basisselection, the dashed line is the rrmse values obtained by the optimalbasis selection using the EHA method, and the dotted line is theperformance as defined in Eq. (14) for each trial. It shows that theGA-based selection process can find a very good subset of basisfunctions, which is better than about 97%of other subsets in the EHAprocess, while only using 0.5%of the computational time used by theEHA method.

The stepwise selection method was also tested for selecting thebasis functions. It is found that the performance by the stepwiseselection method was not as accurate as the proposed GAmethod. InAppendix A, a comparison study is carried out to demonstrate thedifference between the two methods.

With the optimal fGA obtained using the GA method by solvingEq. (12) and the optimal� obtained using theGPSmethod by solvingEq. (4), the dynamic kriging (DKG)method is formed and comparedwith other existing surrogate modeling methods in the followingsections.

III. Comparison Study Between Dynamic Krigingand Other Metamodeling Methods

A. Comparison Procedure

To compare the performance of the DKG method against othermetamodeling methods, we selected the four most widely usedmetamodeling methods [23], which are the UKG method, thepolynomial response surface (PRS)method, the radial basis-function(RBF) method, and the blind kriging (BKG) method. To make a faircomparison,wefirst need to specify how thesemethods are optimallyused in this paper.

For the UKGmethod, the mean structure is set to be second-orderpolynomials. For the PRSmethod, the response y is considered as thelinear combination of regression basis functions. The predictedresponse is expressed as

y� a0 �Xmi1�1

ai1xi1 �Xmi1�1

Xmi2�1

ai1i2xi1xi2 � � � �

�Xmi1�1

. . .Xmin�1

ai1i2;...;inxi1xi2 ; . . . ; xin (16)

To accurately apply the PRS method, the highest-order P of thepolynomials is decided by finding the best P such that the predictionwould have the smallest cross-validation error. For the RBFmethod,

ZHAO, CHOI, AND LEE 2039

Page 7: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

the response y is considered as a linear combination of basisfunctions, expressed as

y� wT �Xni�1

wi �kx cik� (17)

where ci is the center of the ith basis function. In this paper,

�r� � er2=2�2 is used. The � value is also determined by mini-mizing the cross-validation error.

The SURROGATES toolbox is used to test the above threemethods with the modification of using cross-validation to find thebest values of the model parameters for each method. For the blindkriging method, the original code from Joseph et al. [16] is used.

The comparison procedure is carried out as follows. First, nsamples are generated by the LHS method. Secondly, five surrogatemodels are generated using the given samples. After constructing thesurrogatemodels using thefivemethods, the functionvalues from thesurrogate model at S evenly distributed testing points are predictedand the rrmse values are calculated as the accuracy measurement.Then a rank is determined for these five methods in terms of theaccuracy of the generated surrogate model based on the rrmse valuesfrom each method. To eliminate the effect of the specific sampleprofile, the comparison is conducted for 50 trials with 50 differentsample sets, and the frequency of being identified as the bestsurrogate model is counted to find the method that performs the best.

For comparison of these methods, one important point is the levelof accuracy at which these surrogate models should be compared.

That is, comparing performance of metamodeling methods whennone of the surrogate models achieved an appropriate level ofaccuracy for the purpose of applications is meaningless. Therefore,one first needs to set the level of accuracy at which the surrogatemodel will be used for the comparison study. In this paper, thecoefficient of determination R2 is used as the normalized accuracymeasurement to check if the surrogate model is acceptable or not.

Fig. 4 Contours from different basis selection results.

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

rRM

SE

Trials1 2 3 4 5 6 7 8 9 10

97

97.5

98

98.5

99

99.5

Per

form

ance

(%

)

GA

EA

Performance

Fig. 5 Comparison between GA and EHA selections.

2040 ZHAO, CHOI, AND LEE

Page 8: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

The surrogate model is defined to be accuratewhen themedian of theR2 value is larger than 0.99 for 50 trials. The rank of the performanceof eachmetamodelingmethod is compared at the sample sizewhen atleast one method can generate a surrogate model with R2 largerthan 0.99.

B. Benchmark Problems for Comparison Study

The first problem chosen for comparison is the Branin–Hoofunction given in Eq. (8). In this problem the true function is acombination of polynomial and cosine functions. Therefore, it is notin favor any of the five methods and can be viewed as an unbiasedproblem for all methods. As shown in Tables 8 and 9, the comparisonstarted at the 16-sample case, and the DKG method achieved theacceptable surrogate model first at the 18-sample case, where theDKGmethod has been identified as the best for 43 times, as shown inTable 9. Table 10 shows that the mean of rrmse of the five methodswith the DKG method performed the best.

The second problem used for comparison study is an engineeringapplication of a M1A1 tracked-vehicle road-arm problem. The roadarm is modeled using 1572 eight-node isoparametric finite elements(SOLID45) and four beam elements (BEAM44) of ANSYS, asshown in Fig. 6, and is made of S4340 steel with Young’s modulusE� 3:0 � 107 psi and Poisson’s ratio �� 0:3. The durabilityanalysis of the road arm is carried out using the Durability andReliability Analysis Workspace (DRAW) [24] to obtain the fatiguelife. The fatigue lives at the 13 critical nodes shown in Fig. 7 arechosen as the design constraints. In Fig. 8, the shape design variablesconsist of four cross-sectional shapes of the road arm, where thewidths (x1 direction) of the cross-sectional shapes are defined asdesign variables d1, d3, d5, and d7 at intersections 1, 2, 3, and 4,respectively, and the heights (x3 direction) of the cross-sectionalshapes are defined as design variables d2, d4, d6, and d8.

Since the finite element analysis and fatigue analysis are time-consuming, the surrogate model is needed when carrying out designoptimization. For comparison of the DKG and other metamodelingmethods, the normalized fatigue life at the first critical node is used asthe response, and the surrogate model is to be generated for

G�d� � 1 L�d�Lt

(18)

Table 8 R2 median history (Branin–Hoo, 50 trials)

Sample Size UKG RBF PRS BKG DKG

16 0.784 0.586 0.961 0.901 0.97517 0.839 0.612 0.967 0.928 0.98918 0.857 0.731 0.970 0.979 0.99219 0.872 0.789 0.952 0.982 0.99620 0.911 0.805 0.975 0.993 0.999

Table 9 Frequency of rank of five methods

(Branin–Hoo, 18 points, 50 trials)

Rank UKG RBF PRS BKG DKG

First 0 0 0 7 43Second 15 8 0 23 4Third 18 14 2 14 2Fourth 16 20 8 5 1Fifth 1 8 40 1 0

Table 10 Mean rrmse values for eachmethod

(Branin–Hoo, 18 points, 50 trials)

Method Mean rrmse

UKG 2.8321RBF 4.5357PRS 6.3349BKG 1.7615DKG 0.6810

Fig. 6 Finite element model of road arm.

Fig. 7 Fatigue-life contour and critical nodes of road arm.

ZHAO, CHOI, AND LEE 2041

Page 9: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

whereL�d� is the crack-initiation fatigue life at the first critical node,and Lt is the crack-initiation target fatigue life, which is 5 years forthis example. The domain for generating the surrogate model isdefined as a hypersphere with a radius of 5% � kd0k, where

d 0� �1:750 3:250 1:750 3:170 1:756 3:038 1:752 2:908 �

The same 50-trial statistical study as conducted in the previousexample is carried out. In each trial, 50 LHS samples within thehypersphere are randomly generated. The surrogate models aregenerated by each of the five metamodeling methods. 2000 LHSsamples are first evaluated using the finite element analysis and usedto calculate the rrmse value for each surrogate model. After 50 trials,the rank of the rrmse values of the surrogate models using differentmetamodeling methods is calculated and shown in Table 11. Again,the DKG method performs the best in 28 times for generating themost accurate surrogate model, followed by the blind krigingmethod. Table 12 shows the mean rrmse values for each of thesurrogate modelingmethods, where the rrmse from theDKGmethodis smallest and followed by the blind kriging method.

Note that the fatigue-life response for the road-arm problem ismildly nonlinear. After checking the selected optimal subset of thebasis functions from both the dynamic kriging method and blindkrigingmethod, it is found that the optimal subsets from twomethodsare almost the same for this example due to themild nonlinearity. Themain difference in prediction accuracy comes from the optimal �values when the two methods use different optimization methods tosolve Eq. (4).

IV. Design Optimization Using DynamicKriging Method

In simulation-based design optimization, surrogate models arewidely used. In this section, a detailed practical use of the DKGmethod for design optimization is discussed. After the explanation ofseveral efficiency strategies of how to use the DKG method, anengineering design optimization problem is used to demonstrate theoverall performance of the DKG method for the simulation-baseddesign optimization.

A. Local Window for Surrogate Modeling

Since the DKG method selects the best basis functions subsetaccording to the nonlinearity of the response, it is better to generatethe surrogate model on a local window than to generate a globalsurrogate model on the entire design domain. When the candidatedesign point moves at each iteration, the sample set within the localwindow changes; therefore, the DKG method will choose differentbasis functions subsets according to the local nonlinearity of theresponse to generate the most accurate surrogate model locally. Thislocalwindow concept is visualized in Fig. 9. The hypersphere used todefine the local window is expressed as

Xmi�1�xi di�2 R2 (19)

where d� �d1; d2; . . . ; dm� is the current design point and R is theradius. In this paper, the R value is set as 5% � kdk.

To see how the DKG method works effectively in the localwindow, consider a 2-D highly nonlinear polynomial functionexpressed as

Fig. 8 Shape design variables of road arm.

Table 11 Frequency of rank of five methods

(road arm, 50 points, 50 trials)

Rank UKG RBF PRS BKG DKG

First 0 0 0 22 28Second 6 1 0 21 22Third 38 2 4 6 0Fourth 6 11 32 1 0Fifth 0 36 14 0 0

Table 12 Mean rrmse values for each method

(road arm, 50 points, 50 trials)

Method Mean rrmse

UKG 1.4461RBF 3.1603PRS 2.2046BKG 0.6518DKG 0.5845

2042 ZHAO, CHOI, AND LEE

Page 10: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

G�X� � 1� �0:9063X1 � 0:4226X2 6�2 � �0:9063X1

� 0:4226X2 6�3 0:6 � �0:9063X1 � 0:4226X2 6�4

�0:4226X1 � 0:9063X2� (20)

and generate surrogate models for Eq. (19) at three different designpoints �d1;d2;d3� using theDKGmethodwithin each local window.The location of three design points and local windows at each pointare shown in Fig. 10.

Nine initial samples are randomly generated within each localwindow, and basis functions up to the second-order polynomialfunction are used for the DKG method. After applying the DKGmethod at each local window, fx1; x2; x1x2g, f1; x1; x2; x21; x22g, andf1; x1; x2; x1x2; x21g are selected as the best basis functions at localwindows at d1, d2, and d3, respectively, to accurately describe thetrue function. The fact that different basis-function sets are selected atdifferent design points means that one basis-function set cannot bestdescribe the local nonlinearity of the true function, since the localnonlinearity changes as the design point moves. Furthermore, thefact that all six possible basis terms are not selected shows that it is notnecessarily good to use all available terms for the generation ofsurrogate models; this indeed shows the effectiveness of the DKGmethod.

B. Sampling Strategy

After deciding the local window for the surrogate modelgeneration, Nr initial samples are generated on the local windowusing the Latin centroidal Voronoi tessellations (LCVT) [22] forevenly distributed samples, and then surrogate models are generatedon the local window. The minimum number of the initial samples is

decided by � Pm�P�. For example, for an 8-D example with up to

second-order polynomial basis functions, the minimum number ofNr will be � 2

8�2� � 45. However, for high-dimensional problems, the

minimum number of initial samples may not be sufficient to generateaccurate surrogate models; more samples may be needed, dependingon the accuracy of the surrogate model [25–29]. The accuracy of thesurrogate model generated with the initial samples Nr can beestimated using

��mean�mse�xi��Var�y�xj��

; for i� 1–NTS; j� 1–n (21)

where Var�y�xj�� is the variance of n true responses at the samplepoints and is used to normalize the accuracymeasure, NTS is the totalnumber of testing points generated using LCVT, and mse is thepredicted mean square error (mse) from the DKG model. Thephysical meaning of the accuracy measure in Eq. (20) is related tothe prediction variance of the kriging model. Hence, the smaller theprediction variance, the more accurate the surrogate model. If theaccuracy of a surrogate model is satisfactory, which is defined as� 1% in this work, the surrogate model can be used foroptimization. However, if the accuracy does not satisfy the target,more samples are sequentially inserted within the local window untilthe surrogate model satisfies the target accuracy condition. The newinserting pointxnew is chosen as the one that has the largest mse valueamong the testing points in the local window. For a typical designoptimization problem, multiple constraints are usually involved. Inthis case, the accuracy measure in Eq. (20) needs to be modified toreflect the effect of multiple surrogate models. In this paper, themaximum value of accuracy measures for each surrogate model isused as the overall accuracy measure for multiple surrogate models,and thus the accuracy measure is given by

�max �max

�mean�msek�xi��Var�yk�xj��

�; for i� 1–NTS;

j� 1–n; k� 1–nc (22)

where yk�xj� and msek are the variance of n true responses and themse for the kth surrogate model, and nc is the number of surrogatemodels. Correspondingly, the new inserting point xnew is chosen asthe point that has the largest mse among all constraints, expressed asargxi maxfmsek�xi�g, for i� 1–NTS, j� 1–n, and k� 1–nc.

C. Adaptive Initial Point for Pattern Search

and Basis-Function Selection

When applying the DKGmethod to a complex engineering designoptimization problem, the number of variables used for surrogatemodeling is usually large. In such cases, the pattern search algorithmto find the optimal correlation parameter � in Eq. (4) and the geneticalgorithm to find the optimal basis-function subset in Eq. (12) maybecome computationally expensive. It is known that the computa-tional efficiency of the pattern search algorithm and the geneticalgorithm is strongly affected by the initial search point. If the initialsearch starts from the neighboring area of the true optimum, it canfind the optimal � within a remarkably shorter time than if it startsfrom a point far away from the optimum. Moreover, along theoptimization history, if the design movement is small, which meansthat the current design is near the optimal design, the surrogatemodelgenerated at the current design will be very similar to the onegenerated at the previous design. This means that two optimal �willbe close. Therefore, we can adaptively use the optimal � obtained inthe previous iteration as the initial point for the pattern search of thecurrent iteration instead of using arbitrary initial point. Similarly, theoptimal basis-function subset foptk1 found in the previous iterationwillbe included in the initial generation in GA for the current iteration.This will save computational time for the DKGmethod, in particular,when the current design is in the neighborhood of the optimal design.The overall flowchart of using the DKG method for the designoptimization is shown in Fig. 11.

Fig. 9 Local window for surrogate model.

Fig. 10 Contour of true response at different local windows.

ZHAO, CHOI, AND LEE 2043

Page 11: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

D. Numerical Example

This section uses the M1A1 tracked-vehicle road-arm problem tocarry out the design optimization using the DKG method with theefficiency strategies discussed in previous sections. The finiteelement model, the critical points, and the design variables definitionare the same, as shown in Figs. 6–8. Table 13 shows the initial designand the design domain for each variable.

The design optimization for theM1A1 tracked-vehicle road arm isformulated to

Minimize cost�d�subject to Gj�d�< 0; j� 1; . . . ; 13;

dL d dU; d 2 R8 (23)

where cost (d) is the weight of the road arm,

Gj�d� � 1 L�d�Lt

; j� 1–13 (24)

L�d� is the crack-initiation fatigue life, and Lt is the crack-initiationtarget fatigue life (5 years).

The constraint value Gj�d� and the sensitivity of Gj�d� withrespect to the design variables are predicted by Gj�d� and

Fig. 11 Flowchart of design optimization using the DKG method.

Table 13 Design variables and design domain

Design variables Lower bound dL Initial design do Upper bound dU

d1 1.350 1.750 2.150d2 2.650 3.250 3.750d3 1.350 1.750 2.150d4 2.570 3.170 3.670d5 1.356 1.756 2.156d6 2.438 3.038 3.538d7 1.352 1.752 2.152d8 2.508 2.908 3.408

Table 14 Comparison between sensitivity-based optimum

and surrogate-based optimum

Designvariable

Initial Sensitivity-based

Surrogate-based

d1 1.750 1.653 1.653d2 3.250 2.650 2.650d3 1.750 1.922 1.911d4 3.170 2.570 2.570d5 1.756 1.478 1.478d6 3.038 3.287 3.297d7 1.752 1.630 1.630d8 2.908 2.508 2.508

Number of functionevaluations

—— 11� 11 � 8 146

Iteration —— 11 11Active constraints —— 1,3,5,8,12 1,3,5,8,12

Cost 515.09 466.80 466.93

2044 ZHAO, CHOI, AND LEE

Page 12: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

@Gj�d�=@di from Eq. (5) where Gj�d� is the prediction generated bythe DKG method. The sequential quadratic programming algorithmis used to solve the optimization problem of Eq. (22). In eachiteration, 15 samples are used as the initial samples in the localwindow and new samples are sequentially inserted if the accuracy ofthe surrogate model is not achieved. After 11 iterations, theoptimization process converged to the optimal design, using 146samples altogether. To verify the accuracy of the optimal result, thesensitivity-based design optimization is carried out again.

In the sensitivity-based design optimization, the sensitivityinformation @Gi=@dj, i� 1–13 and j� 1–8 is available through thedesign sensitivity analysis method [30] and used to solve Eq. (22);therefore, no surrogate model is needed in this procedure. Table 14shows the comparison results for the two approaches. The optimaldesign obtained using the surrogate-based design optimization isalmost identical with the optimal design obtained using thesensitivity-based one except the small difference in d3 and d6. Thesensitivity-based design optimization requires 11 function and 11sensitivity evaluations. One sensitivity evaluation includes sensi-tivity calculations for all design variables, so it requires eightsensitivity calculations in this example, whereas the sampling-baseddesign optimization requires a total of 146 samples for the surrogatemodel generation using the DKG method.

In Fig. 12, the average computational time to generate eachsurrogate model is shown as the solid line with squares. The averagecomputational time to generate each surrogate model withoutapplying the efficiency strategy is also shown as the solid line withstars. It is shown that with applying the efficiency strategy, theaverage computational time is reduced from 4 to 1 s. In addition, themost reduction occurs at the second iteration due to the improvementof the initial theta point and basis-function selection.

V. Conclusions

When applying the metamodeling method to generate thesurrogate model based on limited number of samples, the krigingmethod is often used. The traditional UKG method has somelimitations because of thefixedorder of regression basis functions forthe mean structure and the optimization method that are used toobtain the optimal correlation parameter. The DKG method isproposed to find an accurate optimal correlation parameter by usingthe generalized pattern search method and determining the bestsubset of the basis functions dynamically by applying the geneticalgorithm. Comprehensive comparison studies show that the DKGmethod can generate accurate surrogate models compared with thetraditional metamodeling methods such as the UKG method, theradial basis-function method, the polynomial response surfacemethod, and the blind kriging method. With the use of sequentialsampling and efficiency strategies, the DKG method is applied to asimulation-based design optimization problem to efficiently obtainan accurate optimal design. The numerical example shows that theoptimal design obtained using the DKG method with efficiencystrategies achieves the same accuracy as the optimal design obtainedusing the sensitivity-based design optimization.

Appendix: Comparison Between Stepwise Selectionand Genetic Algorithm Selection for Basis Functions

To compare the performance between the stepwise selection andthe genetic algorithm selection methods for solving Eq. (12), theBranin–Hoo examplewith the 20-runLHS samples is used. TableA1shows results of selected basis functions from two methods and therrmse values of the surrogate models. The genetic algorithm foundthe optimal subset as �1; x1; x2; x1x2; x21; x22� and the associated rrmsevalue for the surrogate model is 0.07, whereas the stepwise selectionfound the optimal subset as �1; x1; x2; x1x2; x21; x22; x31; x21x2; x1x22; x32�and the associated rrmse value as 0.33.

A simulation study is also carried out to compare the performancebetween the genetic algorithm basis selection and the stepwise basisselection. In this study, 50 sets of 20-LHS samples are randomlygenerated. For each sample set, the genetic algorithm basis selectionand the stepwise basis selection are applied to find the optimal basisfunctions for the krigingmodel. After finding the basis functions, thesurrogate models are constructed based on the selected basisfunctions and 100 � 100 grid testing points are used to calculate themse values of the surrogate models. Table A2 shows the comparisonresult. Again, the genetic algorithm outperforms the stepwiseselection.

Acknowledgments

Research is primarily supported by theU.S.ArmyResearchOfficeproject W911NF-09-1-0250. This research was also partiallysupported by theWorld ClassUniversity Program through aNationalResearch Foundation of Korea grant funded by the Ministry ofEducation, Science and Technology (grant number R32-2008-000-10161-0 in 2009). This support is greatly appreciated.Wewould liketo thank Roshan Joseph and Ying Hung for providing the originalcode for the blind kriging method and their valuable suggestions.Wewould also like to thank Felipe Viana for providing theSURROGATES toolbox used in the comparison study and for hisgenerous suggestion for the development of the dynamic krigingmethod.

References

[1] Simpson, T. W., Toropov, V., Balabanov, V., and Viana, F. A. C.,“Design and Analysis of Computer Experiments in MultidisciplinaryDesign Optimization: AReview of How FarWeHaveCome—OrNot,”12th AIAA/ISSMO Multidisciplinary Analysis and OptimizationConference, Victoria, BC, Canada, AIAA Paper 2008-5802, 2008.

[2] Sacks, J., Welch, W. J., Mitchell, T. J., and Wynn, H., “Design andAnalysis of Computer Experiments,” Statistical Science, Vol. 4, No. 4,1989, pp. 409–23.doi:10.1214/ss/1177012413

[3] Kim, C., Wang, S., and Choi, K. K., “Efficient Response SurfaceModeling by Using Moving Least-Squares Method and Sensitivity,”AIAA Journal, Vol. 43, No. 11, 2005, pp. 2404–2411.doi:10.2514/1.12366

Fig. 12 Mean value of time spent for the DKGmethod in one surrogate

model generations.

Table A1 Comparison of basis-function selection

and stepwise methods and rrmse values

Selection methods Selected basis function RRMSE

Stepwise selection 1, x1, x2, x1x2, x21, x

22, x

31,

x21x2, x1x22, x

32

0.33

Genetic algorithm selection 1, x1, x2, x1x2, x21, x

22 0.07

Table A2 Comparison ofGA selection and stepwise selection (50 trials)

Selection method Branin–Hoo20 points

Median of rrmse based on the genetic algorithm selection 0.085Median of rrmse based on stepwise selection 0.356No. of cases (rrmsegenetic algorithm < rrmsestepwise) 49

ZHAO, CHOI, AND LEE 2045

Page 13: Metamodeling Method Using Dynamic Kriging for …user.engineering.uiowa.edu/~kkchoi/RBDO_A.23.pdfMetamodeling Method Using Dynamic Kriging for Design Optimization Liang Zhao,∗ K

[4] Lancaster, P., and Salkauskas, K., “Surfaces Generated by MovingLeast Squares Methods,” Mathematics of Computation, Vol. 37,No. 155, 1981, pp. 141–58.doi:10.2307/2007507

[5] Keane, A. J., “Design Search and Optimization Using Radial BasisFunctions with Regression Capabilities,” Proceedings of the Confer-

ence on Adaptive Computing in Design andManufacture, Berlin, 2004.[6] Clarke, S.M., Griebsch, J. H., and Simpson, T.W., “Analysis of Support

Vector Regression for Approximation of Complex EngineeringAnalyses,” Journal of Mechanical Design, Vol. 127, No. 6, 2005,pp. 1077–1087.doi:10.1115/1.1897403

[7] Forrester, A., Sobester, A., and Keane, A. J., Engineering Design via

Surrogate Modelling: A Practical Guide, Wiley, Hoboken, NJ, 2008.[8] Krige, D. G., “A Statistical Approach to Some Basic Mine Valuation

Problems on the Witwatersrand,” Journal of the Chemical,

Metallurgical and Mining Engineering Society of South Africa,Vol. 52, No. 6, 1951, pp. 119–139.

[9] Goel, T., Haftka, R., and Shyy, W., “Comparing Error EstimationMeasures for Polynomial and Kriging Approximation of Noise-FreeFunctions,” Structural and Multidisciplinary Optimization, Vol. 38,No. 5, 2008, pp. 429–442.doi:10.1007/s00158-008-0290-z

[10] Kyriakidis, P. C., and Goodchild, M. F., “On the Prediction ErrorVariance of Three Common Spatial Interpolation Schemes,” Interna-

tional Journal of Geographical Information Science, Vol. 20, No. 8,2006, pp. 823–855.doi:10.1080/13658810600711279

[11] Lophaven, S. N., Nielsen, H. B., and Sondergaard, J., “DACE: AMATLAB Kriging Toolbox,” Technical Univ. of Denmark, TR IMM-TR-2002-12, Lyngby, Denmark, 2002.

[12] Toal, D. J. J., Bressloff, N. W., and Keane, A. J., “KrigingHyperparameter Tuning Strategies,” AIAA Journal, Vol. 46, No. 5,2008, pp. 1240–1252.doi:10.2514/1.34822

[13] Martin, J. D., “Computational Improvements to Estimating KrigingMetamodel Parameters,” Journal of Mechanical Design, Vol. 131,No. 8, 2009, Paper 084501.doi:10.1115/1.3151807

[14] Forrester, A., and Keane, A. J., “Recent Advances in Surrogate-BasedOptimization,” Progress in Aerospace Sciences, Vol. 45, Nos. 1–3,2009, pp. 50–79.doi:10.1016/j.paerosci.2008.11.001

[15] Chen, R. B., Wang, W., and Wu, C. F. J., “Building Surrogates withOvercomplete Bases in Computer Experiments with Applications toBistable Laser Diodes,” IIE Transactions, Vol. 43, 2011, pp. 39–53.doi:10.1080/0740817X.2010.504686

[16] Joseph, V. R., Hung, Y., and Sudjianto, A., “Blind Kriging: A NewMethod for Developing Metamodels,” Journal of Mechanical Design,Vol. 130, No. 3, 2008, pp. 031102.doi:10.1115/1.2829873

[17] Jones, D. R., “ATaxonomy of Global Optimization Methods Based onResponse Surfaces,” Journal of Global Optimization, Vol. 21, 2001,

pp. 345–383.doi:10.1023/A:1012771025575

[18] Chiles, J. P., and Delfiner, P., Geostatistics: Modeling Spatial

Uncertainty, Wiley, New York, 1999.[19] Lewis, R. M., and Torczon, V., “Pattern Search Algorithms for Bound

Constrained Minimization,” SIAM Journal on Optimization, Vol. 9,No. 4, 1999, pp. 1082–1099.doi:10.1137/S1052623496300507

[20] Martin, J. D., and Simpson, T. W., “Use of Kriging Models toApproximate Deterministic ComputerModels,”AIAA Journal, Vol. 43,No. 4, 2005, pp. 853–863.doi:10.2514/1.8650

[21] Bäck, T., Evolutionary Algorithms in Theory and Practice, OxfordUniv. Press, New York, 1996.

[22] Burkardt, J., Gunzburger, M., Peterson, J., and Brannon, R., “UserManual and Supporting Information for Library ofCodes for CentroidalVoronoi Placement and Associated Zeroth, First, and Second MomentDetermination,” Sandia National Labs., TR SAND2002-0099,Albuquerque, NM, Feb. 2002.

[23] Viana, F. A. C., Haftka, R. T., and Steffen, V., Jr., “Multiple Surrogates:How Cross-Validation Errors Can Help Us to Obtain the BestPredictor,” Structural and Multidisciplinary Optimization, Vol. 39,No. 4, 2009, pp. 439–457.doi:10.1007/s00158-008-0338-0

[24] DRAW Concept Manual, Univ. of Iowa, Center for Computer-AidedDesign, College of Engineering, Iowa City, IA, 1999.

[25] Stein, M., “Large Sample Properties of Simulations Using LatinHypercube Sampling,” Technometrics, Vol. 29, No. 2, 1987, pp. 143–151.doi:10.2307/1269769

[26] Goel, T., Haftka, R. T., Shyy, W., andWatson, L. T., “Pitfalls of Using aSingle Criterion for Selecting Experimental Designs,” International

Journal for Numerical Methods in Engineering, Vol. 75, No. 2, 2008,pp. 127–155.doi:10.1002/nme.2242

[27] Dey, A., and Mahadevan, S., “Ductile Structural System ReliabilityAnalysis Using Importance Sampling,” Structural Safety, Vol. 20,No. 2, 1998, pp. 137–154.doi:10.1016/S0167-4730(97)00033-7

[28] Wang, G. G., “Adaptive Response Surface Method Using InheritedLatin Hypercube Design Points,” Journal of Mechanical Design,Vol. 125, No. 2, 2003, pp. 210–221.doi:10.1115/1.1561044

[29] Xiong, Y., Chen, W., and Tsui, K., “A New Variable FidelityOptimization Framework Based on Model Fusion and Objective-Oriented Sequential Sampling,” Journal of Mechanical Design,Vol. 130, No. 11, 2008, Paper 111401.doi:10.1115/1.2976449

[30] Choi, K. K., and Kim, N. H., Structural Sensitivity Analysis and

Optimization, Springer, New York, 2005.

A. MessacAssociate Editor

2046 ZHAO, CHOI, AND LEE