16
Bayesian assessment of the expected data impact on prediction confidence in optimal sampling design P. C. Leube, 1 A. Geiges, 1 and W. Nowak 1 Received 25 October 2010 ; revised 7 December 2011 ; accepted 13 December 2011 ; published 1 February 2012. [1] Incorporating hydro(geo)logical data, such as head and tracer data, into stochastic models of (subsurface) flow and transport helps to reduce prediction uncertainty. Because of financial limitations for investigation campaigns, information needs toward modeling or prediction goals should be satisfied efficiently and rationally. Optimal design techniques find the best one among a set of investigation strategies. They optimize the expected impact of data on prediction confidence or related objectives prior to data collection. We introduce a new optimal design method, called PreDIA(gnosis) (Preposterior Data Impact Assessor). PreDIA derives the relevant probability distributions and measures of data utility within a fully Bayesian, generalized, flexible, and accurate framework. It extends the bootstrap filter (BF) and related frameworks to optimal design by marginalizing utility measures over the yet unknown data values. PreDIA is a strictly formal information-processing scheme free of linearizations. It works with arbitrary simulation tools, provides full flexibility concerning measurement types (linear, nonlinear, direct, indirect), allows for any desired task-driven formulations, and can account for various sources of uncertainty (e.g., heterogeneity, geostatistical assumptions, boundary conditions, measurement values, model structure uncertainty, a large class of model errors) via Bayesian geostatistics and model averaging. Existing methods fail to simultaneously provide these crucial advantages, which our method buys at relatively higher-computational costs. We demonstrate the applicability and advantages of PreDIA over conventional linearized methods in a synthetic example of subsurface transport. In the example, we show that informative data is often invisible for linearized methods that confuse zero correlation with statistical independence. Hence, PreDIA will often lead to substantially better sampling designs. Finally, we extend our example to specifically highlight the consideration of conceptual model uncertainty. Citation: Leube, P. C., A. Geiges, and W. Nowak (2012), Bayesian assessment of the expected data impact on prediction confidence in optimal sampling design, Water Resour. Res., 48, W02501, doi:10.1029/2010WR010137. 1. Introduction [2] Satisfying the requirements for a sustainable and safe water supply in an economic context has led to increased challenges in monitoring and site characterization. Like many other fields in engineering, hydro(geo)logical modeling applications suffer from high prediction uncertainty. This limits the reliability of deterministic flow and transport pre- dictions, and has lead to the widespread use of stochastic techniques. Collecting hydro(geo)logical data, such as water levels, flow velocity, head and tracer data, provides addi- tional information. Incorporating these into stochastic models helps to reduce parameter, model, and prediction uncertainty. Yet, sampling and investigation campaigns are restricted by limited budgets, or by physical constraints, and therefore should be addressed in a rational and optimal way. [3] This leads to the optimal design (OD) problem of finding the best sampling design or investigation strategy for the given problem at hand, i.e., the one that maximizes some kind of utility function. The impact or utility of a design is defined as its individual capability to reduce uncertainty associated with a prediction goal, or to maxi- mize some related measure of data utility [e.g., Federov and Hackl, 1997; Ucin ´ski, 2005; Müller, 2007]. The most important key ingredients to OD are adequate statistical or stochastic methodologies that properly transfer the uncertainty in model structure and parameters to model predictions, while taking into account the impact of noisy measured and yet unmeasured (planned) data. For stochas- tic problems in hydrology and heterogeneous subsurface environments, the literature provides a range of methods that will be quickly reviewed below, and their adequacy for OD will be discussed. [4] Addressing the optimal design problem within a linear error propagation framework can be accomplished using first-order second-moment (FOSM) methods. Successful applications with regard to optimal design are reported by, e.g., Kunstmann et al. [2002] and Cirpka et al. [2004]. The key advantage of FOSM is the straightforward and 1 Institute for Modeling Hydraulic and Environmental Systems (LH 2 )/SimTech, University of Stuttgart, Stuttgart, Germany. Copyright 2012 by the American Geophysical Union 0043-1397/12/2010WR010137 W02501 1 of 16 WATER RESOURCES RESEARCH, VOL. 48, W02501, doi:10.1029/2010WR010137, 2012

Bayesian assessment of the expected data impact on prediction confidence in optimal sampling design

Embed Size (px)

Citation preview

Bayesian assessment of the expected data impact on predictionconfidence in optimal sampling design

P. C. Leube,1 A. Geiges,1 and W. Nowak1

Received 25 October 2010; revised 7 December 2011; accepted 13 December 2011; published 1 February 2012.

[1] Incorporating hydro(geo)logical data, such as head and tracer data, into stochasticmodels of (subsurface) flow and transport helps to reduce prediction uncertainty. Because offinancial limitations for investigation campaigns, information needs toward modeling orprediction goals should be satisfied efficiently and rationally. Optimal design techniquesfind the best one among a set of investigation strategies. They optimize the expected impactof data on prediction confidence or related objectives prior to data collection. We introducea new optimal design method, called PreDIA(gnosis) (Preposterior Data Impact Assessor).PreDIA derives the relevant probability distributions and measures of data utility within afully Bayesian, generalized, flexible, and accurate framework. It extends the bootstrap filter(BF) and related frameworks to optimal design by marginalizing utility measures over theyet unknown data values. PreDIA is a strictly formal information-processing scheme free oflinearizations. It works with arbitrary simulation tools, provides full flexibility concerningmeasurement types (linear, nonlinear, direct, indirect), allows for any desired task-drivenformulations, and can account for various sources of uncertainty (e.g., heterogeneity,geostatistical assumptions, boundary conditions, measurement values, model structureuncertainty, a large class of model errors) via Bayesian geostatistics and model averaging.Existing methods fail to simultaneously provide these crucial advantages, which our methodbuys at relatively higher-computational costs. We demonstrate the applicability andadvantages of PreDIA over conventional linearized methods in a synthetic example ofsubsurface transport. In the example, we show that informative data is often invisible forlinearized methods that confuse zero correlation with statistical independence. Hence,PreDIA will often lead to substantially better sampling designs. Finally, we extend ourexample to specifically highlight the consideration of conceptual model uncertainty.

Citation: Leube, P. C., A. Geiges, and W. Nowak (2012), Bayesian assessment of the expected data impact on prediction confidence

in optimal sampling design, Water Resour. Res., 48, W02501, doi:10.1029/2010WR010137.

1. Introduction[2] Satisfying the requirements for a sustainable and safe

water supply in an economic context has led to increasedchallenges in monitoring and site characterization. Likemany other fields in engineering, hydro(geo)logical modelingapplications suffer from high prediction uncertainty. Thislimits the reliability of deterministic flow and transport pre-dictions, and has lead to the widespread use of stochastictechniques. Collecting hydro(geo)logical data, such as waterlevels, flow velocity, head and tracer data, provides addi-tional information. Incorporating these into stochastic modelshelps to reduce parameter, model, and prediction uncertainty.Yet, sampling and investigation campaigns are restricted bylimited budgets, or by physical constraints, and thereforeshould be addressed in a rational and optimal way.

[3] This leads to the optimal design (OD) problem offinding the best sampling design or investigation strategyfor the given problem at hand, i.e., the one that maximizessome kind of utility function. The impact or utility of adesign is defined as its individual capability to reduceuncertainty associated with a prediction goal, or to maxi-mize some related measure of data utility [e.g., Federovand Hackl, 1997; Ucinski, 2005; Müller, 2007]. The mostimportant key ingredients to OD are adequate statisticalor stochastic methodologies that properly transfer theuncertainty in model structure and parameters to modelpredictions, while taking into account the impact of noisymeasured and yet unmeasured (planned) data. For stochas-tic problems in hydrology and heterogeneous subsurfaceenvironments, the literature provides a range of methodsthat will be quickly reviewed below, and their adequacy forOD will be discussed.

[4] Addressing the optimal design problem within a linearerror propagation framework can be accomplished usingfirst-order second-moment (FOSM) methods. Successfulapplications with regard to optimal design are reported by,e.g., Kunstmann et al. [2002] and Cirpka et al. [2004].The key advantage of FOSM is the straightforward and

1Institute for Modeling Hydraulic and Environmental Systems(LH2)/SimTech, University of Stuttgart, Stuttgart, Germany.

Copyright 2012 by the American Geophysical Union0043-1397/12/2010WR010137

W02501 1 of 16

WATER RESOURCES RESEARCH, VOL. 48, W02501, doi:10.1029/2010WR010137, 2012

computationally efficient first-order propagation of meanand variance from parameters and data to predictions viasensitivity matrices [e.g., Kitanidis, 1995]. Based on cova-riances, FOSM is limited to linear and weakly nonlinearproblems [e.g., Schweppe, 1973]. Within geostatistical appli-cations, the size of the involved autocovariance matrix of pa-rameters can pose difficulties for large-grid models unlessusing spectral methods [e.g., Nowak et al., 2003; Fritz et al.,2009]. For various sources of uncertainty, the most efficientway to obtain the sensitivity matrices is by employingadjoint states [e.g., Sykes et al., 1985; Sun, 1994; Cirpkaand Kitanidis, 2001]. This requires intrusion into simulationcodes and is often impossible for commercial software tools.

[5] A strictly nonintrusive and straightforward methodwhich avoids the handling of vast autocovariance matricesis the ensemble Kalman filter (EnKF) [e.g., Evensen,2007]. The EnKF has recently been extended toward geo-statistical inversion and parameter estimation [e.g., Zhanget al., 2005; Nowak, 2009; A. W. Schöniger, W. Nowak,and H.-J. Franssen, Parameter estimation by ensembleKalman filters with transformed data: Approach and appli-cation to hydraulic tomography, submitted to Water Resour-ces Research, 2011]. Successful applications in OD can befound in, e.g., Herrera and Pinder [2005] and Zhang et al.[2005]. The main restriction of the EnKF is that it is optimalonly for multi-Gaussian dependence among all involvedvariables [e.g., Evensen, 2007], which is equivalent to animplicit linear assumption [e.g., Nowak, 2009; A. W.Schöniger, W. Nowak, and H.-J. Franssen, Parameter esti-mation by ensemble Kalman filters with transformed data:Approach and application to hydraulic tomography, submit-ted to Water Resources Research, 2011].

[6] For many cases, linear approaches in OD pose asevere restriction. Only for truly linear OD problems, bestlinear unbiased estimators (BLUE), such as kriging, andother linear statistical inference methods can be applied[e.g., Müller, 2007] whose estimation variance is independ-ent of actual measurement values [e.g., Deutsch and Journel,1997]. By contrast, for nonlinear problems, the estimationvariance and more sophisticated measures of data utilitydepend on the actual values of measurements, which are stillunknown prior to collection, i.e., during the OD procedure.Nonlinear systems require consideration of the entire rangeof possible measurement values, such that one can obtain theexpected data impact on average over all the yet unknownfuture measurement values. Hence, for nonlinear problems,the conditional statistics have to be averaged over all possi-ble measurement values [e.g., Freeze et al., 1992; Diggleand Ribeiro, 2007]. The problem gains further complexitywhen considering model structural uncertainty, which leads,among other approaches to Bayesian model averaging [e.g.,Hoeting et al., 1999].

[7] As a matter of principle, any sufficiently accurateconditioning method or Bayesian updating scheme can beemployed in optimal design. This opens the path to allMonte Carlo (MC)-based methods. Most of the availableMC-based techniques, i.e., the pilot point method (PP)[e.g., RamaRao et al., 1995] or sequential self-calibration(SSC) [e.g., Gomez-Hernandez et al., 1997], iteratively cor-rect individual model realizations until they meet the data.However, they would need to solve an optimization prob-lem for many realizations, just in order to account for a

single possible set of measurement values from one singlesuggested sampling pattern. This leads to an infeasiblecomputational load, because nonlinear OD forces repeatingthe analysis for many (100 . . . 10,000) possible sets of mea-surement values, even per individual sampling pattern.This computational effort would again multiply with thenumber of different competing sampling patterns (typically100 . . . 100,000).

[8] Another representative class of MC-based condition-ing methods is the bootstrap filter (BF) or particle filter(PF) by Gordon et al. [1993]. The basic idea of the BF is toweight realizations according to their goodness of fit withgiven measurement values. BF does not invest any effort tocorrect individual realizations for meeting a given data set.Therefore, repetition for many possible sets of data valuesper given design and many possible designs to be comparedwould be relatively cheap. We will found our optimaldesign method on the BF. BFs are technically the same asthe generalized likelihood uncertainty estimator (GLUE)by Beven and Binley [1992] when using the latter with for-mal (e.g., Gaussian) likelihoods. However, from a philo-sophical perspective, BF and GLUE pursue rather differentpaths, and the informal likelihoods commonly associatedwith GLUE are often being disputed in the scientific com-munity [e.g., Mantovan and Todini, 2006]. In order to stayclear of this field of conflicting ideas, we will solely referto BF in the remainder of our work.

[9] Generally, all MC methods benefit from the fact thatstatistical features of arbitrarily high order are entirely pre-served toward the prediction, if only the Monte Carlo sam-ple is large enough. This is of particular interest whendealing with nonlinear model predictions, high-variabilitycases, or nonlinear task-driven optimal design formula-tions. To the authors knowledge, no such OD frameworkfor preposterior analysis has been published within thehydro(geo)logical community yet, other than by James andGorelick [1994]. In our work, we will follow this principle,provide a method for improved convergence and a measureto assess its accuracy, and demonstrate its advantages overlinearized methods in a synthetic example.

[10] Section 2 provides our concept and approach. Section 3introduces the notation for a fully Bayesian OD framework,and section 4 presents our OD method. Section 5 appliesthe method to an illustrative example taken form subsur-face contaminant transport to be discussed in section 6.

2. Approach and Contributions[11] In this work, we propose an extension of the boot-

strap filter (BF) [Gordon et al., 1993] toward optimaldesign, i.e., to the optimal planning of data acquisition,aimed at improved model confidence after conditioning amodel on the data. We account for the fact that data valuesare unknown at the planning stage of optimal design (OD)by averaging over a sample of possible measured data val-ues for each suggested sampling design. In the resultingmethod, called PreDIA(gnosis) (Preposterior Data ImpactAssessor), we use only one single sample for both MCsteps and perform an internal cross-breeding step. The sam-ple of possible data sets and, likewise, the likelihood func-tion may reflect both measurement and model errors. Underthe assumption that both error types are multivariate

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

2 of 16

Gaussian (typically uncorrelated), we analytically margin-alize over possible measurement and model errors (seeequations (B3) and (C2) in Appendices B and C). This actslike a kernel smoothing technique [e.g., Wand and Jones,1995] and yields a substantial gain in computational timefor the optimal design problem. The Gaussian assumption,however, poses no limitation to our methodological frame-work. Any kind of distribution is allowed, and the analyti-cal marginalization shown in the appendices is a simpleconvolution between two error distributions that can beperformed for many other combinations of parametric dis-tributions of measurement and model errors.

[12] Because of the underlying strictly numerical MCframework, our method does not suffer from linearityassumptions, and can handle all types of data in a similarfashion to Zhang et al. [2005] and Murakami et al. [2010].Also, it is not limited to multinormality assumptions of theparameters, which are subject to increasing criticism [e.g.,Fogg et al., 1998; Gomez-Hernandez and Wen, 1998;Bardossy and Li, 2008]. It opens the path to consider varioussources of uncertainty, e.g., heterogeneity, geostatisticalassumptions, boundary conditions, measurement values,model structure uncertainty or uncertain model choice, andmodel errors with known parametric distributions. This helpsto minimize the strength or even the necessity of possiblysubjective prior assumptions on parameter distributions andmodeling assumptions, which would be hard to defend priorto data collection [e.g., Chaloner and Verdinelli, 1995;Diggle and Ribeiro, 2007; Nowak et al., 2010]. The discus-sion of uncertain model choice and Bayesian model averag-ing [e.g., Hoeting et al., 1999] is postponed to section 4.4.

[13] Many OD approaches invoke the framework of dataworth, expressing the benefit of having reduced uncertaintyin monetary terms [e.g., James and Gorelick, 1994; Feyenand Gorelick, 2005]. Liu et al. [2012] evolve a frameworkto assess the value of information in dense nonaqueousphase liquid (DNAPL) remediation via a BF, alsoexpressed in monetary terms. Within the current study, dataimpact is treated in a more general and not necessarilymonetary way. In our illustrative test cases, we define datautility via statistical and information-theoretic measures.Possible statistical measures for the data impact include,e.g., reduction of parameter or prediction variance, relativeentropy [Cover and Thomas, 2006, p. 19], or the possiblesignificance level of hypothesis testing [Nowak, 2008].Although monetarization is not the primary interest of thisstudy, our framework could easily account for monetaryconsiderations similar to Liu et al. [ 2012].

3. Design Problem and the Framework forBayesian Analysis

[14] A design is a set of decision variables d that spec-ify, e.g., the number, locations, types, and experimentalconditions for measurements which shall be acquired in thevector of measurement values yðdÞ. If s is a vector ofuncertain model parameters, the objective of optimal designis to maximize, in some sense, the uncertainty reduction inthe posterior distribution pðsjyðdÞÞ before even knowingthe actual values in y.

[15] Let s � pðsjhÞ be a ns � 1 vector of ns parametervalues (e.g., spatially variable parameters drawn from a

discretized random space function or more general catch-ment properties) with the structural parameter h (e.g., geo-statistical parameters that define the distribution pðsÞ, seesection 4.4). The ny � 1 vector of measurement values yðdÞis related to the parameters s through a model functionyðdÞ ¼ fyðs; n; kÞ þ "y, where "y follows some distributionthat accounts for the (typically white-noise) measurementerror and sometimes also model structure error. In particu-lar, the Gaussian model "y � Nð0;R"Þ, and assumingmodel structure error to be independent, is common use inmany fields of science and engineering, including dataassimilation [e.g., Evensen, 2007]). n is a set of meta pa-rameters for the model function (e.g., less obvious modelparameters for boundary condition values), and k is a (setof) parameter(s) which accounts for conceptual discretemodel choice (e.g., type of rainfall-runoff model, type ofboundary condition, or location of a leaky window in anaquifer). R" is a ny � ny (typically diagonal) error covari-ance matrix. Possible types of measurements include directobservations of s (e.g., hydraulic conductivity values fromgrain size analysis) or indirect dependent quantities (e.g.,drawdown data from well tests, tracer concentrations, orriver stages).

[16] In case the different model structures indicated by krequire different numbers of parameters, the length ns of swill simply depend on k. In case "y also includes modelerror, its distribution and, in specific, its covariance matrixR" will also depend on k. In order to keep notation short,however, we do not reflect these dependencies in ournotation.

[17] Then, for given parameters s, the residuals betweensimulated values fyðs; n; kÞ and measurements y areassumed to be only due to the measurement and modelerror "y. This allows deriving the likelihood function.For instance, assuming Gaussianity for "y (similar toFeyen et al. [2003] or Christensen [2004]) yields the likeli-hood function LðyðdÞjs; n; k; hÞ ¼ pdfNormal ðfyðs; n; kÞ;R"Þ,where fyðs; �; k; hÞ is the mean and R" is the covariance ma-trix. The conditional distribution of s given yðdÞ and allmeta parameters n; k; h is determined by Bayes theorem:

pðsjn; k; h; yðdÞÞ / LðyðdÞjs; n; k; hÞ pðsjn; k; hÞ: (1)

[18] Of course, assuming known meta parameters n; k; hcannot be justified prior to extensive data collection (seesection 4.4). The marginal distribution of s given onlyyðdÞ, called a Bayesian distribution [Kitanidis, 1986], is:

~pðsjyðdÞÞ /Zn;k;h

pðsjn; k; h; yðdÞÞ pðn; k; hjyðdÞÞdðn; k; hÞ; (2)

where the tilde denotes the Bayesian probability. Note thatthe entire distribution pðs; n; k; hÞ has been jointly condi-tioned on yðdÞ.

[19] The final purpose is to predict a dependent nz � 1 vec-tor of ny predictions z (e.g., a contaminant concentration c)related to s, n; k, and h via a physical model z ¼ fzðs; n; k; hÞ(e.g., the transport equation). Typically, z does not have anadditional independent stochastic component other than theones discussed above. One could, however, easily replace the

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

3 of 16

set of noise-free predictions z by noisy predictions z0 usingyet another error "z with appropriate distribution. In a hydro-logical context, z might be a water level connected with afree-surface flow model. The conditional prediction~pðzjyðdÞÞ then becomes

~pðzjyðdÞÞ /Zs;n;k;h

pðzjs; n; kÞ pðsjn; k; h; yðdÞÞ pðn; k; hjyðdÞÞ dðs; n; k; hÞ;

(3)

where pðzjs; n; kÞ is the raw distribution that reflectsz ¼ fzðs; n; kÞ.

[20] The objective of optimal design is to maximizethe uncertainty reduction in the conditional prediction~pðzjyðdÞÞ or the conditional parameters ~pðsjyðdÞÞ. To thisend, a task-specific uncertainty reduction measure or utilityfunction �fdg that works on ~pðzjyðdÞÞ or ~pðsjyðdÞÞ will beset up and maximized. Extensive lists of utility measuresare discussed, e.g., in Federov and Hackl [1997], Chalonerand Verdinelli [1995], Müller [2007], and Nowak [2010].Finally, we average the data utility function �fpðzjyðdÞÞg(or analogously for the parameters) over the possible, yetunknown, measurement values via ~pðyðdÞÞ :

�fdg /ZyðdÞ

�f~pðzjyðdÞÞg~pðyðdÞÞ dyðdÞ; (4)

where

~pðyðdÞÞ ¼Z

s;n;k;h

LðyðdÞjs; n; k; hÞ pðs; n; k; hÞ ðs; n; k; hÞ: (5)

[21] The optimal design dopt will depend on both avail-able types and possible locations of the measurements and,of course, on the prediction goal and the task-specific for-mulation of the objective function. It will also depend onthe level, character, and composition of uncertainties, andon the (suite of) model(s) used.

[22] Equations (4) and (5) reveal that the prior distribu-tions significantly influence the overall results. It is alsowell known in Bayesian model averaging (BMA) that itsresults are directly conditional on the set of models consid-ered in the analysis. Therefore, the selection of modelchoices k, structural parameters h; and prior assumptionson all distributions should be as general (least subjective)as to reflect the situation prior to the investigation. Lack ofinformation could, for example, result in a flat (least sub-jective as possible) prior or even improper priors [Kass andWasserman, 1996]. Please note that improper or flat priorsin conjunction with BMA may cause severe problems[Hoeting et al., 1999]. Maximum entropy [Jaynes, 1957] orminimum relative entropy [Woodbury and Ulrych, 1993]can also be used to keep the prior description as general aspossible. Other options are reference priors [Kass and Was-serman, 1996].

[23] The optimal design will address the entire suite ofuncertainties according to their relevance toward the statedprediction goal. For example, the optimal design will be in-formative with respect to model choice only, if reducing

conceptual model uncertainty can significantly contributeto minimizing the chosen objective function �fdg. Moredetails on this topic are provided in section 4.4.

4. Method and Implementation[24] In the following, we assume that a (suite of) mod-

el(s) that reflects a given system under investigation is al-ready set up, their parameters already conditioned on allexisting data, and model weights already adjusted.Adequate methods for this task are abundant in the litera-ture (some of them listed in section 1), and shall not beenumerated here. The following sections 4.1 and 4.2 areconcerned solely with the search for optimal new data ac-quisition strategies. In section 4.1, we first look at how toevaluate the impact of new data if the possible (future) datafor a given design were already known hypothetically. Forthe reasons discussed in the introduction, we do this usingthe bootstrap filter (BF). Section 4.2 provides the necessarytheoretical and numerical extensions of the BF that arerequired to assess the impact of future data, averaged overall possible and yet unknown data values for a givendesign. Section 4.3 discusses optimization of the design.Section 4.3 and 4.4 discuss the convergence, computationalcosts, and Bayesian model averaging.

4.1. Bootstrap Filter (BF)

[25] To simplify notation, we combine all individual pa-rameter vectors s; n; h; k in the augmented vector S (see alsosection 4.4). After this notational abbreviation, for the re-mainder of this study, all model averaging, uncertain bound-ary conditions, etc., become implicit. For n realizations of Sindependently drawn from pðSÞ and a (hypothetically given)data set y0, the BF would evaluate an n� 1 weight vector waccording to equation (2) with wi ¼ pðSijy0Þ. Weightedaveraging of n prediction realizations zi ¼ f zðSiÞ, i ¼ 1 . . . n,would yield the posterior expectation

Ezjy0½z� � 1

v1

Xn

i¼1

zi wi; (6)

and the weighted variance

Vzjy0½z� � v1

v21 � v2

Xn

i¼1

z2i wi �

Xn

i¼1

zi wi

!224

35; (7)

with v1 ¼Xn

i¼1

wi and v2 ¼Xn

i¼1

w2i . Ea½b� is the expected

value of b over the distribution of a and Va½b� ¼Ea½b2� � Ea½b�2 is the respective variance. Here, we approx-imate both quantities in the weighted sample sense and,therefore, employ v1 and v2 [Weiss, 2006, p. 355]. Thecorresponding correction factor in equation (7) resemblesthe well-known factor 1

n�1 for the nonweighted samplevariance. This is an unbiased estimator of the populationvariance even for small effective sample sizes.

4.2. Preposterior Data Impact Assessor (PreDIA)

[26] Now, we move from conditioning on a hypotheti-cally given vector of measurement values y0 to the

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

4 of 16

planning phase where the actual vector of measurementvalues is yet unknown, but only the design d may be fixed.This situation reflects the evaluation of a single design can-didate d during the planning phase of optimal design,called the preposterior stage [e.g., James and Gorelick,1994]. We evaluate the weight vector w (equation (2)) form possible outcomes of the measurement vector yðdÞ,drawn from ~pðyðdÞÞ. This is implemented by calculatingthe weight vector w for m realizations of yðdÞ, yielding ann� m weight matrix W. A schematic illustration of thisprocedure is shown in Figure 1. For detailed derivationsand analytical marginalization over the unknown measure-ment error "y, we refer to Appendices A and B. An exten-sion to model structure error is discussed in Appendix C.Now, weighted averaging over n predictions and m possiblevectors of measurement values yields the expected condi-tional prediction variance

EyðdÞfVzjyðdÞ½z�g �

1

m

Xm

j¼1

v1; j

v21; j � v2; j

Xn

i¼1

z2i Wij �

Xn

i¼1

ziWij

!28<:

9=;:

(8)

[27] In a similar fashion, other utility functions that acton pðzjyðdÞÞ can be evaluated numerically. As the alertreader may have noticed, equation (8) might consume largeCPU resources. This is caused by the need to average overthe yet unknown measurement values. Hence, we suggestkeeping the sample size for potential measurement values

m much smaller than for potential measurement simula-tions n, since an accurate expectation over yðdÞ requiresfewer realizations than the conditional variance of z (alsosee section 4.5).

4.3. Optimization Scheme

[28] In order to find the best among a set of allowabledesigns d, a utility function based on equation (8) has to bemaximized:

dopt ¼ arg maxd2D

½�fdg�; (9)

where D is the space of admissible designs. The space ofadmissible designs may be approximated as a grid of allow-able locations at a certain series of time steps, restricted inspace by physical and practical constraints, and restrictedin total by allowable costs. The allowable space is, in mostcases, extremely large, and it is often impossible to system-atically evaluate all possible combinations.

[29] Still, the above optimization can be managed by abroad spectrum of optimization schemes that do not scanthe entire design space. Available options include sequen-tial exchange (SE) or greedy search (GS) [Christakos,1992, p. 411], or heuristic methods such as genetic algo-rithms (GA) [e.g., Goldberg, 1989; Reed et al., 2000].Because OD is a nonlinear case of extremely high-dimen-sional optimization, none of these can guarantee finding theglobal optimum. Heuristic methods introduce additionalparameters that control the algorithm and the quality of theresults, while SE is an algorithm that at least leads to Paretooptimal designs. Although we will use SE in our laterexample, our method does not require or prefer any specificchoice for the optimization algorithm.

4.4. Implicit Bayesian Model Averaging and OtherUncertainties

[30] The lack of information is ubiquitous in environ-mental simulation problems, especially prior to a plannedinvestigation effort. Given our quite general inability tofully validate the underlying processes occurring in geo-sciences [Oreskes et al., 1994], the selection of a singlegeostatistical, structural or conceptual model is often unjus-tifiable. Instead, a selection or an entire spectrum of modelsmay compete. Any corresponding assumptions to restrictoneself to a unique model selection would be hard todefend prior to deeper investigation and data acquisition.To reduce the subjectivity of such prior assumptions, onemay admit different model alternatives and weight themaccording to their a priori credibility. The modeling task isperformed with all model alternatives, and posterior credi-bility values are assigned after comparison with availabledata. This procedure is called Bayesian model averaging(BMA) [e.g., Hoeting et al., 1999; Neuman, 2003].

[31] A recent advancement of BMA shifts the problemof discrete model choice to a continuous parameter prob-lem, introducing a continuous spectrum of model alterna-tives [Nowak et al., 2010]. These authors parameterized thechoice among different covariance models via a shape pa-rameter that allows embracing the desired different modelsas special cases. They referred to this as continuous BMA.Continuous BMA should be seen as an elegant alternative

Figure 1. Schematic illustration of PreDIA (preposteriordata impact assessor) enveloping the bootstrap filter (BF).Both are nested together in the optimization procedure.

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

5 of 16

to BMA, if applicable, but not as a limitation to generalBMA.

[32] Following the above rationales, our method canaccount for various additional sources of uncertainty. Wedistinguish between uncertainties of structural parameters hrelated to potentially involved geostatistical models, uncer-tainties of boundary/initial condition parameters n associ-ated with each physical/conceptual model (in geostatisticalinversion this has been first been suggested by Kitanidis[1995], while hydrology, oceanography, and the atmos-phere sciences are much more used to considering them asuncertain [e.g., Evensen, 2007]), and uncertain conceptualmodel selections within k, e.g., uncertain zonation ormodel forcing. All uncertainties and choices can be handledeither discretely or often continuously.

[33] Note that, in our formulation, averaging over theunknown meta-parameters h, boundary/initial conditions n;and model choice indicators k are done implicitly by aver-aging over the augmented parameter vector S. We refer tothis as implicit Bayesian model averaging. In section 5, wewill show an example that exclusively addresses uncertainconceptual model selections.

4.5. Convergence

[34] In order to obtain robust Monte Carlo (MC) statis-tics, a sufficient amount of realizations is needed. MC sta-tistics (e.g., mean and variance) converge at a rateproportional to N�1=2, where N is the sample size [e.g.,Caflisch, 1998; Ballio and Guadagnini, 2004]. That is whyaveraging an objective function such as equation (8) overpossible data values yðdÞ is relatively unproblematic. Var-iance reduction schemes [e.g., Russell, 1986] can help tofurther improve the situation. Diggle and Lophaven [2006]and Neuman et al. [2012], for example, found empiricallythat their statistics stabilize at n ¼ 100 or n ¼ 200 sets ofrandomly drawn data sets for averaging conditional varian-ces for a given design.

[35] This is drastically different for computing the condi-tional variance of z itself, for an individual possible set ofdata values. The bootstrap filter (BF) or particle filter (PF)assigns likelihoods to the individual realizations in order toreflect conditioning. If one of the likelihoods is much largerthan all of the other likelihood values, the posterior statis-tics are approximated by only a single point of probabilitymass. This will impact negatively on the accuracy of allposterior statistics and occurs mostly when conditioning onlarge data sets with small measurement errors. These limi-tations, also called the ‘‘curse of dimensionality’’ or ‘‘filterdegeneracy,’’ have been the scope of many studies in thepast [e.g., Liu, 2008; Snyder et al., 2008; Van Leeuwen,2009]. One common way to avoid filter degeneracy isresampling, where realizations with low weights are dis-carded and additional realizations are generated in order tostabilize the posterior estimation. The filter degeneracy canbe measured by the effective sample size (ESS) [Liu, 2008]which, expressed in common sense, counts the number ofweights that still contribute substantially to the posteriorestimation:

ESS ¼Xn

i¼1

W 2i

!�1

: (10)

[36] Obviously, PreDIA inherits the curse of dimension-ality. Resampling strategies do not make sense in the con-text of PreDIA, because resampling from all possibleposterior distributions is equivalent to enlarging the entiresample right from the start. However, the problem of filterdegeneracy is mitigated to a substantial extent because weanalytically marginalize the likelihood function over mea-surement and model errors of predicted potential data val-ues. This widens the likelihood (see Appendix B), actinglike a kernel smoothing technique [e.g., Wand and Jones,1995], and so averages out statistical noise in conditionalvariances. It is important to understand that this marginali-zation is not an assumption or approximation that wouldartificially weaken the data or compromise the Bayesianframework. It is merely an analytical step within an other-wise numerical framework that takes advantage of the twoGaussian distributions. The kernel smoothing properties ofthis step are a crucial aspect of PreDIA, because informa-tive and low-error data sets are exactly what one desires tohave, and to optimize in applications.

[37] In order to quantify filter degeneracy of PreDIA, weaverage the effective sample size over all possible data val-ues to obtain the averaged effective sample size (AESS):

AESS ¼ 1

m

Xm

j¼1

Xn

i¼1

W 2ij

!�1

; (11)

and record it during the optimization procedure. Valuesthat are too low will indicate that the current analysisrequires a larger sample.

4.6. Impact of Complexity on Computational Costs

[38] The more freedom in model and parameter choice isadmitted, the more complex the model may become. As ageneral property of MC, this does not affect the conver-gence rate of MC statistics, unless the involved statisticaldistributions of model predictions become more complex,e.g., exhibiting heavy tails. Therefore, the convergenceproperties of the BF and related methods may depend onthe variance and higher moments of simulated data, but donot generally depend on model complexity. However,when looking at more complex problems, e.g., 3-D insteadof 2-D, two additional problems may arise.

[39] First, the effects of the curse of dimensionality maybe different, because the multivariate structure of possibledata may become more complex. For example, when sam-pling only hydraulic conductivity or hydraulic heads in 3-Dversus 2-D, their spatial variations remain, in principle, thesame. In such cases, the number of required realizations forMC, BF, or PreDIA does not increase. For concentrationmeasurements, however, the multivariate structure in 3-D ismuch more complex than in 2-D, because a transportedsolute plume has more spatial degrees of freedom in 3-D.In such cases, the required number of realizations mayincrease at a higher rate, as the number of considered meas-urements increase. To guarantee proper preposterior statis-tics, the AESS has to be monitored carefully.

[40] Second, the design space increases when switching to3-D or when optimizing the schedule of an experimentaldesign for dynamic systems rather than static patterns.With more potential measurement locations, the burden of

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

6 of 16

high-dimensional optimization in equation (9) increases.This is, however, not an artifact of our proposed PreDIAmethod, but rather a general problem shared by all OD meth-ods. This problem will require additional future research onadequate optimization algorithms.

[41] As for implicit BMA, it is important to stress thatPreDIA can principally handle uncertain discrete choicebetween structurally different model alternatives at no addi-tional conceptional costs within the data analysis (again, seesection 4.4). Our method merely requires convergence ofthe overall (combined) sample statistics and not necessarilyindividual convergence for each single considered model.

5. Application[42] To demonstrate the properties, advantages, and rele-

vance of our method, we design a synthetic applicationwhich involves a groundwater contamination scenario. Wechoose an example from subsurface contaminant transport,since this allows illustration aspects of model choice anduncertain boundary conditions, and because subsurfacetransport reveals strongly nonlinear and well-researcheddependencies between predictions and parameters.

5.1. Scenario Definition and Configuration

[43] The synthetic scenario assumes a drinking waterwell or a similar sensitive location threatened by a recentbut continuous contaminant source located upstream. Thegoal is to find the sampling pattern which optimally reducesthe uncertainty of predicting the long-term (steady state)contaminant concentration to be expected at the sensitivelocation.

[44] For simplicity, but not as a limitation of our method-ology, the test case considers steady state groundwater flowin a 2-D depth-averaged heterogeneous aquifer withoutsources or sinks:

r � ½TðxÞrh� ¼ 0; (12)

where T ½L2=t� is the locally isotropic transmissivity andh½L� is the hydraulic head. All flow boundaries are specifiedas Dirichlet boundaries with hðxÞ. Contaminant transportfrom a continuous source at late time (steady state) is mod-eled according to

v � rc�r � ðDdrcÞ ¼ 0; (13)

with concentration c [M/L3], velocity v [L/t] ¼ q/ne, Darcyspecific flux q, porosity ne, and pore scale dispersion tensorDd [L2/t] according to Scheidegger [1954]. Transportboundary conditions are specified as Dirichlet boundarieswith cðxÞ ¼ 0 at all outer boundaries and cðxÞ ¼ c0 at thesource. For simplicity, we normalize the problem toc0 ¼ 1. All known flow and transport parameters and theirvalues are summarized in Table 1. An overview of thedomain geometry including the source and the sensitivelocation is provided in Figure 2. The source location(xS, yS) ¼ (80 m, 80 m) and its width ls ¼ 20 m are assumedto be known. The sensitive location is at (xW, yW) ¼ (180 m,86 m). Within the geostatistical concept provided below,this is about seven expected integral scales downstream

from the contaminant source and about half an integralscale offset from the centerline of the expected plume path.

[45] Figure 2 also depicts a possibly present hydraulicbarrier (width ¼ 10 m) due to uncertainty in the geologicalmedium boundaries. For the sake of scenario variation, weassume that local hydrogeologists are uncertain about theextent of a narrow zone filled with a different geological fa-cies, which might be present in that area. For simplicity,we implement this as a rectangle with a different meanvalue of log conductivity at TðxÞ ¼ ln 10�7. The prior prob-ability of this alternative model is set to 30%. Please notethat the possibly present barrier is only considered in onescenario variation.

[46] In our example, we consider the global transmissiv-ity field uncertain, following a geostatistical approach. Asan example for model choice, the geostatistical model willbe kept uncertain. We define TðxÞ as a discretized randomspace function represented by cell-wise values within theparameter vector s. Following classical geostatistical ideas,we use E½s� ¼ X� with deterministic trend functions X andtrend coefficients �1; �2, and assume that s0 ¼ s� E½s� issecond-order stationary with a covariance function CðhÞthat only depends on the separation vector h.

[47] Bayesian geostatistics [Kitanidis, 1986] open thepath to a more general consideration of uncertainty. Leth ¼ ð�2; �i; �Þ contain uncertain structural parameters ofthe geostatistical model, where �2 accounts for the fieldvariance, and �i are the correlation length scales in spatialdirections xi. Several recent studies, including Feyen et al.[2003], Nowak et al. [2010], and Murakami et al. [2010],suggest using the Matern covariance function [Matern,1986]:

CðlÞ ¼ �

2��1�ð�Þð2

ffiffiffi�p

lÞ�B�ð2ffiffiffi�p

lÞ;

l ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�x1

�1

� �2

þ �x2

�2

� �2s

;

(14)

Table 1. Known Parameters for the Flow, Transport, and Geostat-istical Model

Numerical DomainDomain size [L1, L2] [m] [320, 220]Grid spacing [D1, D2] [m] [0.25, 0.125]

Design DomainDomain size [L1, L] [m] [260, 160]Grid spacing [D1, D2] [m] [2, 2]

Transport ParametersHead gradient � [�] 0.01Effective porosity ne [�] 0.35Local-scale dispersivities [l, t] [m] [0.5, 0.125]Diffusion coefficient Dm [m2 s�1] 10�9

Transversal plume dimension ls [m] 20

Known Geostatistical Model ParametersGlobal mean �1 ¼ ln T [�] ln 10�5

Trend in mean �2 [�] 0

Measurement Error Standard DeviationsHydraulic conductivity �r,T [�] 1.00Hydraulic head �r,h [m] 0.01

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

7 of 16

with gamma function �ð�Þ. B�ð�Þ is the modified Besselfunction of the third kind [Abramowitz and Stegun, 1972].The additional shape parameter � controls the shape of thecovariance function, e.g., � ¼ 0:5 is the exponential and� ¼ 1 as the Gaussian model. The benefits of the Maternfamily have been discussed extensively by, e.g., Handcockand Stein [1993] and Diggle and Ribeiro [2002]. The rele-vance of the Matern family within Bayesian geostatisticshas recently been pointed out by Nowak et al. [2010]. Thisis an example of continuous Bayesian model averaging,where model structural uncertainty can be mapped ontoparametric uncertainty (see section 4.4). Our intention forthis choice is to illustrate the application of continuousBMA within BF and PreDIA.

[48] The values for the Dirichlet flow boundary condi-tion are determined by two uncertain angles, � and ,which define the regional head gradient via its slope � andorientation relative to the northern/southern boundaries.All uncertain geostatistical parameter h, boundary parame-ters n; and conceptual models k are drawn from their re-spective distributions (see Table 2).

[49] Concentrations c are considered not to be availableas measurement data, because the spill just happened andthe plume has not evolved yet. Instead, head and transmis-sivity data shall be optimally collected in order to maxi-mize uncertainty reduction associated with the predictionbased on equation (8). We define transmissivity T and hy-draulic head h with measurement errors �r;T and �r;h,respectively, to be measurable at the point scale, e.g., bydisturbed core samples and small monitoring wells. For in-structive reasons, we decide not to sample transmissivity Tat the same locations as hydraulic head h by default, sincethis will help to better exhibit and discuss the underlyingphysics associated with the respective choice of locationand data type. Locations where T is informative may not beinformative for h measurements, because different physicalflow and transport-related phenomena may co-ordinate theindividual data types to different informative locations.

However, our framework could easily handle constraintssuch that T and h measurement locations have to coincide.

[50] The implementation is done in the MATLAB envi-ronment. For generating geostatistical random fields andsimulating groundwater flow and solute transport, we usethe same code already used by Nowak et al. [2008, 2010].A large sample size of 50,000 realizations has been chosento ensure that our discussion of the method and resultingdesigns is not compromised by statistical noise. We use thesequential exchange algorithm (see section 4.3) in order tooptimize the design, and the utility of each design candi-date is evaluated with PreDIA, based on equation (8) as anobjective function.

5.2. Scenario Variations

[51] We consider two different scenario objectives. Theywill serve to show that PreDIA can include arbitrary pre-diction goals regardless of their nonlinearity and that it canalso include arbitrary task-driven formulations. A third sce-nario then exclusively addresses (not done in cases 1a and2b) the consideration of conceptual model uncertainty, i.e.,via incorporating a hydraulic barrier :

[52] 1. Minimum-variance prediction of a contaminantconcentration c at the sensitive location. To emphasize thedifference to conventional linear methods, we compare theresults of our method to results from an ensemble Kalmanfilter (EnKF) [e.g., Herrera and Pinder, 2005; Evensen,2007]. Therefore, we run the first scenario with PreDIA

Figure 2. Spatial configuration of the synthetic test case involving a contaminant release at (xS, yS) ¼(80 m, 80 m) with source width ls ¼ 20 m, and prediction target at (xW, yW) ¼ (180 m, 86 m). For furtherdetails, see Table 1.

Table 2. Uncertain Structural and Boundary Parameters andTheir Assigned Distributions

Uncertain Structural Parameters hVariance �2

T [�] N (m ¼ 2.0, � ¼ 0.3)Integral scale � [m] N (m ¼ 15, � ¼ 2.0)Matern Kappa � [�] U( ¼ 5, b ¼ 36)

Uncertain Boundary Parameters nDeviation from center [�] N (m ¼ 0.0, � ¼ 10)

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

8 of 16

(case 1a) and compare the results to a sampling patternobtained from an EnKF (case 1b);

[53] 2. Maximum-confidence prediction of whether a crit-ical concentration threshold will be exceeded. This is equiv-alent to predicting an indicator quantity z ¼ Iðc > ccritÞ,with E½I � ¼ Pðc > ccritÞ. Since the indicator is a discretevariable that depends very nonlinearly on model parameters,it does not meet the requirements under which EnKFs can beused for comparison. Instead, two threshold values areconsidered with PreDIA: ccrit ¼ P15 (case 2a) and ccrit ¼P75 (case 2b), where P15 and P75 are the c-values belowwhich 15% and 75% of the c-values may be found, respec-tively; and

[54] 3. Consideration of a hydraulic barrier and mini-mum-variance prediction of a contaminant concentration cat the sensitive location (case 3).

6. Results and Discussion[55] In section 6, we present and discuss the sampling

patterns resulting from the synthetic test case and its varia-tions defined in section 5.

6.1. Sampling Pattern Optimized for PredictingConcentration (Case 1a)

[56] Case 1a features optimal sampling for minimum-variance prediction of concentrations at the sensitive loca-tion. Figure 3 shows the respective variances of T, h, and cprior to investigation. The resulting sampling pattern,obtained with PreDIA, is shown in Figure 4. We have alsoincluded in Figure 4 the expected conditional variance oftransmissivity (top), hydraulic head (center), and predictedconcentration (bottom), according to equation (8). The ba-sic characteristics of the design pattern mostly coincidewith the results found in the work of Nowak et al. [2010]who considered a similar scenario. However, there are im-portant differences since they used an EnKF and weemploy PreDIA. With regard to the sampling pattern, wefind two predominant groups: measurements gatheringaround the source and measurements flanking the expectedmigration path of the plume. Near-source measurementsare exclusively occurring as transmissivity measurements.They are highly informative since they provide informationabout the volumetric flow rate through the source area. Theflow rate through the source, in turn, is a dominant factorthat dictates the total contaminant mass flux, the expectedwidth, and the dispersion characteristics of the plume fur-ther downstream [de Barros and Nowak, 2010].

[57] The measurements flanking the plume are headmeasurements which capture both the large-scale drift ofthe plume (due to the uncertain regional head gradient) andthe mesoscale meandering of the plume (caused byheterogeneity).

[58] In principle, the prediction task leads to informationneeds that manifest themselves most in those regions wherethe statistical dependency between the measurable quantities(transmissivity or hydraulic head) and the prediction goal ishighest, while avoiding measurements that are mutually tooclose and would merely convey redundant information.Figure 5 shows the statistical dependencies between observ-able quantities at potential measurement locations and the pre-diction target for a near-source transmissivity measurement

location (A) and a near-boundary head measurement location(B). The statistical dependencies are obtained by plottingthe sample of possible measurement values against thesample of predicted concentrations. Additionally, we illus-trate the nonlinear dependency in the scatterplot by a mov-ing average line.

[59] Obviously, T at the near-source location (A) has amostly linear relation to the predicted concentration. Thehigher the transmissivity at the source, the higher is thesource discharge and the broader is the plume on averageafter leaving the source. Therefore, the plume is far morelikely to maintain high concentrations even over long traveldistances, and is more likely to hit the target [de Barrosand Nowak, 2010].

[60] Opposed to that, h at the near-boundary location (B)exhibits a nonlinear dependency to the prediction goal.Extreme angles of the regional flow gradient divert theplume away from the target location, for both positive andnegative values of the angle. By contrast, regional flow inthe straight uniform direction drives the plume, most likely,

Figure 3. (top) Prior uncertainties (variance) associatedwith transmissivity, (middle) hydraulic head, and (bottom)concentration based on the uncertain structural and bound-ary parameters listed in Tables 1 and 2.

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

9 of 16

through the target. The resulting dependency between hy-draulic heads close to the boundary and the predictedconcentration has an almost purely quadratic behavior, andshows almost no correlation in a linear sense, i.e., it hasalmost zero covariance.

[61] Figure 6 (left) illustrates how the individual trans-missivity or hydraulic head measurements added during se-quential design reduce the variance of the prediction goaland related physical quantities. The latter include the totalsolute mass flux through the source, the angle of the bound-ary condition (causing a large-scale drift), the width of theplume at the target (lateral spreading), and the lateral posi-tion of the plume’s centroid (also affected by mesoscalemeandering caused by heterogeneity).

[62] We can clearly see that transmissivity measure-ments located close to the source greatly reduce the predic-tion uncertainty of the total solute flux for this case, while

the head measurements along the flanks are almost not in-formative to the total solute flux. Instead, the uncertainty ofthe boundary condition (regional flow direction) is greatlyreduced by the head measurements, whereas the transmis-sivity measurements around the source contribute almostno related information. Likewise, the position of the plumecenter is revealed almost solely by head measurements. Forthe plume width at the prediction target, we find a sensitiv-ity to both head and transmissivity measurements, wherethe first two transmissivity measurements at the source areclearly the most valuable ones.

[63] As addressed in section 4.5, we recorded the aver-aged effective sample size (AESS) during the optimizationprocedure in order to monitor and avoid filter degeneracy.Figure 6 (right) indicates the AESS (scale shown on theright axis) during the optimization scheme. The AESSdrops from initially 50,000 to �500. This is fully sufficient

Figure 4. (a) PreDIA-based (case 1a) and (b) EnKF-based (case 1b) sampling pattern optimized for min-imum prediction variance of concentration at the sensitive location. Notations are as follows: head meas-urements (crosses), transmissivity measurements (circles), source (box), and target (diamond). Maps in thebackground are (top) expected preposterior variances for transmissivity, (middle) hydraulic head, and(bottom) concentration.

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

10 of 16

in order to calculate noise-free maps of the expected condi-tional variance (see Figure 6) and evaluate the objectivefunction reliably.

6.2. Comparison to EnKF (Case 1b)

[64] The sampling pattern provided by the ensemble Kal-man filter (EnKF) relies on exactly the same geostatisticaland boundary parameters used in case 1a, and hence usesthe very same sample data. For technical insights in theEnKF formalism, please see Herrera and Pinder [2005] orEvensen [2007]. The resulting pattern is shown in Figure 4(right column). The underlain maps of expected conditionalvariance are evaluated by PreDIA, because the maps pro-vided by the EnKF are inaccurate and would not be compa-rable to those shown in the left part of Figure 4.

[65] Compared to the PreDIA-based sampling pattern(case 1a), we find again the group of transmissivity samplesin the source area. However, the number of measurementsin this group is much larger. The next fundamental differ-ence to the PreDIA-based sampling pattern is that the groupof head measurements at the northern and southern domainboundary is smaller in favor of head measurements in thecorners of the design domain. Apparently, the relevance ofthe variable boundary conditions that induce large-scaledrift of the plume is also recognized, but judged differentlyby the EnKF analysis scheme.

[66] The EnKF assesses statistical dependencies only viacovariances, which are a measure for linear dependenceonly. It is unable to capture even-order (e.g., quadratic)dependencies such as between head measurements near the

Figure 5. Scatter density plots depicting (a) the relation between the sample of predicted concentra-tions and the sample of transmissivity values at a near-source location and (b) hydraulic head values at anear-boundary location. The solid line illustrates the relation via moving average.

Figure 6. Expected variance reduction (equation (8)) for (left) PreDIA (case 1a) and (right) EnKF(case 1b) during initial placement of samples for different auxiliary quantities. The sequential exchangephase is not shown in detail and is indicated only by the gray lines. Hydraulic head measurements aredenoted by crosses, and transmissivity measurements are denoted by circles. The right axis quantifies,for the PreDIA-based optimization, the respective averaged effective sample size (AESS).

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

11 of 16

northern and southern boundary and the prediction goal (seeFigure 5). Therefore, it simply ignores these head measure-ment locations as potential sources of valuable information.Hence, crucial information about the mesoscale meanderingof the plume is neglected. However, four measurement loca-tions were placed at the corners of the allowable designlocations. Apparently, their nonlinear dependency exhibits asufficiently large linear component due to the slight asym-metry of our setup.

[67] Overall, this leads to a significantly worse perform-ance in reducing the uncertainty associated with the plumecenter, even though the EnKF captures the uncertain bound-ary condition reasonably well. This can be seen by compar-ing the expected conditional variance within Figure 6 (leftand right). With a higher relative emphasis on the mostly lin-ear source transmissivity information, the plume width andtotal solute flux are determined comparably well. Still, theoverall prediction quality of concentration c is reduced byignoring and misinterpreting nonlinear information, suchthat PreDIA clearly outmatches the EnKF. In our setup, Pre-DIA achieves 25% more uncertainty reduction with thesame number of sampling positions than the EnKF.

[68] In more general terms, EnKFs and all linear(ized)methods can only measure correlation, which is a veryincomplete access to statistical dependence. For example,zero correlation between a zero-mean variable and its squaredoes not imply at all that a squared value is independent ofits square root. Hence, the limitations of the linear(ized)methods illustrated in our specific example generalize to allnonlinear applications.

6.3. Sampling Patterns Optimized for PredictingExceedance Probability (Cases 2a and 2b)

[69] In this test case, we desire maximum-confidence pre-diction whether a critical concentration value (e.g., imposedby a regulatory threshold) will be exceeded or not. The Pre-DIA-based sampling patterns for cases 2a and 2b are shownin Figure 7, again obtained from the same sample.

[70] Case 2a (ccrit ¼ P15) exhibits a sampling patternwhich is mainly based on head measurements at near-bound-ary and toward-target locations. Transmissivity measurementsexploring the source region are practically absent. For predict-ing low threshold values, it is only important, and thereforesufficient, to know that the plume misses the sensitive loca-tion. This information is obtained by head measurementsflanking the plume, which can reveal transverse gradients thatcould divert the plume from hitting the sensitive location.

[71] Case 2b (ccrit ¼ P85) shows an inverted behavior,where the source is sampled repeatedly using six transmis-sivity samples that are hardly distinguishable in Figure 7.Two additional transmissivity samples north of the sourcesupport the near-source samples by addressing the contrastin transmissivity between the source and its surroundings.Instead, head measurements closely flanking the plume aredisregarded. This is a direct consequence of the differentinformation needs between cases 2a and 2b. For high-threshold values, it is necessary to know whether the plumepreserves its initial peak concentration over large travel dis-tances up to the sensitive location. Highly conductive sour-ces favor this behavior, and can be identified by increasingthe source sampling density. In addition, highly conductivesources statistically imply an increased downstream plume

width. With the plume sufficiently wide, the chances ofbypassing the sensitive location by mesoscale meanderingdecrease and only a globally rotated mean flow direction canprevent the plume from hitting the sensitive location. That isthe reason why transverse gradients and the related headmeasurements are not closely flanking the plume, and thereare more remote head samples at the northern and southernboundaries that help to infer the global flow direction.

[72] In order to emphasize the task-specific characterof the individual design patterns toward their respectiveprediction goal, we applied each design pattern to the pre-diction goals of all other test cases. This yields the perform-ance indices summarized in Table 3.

[73] The performance indices show that the PreDIA-based design pattern (1a) clearly outmatches the EnKF (1b)for all three prediction goals. The EnKF-based design pat-tern is even surpassed in its own objective by the PreDIA-based sampling patterns designed for cases 2a (low thresh-old) and 2b (high threshold). The worst performance wasfound for pattern 2a (low threshold) when applied to theobjective of case 2b (high threshold). This can be explainedby the fact that these two patterns lay their focus on opposedfeatures in their respective design objectives, i.e., on meso-scale meandering versus source conductivity. The oppositecase (applying pattern 2b to case 2a) performs better. Obvi-ously, in our specific examples, many source conductivitymeasurements are more generic all-purpose information thanhead measurements populating the boundaries.

6.4. Sampling Patterns Accounting for ConceptualModel Uncertainty (Case 3)

[74] The optimized sampling pattern for case 3 is shownin Figure 8. Opposed to the previous cases, case 3 also con-siders conceptual model uncertainty, represented by a pos-sibly present hydraulic barrier. If present, the barrier causesa flow regime, which forces the plume to swerve northwardand so increases the chance that the plume hits the sensitivelocation. The strong dependence of the predicted concen-tration on the presence of the hydraulic barrier requires anadequate model choice. Therefore, the sampling patternreacts to this additional uncertainty. Compared to case 1a,three transmissivity measurements are placed in the area ofthe possibly present barrier, whereas most other design fea-tures are preserved.

[75] Although we did not use the model choice as anobjective function for the design (the importance of modelchoice is only implicit via its role in our chosen predictiongoal), the reliability of correct model choice is improved bythe adapted sampling pattern provided by PreDIA. Thiseffect can be illustrated best by computing the preposteriorweights of the two different hypothesized models: Amongall possible data sets generated with the barrier, the modelwith the barrier obtains (on average over all of those datasets) a weight of 98%. Among all possible data sets gener-ated without the barrier, the model without the barrierreceives an average weight of 50%. Weighting both prepos-terior cases by their prior probabilities to occur (i.e., 70%and 30%, respectively) yields an expected reliability of85% to choose the correct model. This is a significantlyincreased reliability compared to the prior stage, where thereliability lies at 58%.

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

12 of 16

[76] PreDIA also allows the performance of a full BMAanalysis including measures like preposterior intermodeland intramodel variances, because all statistics are avail-able. However, we omit this analysis here for the sake ofbrevity. As for the computational costs and convergenceissues, the AESS drops in scenario three from 500 (cases 1and 2) to �200. This is owed to the increased variability

and uncertainty in hydraulic conductivity introduced by thepossibly present hydraulic barrier.

7. Conclusions[77] In this work, we introduced a method for informa-

tion processing, called PreDIA, which assesses theexpected data utility of proposed sampling designs, such asthe expected impact of data on prediction confidence, in anoptimal design framework. The method operates via apurely numerical Monte Carlo implementation of Bayestheorem and Bayesian model averaging, combined with ananalytical marginalization over measurement and modelerrors. Since the actual measurement values at the individ-ual planned measurement locations are unknown during theplanning stage of optimal design, PreDIA averages the util-ity of designs over all possible measurement values that a

Table 3. Performance Indices for Every Sampling Design WhenApplying to Different Prediction Goals

Pattern 1a Pattern 1b Pattern 2a Pattern 2b

Cases 1a/1b 100.00% 75.14% 79.10% 95.99%Case 2a 81.41% 76.03% 100.00% 69.01%Case 2b 90.43% 38.79% 27.54% 100.00%

Figure 7. PreDIA-based sampling pattern optimized for predicting the exceedance of a low ccrit (left,case 2a) and high ccrit (right, case 2b). Notations are as follows: head measurements (crosses), transmis-sivity measurements (circles), source (box), and target (diamond). Maps in the background are preposter-ior variances for transmissivity (top), hydraulic head (center), and indicator variable (bottom).

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

13 of 16

given sampling design could produce. The method can beseen as an extension of the bootstrap filter (BF) or to parti-cle filters (PF). Because of its full numerical character, Pre-DIA allows the incorporation of various sources ofuncertainty and is able to manage highly nonlinear depend-encies between data, parameters, and predictions with ease.

[78] We applied the method to an optimal design prob-lem taken from contaminant hydrogeology, where we illus-trated its applicability to different sources of uncertainty,various prediction tasks, and task-driven objective func-tions. Within a groundwater quality example, we consid-ered noncolocated hydraulic head and transmissivitymeasurements. In order to show the limitations of linear-ized methods, we compared the optimal design patternsobtained via PreDIA to those from an EnKF. We find thefollowing conclusions most important:

[79] 1. PreDIA outmatches linearized methods (such asEnKFs) because linear methods fail to recognize relevantnonlinear relations between potential measurement loca-tions and the prediction goal, and hence oversample loca-tions considered to be most informative from the limitedviewpoint of linearized analysis.

[80] 2. PreDIA can handle arbitrary task-driven formula-tions of optimal design. We demonstrate this in a scenariovariation that involves predicting the exceedance of a regu-latory threshold value, which is important for risk manage-ment [e.g., de Barros et al., 2009]. The sampling patternfor the task-driven prediction strongly depends on the levelof the threshold value, because different information needsare triggered by the underlying flow and transport physics.Neither this difference nor such classes of task-driven for-mulations could be handled by linearized methods.

[81] 3. The number of Monte Carlo realizations needed byPreDIA for convergence rises with the number of plannedsampling points and their measurement accuracy. This isinherited from BFs in general. The averaged effective sam-ple size (AESS) serves as a sound measure to monitor statis-tical convergence. Averaging analytically over measurementand model-structural error and over the yet unknown datavalues drastically improves convergence. However, the prob-lem of filter degeneracy is still a challenge when planningextensive sampling campaigns. An extension of PreDIA to-ward more efficient stochastic methods would help to further

increase the affordable sampling size. Here, linear methodsare superior as they benefit from fast analytical solutions.

[82] 4. Bayesian model averaging is implicit in PreDIAat no additional conceptual costs, and allows the reductionin subjectivity of prior assumptions on, e.g., geostatisticalparameters, boundary parameters, or physical/conceptualmodel alternatives (like hydraulic barriers). Introducingmore variability to models might increase the computa-tional costs or lead to a decrease in the AESS. Incorrectprior assumptions could negatively affect the quality of theresulting optimal designs.

[83] 5. Our specific illustrative example showed that theuncertain direction of a regional groundwater flow has asignificant impact on the uncertainty of predicting contami-nations, and should hence not be neglected. This additionaluncertainty can be quickly reduced by hydraulic headmeasurements at large distances.

[84] 6. In our specific case, the optimal design predomi-nantly addressed uncertainty in head boundary conditionsand contaminant source hydraulics, rather than structuraluncertainty in the geostatistical model. This will changeaccording to the relative importance of individual sourcesof uncertainty, and the availability of data types that areadequate to address these individual uncertainties.

Appendix A: Derivation of Weight Matrices[85] In formal Bayesian updating, the likelihood Lij of pa-

rameter vector Si given the synthetic noise-free ny � 1 vec-tor of measurement values fyðSjÞ resembles a multivariatenormal distribution in � ¼ fyðSjÞ � fyðSiÞ with mean zeroand covariance matrix R" ¼ 2R" (Appendix B), where R" isthe covariance matrix of measurement errors. Hence,

Lij ¼1

ð½2��ny det R"Þ1=2

exp � 1

2ð�ÞT R�1

" ð�Þ� �

8 i 6¼ j; (A1)

Lij ¼ 0 8 i ¼ j: (A2)

Since, for i ¼ j, Si and Sj would not be drawn independ-ently, we discard i ¼ j and instead set all Lii to zero. Nor-malizing according to Bayes theorem yields

wij ¼LijXn

i¼1

Lij

:(A3)

Appendix B: Weight Matrix MarginalizedOver Synthetic Measurement Error

[86] The commonly used concept to simulate noisy datais yj ¼ fyðSjÞ þ "j, where "j � Nð0;R"Þ with covariancematrix R". In order to compute the likelihood of a realiza-tion j and a given data set, d0, a bootstrap filter (BF) woulduse the residual vector �0 ¼ d0 � fyðSjÞ and, for ny mea-surement values, set :

pðd0jSjÞ ¼1

ð½2��ny det R"Þ1=2exp � 1

2ð�0ÞT R�1

" ð�0Þ� �

: (B1)

Figure 8. Sampling pattern (case 3) when consideringconceptual model uncertainty exemplary represented by ahydraulic barrier.

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

14 of 16

Because we need to marginalize over all potential data val-ues that might populate d0, we use dk ¼ fyðSkÞþ "k to gen-erate realizations of potential data, again with"k � Nð0;R"Þ. This yields

pðSk jSj; "kÞ ¼1

ð½2��ny det R"Þ1=2exp � 1

2ð�þ "kÞT R�1

" ð�þ "kÞ� �

;

(B2)

with � ¼ fyðSkÞ � fyðSjÞ. Now, we can average over allpossible potential measurement error values of "k via themarginalization

pðSk jSjÞ ¼Z þ1�1

pðSk jSj; "kÞ � pð"kÞ d "k :

After some manipulations, and using the statistical inde-pendence between "j and "k , this results in

pðSk jSjÞ ¼1

ð½2��ny det R"Þ1=2

exp � 1

2ð�ÞT R�1

" ð�Þ� �

; (B3)

with R" ¼ 2R". This is equivalent to using noise-freepotential data, but doubling R" in the likelihood analysis.

Appendix C: Implementing Model Errors[87] Following the procedure of Appendix B, a model

error "m can be accounted for in the same fashion. Bothsynthetic and potential data are then simulated byyj ¼ fyðSjÞ þ "j þ "m

j and dk ¼ fyðSkÞ þ "k þ "mk . Follow-

ing the precisely identical procedure as in Appendix B, thelikelihood becomes

pðSk jSj;"k ;"mj ;"

mk Þ¼

1

ð½2��ny detR"Þ1=2

exp �1

2ð�þ"kþ"m

j þ"mk Þ

T R�1" ð�þ"kþ"m

j þ"mk Þ

� �;

(C1)

with �¼ fyðSkÞ� fyðSjÞ. If "m is assumed to be Gaussianwith "m�Nð0;Rm

" Þ and covariance matrix Rm" , it can again

absorbed into R" such that

pðSk jSjÞ¼1

ð½2��ny detR"Þ1=2

exp �1

2ð�ÞT R�1

" ð�Þ� �

; (C2)

with R"¼2R"þ2R"m:

Notation

d set of decision variablesdopt set of optimized decision variables

fy physical measurement modelf z physical prediction model"y measurement error"z prediction error"m model errork set of discrete model choice parameters

ns number of physical parameter valuesny number of physical potential measurement valuesnz number of physical predictions valuesn number of simulated measurement valuesm number of potential measurement values

R" error covariance matrixRm" model error covariance matrixs set of uncertain model parametersS augmented set of uncertain model parametersh set of uncertain structural parametersw weight vector posterior to measurement (e.g., BF)

W weight vector prior to measurement (e.g., PreDIA)n set of uncertain boundary parametersy set of potential measurement values

y0 hypothetical measurement valuesz set of predictions

[88] Acknowledgments. The authors would like to thank the GermanResearch Foundation (DFG) for financial support of the project within theCluster of Excellence in Simulation Technology (EXC 310/1) at the Uni-versity of Stuttgart. In addition, we would like to thank Felipe de Barros,Harrie-Jan Hendricks Franssen, and three anonymous reviewers for addi-tional comments and discussions.

ReferencesAbramowitz, M., and I. A. Stegun (1972), Handbook of Mathematical

Functions with Formulas, Graphs, and Mathematical Tables, 10th ed.,1046 pp., Dover Publications, Mineola, N. Y.

Ballio, F., and A. Guadagnini (2004), Convergence assessment of numeri-cal Monte Carlo simulations in groundwater hydrology, Water Resour.Res., 40(4), W04603, doi:10.1029/2003WR002876.

Bardossy, A., and J. Li (2008), Geostatistical interpolation using copulas,Water Resour. Res., 44, W07412, doi:10.1029/2007WR006115.

Beven, K., and A. Binley (1992), The future of distributed models: Modelcalibration and uncertainty prediction, Hydrogeol. Processes, 6(3), 279–298, doi:10.1002/hyp.3360060305.

Caflisch, R. (1998), Monte Carlo and quasi-Monte Carlo methods, Actanumerica, 7, 1–49, doi:10.1017/S0962492900002804.

Chaloner, K., and I. Verdinelli (1995), Bayesian experimental design: Areview, Stat. Sci., 10, 273–304, doi:10.1214/ss/1177009939.

Christakos, G. (1992), Random Field Models in Earth Sciences, 2nd ed.,474 pp., Dover Publications, Mineola, N. Y.

Christensen, S. (2004), A synthetic groundwater modeling study of the ac-curacy of glue uncertainty intervals, Nordic Hydrol., 35, 45–59.

Cirpka, O. A., and P. K. Kitanidis (2001), Sensitivity of temporal momentscalculated by the adjoint-state method, and joint inversing of head andtracer data, Adv. Water Resour., 24(1), 89–103, doi:10.1016/S0309-1708(00)00007-5.

Cirpka, O. A., C. M. Bürger, W. Nowak, and M. Finkel (2004), Uncertaintyand data worth analysis for the hydraulic design of funnel-and-gate sys-tems in heterogeneous aquifers, Water Resour. Res., 40(11), W11502,doi:10.1029/2004WR003352.

Cover, T. M., and J. A. Thomas (2006), Elements of Information Theory,2nd ed., 776 pp., Wiley-Interscience, Hoboken, N. J.

de Barros, F. P. J., and W. Nowak (2010), On the link between contaminantsource release conditions and plume prediction uncertainty, J. Contam.Hydrol., 116, 24–34, doi:10.1016/j.jconhyd.2010.05.004.

de Barros, F. P. J., Y. Rubin, and R. M. Maxwell (2009), The concept ofcomparative information yield curves and its application to risk-basedsite characterization, Water Resour. Res., 45(6), W06401, doi:10.1029/2008WR007324.

Deutsch, C. V., and A. G. Journel (1997), GSLIB: Geostatistical SoftwareLibrary and Users Guide, 2nd ed., 384 pp., Oxford Univ. Press, New York.

Diggle, P., and S. Lophaven (2006), Bayesian geostatistical design, Scandi-navian J. Stat., 33(3), 53, doi:10.1111/j.1467-9469.2005.00469.x.

Diggle, P. J., and P. J. Ribeiro Jr. (2002), Bayesian inference in Gaussianmodel-based geostatistics, Geogr. Environ. Model., 6(2), 129–146,doi:10.1080/1361593022000029467.

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

15 of 16

Diggle, P. J., and P. J. Ribeiro Jr. (2007), Model-Based Geostatistics,Springer series in statistics, 1st ed., 230 pp., Springer, N. Y.

Evensen, G. (2007), Data Assimilation: The Ensemble Kalman Filter, 2nded., 280 pp., Springer, Berlin, Germany.

Federov, V., and P. Hackl (1997), Model-Oriented Design of Experiments,1st ed., 117 pp., Springer, N. Y.

Feyen, L., and S. M. Gorelick (2005), Framework to evaluate the worth ofhydraulic conductivity data for optimal groundwater resources manage-ment in ecologically sensitive areas, Water Resour. Res., 41, W03019,doi:10.1029/2003WR002901.

Feyen, L., and J. J. Gomez-Hernandez, P. J. Ribeiro Jr., K. J. Beven, and F.De Smedt (2003), A Bayesian approach to stochastic capture zone delin-eation incorporating tracer arrival times, conductivity measurements,and hydraulic head observations, Water Resour. Res., 39(5), 1126,doi:10.1029/2002WR001544.

Fogg, G. E., C. D. Noyes, and S. F. Carle (1998), Geologically based modelof heterogeneous hydraulic conductivity in an alluvial setting, Hydro-geol. J., 6(1), 131–143, doi:10.1007/s100400050139.

Freeze, R. A., B. James, J. Massmann, T. Sperling, and L. Smith (1992),Hydrogeological decision analysis: 4. The concept of data worth and itsuse in the development of site investigation strategies, Ground Water,30(4), 574–588, doi:10.1111/j.1745-6584.1992.tb01534.x.

Fritz, J., I. Neuweiler, and W. Nowak (2009), Application of FFT-basedalgorithms for large-scale universal kriging problems, Math. Geosci.,41(5), 509–533, doi:10.1007/s11004-009-9220-x.

Goldberg, D. E. (1989), Genetic Algorithms in Search Optimization andMachine Learing, 1st ed., 432 pp., Addison-Wesley, Bonn, Germany.

Gomez-Hernandez, J. J., and X.-H. Wen (1998), To be or not to be multi-Gaussian? A reflection on stochastic hydrogeology, Adv. Water Resour.,21(1), 47–61, doi:10.1016/S0309-1708(96)00031-0.

Gomez-Hernandez, J. J., A. Sahuquillo, and J. E. Capilla (1997), Stochasticsimulation of transmissivity fields conditional to both transmissivity andpiezometric data: 1. Theory, J. Hydrol., 203(1–4), 162–174.

Gordon, N., D. Salmond, and A. Smith (1993), Novel approach to nonlin-ear/non-Gaussian Bayesian state estimation, IEE Conf. Proceedings-F,140(2), 107–113.

Handcock, M. S., and M. L. Stein (1993), A Bayesian analysis of kriging,Am. Stat., 35(4), 403–410.

Herrera, G. S., and G. F. Pinder (2005), Space-time optimization of ground-water quality sampling networks, Water Resour. Res., 41, W12407,doi:10.1029/2004WR003626.

Hoeting, J. A., D. Madigan, A. E. Raftery, and C. T. Volinsky (1999),Bayesian model averaging: A tutorial, Stat. Sci., 14(4), 382–417.

James, B. R., and S. Gorelick (1994), When enough is enough: The worthof monitoring data in aquifer remediation design, Water Resour. Res.,30(12), 3499–3513, doi:10.1029/94WR01972.

Jaynes, E. T. (1957), Information theory and statistical mechanics, Phys.Rev., 106(4), 620–630, doi:10.1103/PhysRev.106.620.

Kass, R., and L. Wasserman (1996), The selection of prior distributions byformal rules, J. Am. Stat. Assoc., 91, 1343–1370.

Kitanidis, P. K. (1986), Parameter uncertainty in estimation of spatial func-tions: Bayesian analysis, Water Resour. Res., 22(4), 499–507,doi:10.1029/WR022i004p00499.

Kitanidis, P. K. (1995), Quasi-linear geostatistical theory for inversing,Water Resour. Res., 31(10), 2411–2419, doi:10.1029/95WR01945.

Kunstmann, H., W. Kinzelbach, and T. Siegfried (2002), Conditional first-order second-moment method and its application to the quantification ofuncertainty in groundwater modeling, Water Resour. Res., 38(4), 1035,doi:10.1029/2000WR000022.

Liu, J. S. (2008), Monte Carlo Strategies in Scientific Computing, 360 pp.,Springer, N. Y.

Liu, X., J. Lee, P. K. Kitanidis, J. Parker, and U. Kim (2012), Value of in-formation as a context-specific measure of uncertainty in groundwaterremediation, Water Res. Manage., doi:10.1007/s11269-011-9970-3, inpress.

Mantovan, P., and E. Todini (2006), Hydrological forecasting uncertaintyassessment: Incoherence of the GLUE methodology, J. Hydrol., 330(1–2), 368–381, doi:10.1016/j.jhydrol.2006.04.046.

Matern, B. (1986), Spatial Variation, 2nd ed., 151 pp., Springer, Berlin,Germany.

Müller, W. G. (2007), Collecting Spatial Data, 3rd ed., Springer, Berlin,Germany.

Murakami, H., X. Chen, M. S. Hahn, Y. Liu, M. L. Rockhold, V. R.Vermeul, J. M. Zachara, and Y. Rubin (2010), Bayesian approach forthree-dimensional aquifer characterization at the Hanford 300 area,Hydrol. Earth Syst. Sci. Discussions, 7(2), 2017–2052.

Neuman, S., L. Xue, M. Ye, and D. Lu (2012), Bayesian analysis of data-worth considering model and parameter uncertainties, Adv. WaterResour., 35, 75–85, doi:10.1016/j.advwatres.2011.02.007.

Neuman, S. P. (2003), Maximum likelihood Bayesian averaging of uncer-tain model predictions, Stoch. Environ. Res. Risk Assess., 17(5), 291–305, doi:10.1007/s00477-003-0151-7.

Nowak, W. (2008), A hypothesis-driven approach to site investigation, EosTrans. AGU, 89(53), fall Meet. Suppl., Abstract H43A-0984.

Nowak, W. (2009), Best unbiased ensemble linearization and the quasi-lin-ear Kalman ensemble generator, Water Resour. Res., 45(4), W04431,doi:10.1029/2008WR007328.

Nowak, W. (2010), Measures of parameter uncertainty in geostatistical esti-mation and geostatistical optimal design, Math. Geosci., 42(2), 199–221,doi:10.1007/s11004-009-9245-1.

Nowak, W., S. Tenkleve, and O. Cirpka (2003), Efficient computation of lin-earized cross-covariance and auto-covariance matrices of interdependentquantities, Math. Geol., 35(1), 53–66, doi:10.1023/A:1022365112368.

Nowak, W., F. P. J. de Barros, and Y. Rubin (2010), Bayesian geostatisticaldesign–task-driven optimal site investigation when the geostatisticalmodel is uncertain, Water Resour. Res., 46, W03535, doi:10.1029/2009WR008312.

Oreskes, N., K. Shrader-Frechette, and K. Belitz (1994), Verification, vali-dation, and confirmation of numerical models in the earth sciences, Sci-ence, 263(5147), 641–646, doi:10.1126/science.263.5147.641.

RamaRao, B. S., A. M. LaVenue, G. de Marsily, and M. G. Marietta (1995),Pilot point methodology for automated calibration of an ensemble ofconditionally simulated transmissivity fields: 1. Theory and computa-tional experiments, Water Resour. Res., 31(3), 475–493, doi:10.1029/94WR02258.

Reed, P., B. Minsker, and D. E. Goldberg (2000), Designing a competentsimple genetic algorithm for search and optimization, Water Resour.Res., 36(12), 3757–3762, doi:10.1029/2000WR900231.

Russell, C. H. C. (1986), Variance reduction methods, paper presented atWSC Proc. 18th Conference on Winter Simulation, pp. 60–68, Assoc.Comput. Mach., New York, doi:10.1145/318242.318261.

Scheidegger, A. E. (1954), Statistical hydrodynamics in porous media,J. Appl. Phys., 25(8), 994–1001, doi:10.1063/1.1721815.

Schweppe, F. C. (1973), Uncertain Dynamic Systems, 1st ed., 576 pp., Pren-tice-Hall, Englewood Cliffs, N. J.

Snyder, C., T. Bengtsson, P. Bickel, and J. Anderson (2008), Obstacles tohigh-dimensional particle filtering, Mon. Weather Rev., 136(12), 4629–4640, doi:10.1175/2008MWR2529.1.

Sun, N.-Z. (1994), Inverse Problems in Groundwater Modeling, 1st ed.,352 pp., Springer, Dordrecht, Netherlands.

Sykes, J. F., J. L. Wilson, and R. W. Andrews (1985), Sensitivity analysisfor steady state groundwater flow using adjoint operators, Water Resour.Res., 21(3), 359–371, doi:10.1029/WR021i003p00359.

Ucinski, D. (2005), Optimal Measurement Methods for DistributedParameters System Identification, 1st ed., 392 pp., CRC Press, BocaRaton, Fla.

Van Leeuwen, P. J. (2009), Particle filtering in geophysical systems, Mon.Weather Rev., 137(12), 4089–4114, doi:10.1175/2009MWR2835.1.

Wand, M. P., and M. C. Jones (1995), Kernel Smoothing, 1st ed., 224 pp.,CRC Press, Boca Raton, Fla.

Weiss, N. A. (2006), A Course in Probability, 1st ed., 816 pp., Addison-Wesley, N. Y.

Woodbury, A. D., and T. J. Ulrych (1993), Minimum relative entropy: For-ward probabilistic modeling, Water Resour. Res., 29(8), 2847–2860,doi:10.1029/93WR00923.

Zhang, Y., G. F. Pinder, and G. S. Herrera (2005), Least cost design ofgroundwater quality monitoring networks, Water Resour. Res., 41(8),W08412, doi:10.1029/2005WR003936.

A. Geiges, P. C. Leube, and W. Nowak, Institute for Modeling Hydrau-lic and Environmental Systems (LH2)/SimTech, University of Stuttgart,Pfaffenwaldring 61, 70569 Stuttgart, Germany. ([email protected])

W02501 LEUBE ET AL.: ASSESSING THE EXPECTED DATA IMPACT W02501

16 of 16