Calibration Guidelines

Calibration Guidelines

1. Start simple, add complexity carefully2. Use a broad range of information3. Be well-posed & be comprehensive4. Include diverse observation data for ‘best

fit’5. Use prior information carefully6. Assign weights that reflect ‘observation’

error7. Encourage convergence by making the

model more accurate 8. Consider alternative models

9. Evaluate model fit 10. Evaluate optimal

parameter values

11.11. Identify new data to improve parameter estimates

12.12. Identify new data to improve predictions

13. Use deterministic methods14. Use statistical methods

Model development Model testing

Potential new dataPotential new data

Prediction uncertainty

Model DevelopmentGuideline 1: Apply principle of parsimony(start simple, add complexity carefully)

Start simple. Add complexity as warranted by the hydrogeology, the inability of the model to reproduce

observations, the predictions, and possibly other things.

But my system is so complicated!!

DATA ARE LIMITED.

SIMULATING COMPLEXITY NOT SUPPORTED BY THE DATA CAN BE USELESS AND MISLEADING.

• Neither a million grid nodes nor hundreds of parameters guarantee a model capable of accurate predictions.

• Here, we see that a model that fits all the data perfectly can be produced by adding many parameters, but the resulting model has poor predictive capacity. It is fitting observation error, not system processes.

• We don’t know the ‘best’ level of complexity. We do know that we don’t want to start matching observation error. Observation error is evaluated when determining weights (Guideline 6).

A

0

1

2

3

4

5

6

7

0 1 2 3 4 5 6 7

X

Y

B

0

1

2

3

4

5

6

7

0 1 2 3 4 5 6 7

X

Y

Model complexity accuracy

• General situation: Tradeoff between model fit and prediction accuracy with respect to the number of parameters

Model complexity accuracyModel simplicity accuracy

C

0

2

4

6

8

10

0 10 20 30 40 50 60

Number of parameters

(o) D

iscre

panc

y be

twee

n m

odel

and

obs

erva

tions

or

( | )

Pred

ictio

n er

ror

model fitprediction error

Can 20% of the system detail explain 90% of the dynamics?

• The principle of parsimony calls for keeping the model as simple as possible while still accounting for the main system processes and characteristics evident in the observations, and while respecting other system information.

• Begin calibration by estimating very few parameters that together represent most of the features of interest.

• The regression methods provide tools for more rigorously evaluating the relation between the model and the data, compared to trial and error methods.

• It is expected (but not guaranteed) that this more rigorous evaluation produces more accurate models.

Flow through Highly Heterogeneous Fractured Rock at Mirror Lake, NH Tiedeman, et al., 1998

20% of the system detail does explain 90% of the dynamics!

Fracture zoneswith high T

Fracture zoneswith high T

MODFLOW Model with only 2 horizontal hydraulic conductivity parameters

Fractureswith low T

Conceptual Cross SectionThrough Well Field

Time

Dra

wdo

wn

Apply principle of parsimony to allaspects of model development

• Start by simulating only the major processes• Use a mathematical model only as complex as is

warranted.• When adding complexity, test:

Whether observations support the additional complexity

Whether the additional complexity affects the predictions

This can require substantial restraint!!!

Advantages of starting simple and building complexity as warranted

Transparency: Easier to understand the simulated processes, parameter definition, parameter values, and their consequences. Can test whether more complexity matters.

Refutability: Easier to detect model error.

Helps maintain big picture view consistent with available data.

Often consistent with detail needed for accurate prediction.

Can build prediction scenarios with detailed features to test the effect f those features.

Shorter execution times

Issues of computer execution time• Computer execution time for inverse models can be

approximated using the time for forward models and the number of parameters estimated (NP) as:

Tinverse = 2(NP) Tforward (1+NP)(1+NP) is the number of solutions per parameter-estimation iteration2(NP) is an average number of parameter estimation iterations

• To maintain overnight simulations, try for Tforward < about 30 minutes

• Tricks:– Buffer sharp K contrasts where possible– Consider linear versions of the problem as much as possible (for gw

problems: replace the water table with assigned thickness unless the saturated thickness varies alot over time; replace nonlinear boundary conditions such as EVT and RIV Packages of MODFLOW with GHB Package during part of calibration)

– Parallel runs







parameter values







Model DevelopmentGuideline 2: Use a broad range of information

(soft data) to constrain the problem

• Soft data is that which cannot be directly included as observations in the regression

• Challenge: to incorporate soft data into model so that it (1) characterizes the supportable variability of hydrogeologic properties, and (2) can be represented by a manageable number of parameters.

• For example, in ground-water model calibration, use hydrology and hydrogeology to identify likely spatial and temporal structure in areal recharge and hydraulic conductivity, and use this structure to limit the number of parameters needed to represent the system.

Do not add features to the model to attain model fit if they contradict other information about the system!!

Example: Parameterization for simulation of ground-water flow in fractured dolomite

(Yager, USGS Water Supply Paper 2487, 1997)

How to parameterize hydraulic conductivity in this complex system? Yager took advantage of data showing that the regional fractures that dominate flow are along bedding planes.

Example: Parameterization for simulation of ground-water flow in fractured dolomite

Transmissivity estimated from aquifer tests is roughly proportional to the number of fractures intersecting the pumped well.

Thus, assume all major fractures have equal T, and calculate T for each model layer from the number of fractures in the layer.

The heterogeneous T field can then be characterized by a single model parameter and multiplication arrays.

Data management, analysis, and visualization

Data management, analysis, and visualization problems can be daunting. It is difficult to allocate project time between these efforts and modeling in an effective manner, because:

• There are many kinds of data (point well data, 2D and 3D geophysics, cross sections, geologic maps, etc) and the subsurface is often very complex. Capabilities for integrating these data exist, but can be cumbersome.

• The hardware and software change often. Thus far, products have been useful, but not dependable or comprehensive.

• Low end: Rockworks ~US$2000. High end: Earthvision ~US$100,000

+US$20,000/yrGUI’s provide some capabilities







parameter values







Model DevelopmentGuideline 3: Be Well-Posed & Be Comprehensive

Well posed: Don’t spread observation data too thinly For a well-posed problem, estimated parameters are supported by the calibration observations, and the regression converges to optimal values. In earth systems, observations are usually sparse, so being well-posed often leads to models with few parameters.

Comprehensive: Include many system aspects. Characterize as many system attributes as possible using defined model parameters. Leads to many parameters.

Is achieving Guideline 3 possible?

Challenge: Bridge the gap. Develop a useful model that has complexity the observation data can support and the predictions need.

Be Well-Posed and Be Comprehensive Often harder to be well posed than to be comprehensive.

• Easy to add lots of complexity to a model.

• Harder to limit complexity to what is supported by the observations and most important to predictions.

Keeping the model well-posed can be facilitated by:

• Scaled sensitivities, parameter correlation coefficients, leverage statistics

• Independent of model fit. Can use before model is calibrated

• Cook’s D, DFBeta’s (influence statistics)

•Advantage -- integrate sensitivities and parameter correlations.

•Caution -- dependent on model fit. Use cautiously with uncalibrated model.

Dimensionless Scaled Sensitivities: Support of each observation for each parameter

(example from Death Valley)

Parameter K4

-15

-5

5

15

25

35

45

1 101 201 301 401 501

Sequential observation number

Dim

ensi

onle

ss s

cale

d se

nsiti

vity

Obs # 17Value = -127.04

Obs # 107Value = 110.45

Obs # 505Value = -18.65

• Estimation of parameter K4 seems to be dominated by 4 observations: 3 heads and 1 flow.

• Scaled sensitivities neglect parameter correlation, so some observations may be more important than indicated. In ground-water problems, flows are very important for reducing correlations.

3 dominanthead obs 1 dominant

flow obs

Heads: obs # 1-501Flows: obs # 502-517

Composite Scaled Sensitivities: Support of whole observation set for each parameter

Supportable model complexity

0

50

100

150

200

250

K1

K2

K3

K4

AN

IV3

AN

IV1

RCH

ETM

Parameter names

Com

posi

te s

cale

d se

nsiti

vity

• CSS for initial Death Valley model with only 9 parameters.

• Graph clearly reveals relative support the observations as a whole provide towards estimating each parameter.

• Observations provide much information about RCH and 2 or 3 of the K parameters; little information about ANIV or ETM

The observations provide enough information to add complexity to the K and RCH parameterization

Composite Scaled Sensitivities: Support of whole observation set for each parameter

0

2

4

6

8

10

12

K1

K2

K3

K4

K5

K6(

el)

K7(

Nw

flt)

K8(

dr)

K9(

fmtn

)

AN

IV1

AN

IV3

RC

H0

RC

H1

RC

H2

RC

H3

ETM

GH

Bam

GH

Bgs

GH

Bo

GH

Bfc

GH

Bt Q Q

Por

os

Parameter label

Com

posi

te s

cale

d se

nsiti

vity

Supportable model complexity

• Black bars: parameters estimated by regression. • Grey bars: not estimated by regression because of

parameter correlation, insensitivity, or other reasons.

Good way to show the observation support as the number of defined parameters becomes large. This graph is from the final Death Valley model.

Parameter correlations: DVRFS model

• With head data alone, all parameters except vertical anisotropy are perfectly correlated -- Multiply all by any positive number, get identical head distribution. By Darcy’s Law.

• The flow observations reduce the correlation to what is shown above.

Parameter pair Correlation

GHBgs K7 -0.99

K2 RCH3 0.98

RCH3 Q2 0.97

K1 RCH3 0.96

pcc >0.95 for 4 parameter pairs in the three-layer DVRFS model with:

• all 23 parameters active• no prior information • 501 head observations• 16 flow observations.

Influence Statistics• Like DSS, they help indicate if parameter estimates are

largely affected by just a few observations

• Like DSS, they depend on the type, location, and time of the observation

• Unlike DSS, they depend on model fit to the observed value.

• Unlike DSS, they include the effect of pcc (parameter correlation coefficient) (Leverage does this, too)

• Cook’s D: a measure of how a set of parameter estimates would change with omission of an observation, relative to how well the parameters are estimated given the entire set of observations.

Cook’s D – Which observations are most important to estimating all the parameters?

(3-layer Death Valley example)

• Estimation dominated by ~10% of the observations

• 5 obs very important: 3 heads, 2 flows.

• Importance of flows is better reflected by Cook’s D than scaled sensitivities. In gw problems, flows often resolve extreme correlation. Need flows to uniquely estimate parameter values.

• Although dependent on model fit, relative valuesof Cook’s D can be useful for uncalibrated models.

1.E-03

1.E-02

1.E-01

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

1.E+05

1.E+06

1 101 201 301 401 501

Observation number

Coo

k's

D

17107

145

Accounts for sensitivities, parameter correlations, and model fit

flow obs(502-517)

Sensitivity Analysis for

2 parameters• CSS• DSS• Leverage• Cook’s D• DFBETAS Conclusion: flow qleft

has a small sensitivity but is critical to uncoupling otherwise completely correlated parameters.

EFFECT OF PARAMETER CORRELATIONNot integrated Integrated

Parameter Parameterpair CorrelationK-W 0.9999917_mc

0.0

0.2

0.4

0.6

0.8

1.0

h1 h2 h3 h4 h5 qleft qrightObservation name

Leve

rage

Measure of influence

-0.10.00.10.20.30.40.50.60.70.8

h1 h2 h3 h4 h5 qleft qright

Observation name

DFB

ETA

S

WK

0

400

800

1200

1600

2000

W K

Parameter name

Com

posi

te s

cale

d se

nsiti

vity

Measure of influence

0.00.10.20.30.40.50.60.70.8


Observation name

Coo

k's

D-3000

-2000

-1000

0

1000

2000

3000


Observation name

Dim

ensi

onle

ss s

cale

d se

nist

ivity

WK

_sc

_sd

_so

_rc

_rb

Which statistics address which relations??

Observations – Parameters - Predictions

dss psscss pprpccleverageParameter cvAICBICDFBETAS Cook’s D

oprObservations ---------------- Predictions

Problems with Sensitivity Analysis Methods

• Nonlinearity of simulated values with respect to the parameters

• Inaccurate sensitivities

Nonlinearity

Parameter correlation coefficients commonly differ for different parameter values.

Extreme correlation is indicated if pcc=1.0 for all parameter values; regression can look okay – but beware! (see example in Hill and Tiedeman, 2003)

Scaled sensitivities change for different parameter values because (1) the sensitivities are different and (2) the scaling. [dss= (y/b)b1/2]

Consider decisions based on scaled sensitivities to be preliminary. Test by trying to estimate parameters. If conclusions drawn from scaled sensitivities about what parameters are important and can be estimated change dramatically for different parameter values, the problem may be too nonlinear for this kind of sensitivity analysis and regression to be useful.

Nonlinearity: sensitivities differ for different parameter values.

From Poeter and Hill, 1997. See book p. 58

(pcc)

Inaccurate sensitivities

How accurate are the sensitivities?• Most accurate: sensitivity-equation method. MODFLOW-2000. [Generally 5-

7 digits]• Less accurate: Perturbation methods. UCODE_2005 or PEST. [Often only

2-3 digits] Both programs can use model-produced sensitivities if available.When does it NOT matter?• Scaled sensitivities, regression often do not require accurate sensitivities.

Regression convergence improves with more accurate sensitivities for problems on the edge. [Mehl and Hill, 2002]

When does it matter?• Parameter correlation coefficients. [Hill and Østerby, 2003]

– Values of 1.00 and –1.00 reliably indicate parameter correlation; smaller absolute values do not guarantee lack of correlation unless the sensitivities are known to be sufficiently accurate.

– Parameter correlation coefficients have more problems as sensitivity accuracy declines for all parameters, but it is most severe for pairs of parameters for which one parameter or both parameters have small composite scaled sensitivity.


1. Start simple, add complexity carefully2. Use a broad range of information3. Be well-posed & be comprehensive4. Include diverse observation data for

‘best fit’5. Use prior information carefully6. Assign weights that reflect ‘observation’




parameter values







Model Development Guideline 4: Include many kinds of data as observations (hard data) in the regression

Adding different kinds of data generally provides more information about the properties of the simulated system.

In ground-water flow model calibration

Flow data are important. With only head data, if all major K and Recharge parameters are being estimated, extreme values of parameter correlation coefficients will likely occur (Darcy’s Law).

Advective transport (or concentration first-moment data) can provide valuable information about the rate and direction of ground-water flow.

In ground-water transport model calibration

Advective transport (or concentration first-moment data) important because they are more stable numerically and the misfit increases monotonically as the fit to observations becomes worse. (Barth and Hill, 2005a,b, Journal of Contaminant Hydrology)

O O OO

O

O

O

OO

O

O

O

O

OS

S

S

SS

S

S

S

S

SS S S S

Time (days)

|WeightedR

esidual|

0 10 20 30 40 50100

101

102

103

104

105

106

107

108

109

Weighted residuals using observed-value weightsWeighted residuals using simulated-value weights

OS

O O OO

O

O

O

OO

O

O

O

O

OS

S

S

SS

S

S

S

S

SS S S S

Time (days)

C/Co

|WeightedR

esidual|

0 10 20 30 40 5010-9

10-8

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

100

101

102

103

104

105

106

107

108

109

Observed C/CoSimulated C/CoWeighted residuals using observed-value weightsWeighted residuals using simulated-value weights

OS

O O OO

O

O

O

OO

O

O

O

O

OS

S

S

SS

S

S

S

S

SS S S S

Time (days)

|Weig

htedR

esidu

al|

0 10 20 30 40 50100

101

102

103

104

105

106

107

108

109

Weighted residuals using observed-value weightsWeighted residuals using simulated-value weights

OS

Time (days)

C/Co

0 10 20 30 40 5010-9

10-8

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

Observed C/CoSimulated C/Co

From Barth and Hill (2005a).Book p. 224

Here, model fit does not change with changes in the parameter values unless overlap occurs

Contoured or kriged data values as ‘observations’?

(book p. 284)

Has the advantage of creating additional ‘observations’ for the regression.

However, a significant disadvantage is that the interpolated values are not necessarily consistent with processes governing the true system, e.g. the physics of ground-water flow for the true system. For example, interpolated values could be unrealistically smooth across abrupt hydrogeologic boundaries in the true subsurface.

This can cause estimated parameter values to representative of the true system poorly.

Proceed with Caution !!!!







parameter values







Model DevelopmentGuideline 5: Use prior information carefully

Prior information allows some types of soft data to be included in objective function (e.g. T from aquifer test)

Prior information penalizes estimated parameter values that are far from ‘expected’ values through an additional term in the objective function.

What are the ‘expected’ values?

HEADS FLOWS PRIOR

2

1))('()( bhhbS ii

nh

ii

2

1))('( bqq ii

nq

ii

2

1))('( bPP ii

npr

ii

+…

Ground-Water ModelingHydrologic and hydrogeologic

data: less accurateRelate to model inputs

Dependent variable observations: more accurate

Relate to model outputs - calibration

Ground-Water Model -- Parameters

Predictions


Societal decisions

Suggestions

• Begin with no prior information, to determine the information content of the observations.

• Insensitive parameters (parameters with small css):– Can include in regression using prior information to maintain a well-

posed problem

– Or during calibration exclude them to reduce execution time. Include them when calculating prediction uncertainty and associated measures (Guidelines 12 – 14).

• Sensitive parameters:– Do not use prior information to make unrealistic optimized parameter

values realistic.

– Figure out why model + calibration data together cause regression to converge to unrealistic values (see Guideline 9).

Highly parameterized models• # parameters > # observations• Methods

– Pilot points (de Marsily, RamaRao, LaVenue)– Pilot points with smoothing (Tikhonov)– Pilot points with regularization (Alcolea, Doherty)– Sequential self calibration (Gomez-Hernandez,

Hendricks Franssen)– Representer (Valstar)– Moment mehod (Guadagnini, Neuman)

• Most common current usage: PEST regularization capability, by John Doherty

Why highly parameterize?

• Can easily get close fits to observations• Intuitive appeal to resulting distributions

– We know the real field varies– Highly parameterized methods can be

used to develop models with variable distributions

– Mostly used to represent K; can use if for other aspects of the system

Why not highly parameterize?

• Are the variations produced by highly parameterized fields real?

• Perhaps NO if they are produced because of – Data error (erroneous constraint)– Lack of data (no constraint)– Instability

• How can we know?• Here, consider synthetic problem.

– Start with no observation error– Add error to observations

11 Observations: 10 heads (*), 1 flow (to river)6 Parameters: HK_1HK_2 (multiplier)RCH_1, RCH_2K_RB (prior)VK_CB (prior)

Steady state

Top layer really homogeneous. Consider using 36 pilot points to represent it.

(From Hill and

Tiedeman, 2006, Wiley)

* *

* *

* *

*

* *

*

Model Layer 1 Model Layer 2

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

Initial run with no data errorHere, use the constrained minimization Here, use the constrained minimization

regularization capability of PESTregularization capability of PEST

Consistent with the Consistent with the constant value that constant value that actually occurs. actually occurs.

Variogram inputs:Variogram inputs:Nugget=0.0 Nugget=0.0 a=3x10a=3x1077 Variance=0.1Variance=0.1

Interpolation inputs:Interpolation inputs:Search radius=36,000Search radius=36,000Max points=36 Max points=36

Equals fit achieved with a Equals fit achieved with a single HK_1 parameter given single HK_1 parameter given a correct model and a correct model and observations with no errorobservations with no error

Fit CriterionFit Criterion phi=1x10phi=1x10-5-5

Introducing variability

S(b)= y - yobs(y - yobs)

+ β (Δp - 0)T (Δp - 0)

To the extent that β can be small, variability is allowed in the K distribution.

Increased s2lnK

The variogram and interpolation input reflects a much The variogram and interpolation input reflects a much better understanding of the true distribution than wouldbetter understanding of the true distribution than would

normally be the case. To create a more realisticnormally be the case. To create a more realistic situation, use situation, use β, the regularization weight factor, the regularization weight factor

Percent error for each parameter calculated as 100|(btrue-best)/btrue|

For HK_1, best= exp(lnK)

lnK=mean of the ln K’s of the 36 pilot points

Also report s2lnK

No data error. Correct model.Perform regression starting from values other than the true values

If β is restricted to be close to 1.0, the estimates are close to the true and s2

lnK is small, as expected.

What happens if β can be small?

Percent error for each parameter calculated as 100|(btrue-best)/btrue|

For HK_1, best= exp(lnK)

lnK=mean of the ln K’s of the 36 pilot points

No data error. Correct model.Perform regression starting from values other than the true values

-30 -11

50

-33

-100

-50

0

50

100

HK_1 K_RB VK_CB HK_2 RCH_1 RCH_2

Parameter name

Par

amet

er e

rror,

in p

erce

nt

Prior=True

Start from one set of parameter values.Average absolute error = 21%

var=0.1s2ln(k)= 0.1

β =

No data error. Correct model.

Same good fit to obs

Perform regression starting from another set of values

-50

-26

86

-57

-100

-50

0

50

100

HK_1 K_RB VK_CB HK_2 RCH_1 RCH_2

Parameter name

Par

amet

er e

rror,

in p

erce

nt

Prior=True

var=0.3

Start from another set of parameter values.Average absolute error = 37%

No data error. Correct model.Parameter estimates depend on starting parameter values

• Disturbing. • Means that in the following results as β

becomes small discrepancies may not be caused by observation error.

• Parameter error• Distribution of K in layer 1

Data error. Correct model.

Parameter error

Not possible to determine for which phi values will be Not possible to determine for which phi values will be accurate.accurate.Small parameter error Small parameter error Accurate predictions? Accurate predictions? Depends on effect of the variability.Depends on effect of the variability.

Data error. Correct model.

0

10

20

30

40

50

sos=10.5 phi=13 phi=11 phi=5 phi=4 phi=2True Pilot-point .

Parameterization

Ave

rage

par

amet

er

erro

r,%s2

ln(k)=1.2

HEAD, in mStraight contours

are correct.phi=11Basic features correct.

phi=2Great fit. Basic features incorrect.

HK_1, in m/sConstant value is correct.

Observation in top layerObservation in bottom layer

True value 0.0004

For phi=11, 0.00048For phi=11, max head 177

Large residual reduced for phi=2Solid, negative; dashed, positive.

Lessons• Inaccurate solutions?

– Possibly. Variability can come from actual variability, or from data error or instability of the HP method.

• Exaggerated uncertainty? – Possibly, if include variability caused by data error and

instabilities. True of Moore and Doherty method??• How can we take advantage highly parameterized

methods? – Use hydrogeology to detect unrealistic solutions.– Analyze observation errors and accept that level of misfit.

Set phi accordingly. – Consider weighting of regularization equations. Use =1?– Check how model fit changes as phi changes.– Use sensitivity analysis to identify extremely correlated

parameters and parameters dominated by one observation. Use parsimonious overlays.



fit’5. Use prior information carefully6. Assign weights that reflect

‘observation’ error7. Encourage convergence by making the



parameter values







“Observation” error???Measurement error vs. model error

Should weights account for only measurement errors, or also some types of model error?

• A useful definition of observation error that allows for inclusion of some types of model error is:– Any error with an expected value of zero that is related to aspects of

the observation that are not represented in the model (could be, for example, the model configuration and/or the simulated processes).

• Unambiguous measurement errors: Errors associated with the measuring device and the spatial location of the measurement.

• More ambiguous errors: Heads measured in wells that partially penetrate a model layer. Here, even in a model that perfectly represents the true system, observed and simulated heads will not necessarily match. The weights could be adjusted to account for this mismatch between model and reality if the expected value of the error is zero.

Model DevelopmentGuideline 6: Assign weights that reflect

‘observation’ error

• Model calibration and uncertainty evaluation methods that do not account for ‘observation’ error can be misleading and are not worth using.

• ‘Observation’ errors commonly are accounted for by weighting

• Can use large weights (implied small ‘observation’ error) to investigate solution existence.

• For uncertainty evaluation the weights need to be realistic

Model DevelopmentGuideline 6: Assign weights that reflect

‘observation’ error Strategy for uncorrelated errors Assign weights equal to 1/s2, where s2 is the best estimate of the variance of

the measurement error (details given in Guideline 6 of the book). Values entered can be variances (s2), standard deviations (s) or coefficients

of variation (CV). Model calculates variance as needed.

Advantages Weight more accurate observations more heavily than less accurate

observations – intuitively appealing This produces squared weighted residuals that are dimensionless, so they

can be summed in the objective function. Use information that is independent of model calibration, so statistics used

to calculate the weights generally are not changed during calibration. This weighting strategy is required for common uncertainty measures to be

correct. Without this the regression becomes difficult and arbitrary.

If weights do not reflect ‘observation’ error, regression becomes difficult and arbitrary

x

y

y=b0+b1x

No, not if the model is correct and the other data are important.

Interested in predictions for large values of x.

Will accuracy of predictions be improved by weighting this observation more?

Determine weights by evaluating errors associated with the observations

Often can assume (1) a normal distribution and (2) different error components are additive.

If so, two things are important. (a) add variances, not standard deviations or coefficients of

variation! (b) Mean needs to equal zero!

Quantify possible magnitude of error using • range that is symmetric about the observation or prior

information• probability with which the true value is expected to occur

within the range.Examples:• Head observation with three sources of error• Streamflow loss observation

Head observation with three sources of error(1) Head measurement is thought to be good to within 3 feet. (2) Well elevation is accurate to 1.0 feet.(3) Well located in the model within 100 feet. Local gradient is 2%. • Quantify “good to within 3 feet” as “there is a 95-percent

chance the true value falls within ±3 feet of the measurement.”– Use a normal probability table to determine that a 95-percent confidence

interval is a value ± 1.96 times the standard deviation, s.– This means that 1.96 x s = 3 ft., so s = 1.53 ft.

• Quantify well elevation error similarly to get =0.51 ft.• Quantify “located in the model within 100 feet. Locally the

gradient is 2%” as “there is a 95-percent chance the true value falls within plus and minus 2 feet”– Using the procedure above, s=1.02 ft.

• Calculate observation statistic. Add variances: (1.53)2 + (0.51)2 + (1.02)2 = 3.64 ft2 s=1.91 ft. Specify variance or standard deviation in the input file.

Streamflow loss observationStreamflow gain observation derived by subtracting flow measurements

with known error range and probability.• Upstream and downstream flow measurements: 3.0 ft3/s and 2.5 ft3/s

loss observation = 0.5 ft3/s. The first flow measurement is considered ‘fair’, the second is ‘good’. Carter and Anderson (1963) suggest that a good streamflow measurement has a 5% error.

• Quantify the error. ±5% forms a 90% confidence interval on the ‘fair’ upstream flow; plus and minus 5% forms a 95% confidence interval on the ‘good’ downstream flow. Using values from a normal probability table, the standard deviations of the two flows are 0.091 and 0.64.

• Calculate observation statistic Add variances: [(0.091)2 + (0.64)2]1/2 = 0.0124. Coefficient of variation: 0.0124/0.5 = 0.22, or 22 percent. Specify variance, standard deviation or coefficient of variation in the input file

Heads only With flow

weighted using areasonable

coefficient of variation of 10%

With flowweighted using an

unreasonable coefficient of

variation of 1%

What if weights are set to unrealisticallylarge values?

Objective function surfaces(contours of objective function calculated for combinations of 2 parameters)

From Hill and Tiedeman, 2002. Book p. 82.

Unrealistically large weights:Hazardous consequences for predictions

Unreasonable weighting results in misleading calculated confidence interval – excludes true

value of 1737 m

Flow coefficient of variation

Prediction (distance traveled toward

river in 10 yrs) (m)

Confidence interval (m) on the prediction

Includes true

value? 10% 1017 71 ; 1964 Yes 1% 1036 940 ; 1131 No

From Hill and Tiedeman, 2002. Book p. 303

Documents

Calibration Guidelines