107
Multivariate Geostatistics Winter Term 2018/19

Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

  • Upload
    others

  • View
    33

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Multivariate Geostatistics

Winter Term 2018/19

Page 2: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random
Page 3: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Stochastic

About Probability and Statistics

Page 4: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematics

Fields of Mathematics

AnalysisLinear AlgebraFunctional AnalysisDifferential EquationsIntegral Transforms...

Stochastics

Probability

Mathematical Statistics

Applied Statistics

Data Analysis, Data Mining

Page 5: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematics – Analysis, Linear Algebra

Mapping

Mappings (“Abbildungen”) are a central issue in almost allmathematical disciplines.

Analysisreal functions, continuous functions, continuouslydifferentiable functions, integrable functions, etc.

f : R 7→ R+, f (x) = x2,

f : [−π/2, π/2] 7→ [−1, 1], f (ω) = cos2κ ω, κ ∈ N

Linear Algebralinear mapping, linear map provided by a matrix, matrixassociated with a linear map, homomorphism, isomorphism,etc.

A : R2 7→ R2, w =

(cosω − sinωsinω cosω

)v

Page 6: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematics – Probability

Mapping

Probabilitymeasurable mapping, random variable, real random variable

Z : (Ω,A,P) 7→ (R,B)

P(Z (ω) ∈ B) := P(ω ∈ Z−1(B)) for all B ∈ B

where (Ω,A,P) denotes a probability space and (R1,B) thereal measurable space.

A random variable Z is completely defined by its probabilitylaw, i.e. its distribution. The distribution tells the probabilitywith which the random variable Z realizes values z = Z (ω).The values Z (ω), ω ∈ Ω, are called realizations.

Page 7: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematics – Probability

Random variable

In contrast to e.g. analysis or linear algebra, there is usually noexplicit formula or rule (“Abbildungsvorschrift”) for a randomvariable Z : ω 7→ z how to assign z ’s to ω’s.

A random variable is given by its distribution; what can be knownof a random variable is its distribution.

If the distribution is known, then the random variable is known andall its properties can be deduced from the distribution.

Page 8: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Probability

Mathematical probability assumes the probability law to beknown and develops a theorie how to describe random events byrandom variables and investigates their properties.

Page 9: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Probability

For a real random variable Z : (Ω,A,P) 7→ (R,B), the distributionmay be given in terms of the distribution function F

F (z) := P(Z (ω) ∈ (−∞, z ]), z ∈ R

If the distribution function F can be represented as the integral

F (z) =

∫ z

−∞f (x)dx ,

then f is called probability density function, and in this case thedistribution may also be represented by its probability densityfunction.

For instance, the exponential law with parameter λ is given

P(Z ≤ z) = F (z) = 1− exp(−λz) = λ

∫ z

−∞exp(−λx)dx ,

f (x) = λ exp(−λx), 0 ≤ x , 0 < λ

Page 10: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Probability

Expectation, variance

Major properties of a real random variable are given in terms of“moments” or “central moments” of the distribution. The twomost prominent moments are the expectation of a real randomvariable

EZ =

∫ ∞−∞

z dF (z) =

∫ ∞−∞

z f (z)dz = µ

and the variance of a real random variable

VarZ = E(Z − EZ )2 =

∫ ∞−∞

(z − µ)2 f (z)dz = σ2.

For instance, the exponential law is a one–parameter distributionand its meaning is

EZ =1

λ, VarZ =

1

λ2.

Page 11: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Probability

Cauchy distribution

In the same way as there are probability laws for which aprobability density function does not exis, there are distributions forwhich moments do not exist, for instance the Cauchy distribution

f (z) =1

π

λ

λ2 + (z − µ)2, 0 < λ, µ ∈ R,

does not have an expectation nor a variance.

However, its median and mode is given by µ.

Page 12: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Probability

Covariance

The covariance of two real random variables Z1,Z2

Cov(Z1,Z2) = E(

(Z1 − EZ1)(Z2 − EZ2))

is a measure for the extent of a linear relationship between Z1 andZ2.

Independence of random variables

Two real random variables are called indepedent if their jointprobability law is the product of the two individual (“marginal”)probability laws.

Page 13: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Probability

Covariance

The covariance of two real random variables Z1,Z2

Cov(Z1,Z2) = E(

(Z1 − EZ1)(Z2 − EZ2))

is a measure for the extent of a linear relationship between Z1 andZ2.

Independence of random variables

Two real random variables are called indepedent if their jointprobability law is the product of the two individual (“marginal”)probability laws.

Page 14: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Probability

It holds

E(aZ + b) = aEZ + b (a, b ∈ R)

Var(aZ + b) = a2VarZ

E(Z1 ± Z2) = EZ1 ± EZ2

Var(Z1 ± Z2) = VarZ1 + VarZ2 ± 2Cov(Z1,Z2)

with Cov(Z1,Z2) = E(

(Z1 − EZ1)(Z2 − EZ2))

= E(Z1Z2)− EZ1EZ2

Page 15: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Probability

Uncorrelatedness vs. independence

Two random variables Z1,Z2 are called uncorrelated, if

Cov(Z1,Z2) = 0

For uncorrelated random variables Z1,Z2 it holds

E(Z1Z2) = EZ1 EZ2

Var(Z1 ± Z2) = VarZ1 + VarZ2

If two random variables are stochastically independent, then theyare also uncorrelated. The inverse is not generally true.

Page 16: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematical Statistics

Mathematical statistics develops methods to determine theparameters of a distribution from a mathematical sample and itdevelops statistical tests to check hypotheses.

Page 17: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematical Statistics

Mathematical sample

Mathematical statistics initially models a sequence of n realobserved univariate data z1, z2, . . . , zn ∈ R1 as independentrealizations of a real random variable Z .

In the multivariate case, zi ∈ Rm, i = 1, . . . , n, and Z denotes arandom vector.

Page 18: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematical Statistics

Mathematical sample

The model of mathematical statistics may be generalized along thefollowing example.

Throwing a proper dice n times and recording the results each timeis equivalent to simultaneously throwing n dices once and recordingall outcomes if(i) the dice–cup is large enough such that the dices do notinterfere, and(ii) the n dices are identical copies of the initial one.

Page 19: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematical StatisticsMathematical sample

Mathematical statistics initially models a sequence of n realobserved univariate data z1, z2, . . . , zn ∈ R1 as independentrealizations of a real random variable Z .

The link of mathematical statistics and probability is established byemploying a twofold model of the data as follows

Z → z1, z2, . . . zn↑ ↑ ↑ ↑Z1, Z2, . . . Zn

where Z1,Z2, . . . ,Zn is a sequence of independent identicaldistributed (iid) random variables.

The sequence of data is modeled as the n–fold realization of aunique random variable Z and as a unique realization of an iidsequence of n random variables Z1,Z2, . . . ,Zn.The set Z1, . . . ,Zn is called mathematical sample.

Page 20: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematical StatisticsMathematical sample

Mathematical statistics initially models a sequence of n realobserved univariate data z1, z2, . . . , zn ∈ R1 as independentrealizations of a real random variable Z .

The link of mathematical statistics and probability is established byemploying a twofold model of the data as follows

Z → z1, z2, . . . zn↑ ↑ ↑ ↑Z1, Z2, . . . Zn

where Z1,Z2, . . . ,Zn is a sequence of independent identicaldistributed (iid) random variables.

The sequence of data is modeled as the n–fold realization of aunique random variable Z and as a unique realization of an iidsequence of n random variables Z1,Z2, . . . ,Zn.The set Z1, . . . ,Zn is called mathematical sample.

Page 21: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematical Statistics

Mathematical sample

According to the second model of mathematical statistics we findthe following results. Let

M :=1

n

n∑i=1

Zi

Then

EM = E(1

n

n∑i=1

Zi

)=

1

n

n∑i=1

EZi = EZ = µ

VarM = Var(1

n

n∑i=1

Zi

)=

1

n2

n∑i=1

VarZi =1

nVarZ =

1

nσ2

Page 22: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applied Mathematical Statistics – Data Analysis

Applied statistics attempts to find the probability law of therandom variable from which the observed data are thought of asrealizations and applies statistical tests to real world data inpractice.

Page 23: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applied Mathematical Statistics – Data Analysis

Statistics

Descriptive mathematical statistics attempts to condense theinformation conveyed by the data into a few numbers describingand characterizing the set of data. In this way, empirical mean,empirical variance, empirical covariance, etc. may be seen asemprirical parameters of the set of data.

z =1

n

n∑i=1

zi

s2 =1

n − 1

n∑i=1

(zi − z)2, s =√s2

s12 =1

n − 1

n∑i=1

(z1i − z1)(z2i − z2)

r =1

n−1∑n

i=1(z1i − z1)(z2i − z2)

s1s2

Page 24: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applied Mathematical Statistics – Data Analysis

Statistics

Inferential mathematical statistics attempts to infer theprobability law of the random variable from the sequence of thedata and its descriptive parameters, and to prove hypotheses bymeans of statistical tests.

At best, the empirical parameters of the data turn out to bereasonable estimates of the parameters of the probability law andto provide insight.

Page 25: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applied Statistics

Stochastic modelRandom variables

ExpectationVarianceCovarianceCovariance matrix

Real WorldData

Arithmetic meanEmpirical varianceEmprirical covarianceMatrix empirical covariances

Page 26: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Mathematical Statistics

The fundamental modeling assumption of applied statistics

The elements of the sample are assumed to be independentand identical repetitions of the same experiment orobservation; otherwise, classical statistics does not apply.

Page 27: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

GeoStatistics

Geostatistics is a generalization of classical statistics forgeoreferenced, spatially and stochastically dependent randomvariables. Thus, the fundamental modeling assumption of classicalstatistics is violated by definition.

The fundamental modeling assumption of geostatistics ishomogeneity (stationarity) which is a mathematicalexpression for a conservation law, e.g. the increments of anytwo random variables are not spatially dependent.

Page 28: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random
Page 29: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

References (1)

Armstrong, M., 1998, Linear Geostatistics: Springer

Armstrong, M. und Dowd, P.A., 1994, Geostatistical Simulations:Kluwer Academic Publishers

Armstrong, M., Galli, A.G., Leloc’h, G.L.Y., Geffroy, F.L., Eschard,R., 2003, Plurigaussian Simulations in Geosciences: Springer

Chiles, J.-P., Delfiner, P., 1999, Geostatistics: ModelingSpatial Uncertainty: Wiley

Christakos, G., 1992, Random Field Models in Earth Sciences:Academic Press

Christakos, G., 2000, Modern Spatiotemporal Geostatistics: OxfordUniversity Press

Cressie, N.A.C., 1993, Statistics for Spatial Data - RevisedEdition: Wiley

David, M., 1977, Geostatistical Ore Reserve Estimation: Elsevier

Page 30: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

References (2)

Deutsch, C.V, Journel, A.G., 1998, Geostatistical Software Libraryand User’s Guide, Second Edition: Oxford University Press

Goovaerts, P., 1997, Geostatistics for Natural ResourcesEvaluation: Oxford University Press

Houlding, S.W., 2000, Practical Geostatistics: Springer

Isaaks, E.H., Srivastava, R.M., 1989, An Introduction to AppliedGeostatistics: Oxford University Press

Journel, A.G., 1989, Fundamentals of Geostatistics in FiveLessons: American Geophysical Union

Journel, A.G., Huijbregts, C., 1978, Mining Geostatistics:Academic Press

Kitanidis, P.K., 1997, Introduction to Geostatistics: Applications inHydrogeology: Cambridge University Press

Krige, D., 1978, Lognormal - de Wijsian Geostatistics for OreEvaluation: South African Institute of Mining and Mineralogy

Page 31: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

References (3)

Lantuejoul, C., 2002, Geostatistical Simulation: Models andAlgorithms: Springer

Mallet, J.-L., 2002, Geomodeling: Cambridge University Press

Matheron, G., 1971, The theory of regionalized variables andits applications: Les Cahiers du Centre de MorphologieMathematique de Fontainebleau, no 5

Matheron, G., 1989, Estimating and Choosing: Springer

Muller, W.G., 1998, Collecting Spatial Data: Physica Verlag

Myers, J., 1997, Geostatistical Error Mangement QuantifyingUncertainty For Environmental Sampling and Mapping: VanNostrand Reinhold

Olea, R. A., 1991, Geostatistical Glossary and MultilingualDictionary: Oxford University Press

Olea, R. A., 1999, Geostatistics for Engineers and Earth Scientists:Kluwer Academic Publishers

Page 32: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

References (4)

Pawlowsky–Glahn, V., Olea, R. A., 2004, Geostatistical Analysis ofCompositional Data: Oxford University Press

Rivoirard, J., Simmonds, J., Foote, K.G., Fernandes, P., Bez, N.,2000, Geostatistics for Estimating Fish Abundance: BlackwellScience

Stein, M.L., 1999, Interpolation of Spatial Data - Some Theory forKriging: Springer

Wackernagel, H., 1998, Multivariate Geostatistics (2nd completelyrevised version): Springer

Page 33: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

References: Gestatistics in the www

https://wiki.52north.org/AI_GEOSTATS/WebHome

http://www.iamg.org/

Page 34: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random
Page 35: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Geostatistics in a nutshell

The problem and prerequisites of its resolution

Experiencing spatially induced correlation

Applying spatial correlation to prediction I:Heuristic models

Applying spatial correlation to prediction II:Stochastic model – “Kriging”

Data based descriptive models of spatial correlation:The experimental semi–variogramStochastics modeling with random functions:The semi–variogramFundamental assumption of geostatistics – HomogeneityModeling the semi–variogramBest linear unbiased estimator (BLUE)Kriging systemsPractice of ordinary kriging

Stochastic simulation of random functions

Page 36: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The problem

Inventory

symbol description

x1, . . . , xn ∈ D sites of samplingz(x1), . . . , z(xn) data (scalar or vectorial)

Let x0 ∈ D with z(x0) unknown; define the linear combination

z∗(x0) =n∑`=1

w`(x0)z(x`)

with initially unknown coefficients (“weights”) w`(x0), ` = 1, . . . , n.

Page 37: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The problem

Linear combination

z∗(x0) =n∑`=1

w`(x0)z(x`)

Problem

What are the prerequisites that z∗(x0) would be a reasonablepredictor of z(x0)?

What is a reasonable way to determine the weigthsw`(x0), ` = 1, . . . , n, in such a way that– z∗(x0) is a good predictor of z(x0),– is the “best” predictor with respect to what criterion?

Conceptually,

given x0, which w`(x0) 6= 0, i.e., which z(x`) enter theprediction z∗(x0) ?

if w`(x0) 6= 0, how to determine it?

Page 38: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The prerequisites

Tendency of preservation

The general possibility of a reasonable predictor z∗(x0) requiressome kind of “tendency of preservation” of the properties beingsampled.

Such a tendency could mathematically be captured with terms likecontinuity or continuous differentiability, i.e., with some measure ofsmoothness.

In probability or statistics it would be termed spatially inducedsimilarity, correlation, dependence.

Page 39: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The prerequisites

Counter–example

Throwing dices at x`, ` = 1, . . . , n.

“The next observation is as surprising as each previous one.”

Contradiction to the fundamental assumption of statistics

Approaching the problem stochastically, a new kind of statistics isrequired as classical statistics depends on the fundamentalassumption of independent identical distributed random variables,i.e., on independent repetitions of identical sampling.

Page 40: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The prerequisites

Counter–example

Throwing dices at x`, ` = 1, . . . , n.

“The next observation is as surprising as each previous one.”

Contradiction to the fundamental assumption of statistics

Approaching the problem stochastically, a new kind of statistics isrequired as classical statistics depends on the fundamentalassumption of independent identical distributed random variables,i.e., on independent repetitions of identical sampling.

Page 41: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The prerequisites

Counter–example

Throwing dices at x`, ` = 1, . . . , n.

“The next observation is as surprising as each previous one.”

Contradiction to the fundamental assumption of statistics

Approaching the problem stochastically, a new kind of statistics isrequired as classical statistics depends on the fundamentalassumption of independent identical distributed random variables,i.e., on independent repetitions of identical sampling.

Page 42: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The prerequisites

Counter–example

Throwing dices at x`, ` = 1, . . . , n.

“The next observation is as surprising as each previous one.”

Contradiction to the fundamental assumption of statistics

Approaching the problem stochastically, a new kind of statistics isrequired as classical statistics depends on the fundamentalassumption of independent identical distributed random variables,i.e., on independent repetitions of identical sampling.

Page 43: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Experiencing spatially induced correlation

Scientists’ and engineers’ experience communicates, e.g., that theore contents of the specimen in the sample from a homogeneousore deposit

are the more similar the closer their respective samplinglocation independently of their actual value;

are no longer similar at all, if the distance of their respectivelocations is larger than a specific distance characteristic forthe ore deposit.

This experience can be generalized for many spatial phenomena,too.

Page 44: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation

Some kind of spatially induced “similarity”, “continuity”,“correlation” is a prerequisite of any reasonable prediction.

According to common experience, the extent of this spatiallyinduced “similarity”, “continuity”, “correlation” is a function ofdistance.

Turning this experience constructive, the linear ansatz may berewritten as

z∗(x0) =n∑`=1

w`(x0)z(x`) =n∑`=1

w(x0 − x`)z(x`)

with a decreasing weight function w radially symmetric withrespect to the origin.

Page 45: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionHeuristic models (1)

Inverse distance weighting

z∗(x0) =n∑`=1

w(x0 − x`)z(x`)

with

w`(x0) = w(x0 − x`) =1/‖x0 − x`‖∑n`=1 1/‖x0 − x`‖

Other choices?

Page 46: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionHeuristic models (2)

How to choose appropriate weight functions – naive view

Weighted mean of data

z∗(x0) =n∑`=1

w`(x0)︸ ︷︷ ︸weight

z(x`)︸ ︷︷ ︸data

Page 47: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionHeuristic models (3)

How to choose an appropriate weight function – dual view

Linear combination of radially symmetric basis function

z∗(x) =n∑`=1

w`(x)︸ ︷︷ ︸base functions

z(x`)︸ ︷︷ ︸weights

=n∑`=1

w(x − x`)︸ ︷︷ ︸base function

z(x`)︸ ︷︷ ︸weights

What can be said about the smoothness of z∗(x) in terms ofcontinuity, continuous differentiability, etc.?

Page 48: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionStochastic model (1)

Apply data analysis, i.e., descriptive statistics to derive adescription of the spatially induced correlation.

Page 49: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionStochastic model (2)

h–scatter plot

A h–scatter plot is a scatter plot of all pairs of measurements ofthe same attribute z at locations z(x`), z(x` + h) separated by thevector h.

Note that h is a vector such that x`, x` + h ∈ D.

A h–scatter plot visualizes spatial variability or continuity,respectively. It is very helpful to identify extreme values.

Page 50: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionStochastic model (3)

h–scatter plot

For most natural phenomena it is generally expected that thespatial variability increases, i.e., spatial continuity decreases, as thelength ‖ h ‖ of h increases.

Thus, for increasing ‖ h ‖ the points cluster worse around the firstbisector in the (z(x), z(x + h))–plane.

This behaviour may be different for different directions h, which isreferred to as anisotropy.

A h–scatter plot may be summarized by the mean of the squaredorthogonal distances of the points to the first bisector. Thus, interms of mechanics, it is the moment of inertia with respect to thefirst bisector.

Page 51: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionStochastic model (4)

Sample semi–variogram

γ(h) :=1

2N(h)

N(h)∑α=1

[z(x`)− z(x` + h)]2

where z(x`)− z(x` + h) are referred to as h–increments of z .

Pythogoras’ theorem helps to see that

1

2[z(x`)− z(x` + h)]2 = cos2

π

4[z(x`)− z(x` + h)]2

actually is the squared orthogonal distance of the point withcoordinates (z(x`), z(x` + h)) to the first bisector.

Page 52: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionStochastic model (5)

Sample semi–variogram

The sample semi–variogram is a data–driven figure describing theincreasing dissimilarity of observations at any two sites withincreasing distance.

From the plots of an sample semivariogram we may read off thesill, the range, and the nugget–effect, and summarize it in thesethree terms.

Page 53: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionStochastic model (6)

Descriptive statistics vs. mathematical statistics

What is the counterpart of the semi–variogram in terms ofprobability and mathematical statistics?

Measuring the dissimilarity of z(x`) compared to z(x` + h), or thevariability of the increments (z(x`)− z(x` + h)), its counterpartshould be a variance.

Being a mean, the semi–variogram is reminiscent of an expectation.

Variance of increments

Var(Z (x)− Z (x + h)

)= E

(Z (x)− Z (x + h)

)2 − (E(Z (x)− Z (x + h)

))2︸ ︷︷ ︸homogeneity assumption: ≡0

Page 54: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionStochastic model (6)

Descriptive statistics vs. mathematical statistics

What is the counterpart of the semi–variogram in terms ofprobability and mathematical statistics?

Measuring the dissimilarity of z(x`) compared to z(x` + h), or thevariability of the increments (z(x`)− z(x` + h)), its counterpartshould be a variance.

Being a mean, the semi–variogram is reminiscent of an expectation.

Variance of increments

Var(Z (x)− Z (x + h)

)= E

(Z (x)− Z (x + h)

)2 − (E(Z (x)− Z (x + h)

))2︸ ︷︷ ︸homogeneity assumption: ≡0

Page 55: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Applying spatial correlation to predictionStochastic model (6)

Descriptive statistics vs. mathematical statistics

What is the counterpart of the semi–variogram in terms ofprobability and mathematical statistics?

Measuring the dissimilarity of z(x`) compared to z(x` + h), or thevariability of the increments (z(x`)− z(x` + h)), its counterpartshould be a variance.

Being a mean, the semi–variogram is reminiscent of an expectation.

Variance of increments

Var(Z (x)− Z (x + h)

)= E

(Z (x)− Z (x + h)

)2 − (E(Z (x)− Z (x + h)

))2︸ ︷︷ ︸homogeneity assumption: ≡0

Page 56: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The stochastic model: Random functions (1)

Random function

The set of spatially indexed random variables (Z (x), x ∈ D) iscalled a random function (RF).

Inventory of the model

symbol description

x1, . . . , xn ∈ D sites of samplingZ (x1), . . . ,Z (xn) random variables authorized for sampling sitesz(x1), . . . , z(xn) data z(x`)

interpreted as single realization of Z (x`)Z (x0) random variable authorized for location x0z(x0) unknown, to be estimatedZ ∗(x0) random variable,

estimator of the random variable Z (x0)z∗(x0) realisation of Z ∗(x0), estimate of z(x0)

Page 57: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The stochastic model: Random functions (2)

The novel model provided by a random function interprets the dataz`, ` = 1, . . . , n, supported by the n specimen at the samplinglocations x` ∈ D, ` = 1, . . . , n, as a unique discrete realisationz(x`), x` ∈ D, ` = 1, . . . , n, of a unique spatial random functionZ (x), x ∈ D.

The random function is also referred to as regionalized randomvariable.

Page 58: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The stochastic model: Random functions (3)

Estimator – estimate

Let x0 ∈ D; define the estimator Z ∗(x0) as linear combination ofrandom variables, i.e.,

Z ∗(x0) =n∑`=1

λ`(x0)Z (x`)

with initially unknown coefficients (“weights”) λ`(x0), ` = 1, . . . , n.

Problem rephrased

What is a reasonable way to determine the weigthsλ`(x0), ` = 1, . . . , n, in such a way that– Z ∗(x0) is a good estimator of Z (x0),– is the “best” estimator with respect to what criterion?

Page 59: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The stochastic model: Random functions (3)

Estimator – estimate

Let x0 ∈ D; define the estimator Z ∗(x0) as linear combination ofrandom variables, i.e.,

Z ∗(x0) =n∑`=1

λ`(x0)Z (x`)

with initially unknown coefficients (“weights”) λ`(x0), ` = 1, . . . , n.

Problem rephrased

What is a reasonable way to determine the weigthsλ`(x0), ` = 1, . . . , n, in such a way that– Z ∗(x0) is a good estimator of Z (x0),– is the “best” estimator with respect to what criterion?

Page 60: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

The stochastic model: Random functions (4)

Estimator – estimate

If

Z ∗(x0) =n∑`=1

λ`(x0)Z (x`)

is a good estimator of Z (x0), then its realisation

z∗(x0) =n∑`=1

λ`(x0)z(x`)

with the same weights should be a good estimate of z(x0).

Page 61: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Random functions – Moments (1)

Any marginal one–point distribution function is given by

F (x ; z) := P(Z (x) ≤ z)

The mean of the RF is the expected value function

m(x) = EZ (x)

and its variance is the variance function

VarZ (x) = EZ 2(x)−m2(x)

of the random funcction (Z (x), x ∈ D).

Page 62: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Moments (2)

The centered 2–point covariance is the covariance of therandom variables Z (x1) and Z (x2)

Cov(Z (x1),Z (x2)

)= E[Z (x1)−m(x1)][Z (x2)−m(x2)]

= E(Z (x1)Z (x2))−m(x1)m(x2)

It is assumed that both moments exist, i.e. that they are finite.Then Z (x) is called a second–order random function: it has afinite variance and its covariance exists everywhere.

A covariance function is an even and positive definite function.

Page 63: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Moments (3)

The two–point variogram is defined as

Var(Z (x1)− Z (x2)

)= 2γ(x1, x2)

The variogram is an even and non–negative function. Moreimportant, −γ is a conditionally positive definite function.

Page 64: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Moments (4)

It holds

2γ(x`, xk) = Var(Z (x`)− Z (xk)

)= Var

(Z (x`)

)+ Var

(Z (xk)

)− 2Cov

(Z (x`),Z (xk)

)If

Var(Z (x`)

)= Var

(Z (xk)

)=: C (0)

and Cov(Z (x`),Z (xk)

)=: C (x`, xk)

then

γ(x`, xk) = C (0)− C (x`, xk)

where C (0) corresponds to the sill of the two–point semivariogram.

Page 65: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Statistics of random functions (1)

The problem of geostats

Only one discrete realisation z(x`), ` = 1, . . . , n, of the randomfunction Z (x) sampled at locations x` ∈ D exists, i.e., just for afew random variables Z (x`) of the random function Z (x) only asingle realization z(x`) has been sampled.

The geostatistical solution

A random function is furnished with “pleasant” properties suchthat the model permits to derive statistics based on a singlediscrete realization.

The solution will be provided by an appropriate generalization ofthe assumption of independent and identical distribution inherentto classical statistics.

Page 66: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Statistics of random functions (1)

The problem of geostats

Only one discrete realisation z(x`), ` = 1, . . . , n, of the randomfunction Z (x) sampled at locations x` ∈ D exists, i.e., just for afew random variables Z (x`) of the random function Z (x) only asingle realization z(x`) has been sampled.

The geostatistical solution

A random function is furnished with “pleasant” properties suchthat the model permits to derive statistics based on a singlediscrete realization.

The solution will be provided by an appropriate generalization ofthe assumption of independent and identical distribution inherentto classical statistics.

Page 67: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Statistics of random functions (2)

Homogeneity

The required “pleasant” property essentially consists in themodeling assumption that the increments z(x` + h)− z(x`) andz(xk + h)− z(xk) are realizations of a unique random variable∆(h) := Z (x + h)− Z (x) representing increments independently oftheir involved locations.

Note that it is not assumed that ∆(h1) and ∆(h2) areindependent.

The arithmetic mean of observed incrementsδ(h) = z(x + h)− z(x) provides an reasonable estimate of theexpectation E(∆(h)).

Page 68: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Statistics of random functions (2)

Homogeneity

The required “pleasant” property essentially consists in themodeling assumption that the increments z(x` + h)− z(x`) andz(xk + h)− z(xk) are realizations of a unique random variable∆(h) := Z (x + h)− Z (x) representing increments independently oftheir involved locations.

Note that it is not assumed that ∆(h1) and ∆(h2) areindependent.

The arithmetic mean of observed incrementsδ(h) = z(x + h)− z(x) provides an reasonable estimate of theexpectation E(∆(h)).

Page 69: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Homogeneity of random functions (1)

Strong homogeneity

A random function is called strongly (strictly) stationary (better:homogeneous) if all finite–dimensional joint distribtions aretranslation–invariant, i.e.

P(Z (x1) ≤ z1, . . . ,Z (xk) ≤ zk

)=

P(Z (x1 + h) ≤ z1, . . . ,Z (xk + h) ≤ zk

)

Page 70: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Homogeneity of random functions (2)

Strong stationarity (homogeneity) implies that its moments areinvariant under translation (if they exist), i.e.

EZ (x) = m

Cov(Z (x),Z (x + h)

)= E[Z (x)−m][Z (x + h)−m] = C (h)

Var(Z (x + h)− Z (x)

)= E[Z (x + h)− Z (x)]2

Thus, the mean is constant and the covariance function dependsonly on the lag h.

Page 71: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Homogeneity of random functions (3)

Second–order homogeneity

A random function is called second–order (weakly) stationary(better: homogeneous) SRF if

EZ (x) = m

Cov(Z (x),Z (x + h)

)= C (h)

If C is a function of |h| only, then the SRF is isotropic.

Second–order stationarity (homogeneity) implies

E(Z (x + h)− Z (x)

)= 0

Var(Z (x + h)− Z (x)

)= 2

(C (0)− C (h)

)

Page 72: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Homogeneity of random functions (4)

Intrinsic homogeneity

A random function is called intrinsically stationary (better:homogeneous) IRF if the increment variable∆(x , h) = Z (x + h)− Z (x) is a SRF with respect to x ∈ D, i.e.

E(Z (x + h)− Z (x)

)= aTh

Var(Z (x + h)− Z (x)

)= 2γ(h)

Thus, the expectation of the increments is a linear function of thelag h (linear drift), and its variance is given by the variogram(function). If γ is a function of |h| only, then the IRF is isotropic.

If the linear drift is zero, i.e. if the mean is constant, then theintrinsic model is of the form

E(Z (x + h)− Z (x)

)= 0, E

(Z (x + h)− Z (x)

)2= 2γ(h)

Page 73: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Bounded variograms and stationarity

An SRF is also IRF and therefore has a variogram. In this case

γ(h) = C (0)− C (h)

Thus the variogram of an SRF is bounded by C (0).

Conversely, if the variogram of an IRF is bounded, then γ is of theform given above with a stationary covariance C (h).

Page 74: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Sill of a variogram and sample variance

The theoretical variance of Z (x) is equal to C (0) if Z is an SRF,or does not exist if Z is a nonstationary IRF.

In the case of an SRF, the expectation of the sample variance isalways smaller than the theoretical variance. Thus, the samplevariance is a biased estimate of the theoretical variance C (0).

The variogram, unlike the covariance, does not require theknowledge of the mean. In practice, this mean is not known. Thevariogram is not affected by these problems, because itautomatically filters the mean.

Page 75: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Sample variogram

The sample variogram (experimental, empirical variogram) isdefined as

γ(h) =1

2N(h)

∑x`−xk'h

(z(x`)− z(xk)

)2where N(h) denotes the total number of pairs of points separatedby the lag h.

Page 76: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Variances (1)

What is the variance of an arbitrary linear combination∑λ`Z (x`)?

Let Z ∗ be a finite linear combination of random variables Z (x`) ofa random function (Z (x), x ∈ D), i.e.,

Z ∗ =∑`

λ`Z (x`) , λ` ∈ R

then

Var(Z ∗) = Var(∑

`

λ`Z (x`))

= Cov(∑

`

λ`Z (x`),∑k

λkZ (xk))

=∑`

∑k

λ`λkCov(Z (x`),Z (xk)

)=

∑`

∑k

λ`λkC (x` − xk) = C (x` − xk)

Page 77: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Variances (2)

If Z is an SRF, then

0 ≤ Varn∑`=1

λ`Z (x`) =n∑`=1

n∑k=1

λ`λkC (x` − xk)

A function C with this property is called positive definite.

Page 78: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Variances (3)

Assuming intrinsic stationarity (homogeneity), the first twomoments of the incremets, in particular the semivariogram exists.However, the covariance function of the increments may not exist.If Z is an IRF, then only linear combinations which can berepresented as linear combinations of increments have a finitevariance, i.e.,

n∑`=1

λ`Z (x`) =n∑`=1

λ`

(Z (x`)− Z (x0)

)if

n∑`=1

λ` = 0

Page 79: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Variances (4)

If Z is an IRF and∑λ` = 0, then

0 ≤ Varn∑`=1

λ`Z (x`) = −n∑`=1

n∑k=1

λ`λkγ(x` − xk)

Thus, −γ is a conditionally positive definite functions.

In the case of∑λ` = 0 the covariance function C may formally be

replaced by the negative semivariogram −γ.

Page 80: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Variances (4)

Let

Z ∗(x0) =n∑`=1

λ`Z (x`)

then the variance

Var( n∑`=1

λ`Z (x`)− Z (x0))

can be represented in terms of the variogram if

n∑`=1

λ` = 1

Page 81: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Variogram models (1)

Pure nugget–effect

γ(h) :=

0 , if h = 01 , otherwise

Spherical semivariogram

γ(h) :=

1.5 h

a − 0.5(ha)3, if h < a

1 , otherwise

Exponential semivariogram

γ(h) := 1− exp

(−3h

a

)Gaussian semivariogram

γ(h) := 1− exp

(−3h2

a2

)

Page 82: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Variogram models (2)

Power variogram

γ(h) := hω 0 < ω < 2

Page 83: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Variogram models and covariance functions

For bounded model semivariograms (with a sill) referred to as“transition models” the corresponding model covariance isprovided by

c(h) := 1− γ(h)

For unbounded model semivariogram functions a correspondingcovariance function does not exist.

Page 84: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Kriging (1)

Named in honour of Danie Krige, kriging is the genuine method ofgeostatistics for spatial prediction.

It is a stochastics based method of spatial prediction (estimation)employing the spatial structure, i.e., the spatial correlation, ascaptured in the semivariogram.

Based on n georeferenced data of an attribute z(x`), ` = 1, . . . , n,the value of this attribute z shall be predicted for any locationx0 ∈ D by a linear combination of the data, i.e.

z∗(x0) :=n∑`=1

λ`(x0)z(x`)

where the weights depend on the spatial correlation of the data.

Page 85: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Kriging (2)

Notation

The random fucntion (Z (x), x ∈ D).

The set of points S ⊂ D where Z (x) has been sampled:S = x ∈ D | z(x) known . Usually S is finite a consists of npoints.

The data z(x`), ` = 1, . . . , n.

The mean value m(x`) = m`

The covariances Cov(Z (x`),Z (xk)) = σij

Change of support: Variability depends on material support,and must therefore be considered, e.g. point kriging, blockkriging, ... .

Neighborhoods: local vs. global approach

Page 86: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Kriging (3)

Linear approach of kriging

Z ∗(x0) =n∑`=1

λ`(x0)Z (x`) + λ0(x0)

or for short

Z ∗ =n∑`=1

λ`Z` + λ0

Eventually we shall see that kriging is of the form

Z ∗ =n∑`=1

λ`Z`

Page 87: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Kriging (4)

A proper estimator should be unbiased

E[Z ∗(x0)] = E[Z (x0)] i.e. E[Z ∗(x0)− Z (x0)] = 0

and its associated estimation variance

σ2E (x0) = Var[Z ∗(x0)− Z (x0)]!−→ min

should be as small as possible.

Note

Var[Z ∗(x0)− Z (x0)] = E[Z ∗(x0)− Z (x0)]2︸ ︷︷ ︸mean square error

−E2[Z ∗(x0)− Z (x0)]︸ ︷︷ ︸bias

Page 88: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Kriging (5)

These two requirements are the characteristics of kriging:

blue, blup – best linear unbiased estimator, predictor

and lead to the problem of quadratic programming (optimization)

σ2E (x0)!−→ min

subject to unbiasedness

E[Z ∗(x0)− Z (x0)] = 0

Page 89: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Kriging (6)

Simple kriging (SK) refers to kriging in case of a constantknown mean and a known covariance function and thus tosecond–order stationarity.

Ordinary kriging (OK) refers to the case of a constant unknownmean and a known variogram and thus to intrinsic stationarity.

Universal kriging (UK) refers to case of an unknown mean ofknown type and a known variogram.

Page 90: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging (1)

Assuming a known constant mean m: m(x) = mLinear approach

Z ∗ =n∑`=1

λ`Z` + λ0

The constant λ0 and the weights λ` are determined so as tominimze the error Z ∗ − Z0 characterized by its expected meansquare E(Z ∗ − Z0)2.The mean square error (mse) is

E(Z ∗ − Z0)2 = Var(Z ∗ − Z0)︸ ︷︷ ︸variance term

+ E2(Z ∗ − Z0)︸ ︷︷ ︸bias term

To make the bias term vanish, it is necessary that

λ0 = m0 −∑`

λ`m`

Page 91: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging (2)

Then

Z ∗ = m0 +n∑`=1

λ` (Z` −m`)︸ ︷︷ ︸Y`

with

Y ∗ =n∑`=1

λ`Y`

This amounts to predicting the zero–mean variableY (x) = Z (x)−m(x) by the linear estimator Y ∗ =

∑` λ`Y`, and

adding the mean afterwards

Z ∗ = Y ∗ + m0

Therefore, the case of a known mean is equivalent to the case of azero mean and λ0 = 0.

Page 92: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging (3)

The mse which is now a variance is

Var(Z ∗−Z0) =∑`

∑k

λ`λkCov(Z`,Zk)−2∑`

λ`Cov(Z`,Z0)+Var(Z0)

Setting all partial derivatives to 0

∂λ`E(Z ∗ − Z0)2 = 2

∑k

λkCov(Z`,Zk)− 2Cov(Z`,Z0)!= 0

As the mse is a convex function due to the positive definitness ofthe covariance, the weights are finally given as the solutions of thesimple kriging system∑

k

λSKk Cov(Z`,Zk) = Cov(Z`,Z0)

These equations provide the best linear unbiased predictor (blup).

Page 93: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging (4)

With ∑k

λSKk C (x` − xk) = C (x` − x0)

the estimation variance, called the kriging variance, associated withZ ∗ is

σ2SK(x0) = E(Z ∗ − Z0)2 = Var(Z0)−∑`

λSK` Cov(Z`,Z0)

= C (0)−∑`

λSK` C (x` − x0)

Note that the kriging variance does not depend on the variablesZ (x`) nor on the data z(x`)!

Page 94: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging (5)

More explicitly the simple krige – system reads C (x1 − x1) . . . C (x1 − xn)...

. . ....

C (xn − x1) . . . C (xn − xn)

λSK1 (x0)

...λSKn (x0)

=

C (x1 − x0)...

C (xn − x0)

,

which reduces in matrix notation to

KSKλSK (x0) = kSK ,

Then, its solution is given by

λSK (x0) = K−1SK kSK

and the kriging – variance is accordingly

σ2SK = C (0)− λTSK (x0)kSK

= C (0)− kTSKK−1SK kSK

Page 95: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging (6)

A solution of the kriging system C (x1 − x1) . . . C (x1 − xn)...

. . ....

C (xn − x1) . . . C (xn − xn)

λSK1 (x0)

...λSKn (x0)

=

C (x1 − x0)...

C (xn − x0)

,

in matrix notationKSKλSK (x0) = kSK

exists and is unique, and the kriging variance is positive, ifKSK = C (xi − xj) is positive definit, i.e. in practice if

1 xi 6= xj fur i 6= j

2 the covariance function has been modeled by authorizedmathematical model functions.

Page 96: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging (9)

Properties

Interpolation

z∗(x`) = z(x`), σ2SK(x`) = 0

Smoothing

VarZ ∗ =∑`

∑k

λSK` λSK

k Cov(Z`,Zk) =∑`

λSK` Cov(Z`,Z0),

VarZ ∗ = VarZ0 − σ2SK

Page 97: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging – Dual Kriging (1)

z∗(x) =∑`

λSK` (x) z(x`) =

z(x1)...

z(xn)

T λSK1 (x)

...λSKn (x)

=

z(x1)...

z(xn)

T C (x1 − x1) · · · C (x1 − xn)

.... . .

...C (xn − x1) · · · C (xn − xn)

−1 C (x1 − x0)

...C (xn − x0)

=

b1(x1, . . . , xn)...

bn(x1, . . . , xn)

T C (x1 − x0)

...C (xn − x0)

=∑`

b` C (x` − x)

or in matrix notation

z∗(x) = zTλSK(x) = zTK−1SK kSK =∑`

b`C (x` − x) .

Page 98: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging – Dual Kriging (2)

Reading

z∗(x) = zTλSK(x) = zTK−1SK kSK =∑`

b`C (x` − x) .

as a superposition of shifted covariance functionsCov(Z`,Z ) = C (x` − x) centered at the sampling locations x`, thecovariance function C determines the “smoothness”, i.e. continuityand regularity, of z∗.

If the covariance is parabolic near the origin, then z∗ isdifferentiable;

if it is linear near the origin, then z∗ is continuous but withcusps at the data points;

if the covariance has a discontinuity at the origin, then therewill be isolated jumps at the data points.

Page 99: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Simple Kriging – Dual Kriging (3)

Rewriting

z∗(x) = zTλSK(x) = zTK−1SK kSK =∑`

b`C (x` − x) .

and postulating interpolation

z∗(xk) =∑`

b`C (x` − xk)!= z(xk) , j = 1, . . . , n

or in matrix notationKSKb = z ,

the kriging estimator can be characterized as solution of theinterpolation problem.

Page 100: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Ordinary Kriging (1)

Assuming an unknown constant mean m: m(x) = mOrdinary kriging is the most simple case where the randon functionZ (x) is decomposed according to

Z (x) = m(x) + Y (x)

into a sum of a deterministic function m called the drift describingthe systematic behaviour, and a zero–mean random function Y (x)called the residulas capturing the erratic fluctuations.

Page 101: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Ordinary Kriging (2)

Linear approach

Z ∗ =n∑`=1

λ`Z` + λ0

minimizing the mse

E(Z ∗ − Z0)2 = Var(Z ∗ − Z0) + E2(Z ∗ − Z0)

= Var(Z ∗ − Z0)︸ ︷︷ ︸variance term

+[λ0 +

(∑`

λ` − 1)m]2

︸ ︷︷ ︸bias term

Page 102: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Ordinary Kriging (3)

To make the bias term[λ0 +

(∑`

λ` − 1)m]2

vanish, it is necessary that

λ0 = 0∑`

λ` = 1

Then

E[Z ∗ − Z0] =∑`

λ`m −m =(∑

`

λ` − 1)m = 0

as∑

` λ` = 1.

Page 103: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Ordinary Kriging (4)

Subject to the condition∑

` λ` = 1 the mse to be minimized is

Var(Z ∗ − Z0) = −∑`

∑k

λ`λkγ(Z`,Zk) + 2∑`

λ`γ(Z`,Z0)

Applying the method of Lagrangian multipliers to solve theconstrained minimization problem leads to considering theLagrangian function

L(λ1, . . . , λn;µOK) = Var(Z ∗ − Z0) + 2µOK

(∑`

λ` − 1)

and setting all its partial derivatives to zero

∂L

∂λ`= 2

∑k

γ(Z`,Zk)− 2γ(Z`,Z0) + 2µOK!= 0

∂L

∂µOK= 2

∂L

∂λ`

!= 0

leads to the ordinary kriging system

Page 104: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Ordinary Kriging (5)

Ordinary kriging system∑k

λOKk γ(Z`,Zk) + µOK = γ(Z`,Z0)∑

`

λOK` = 1

With the ordinary kriging weights λOK the estimation variance,called the kriging variance, associated with Z ∗ is

σ2OK(x0) = E(Z ∗ − Z0)2 =∑`

λSK` γ(Z`,Z0) + µOK

Note that the kriging variance does not depend on the variablesZ (x`) nor on the data z(x`)!

Page 105: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Ordinary Kriging (6)

OK in terms of covariances reads∑k

λOKk Cov(Z`,Zk) + µOK = Cov(Z`,Z0)∑

`

λOK` = 1

More explicitly, OK weights are determined by(λOKk (x)µOK (x)

)=

(C (x` − xk) 1

1 0

)−1(C (x` − x0)

1

)Then, with the ordinary kriging weights λOK the kriging varianceassociated with Z ∗ is

σ2OK(x0) = E(Z ∗ − Z0)2 = Var(Z0)−∑`

λSK` Var(Z`,Z0)− µOK

= C (0)−∑`

λSK` C (x` − x0)− µOK

Page 106: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Ordinary Kriging (7)

In case of a SRF, OK is equivalent to optimum estimation of theunknown mean (by kriging!) followed by SK of the residuals fromthe (optimum) mean estimate as if the mean was known, i.e.perfectly estimated.

If the mean would have been estimated in a different(conventional) way, then the above statement is not true.

In case of IRF, the SK estimator is not defined. Moreover, since anIRF is defined by increments, the unknown mean of its randomvariables is indeterminate and cannot be estimated.

Page 107: Multivariate Geostatistics - tu-freiberg.deGeoStatistics Geostatistics is a generalization of classical statistics for georeferenced, spatially and stochastically dependent random

Ordinary Kriging (8)

In practice, the semivariogram is modeled (variography); fornumerics it is replaced by a pseudo covariance function

A− γ(h)

where A denotes a sufficiently large real number such that

A− γ(h) ≥ 0

for all lags h, which are numerically relevant.

In this way, the numerical performance of software can beimproved.