Spatial Structure

Spatial StructureSpatial Structure

The relationship between a value measured at a point in one place, versus a value from another point measured a certain distance away.

Describing spatial structure is useful for: Indicating intensity of pattern and the scale at which that pattern is exposed

Interpolating to predict values at unmeasured points across the domain (e.g. kriging)

Assessing independence of variables before applying parametric tests of significance

Deterministic Solutions


Geostatistical Solutions


Spatial StructureSpatial Structure



First Order Polynomial Interpolation

Predicted Model

Measured

Second Order (third, fourth, etc.) Polynomial Interpolation

Local Polynomial Interpolation

Radial Basis Function (Spline) Interpolation

SemivarianceSemivariance

2)(1

)(2 jii j

ijd

yywn

d Where :j is a point at distance d from ind is the number of points in that distance class (i.e., the sum of the weights wij for that distance class)wij is an indicator function set to 1 if the pair of points is within the distance class.

2)()(2

1)( idi

dn

i

yydn

d

The geostatistical measure that describes the rate of change of the regionalized variable is known as the semivariance.

Semivariance is used for descriptive analysis where the spatial structure of the data is investigated using the semivariogram and for predictive applications where the semivariogram is fitted to a

theoretical model, parameterized, and used to predict the regionalized variable at other non-measured points (kriging).



Given: Spatial Pattern is an outcome of the synthesis of dynamic processes operating at various spatial and temporal scales

Therefore: Structure at any given time is but one realization of several potential outcomes

Assuming: All processes are Stationary (homogeneous)

Where: Properties are independent of absolute location and direction in space

Therefore: Observations are independent which := they are homoscedastic and form a known distribution

That is: ijjZiZ jiXX ,,,, 22

Stationarity is a property of the process NOT the data, allowing spatial inferences

And:

Stationarity is scale dependentFurthermore:

Inference (spatial statistics) apply over regions of assumed stationarity

Thus:

Geostatistical Solutions - SemivarianceGeostatistical Solutions - Semivariance

0 1 2 3 4 5

5

4

3

2

1

0

100

105

105

100

115

??

Given:

Is spatial dependent of an intrinsic stationary process

Where: )()( ssz )(s

Find: )4,1(z

We assume: )(1

)( 0 is

n

is ZZ

Where: )( isZ is known, and

Is the weight at (i)

IDW (inverse distance weighting) depends only on distance

Kriging depends upon semivariogram which considers spatial relationship and distance

i

i

We constrain the prediction such that: 10)()( issi iiZZ

That says: The difference between the predicted and the observed should be small

OR: minimize the statistical expectations of:

2

1)()( 0

n

isiS i

ZZ

Empirical Semivariogram

Distance between paired points

½ the difference squared between pairs

Semivariogram1st, recall that Euclidean distance is;

2nd, Empirical semivariance :=

22, )()( jijiji yyxxd

2)(5.0 jvalueivalueaverage value @ i – value @ j

3rd, Bin ranges of distances; and find ….

•Average Distance between all pairs in each bin

•Average Semivariance of all paired observations in each bin

NOTE:In large dataset this can become unmanageable.

Solution:Binning pairs at the similar distances such as (1,5) and (1,3)

4th, Plot the Semivariogram and fit a model (ie.: least-squares regression passing through zero)

Average Distance in bin h

Average Semivariance in each bin

Empirical

FittedSemivariance = slope*distance

Semivariance = 13.5 * h

hslopeji ,

5th, Knowing , construct the matrix (Gamma) for the sample location,hslopeji ,

For example, pair (1,5) and (3,4), the lag distance is calculated using the distance between the two locations; the semivariogram value is found by multiplying the slope (13.5) time the distance.

hslopeji ,13.5* = 30.19

5th, Without resorting to matrix algebra; the next step constructs the matrix of all model semivariance for all pairs … such that:

1011

1

1

0,

0,1

,1,

1,11,1

nnnnn

m

g

, or

Where: Gamma Matrix is the model’s semivariance for all sampled pairs

Such that: g 1

Where: Lambda vector contains weights assign to the measured values surrounding the location to be predicted

Where: g Gamma vector is the prediction from all location

Which yields:

6th, This means that in our example, to predict the value at location (1,4) the vector is such that:g

Point Distance vector for (1,4)

1,5 1 13.5

4,3 2 27.0

1,3 1 13.5

4,5 3.162 42.69

5,1 5 67.5

gRecalling from the Empirical Semivariogram:

Semivariance = slope * distance Semivariance = 13.5 * h

Slope*distance=slope*h=13.5*1

100

115

0 1 2 3 4 5

5

4

3

2

1

0

100

105

105

102.6218102.6218

7th, This mean that in our example, to predict the value at location (1,4) with the the matrix and the vector, we can:g

Point Weight value Product

1,5 0.467 100 46.757

4,3 0.098 105 10.325

1,3 0.469 105 49.331

4,5 -0.021 100 -2.113

5,1 -0.01 115 -1.679

102.6218 Kriging Predictor

Solve: g 1

Such That:

step 5, step 6

bdco γ(d)

)]/exp(1[γ(d) 22 adcco

)]/exp(1[γ(d) adcco

Gaussian:

Linear:

Spherical:

Exponential:

For predictions, the empirical semivariogram is converted to a theoretic one by fitting a statistical model (curve) to describe its range, sill, & nugget.

adcc

adadadcc

o

o

,

)],2/()2/3[γ(d)

33

There are four common models used to fit semivariograms:

Where:

c0 = nugget

b = regression slope

a = range

c0+ c = sill

Assumes no sill or range

The sill is the value at which the semivariogram levels off (its asymptotic value)

The range is the distance at which the semivariogram levels off (the spatial extent of structure in the data)

The nugget is the semivariance at a distance 0.0, (the y –intercept)

A semivariogram is a plot of the structure function that, like autocorrelation, describes the relationship between measurements taken some distance apart.

Semivariograms define the range or distance over which spatial dependence exists.

Autocorrelation assumes stationarity, meaning that the spatialstructure of the variable is consistent over the entire domain of the dataset.

The stationarity of interest is second-order (weak) stationarity, requiring that:

(a) the mean is constant over the region(b) variance is constant and finite; and (c) covariance depends only on between-sample spacing

In many cases this is not true because of larger trends in the data In these cases, the data are often detrended before analysis. One way to detrend data is to fit a regression to the trend, and use only the residuals for autocorrelation analysis

StationarityStationarityStationarityStationarity

Autocorrelation also assumes isotropy, meaning that the spatial structure of the variable is consistent in all directions.

Often this is not the case, and the variable exhibits anisotropy, meaning that there

is a direction-dependent trend in the data.

AnistotropyAnistotropyAnistotropyAnistotropy

If a variable exhibits different ranges in different directions, then there is a geometric anisotropy. For example, in a dune deposit, larger range in the wind direction

compared to the range perpendicular to the wind direction.

• Check for enough number of pairs at each lag distance (from 30 to 50). • Removal of outliers

• Truncate at half the maximum lag distance to ensure enough pairs

• Use a larger lag tolerance to get more pairs and a smoother variogram

• Start with an omnidirectional variogram before trying directional variograms

• Use other variogram measures to take into account lag means and variances (e.g., inverted covariance, correlogram, or relative variograms)

• Use transforms of the data for skewed distributions (e.g. logarithmic transforms).

• Use the mean absolute difference or median absolute difference to derive the range

Variogram Modeling SuggestionsVariogram Modeling Suggestions

Documents

Spatial Structure