SPATIAL DATA ANALYSIS Tony E. Smith University of Pennsylvania Point Pattern Analysis Spatial...

Preview:

Citation preview

SPATIAL DATA ANALYSIS

Tony E. SmithUniversity of Pennsylvania

• Point Pattern Analysis

• Spatial Regression Analysis

• Continuous Pattern Analysis

POINT PATTERN ANALYSIS

Example Application Areas

• Housing Sales

• Crime Incidents

• Infectious Diseases

Philadelphia Pneumonia Example

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

! !

!

!

!

!

!

!

!!

!

!

!

!

!

!

! !

!

! !

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

! !

!

!

!

!

!

!

!!

!

!

!

!

!

!

! !

!

! !

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

!

Where are Conflict “Hot Spots” ?

• Only meaningful relative to Population

Perhaps even Racial mix

• What would random incidents look like ?

• How analyze this statistically ?

ACTUAL RANDOM

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

! !

!

!

!

!

!

!

!!

!

!

!

!

!

!

! !

!

! !

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

!

!

!!

!!

!

!

!!!!

!!!!

!

!

!!!

!! !!! !

! !!! !!

!!!! ! !!

! !!!

! !!

!!! !

! !! !

!!! !

!!!

!!

!!

!! !

! !! ! !

!!!!

!!

! !! !!

!! !!!

!!!! !

!!

! !!

!!!!

! ! ! !!!! ! !!!!! !!! ! !!!! ! ! !

! !!! !!

!

Hot Spot Analysis

• Make grid of n Reference Points ( )

• Select radius, r, for Cells

• Make cell counts

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

01 0( ,.., )nC C

r

• Generate N random patterns of same size

• Repeat cell count procedure for each

1 1, ..,( ,.., ) ,i in i NC C pattern

• Rank counts at each location 1,..,j n

• Define P-value for observed count:

( ) ; 1,..,1

mj j n

N

P-value

1 1 0mi j i j jC C C

Use these to define a P-value Map

P-Value Map at ¾ Mile Scale

• P-value contours are

mapped by a spline

interpolation of P-values

at each grid point

Legend

mask_1

! Geocoding_Philadelphia

PVals at 3/4 miles

Prediction Map

[PVals].[D_015]

Filled Contours

0.01 - 0.02

0.02 - 0.05

0.05 - 0.1

0.1 - 0.15

0.15 - 1

P-Values

EVENTS SIGNIFICANCE

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!!

!

!

!

!

!

!

! !

!

! !

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

! !

!

!

!

!

!

!

!!

!

!

!

!

!

!

! !

!

! !

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

!

SPATIAL REGRESSION ANALYSIS

Example Applications

• Urban Area Data by:

• census tracts

• National Area Data by:

• block groups

• states

• counties

Ohio Lung Cancer Example

!(

!(

!(

!(

!(

!(

Akron

Dayton

Toledo

Columbus

Cleveland

Cincinnati

Ohio Lung Cancer Data 1998

• Age-Adjusted Mortality Rates for White Males

• Explanatory Variables

!(

!(

!(

!(

!(

!(

Akron

Dayton

Toledo

Columbus

Cleveland

Cincinnati

Per Capita Income Percent Smokers

Simple OLS Regression

• Linear Model

0 , 1,..,i I Ii S Si iy x x u i n

2, ~ (0, )y X N I

• Regression Results

Variable Coefficient P-value

Constant 1.001567 0.000068 Income -0.000046 0.042802 Smoking 0.942823 0.018729

0.09882adjR

Residual Plot :

y0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 1.1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

Spatial Autocorrelation Problem

• One-Dimensional Example

••• •• •

•••

TRUE TREND

y

x

••• •• •

•••

TRUE TREND

y

x

REGRESSION LINE

Correlated Errors

• Consequences of Autocorrelation

2 -valuet P-value

• Spatial Autoregressive Errors

Results often look too significant

, 1,..,i ij j ij iu w u i n

where:

0ijw j influences i

21( ,.., ) ~ (0, )n N

iid

Reduces to OLS if 0

Modeling Spatial Dependencies

• Examples of Spatial Weights

1 ,

0ij

j borders iw

, otherwise

01 , ( , )

0i j

ij

d cent cent dw

, otherwise

• Spatial Weights Matrix

11 1

1

, 0n

ii

n nn

w w

W w

w w

• Spatial Autoregressive Errors

2, ~ (0, )u Wu N I

Testing for Spatial Dependencies

• Moran’s Standardized Coefficient

0 cov( , ) 0u Wu

cov( , )0

var( )

u WuI

u

• Coefficient Estimateˆ ˆˆ ˆ ˆ

u WuI

u u

• Permutation Test for Residuals

• Permute locations of 1 2ˆ ˆ ˆ( , ,.., )nu u u

• Compute for each new permutation I

• Rank and compute P-Values as for Clustering

• Test Result for OLS Residuals

ˆ ˆProb .038OLSI I SIGNIFICANT

Spatial Autoregression Model

• Reduced Form for Analysis

1u Wu u I W

1( )y X u X I W

• Maximum Likelihood Estimation (MLE)

where: 1 1( ) ( ) ( )I W I W

2~ , ( )y N X

yields consistent estimates:

Maximization of this function

2ˆ ˆ ˆ, ,

2 2( , , | , ) , ( )L y X N X

• Formal Statement of the SAR Model

2, , ~ (0, )y X u u Wu N I

Comparison of SAR and OLS

• OLS Results

Variable Coefficient P-value

Constant 1.001567 0.000068 Income -0.000046 0.000018 Smoking 0.942823 0.018729

0.09882adjR

Variable Coefficient P-value

Constant 0.918535 0.000256 Income -0.000036 0.142127 Smoking 0.922541 0.015640

0.09662adjR • SAR Results

Significant Autocorrelation

RHO value 0.246392 0.07561 (0.0375)

CONCLUSION: More reliable estimates

of parameters and goodness of fit.

CONTINUOUS PATTERN ANALYSIS

Example Application Areas

• Weather Patterns

• Mineral Exploration

• Environmental Pollution

• Geologic Analyses

Venice Example

INDUSTRY

VENICE

Model Sources of Drawdown

• Industrial Drawdown

• Local Venice Drawdown

Model Water Table Levels

( )Ix s

( )Vx s

Industrial Drawdown at

Venice Drawdown at

s

s

( )el s Elevation at s

sL Water level at s

Linear Model of Effects

0

2

( ) ( )

( )

( ) , ~ (0, )

s I I V V

el s

s s

L x s x s

el s

x s N

How can one estimate this model ?

Sample Drill-Hole Data

Sample Data Points

, 1,..,j j jL x j n

, ~ (0, )L X N

What about spatial dependencies in ?

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!! !

!

!

!!

!

!

!!

!

!

!!

!

!

!

!

!

!

!!! Legend

level_1973

! 2.29 - 6.63

! -0.28 - 2.29

! -2.50 - -0.28

! -4.79 - -2.50

! -6.11 - -4.79

Coastline

Spatial Covariograms

• Assume: cov( , ) ( )ij i j i j ijC s s C d

• Variogram:

Can pool data to estimate

212

( )ij i jEd

2( ) (0) ( ) ( )ij ij ijC d C d d

2ˆ ˆˆ( ) ( )ij ijC d d

Need only estimate the variogram

Standard Variogram Model

Sill

Nugget

Ranged d

( )C d

( )d

(using nonlinear least squares)

1( ,.., )n

Spatial Prediction of Residuals

• How predict at new locations,s js s ?

1

2

3

k

• Linear Predictors

k

s i ii

Simple Kriging

• Find to minimize prediction error:

Solution: If:

min ( ) sMSE E L L

2

cov ,

s

s s

s s

C

then: 1ˆs s

Yielding predicted value: 1ˆ( )s s s

• Given linear model , ~ (0, )L X N

to obtain consistent estimates:

Spatial Prediction of L-Values

Iterate between:

• Linear Regression

• Simple Kriging

Universal Kriging:

ˆ ˆ,

• Then predict by:sL ˆ ˆˆ ( )s sL x

!!

!

!

!!!

s•

Results for Venice:

Can be 95% confident that each meter of

industrial drawdown lowers the Venice

water table by at least at least 15 cm.

• Predicted Water Table Levels

• Analysis for Policy Conclusions

ACTION: Drawdown was restricted (1973)

RESULT: Venice elevation increased (1976)

Recommended