28
Spatial Correlation Robust Inference Ulrich K. Müller and Mark W. Watson Princeton University Presentation at University of Virginia

Spatial Correlation Robust Inference

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Spatial Correlation Robust Inference

Spatial Correlation Robust Inference

Ulrich K. Müller and Mark W. Watson

Princeton University

Presentation at University of Virginia

Page 2: Spatial Correlation Robust Inference

Motivation

• More and more empirical work using spatial data in development, trade, macro, etc.

• How to appropriately correct standard errors?

— Conley (1999): Spatial analog of HAC standard errors

— Don’t work well under moderate or high spatial dependence

— Many applications characterized by strong spatial correlations (Kelly (2019, 2020))

1

Page 3: Spatial Correlation Robust Inference

Canonical Problem

Observe

= + , = 1

• (and ) associated with observed location ∈ S ⊂ R

• drawn i.i.d. with density

• With s = (1 ): E[|s] = 0, E[|s] = ( − ), so covariance stationary

• How to test 0 : = 0 and construct confidence interval for ?

• Extensions to regression, GMM etc. follow from standard arguments (see paper)

2

Page 4: Spatial Correlation Robust Inference

Three One-Dimensional Designs

3

Page 5: Spatial Correlation Robust Inference

Two Geographic Designs

Light data from Henderson, Squires, Storeygard and Weil (2018)

4

Page 6: Spatial Correlation Robust Inference

Spatial Inference

• Usual t-statistic

=

√( − 0)

with associated critical value cv, and confidence interval with endpoints ± cv √

• Conley (1999): kernel-type “consistent” estimator 2 = −1X

µ||−||

¶ with →∞

and standard normal cv

• Bester, Conley, Hansen, and Vogelsang (2016): ‘fixed-b’ version 2 = −1X

µ||−||

and nonstandard cv obtained by simulating from ∼ N (0 1)

• Sun and Kim (2012): 2 = −1X

=1

³X−12

´2with Fourier weights and student-t

cv (analogous to Müller (2004, 2007), Phillips (2005), etc.)

5

Page 7: Spatial Correlation Robust Inference

This Paper

• Measure of strength of spatial dependence: average pairwise correlation

=1

(− 1)X=1

X6=Cor ( |s)

• Objective: method to construct and cv that “work” for

— some moderate and all small values of

— also for non-uniform spatial density

6

Page 8: Spatial Correlation Robust Inference

Outline of Talk

1. Definition of new “SCPC” method

2. Size control under generic weak correlation

3. Size control under some forms of strong correlation

4. Small sample performance

7

Page 9: Spatial Correlation Robust Inference

Motivation for SCPC Method

• Consider OLS variance estimator 2OLS = (− 1)−1P=1( − )2. Using familiar notation

2OLS = (− 1)−1y0My = (− 1)−1y0⎛⎝−1X=1

ww0

⎞⎠y = (− 1)−1 −1X=1

(w0u)2

for w such that −1w0w = 1[ = ] and l0w = 0 for l = (1 1)

0So | | cv iff

(l0u)2 cv2(− 1)−1−1X=1

(w0u)2

• Challenge with (positive) spatial correlation: Most weighted averages w0u are less variable than l0u,leading to downward bias of 2

• Solution: Select (few) weighted averages w0u that are as variable as possible under plausible spatialcovariance matrix, and use 2 = −1P

=1(w0u)

2

8

Page 10: Spatial Correlation Robust Inference

SCPC Method

• Benchmark model ( − ) = exp(−|| − ||) with associated covariance matrix Σ() of u

— worst-case correlation has = 0, weaker correlation for 0

• Use eigenvectors r1 r ofMΣ(0)M corresponding to largest eigenvalues as weights and

2SCPC() = −1X

=1

(r0u)2

“Spatial Correlation Principle Components” (SCPC) of u ∼ N (0MΣ(0)M)

• For given , compute critical value cvSCPC such that sup≥0 P0Σ()(|SCPC()| cvSCPC()|s) =

• SCPC minimizes CI length E[2SCPC() cvSCPC()|s] in i.i.d. model u|s ∼ N (0 2I)9

Page 11: Spatial Correlation Robust Inference

Eigenvectors in One-Dimensional Designs

10

Page 12: Spatial Correlation Robust Inference

Eigenvectors in Geographic Designs

11

Page 13: Spatial Correlation Robust Inference

Properties of SCPC Inference I

• Select 0 as function of implied average correlation = 0

⇒ SCPC inference invariant to scale of locations, as well as any distance preserving transformations,

such as rotations

• For fixed 0 (such as 0 = 002), SCPC = (1)

— SCPC not consistent, cvSCPC does not converge to normal critical value

— (appropriately) reflects difficulty of correctly estimating variance of l0u under ‘strong’ correlation

• SCPC easy to apply, even for (very) large , using computational short-cuts for determination ofeigenvectors if is large (STATA and matlab code available)

• By construction, SCPC controls size in benchmark model u|s ∼ N (0Σ()), ≥ 012

Page 14: Spatial Correlation Robust Inference

Properties of SCPC Inference II

• Theorem: Under regularity conditions, SCPC controls asymptotic size under generic ‘weak correlation’with

→ 0, also in non-Gaussian models

(whereas suggestions by Sun and Kim (2012) and Bester, Conley, Hansen, and Vogelsang (2016)

only do so for uniform )

• Theorem: In Gaussian model with parametric covariance matrices u|s ∼ N (0Σ()), ∈ Θ,

easy-to-compute conditions such that SCPC controls size in nonparametric class of models

u|s ∼ N (0ZΣ() ()) for all cdfs

⇒ Numerical result: in many spatial designs, SCPC controls size for mixtures of Matérn covariance

functions with implied ≤ 0

• Numerical result: SCPC close to minimizing expected confidence interval length in particular classof spatial correlations among all functions of y

13

Page 15: Spatial Correlation Robust Inference

Weak Correlation: Set-up of Lahiri (2003)

• Location are sampled i.i.d. on S ⊂ R

• Conditional on locations s = (1 ), for some sequence 0,

= ()

with a mean-zero stationary random field on R with E[()()] = ( − )

• Calculation: = (1), so weak dependence when →∞

⇒ nature of weak dependence characterized by = → ∈ [0∞)

⇒ →∞ corresponds to asymptotically negligible spatial correlation

14

Page 16: Spatial Correlation Robust Inference

Weak Correlation: Weighted Averages

• Lemma: Let the × (+ 1) matrixW0 have th row w0() where w0() = (1w()0) w() =

(1() ())0 and : S 7→ R are continuous weight functions. Under mixing and moment

assumptions on of Lahiri (2003)

12 −12W00u|s⇒ N (0Ω) with Ω = (0)V1 +

µZ()

¶V2

where

V1 =

ZSw0()w0()0() and V2 =

ZSw0()w0()0()2

⇒ V1 is what we would expect from i.i.d. data, large corresponds to very weak correlation

⇒ V2 proportional to V1 only under constant

• Convergence holds conditional on s: Randomness in locations doesn’t drive variability of weightedaverages

15

Page 17: Spatial Correlation Robust Inference

Weak Correlation: t-statistic

• Recall: 12 −12W00u|s⇒ N (0Ω) with Ω = (0)V1 + (R())V2

• Theorem: Let be t-statistic with 2 = −1P=1(w

0y)

2. WithX = (0X01:)

0 ∼ N (0Ω),

under assumptions of Lemma,

P (| | cv |s) → P

⎛⎜⎝ |0|q−1X01:X1:

cv

⎞⎟⎠

• If Ω ∝ V1 =RS w0()w0()0() = I+1, then critical value from student-t

⇒ but Ω ∝ V1 only if uniform, so Sun and Kim (2012) approach not generically valid

16

Page 18: Spatial Correlation Robust Inference

Weak Correlation: Size Control

• Since t-statistic is scale invariant, no loss of generality to normalize

Ω = V1 + (1− )V2 ∈ [0 1)where

V1 =

ZSw0()w0()0() and V2 =

ZSw0()w0()0()2

⇒ scalar parameter ∈ [0 1) fully characterizes all possible limits under weak correlation

• SCPC benchmark model (− ) = exp(−||− ||) replicates all such limits with appropriatechoices of = →∞

Theorem: SCPC critical value construction sup≥0 P0Σ()

(|SCPC()| cvSCPC()|s) = implies

size control also under arbitrary sequences ≥ 0, and, therefore, under generic weak correlation.

17

Page 19: Spatial Correlation Robust Inference

Weak Correlation: Additional Results

• Convergence of first eigenvectors ofMΣ0M to limiting eigenfunctions : S 7→ R in appropriatesense

• Analogous results to those above for t-statistics with ‘fixed-b’ kernel variance estimators

2 = −1X

µ||−||

with cv obtained from distribution in i.i.d. model, as in Bester, Conley, Hansen, and Vogelsang (2016)

⇒ generic size control under weak correlation again only with uniform

18

Page 20: Spatial Correlation Robust Inference

Size Control Under Mixtures I

• Consider Gaussian model and parametric covariance matrices u|s ∼ N (0Σ()), ∈ Θ.

Seek easy-to-compute conditions such that t-statistic is valid in nonparametric class of models

u|s ∼ N (0ZΣ() ()) for all cdfs

• Suppose 2 = u0WW0u, cv and Σ0 are such that

P(| | cv) ≤ under y ∼ N (0Σ0)so can one find inequalities relating Σ(), ∈ Θ to Σ0 such that also

P(| | cv) ≤ under y ∼ Nµ0

ZΣ() ()

¶for any cdf on Θ?

19

Page 21: Spatial Correlation Robust Inference

Size Control Under Mixtures II

Theorem: With W0 = (lW), let Ω0 = W00Σ0W0 and Ω() = W00Σ()W0. Suppose A0 =

D(cv)Ω0 with D(cv) = diag(1− cv2 I) is diagonalizable, and let P be its eigenvectors. Define

A() = P−1D(cv)Ω()P and A() = 12(A() +A()0), and suppose A0 and A() ∈ Θ are scale

normalized such that 1(A0) = 1(A()) = 1, where (·) is the th largest eigenvalue. Let

1() = (−A())− 1(A())(−A0)− (1(A())− 1)() = +1−(−A())− 1(A())+1−(−A0) for = 2

If inf∈ΘP=1 () ≥ 0 for all 1 ≤ ≤ , then for any cdf on Θ, P(| | cv) ≤ under

y ∼ N (0Σ0) implies

P(| | cv) ≤ under y ∼ Nµ0

ZΣ() ()

20

Page 22: Spatial Correlation Robust Inference

Implications of Mixture Theorem

1. Spatial context

• Apply SCPC in U.S. states “light” and uniform designs

• Use theorem to numerically establish size control in arbitrary mixtures of Matérn-class covariancematrices with ∈ 12 32 52∞ and 0 such that implied ≤ 0

2. Regular time series

• Consider equal-weighted cosine (EWC) t-statistic of Müller (2004, 2007), Lazarus et al. (2018)with critical value computed from benchmark AR(1) model with coefficient 1− 0

• Class of spectral densities where ()0() is (weakly) monotonically increasing in ||⇒ Can be written as mixture (|) ∝

R1[ ]0() ()

⇒ Theorem implies small sample and asymptotic size control (cf. results in Dou (2019))

21

Page 23: Spatial Correlation Robust Inference

Intuition for Theorem

• With Ω() =W00Σ(θ)W0 and X = (0X01:)

0 ∼ Ω()12Z ∼ N (0Ω())

P³2 cv2

´→ P

⎛⎝ 20

X01:X1: cv2

⎞⎠ = P ³20 − cv2X01:X1: 0

´= P

³X0D(cv)X 0

´= P(Z0Ω()12D(cv)Ω()12Z 0)

= P

⎛⎝ X=0

()2 0

⎞⎠ = P⎛⎝20 − X

=1

()

0()2

⎞⎠where and D(cv) = diag(1− cv2 I) and () are the eigenvalues of Ω()12D(cv)Ω()12, or,equivalently, of D(cv)Ω()

• Can show: P³20

P=1

2

´is Schur convex in =1

• Use majorization theory on eigenvalues of linear combinations of matrices (plus additional calcula-tions) to obtain result

22

Page 24: Spatial Correlation Robust Inference

Monte Carlo Comparison with Existing Methods

For each of the 48 contiguous U.S. states, draw 5 independent samples of 500 locations from density

∈ uniform, light

• Gaussian benchmark model with ( − ) = exp(−0|| − ||), 0 calibrated from 0

• Alternative methods

1. Conley (1999) with choice of that minimizes size distortion

2. Fixed-b kernel variance estimator with = max || − || (analog to Kiefer, Vogelsang andBunzel (2000))

3. Sun and Kim (2012) with chosen according to their rule using true covariance function

4. Ibragimov and Müller (2010) cluster inference with = 4 and = 9 clusters

23

Page 25: Spatial Correlation Robust Inference

Monte Carlo Results

24

Page 26: Spatial Correlation Robust Inference

Monte Carlo Results

25

Page 27: Spatial Correlation Robust Inference

Monte Carlo Results

26

Page 28: Spatial Correlation Robust Inference

Conclusions

• New method to conduct spatial correlation robust inference that accounts for

— non-uniform distribution of locations

— (some) strong spatial correlations

• Potentially applicable also in network econometrics or more generally under given plausible form of

‘worst-case’ correlation structure

27