34
Generalized Regression Model Based on Greene’s Note 15 (Chapter 8)

Generalized Regression Model

  • Upload
    lluvia

  • View
    55

  • Download
    1

Embed Size (px)

DESCRIPTION

Generalized Regression Model. Based on Greene’s Note 15 (Chapter 8). Generalized Regression Model. - PowerPoint PPT Presentation

Citation preview

Page 1: Generalized Regression Model

Generalized Regression Model

Based on Greene’s Note 15 (Chapter 8)

Page 2: Generalized Regression Model

Generalized Regression ModelSetting: The classical linear model

assumes that E[] = Var[] = 2I. That is, observations are uncorrelated and all are drawn from a distribution with the same variance. The generalized regression (GR) model allows the variances to differ across observations and allows correlation across observations.

Page 3: Generalized Regression Model

Implications• The assumption that Var[] = 2I is used to derive the

result Var[b] = 2(XX)-1. If it is not true, then the use of s2(XX)-1 to estimate Var[b] is inappropriate.

• The assumption was used to derive most of our test statistics, so they must be revised as well.

• Least squares gives each observation a weight of 1/n. But, if the variances are not equal, then some observations are more informative than others.

• Least squares is based on simple sums, so the information that one observation might provide about another is never used.

Page 4: Generalized Regression Model

GR Model• The generalized regression model: y = X + , E[|X] = 0, Var[|X] = 2. Regressors are well behaved. We consider some

examples with Trace = n. (This is a normalization with no content.)

• Leading Cases– Simple heteroscedasticity– Autocorrelation– Panel data and heterogeneity more generally.

Page 5: Generalized Regression Model

Least Squares• Still unbiased. (Proof did not rely on )• For consistency, we need the true variance of b,

Var[b|X] = E[(b-β)(b-β)’|X] = (X’X)-1 E[X’εε’X] (X’X)-1 = 2 (X’X)-1 XX (X’X)-1 .

Divide all 4 terms by n. If the middle one converges to a finite matrix of constants, we have the result, so we need to examine

(1/n)XX = (1/n)ij ij xi xj.

This will be another assumption of the model.• Asymptotic normality? Easy for heteroscedasticity case, very

difficult for autocorrelation case.

Page 6: Generalized Regression Model

Robust Covariance Matrix• Robust estimation:• How to estimate Var[b|X] = 2 (X’X)-1 XX (X’X)-1 for

the LS b? • The distinction between estimating 2 an n by n matrix and estimating 2 XX = 2 ijij xi xj

• For modern applied econometrics, – The White estimator– Newey-West.

Page 7: Generalized Regression Model

The White Estimator

n1 2 1i i ii 1

n 2i2 i 1

2i

i i2

12

2 2

Est.Var[ ] ( ) e ( )

eUse ˆ nne ˆ ˆ = , =diag( ) note tr( )=nˆ ˆˆ

ˆˆEst.Var[ ] n n n nˆ

Does ˆ n n

b X'X x x' X'X

Ω Ω

X'X X'ΩX X'Xb

X'ΩX X'ΩX

0?

Page 8: Generalized Regression Model

Newey-West Estimator

n 20 i i ii 1

L n1 l t t l t t l t l tl 1 t l 1

l

Heteroscedasticity Component - Diagonal Elements1 en

Autocorrelation Component - Off Diagonal Elements1 we e ( )n

lw 1 = "Bartlett weight"L 11Est.Var[ ]=n

S x x'

S x x x x

Xb1 1

0 1[ ]n n

'X X'XS S

Page 9: Generalized Regression Model

Generalized Least SquaresA transformation of the model: P = -1/2. P’P = -1

Py = PX + P or y* = X* + *. Why? E[**’|X*]= PE[’|X*]P’ = PE[’|X]P’ = σ2PP’ = σ2 -1/2 -1/2 = σ2 0

= σ2I

Page 10: Generalized Regression Model

Generalized Least SquaresAitken theorem. The Generalized Least

Squares estimator, GLS. Py = PX + P or y* = X* + *. E[**’|X*]= σ2I Use ordinary least squares in the

transformed model. Satisfies the Gauss – Markov theorem.

b* = (X*’X*)-1X*’y*

Page 11: Generalized Regression Model

Generalized Least SquaresEfficient estimation of and, by implication,

the inefficiency of least squares b. = (X*’X*)-1X*’y* = (X’P’PX)-1 X’P’Py = (X’Ω-1X)-1 X’Ω-1y

≠ b. is efficient, so by construction, b is not.

β̂

β̂β̂

Page 12: Generalized Regression Model

Asymptotics for GLSAsymptotic distribution of GLS. (NOTE: We

apply the full set of results of the classical model to the transformed model).

UnbiasednessConsistency - “well behaved data”Asymptotic distributionTest statistics

Page 13: Generalized Regression Model

Unbiasedness

1

1

1

ˆ ( )( )

ˆE[ ] ( ) E[ | ]

= if E[ | ]

-1 -1

-1 -1

-1 -1

β X'Ω X X'Ω y β X'Ω X X'Ω ε

β| X =β X'Ω X X'Ω ε X

β ε X 0

Page 14: Generalized Regression Model

Consistency

12

ni ii 1

ii

Use Mean Square

ˆVar[ | ]= ?n n

Requires to be "well behaved"nEither converge to a constant matrix or diverge.Heteroscedasticity case:

1 1n n

Autocorrelatio

-1

-1

-1

X'Ω Xβ X 0

X'Ω X

X'Ω X x x'

n n 2i ji 1 j 1

ij

n case:1 1 . n terms. Convergence is unclear.n n

-1X'Ω X x x '

Page 15: Generalized Regression Model

Asymptotic Normality

111

11

1

' 1ˆn( ) n 'n nConverge to normal with a stable variance O(1)?

' a constant matrix?n1 ' a mean to which we can apply then central limit theorem?Het

X Ω Xβ β X Ω ε

X Ω X

X Ω ε

n1 2i i i ii 1

i i i i

i i

eroscedasticity case?1 1' . Var , is just data.n n Apply Lindeberg-Feller. (Or assuming / is a draw from a common distribution with mean and fixed va

x xX Ω ε =

xriance - some recent treatments.)

Autocorrelation case?

Page 16: Generalized Regression Model

Asymptotic Normality (Cont.)

n n1i j i ii 1 j 1

1

For the autocorrelation case1 1' n n

Does the double sum converge? Uncertain. Requires elementsof to become small as the distance between i and j increases.(Has to resem

ijX Ω ε = Ω x x

Ωble the heteroscedasticity case.)

Page 17: Generalized Regression Model

Test Statistics (Assuming Known Ω)

• With known Ω, apply all familiar results to the transformed model.

• With normality, t and F statistics apply to least squares based on Py and PX

• With asymptotic normality, use Wald statistics and the chi-squared distribution, still based on the transformed model.

Page 18: Generalized Regression Model

Generalized (Weighted) Least Squares:Heteroscedasticity

1

22 2

n

1

-1/2 2

n1

n n1i ii 1 i 1

i i

i i

i2

0 ... 00 ... 0Var[ ] 0 0 ... 00 0 ...

1/ 0 ... 00 1/ ... 0 0 0 ... 00 0 ... 1/

1 1ˆ ( ) ( ) y

ˆy

ˆ

-1 -1i iβ X'Ω X X'Ω y x x x

xβ2

ni 1

n K

Page 19: Generalized Regression Model

Autocorrelation t = t-1 + ut (‘First order autocorrelation.’ How does this come

about?) Assume -1 < < 1. Why?ut = ‘nonautocorrelated white noise’t = t-1 + ut (the autoregressive form) = (t-2 + ut-1) + ut

= ... (continue to substitute)= ut + ut-1 + 2ut-2 + 3ut-3 + ...= (the moving average form)

(Some observations about modeling time series.)

Page 20: Generalized Regression Model

Autocorrelation

2t t t 1 t 1

it ii=0

22i 2 u

u 2i=0

t t 1 t t 1 t2

t t 1 t t 1 t

Var[ ] Var[u u u ...] = Var u

= 1An easier way: Since Var[ ] Var[ ] and uVar[ ] Var[ ] Var[u ] 2 Cov[ ,u ] =

2 2t u

2u

2

Var[ ]

1

Page 21: Generalized Regression Model

Autocovariances

t t 1 t 1 t t 1

t 1 t 1 t t 1

t-1 t2u

2

t t 2 t 1 t t 2

Continuing...Cov[ , ] = Cov[ u , ] = Cov[ , ] Cov[u , ] = Var[ ] Var[ ]

=(1 )Cov[ , ] = Cov[ u , ]

t 1 t 2 t t 2

t t 12 2

u2

= Cov[ , ] Cov[u , ] = Cov[ , ]

= and so on.(1 )

Page 22: Generalized Regression Model

Autocorrelation Matrix

2 1

22

2 2 32

1 2 3

11

11

1

(Note, trace = n as required.)

Ω

Ω

T

T

u T

T T T

Page 23: Generalized Regression Model

Generalized Least Squares

2

1/ 2

21

2 21/ 23 2

T T 1

1 0 0 ... 01 0 ... 0

0 1 ... 0... ... ... ... ...0 0 0 01 yy yy y

...y

Ω

Ω y=

Page 24: Generalized Regression Model

The Autoregressive Transformation

t t t t 1 t

t 1 t 1

t t 1 t t 1

t t 1 t

y u y

y y ( ) + ( )y y ( ) + u

(Where did the first observation go?)

t

t-1

t t-1

t t-1

x 'βx 'β

x x 'βx x 'β

Page 25: Generalized Regression Model

Unknown • The problem (of course), is unknown. For now, we will

consider two methods of estimation:– Two step, or feasible estimation. Estimate first, then

do GLS. Emphasize - same logic as White and Newey-West. We don’t need to estimate . We need to find a matrix that behaves the same as (1/n)X-1X.

– Properties of the feasible GLS estimator• Maximum likelihood estimation of , 2, and all at the

same time.– Joint estimation of all parameters. Fairly rare. Some

generalities…– We will examine two applications: Harvey’s model of

heteroscedasticity and Beach-MacKinnon on the first order autocorrelation model

Page 26: Generalized Regression Model

Specification must be specified first.• A full unrestricted contains n(n+1)/2 - 1

parameters. (Why minus 1? Remember, tr() = n, so one element is determined.)

is generally specified in terms of a few parameters. Thus, = () for some small parameter vector . It becomes a question of estimating .

• Examples:

Page 27: Generalized Regression Model

Heteroscedasticity: Harvey’s Model • Var[i | X] = 2 exp(zi)

• Cov[i,j | X] = 0

e.g.: zi = firm size

e.g.: zi = a set of dummy variables (e.g., countries) (The groupwise heteroscedasticity model.)

• [2 ] = diagonal [exp( + zi)],

= log(2)

Page 28: Generalized Regression Model

AR(1) Model of Autocorrelation

2 1

22

2 2 32

1 2 3

11

11

1

Ω

T

T

u T

T T T

Page 29: Generalized Regression Model

Two Step EstimationThe general result for estimation when is

estimated.GLS uses [X-1X]X -1 y which converges in

probability to .We seek a vector which converges to the same

thing that this does. Call it “Feasible GLS” or FGLS, based on [X X]X y

The object is to find a set of parameters such that [X X]X y - [X -1 X]X -1 y 0

ˆ -1Ω ˆ -1Ω

ˆ -1Ω ˆ -1Ω

Page 30: Generalized Regression Model

Feasible GLS

For FGLS estimation, we do not seek an estimator of such that

ˆ ˆThis makes no sense, since is nxn and does not "converge" to

anything. We seek a matrix such that

Ω

Ω- Ω 0ΩΩ

ˆ (1/n) (1/n)For the asymptotic properties, we will require that

ˆ (1/n) (1/n)Note in this case, these are two random vectors, which we requireto converge

-1 -1

-1 -1

X'Ω X - X'Ω X 0

X'Ω - X'Ω 0

to the same random vector.

Page 31: Generalized Regression Model

Two Step FGLS(Theorem 8.5) To achieve full efficiency, we

do not need an efficient estimate of the parameters in , only a consistent one. Why?

Page 32: Generalized Regression Model

Harvey’s ModelExamine Harvey’s model once again.Methods of estimation:Two step FGLS: Use the least squares residuals to

estimate , then use {X[Ω()]-1 X}-1X’[Ω()]-1y to estimate .

Full maximum likelihood estimation. Estimate all parameters simultaneously.

A handy result due to Oberhofer and Kmenta - the “zig-zag” approach.

Examine a model of groupwise heteroscedasticity.

Page 33: Generalized Regression Model

Harvey’s Model for Groupwise Heteroscedasticity

Groupwise sample, yig, xig,…

N groups, each with Ng observations.

Var[εig] = σg2

Let dig = 1 if observation i,g is in group j, 0 else.

= group dummy variable.Var[εig] = σg

2 exp(θ2d2 + … θGdG)Var1 = σg

2 , Var2 = σg2 exp(θ2) and so on.

Page 34: Generalized Regression Model

Estimating Variance Components• OLS is still consistent:• Est.Var1 = e1’e1/N1 estimates σg

2

• Est.Var2 = e2’e2/N2 estimates σg2 exp(θ2)

• Estimator of θ2 is ln[(e2’e2/N2)/(e1’e1/N1)]• (1) Now use FGLS – weighted least squares• Recompute residuals using WLS slopes• (2) Recompute variance estimators• Iterate to a solution… between (1) and (2)