37
Charles University FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Institute of Economic Studies Faculty of Social Sciences Jan Ámos Víšek Jan Ámos Víšek Econometrics Econometrics Tuesday, 14.00 – 15.20 Charles University Third Lecture http://samba.fsv.cuni.cz/~visek/ Econometrics_Up_To_2010/ http://samba.fsv.cuni.cz/~visek/ Econometrics_Up_To_2010/

Charles University

  • Upload
    madge

  • View
    19

  • Download
    2

Embed Size (px)

DESCRIPTION

Tuesday, 14.00 – 15.20. Charles University. Charles University. Econometrics. Econometrics. http://samba.fsv.cuni.cz/~visek/Econometrics_Up_To_2010/. http://samba.fsv.cuni.cz/~visek/Econometrics_Up_To_2010/. Jan Ámos Víšek. Jan Ámos Víšek. FSV UK. Institute of Economic Studies - PowerPoint PPT Presentation

Citation preview

Page 1: Charles University

Charles University

FSV UK

STAKAN III

Institute of Economic Studies

Faculty of Social Sciences Institute of Economic Studies

Faculty of Social Sciences

Jan Ámos VíšekJan Ámos Víšek

Econometrics Econometrics

Tuesday, 14.00 – 15.20

Charles University

Third Lecture

http://samba.fsv.cuni.cz/~visek/Econometrics_Up_To_2010/http://samba.fsv.cuni.cz/~visek/Econometrics_Up_To_2010/

Page 2: Charles University

Schedule of today talk

Recalling OLS and definition of linear estimator.

Discussion of restrictions on linearity in the case of estimators and of models.

Proof of the theorem given at the end of last lecture.

Definition of the best ( linear unbiased ) estimator.

Under normality of disturbances OLS is BUE.

Page 3: Charles University

T1T0)n,OLS( XXXˆ

T1T0)n,OLS( XXXˆ

YXXXˆ T1T)n,OLS(

Ordinary Least Squares (odhad metodou nejmenších čtverců)

Definition

An estimator where LY)X,Y(~ )X(LL

is matrix, is called the linear estimator .)np(

Page 4: Charles University

ij is Kronecker delta, i.e. if and for . 1ij ji 0ij ji

Assumptions

Let be a sequence of r.v’s,1ii }{

Assertions

Then is the best linear unbiased estimator .

,,0 ij2

jii ),0(2

)n,SLO(

Assumptions

If moreover , and)n(OXX T )n(O)XX( 11T ‘s are independent,

If further , regular matrix,

.

QXXlim Tn

1

n

Assertions

)n,SLO( is consistent.Assumptions

Assertions

then

Q

),0())ˆ(n n0)n,OLS( N(L

where .120)n,OLS( Q))ˆ(n(cov

Theorem

Page 5: Charles University

Proof

}{ T1T0)n,OLS( XXXˆ

0T1T0 XXX

)n,OLS( is unbiased

)n,OLS( is linear

YLYXXXˆ T1T)n,OLS(

)n,OLS( is BLUE

T1T XXX LRemember that we have denoted by .

Page 6: Charles University

Definition

The estimator is the best one in given class of estimators if for any other , the matrix

is positive definite, i.e. for any , we have

.

G~ G

}ˆ{cov}~

{cov pR

0}ˆ{cov}~

{covT

})ZZ()ZZ{(}Zcov{ T Recalling that

Page 7: Charles University

T2)n,OLS( LL}ˆcov{

)n,OLS( is the best in the class of unbised linear estimators

LY~ ,

00p0 LXLY~

R

})LY()LY(}

~cov{ T00{

})LXLY()LXLY T00{(

}L)XY()XY(L TT00{ T2T2TT LLL}I{LL}L {

i.e. (unit matrix) ILX

Page 8: Charles University

)n,OLS( is the best in the class of unbised linear estimators

T2)n,OLS(T2 LL}ˆcov{,LL}~

cov{

1TT1TT )XX(XX)XX(LL)LL(

0)XX()XX()XX()XX(LX 1T1T1T1T

TT )LLL)(LLL(LL

TT LL)LL)(LL(

TLL TT )LL)(LL(LL

TT*T*T LL)LL(LL)LL()LL)(LL(

Page 9: Charles University

T1T0)n,OLS( XXXˆ

T1T0 XXXn

n

)n,OLS( is consistent

T1

T0 Xn

1XX

n

1

pT)n( RXn

1Z

i

n

1i ik)n(

k Xn

1Z

2ik

2iiiiki X}{var,0,X

Denote then

and put

Page 10: Charles University

)n,OLS( is consistent

Let be a sequence of independent r.v’s with

finite means and positive variances , .

1ii }{

Let moreover

i2i ,2,1i

0}{varn

1i ni2n1 .

Then 0)(n

1i niin1 in probability .

Lemma – law of large numbers

0

)n1i )

ii(n

1(P

n

1i i 0}{var2n2

For any

Proof :

Page 11: Charles University

0X}{varn

1i

2ik2

2

i

n

1i2 nn1

2ik

2iiiiki X}{var,0,X

)n,OLS( is consistent

Let be a sequence of independent r.v’s with

finite means and positive variances , .

1ii }{

Let moreover

i2i ,2,1i

0}{varn

1i ni2n1 .

Then 0)(n

1i niin1

Recalling previous slide: Lemma – law of large numbers

0)(XZ ik

n

1i iki

n

1i ik)n(

kn

1

n

1

in probability .

in probability .

Page 12: Charles University

)n,OLS( is asymptotically normal

Let be a sequence of independent r.v’s with

finite means and positive variances ,

1ii }{

. Let moreover

ii 2i

,2,1i

n

1i i2n }{varC .

Then

0)C(maxlim 1ni

ni1n

.

Central Limit Theorem - Feller- Lindeberg

and )(CZ i

n

1i i1

nn

and )1,0()Z( n NL

if and only if for any

0)z(F)z(Climn

1iCz

2i

2nn

ni

d

0

Page 13: Charles University

)n,OLS( is asymptotically normal

Let

1n

)n(p

)n(2

)n(11n

)n( }Z,,Z,Z{}Z{

be sequence of vectors from with d.f. .

Varadarajan theorem

pR )n(F

Further let for any be the d.f. ofpR )n(F

)n(pp

)n(22

)n(11 ZZZ .

Moreover, let be d.f. of F p21 Z,,Z,Z and be d.f. of F pp2211 ZZZ .

If for any , then . )n(F F )n(F FpR

Page 14: Charles University

T1T0)n,OLS( XXXˆ

)n,OLS( is asymptotically normal

T

1T0)n,OLS( XXX

n

1)ˆ

n

1(n

Firstly we verify conditions of Feller-Lindeberg theorem for

TXn

1T, for arbitrary and secondly we apply Vara-

darajan theorem. Then we transform asymptotically normally

distributed vector by matrix .TXn

1 1T XX

n

1

pR

Page 15: Charles University

)n,OLS( is the best in the class of unbiased linear estimators

REMARK

p,,2,1j,0XXY ij

n

1i

Tii

Normal equations

(See the next slides ! )

If either for some Tii 11

XY 1i

or for some 2i ji2X are large,

it may cause serious problems when solving normal equations

and solution can be rather strange.

Page 16: Charles University

Outlier

Solution given by OLS

A “reasonable” model, neglecting the outlier

Page 17: Charles University

Leverage point

Solution given by OLS

A “reasonable” model, neglecting the leverage point

Page 18: Charles University

Solution given by OLS may be different from that expected by common sense.

Conclusion I

One reason is that is the best only among linear estimators. )n,SLO(

Drawing the data from previous slide on the screen of PC, the common sense propose to reject the leverage point and then apply OLS.

We obtain than “reasonable” model but it can’t be written as where is the response for all data. So this estimator is not linear.

LY Y

Restriction on the linear estimators can appear to be drastic !!

Conclusion II

Page 19: Charles University

Restriction on the linear regression model is not substantial.

Conclusion III

And what represents the restriction on the linear model ?

Time total = -3.62 + 1.27 * Weight - 0.53 * Puls - 0.51 * Strength + 3.90 * Time per ¼-mile

Remember, we have considered model

is not a better one.

Time total = -3.62 + 1.27 * Weight + a* Weight - 0.53 * Puls + b* Puls - 0.51 * Strength + c* log(Strength) + 3.90 * Time per ¼-mile

But it is easy to test whether the model 2

3

System of all polynomials is dense in the space of continuous functions on a compact space.

Weierstrass approximation theorem

Page 20: Charles University

What is a mutual relations

of linearity of the estimator of regression coefficients

linearity of regression model ?

and

NONE NONE

The answer is simpler then one would expect :

Page 21: Charles University

( and to use OLS only under these conditions ).

We should find the conditions under which OLS are (is) the best estimator among all unbiased estimators.

Conclusion IV

And why OLS became so popular ?

It has a simple geometric interpretation, implying existence of solution together with an easy proof of its properties.

Nowadays however there is a lot of implementation which are safe against numerical difficulties.

There is a simple formula for evaluating it, although the evaluation need not be straightforward.

Firstly

Secondly

Page 22: Charles University

F)( i L )z(f F

n

1i

Tii

R

)n,ML( )XY(fmaxargˆp

Let and be the density of the distribution

Recalling the definitionMaximum Likelihood Estimator (maximálně věrohodný odhad)

)n,SLO( )n,LM( )n,SLO(

)n,SLO(

Then and attains Rao-Cramer

lower bound, i.e. is the best unbiased estimator.

),0()( 2i NL )n,SLO( )n,LM(then and .

1ii }{ ),0(),,0()( 22

i NLLet be iid. r.v’s, .

)n,SLO(If is the best unbiased estimator attaining

Rao-Cramer lower bound of variance,

Theorem Assumptions

Assumptions

Assertions

Assertions

BLUE

Page 23: Charles University

n

1i2

2Tii

R

)n,ML( }{ }2

)XY(exp{)2/1(maxargˆ

p

n

1i 2

2Tii

R

}{2

)XY()2log(maxarg

p

n

1i 2

2Tii

R

}{2

)XY(maxarg

p

n

1i

2Tii

R

)XY(minargp

)n,SLO(

Maximum Likelihood Estimator under assumption of normality of disturbances

A monotone transformation doesn’t change location of extreme!

This is a constant with respect to

The change of sign changes “max” to “min” !

Page 24: Charles University

y),y(f),y(fˆ }{ )2()1(j

)2(j

)1(j d

If is unbiased, then

write instead of ),X,y(fn ),y(f

Denote joint density of disturbances by

y),y(f}){,y(f

),y(f),y(fˆ )2()2(

k)1(

k)2(

)2()1(

j)2(k

)1(k

)2(j

)1(j d

),X,y(fn

Recalling Rao-Cramer lower bound of variance of unbiased estimator

Let us divide both sides by )2(

k)1(

k

Page 25: Charles University

So we have

y),y(f}){,y(f

),y(f),y(fˆ )2()2(

k)1(

k)2(

)2()1(

j)2(k

)1(k

)2(j

)1(j d

y),y(f),y(flogˆ

kjjk d

)2(

In matrix form

y),y(f),y(flogˆI

T

d

was arbitrary hence write instead of it

Multiply it by from the left-hand-side and by from the right-one.

p,,1k,1k,,1l,)2(l

)1(l

)2(k

)1(k 0. Then let .

Assume that ,

T

Page 26: Charles University

So we have for any

y),y(f),y(flogˆ

T

TT d

pR

y),y(f),y(flog),y(flog

d

y),y(f

y),y(f),y(f

1),y(fdd

01y),y(fy),y(f

dd

Intermediate considerations

Page 27: Charles University

So we have for any

But then

y),y(f),y(flog),y(flogˆ

T

TT d

pR

y),y(f),y(flog),y(flogˆ

T

T d

y),y(f),y(flog),y(flogˆ

T

T d

0

Further intermediate considerations

Finally write as ),y(f ),y(f),y(f

Page 28: Charles University

So we have for any

),y(fˆˆTT

pR

y),y(f),y(flog),y(flog

T

d

y),y(fˆˆ2

TT d

2

1

y),y(f),y(flog),y(flog

2T

d

Applying Cauchy-Schwarz inequality

dx)x(hdx)x(gdx)x(h)x(g 22

2

Page 29: Charles University

So we have for any pR

),y(flog

varˆˆvar TTTT

2T2T ˆˆ)(

2T),y(flog

TT ˆˆˆˆ

T

T ),y(flog),y(flog

ˆcovT

),y(flog

covT TT2T )(

,

Notice, both r.v. are scalars!!

i. e.

Page 30: Charles University

Since it holds for any , we have

( in the sense of positive semidefinite matrices)

pR

T ˆcov

),y(flogcovT

1

T ),y(flogcov

Tˆcov

1

TT ),y(flogcov TT ˆcov

),y(flog

covAssuming regularity of

Select with 12T

Page 31: Charles University

Since it holds for any , we have pR

and

1

),y(flogcovˆcov0

(inequality is in the sense of positive semidefinite matrices).

ˆˆTT

y),y(f),y(flog),y(flog

T

d

Cauchy-Schwarz inequality has been applied on

1

T ),y(flogcov ˆcovT

We would like to reach equality !

Page 32: Charles University

),y(flog

, i.e.

)(),X,y(flog

)()X,y(ˆ n

n

1i2

2Tii

n }{ }2

)Xy(exp{)2/1(),X,y(f

Remember the joint density of disturbances is

n

1i iTii

n X)Xy(),X,y(flog

Hence the equality is reached iff is a linear function of

where is a matrix and . )( )pp( pR)(

Page 33: Charles University

)(XX)(Xy)()X,y(ˆ Ti

n

1i ii

n

1i i

Hence

pTi

n

1i i Ra)(XX)(

)(X)Xy()()X,y(ˆi

Tii

SoaYX)()X,y(ˆ T

.

,

.

aXX)(a)X(X)(aYX)( TTT

)X,y( cannot depend on

is to be unbiased, i.e. for any )X,y( pR

and so with . 1T XX)(

0a

)n,SLO(T1T ˆYXXX)X,Y(ˆ Finally .

Page 34: Charles University

If attains Rao-Cramer lower bound, then the equality in Cauchy Schwarz inequality is reached and hence

)n,SLO(

( write instead of ))X,y(ˆ )n,SLO(

)(

),X,y(flog)()X,y(ˆ n)n,SLO(

)(ˆ)(),X,y(flog 1n

)y(U)(ˆ)(),X,y(flog n

)y(U)(yX)XX)((),X,y(flog T1Tn

(notice that after integration )pR)(

The proof of opposite direction.

Page 35: Charles University

)y(U)(yX)XX)((),X,y(flog T1Tn

)y(U)(yX),X,y(flog TT2n

)y(U)(expyXexp),X,y(f TT2n

)y(U~

)(~

exp)Xy()Xy(2

1expf T

2n

This we only rewrote from the previous slide

Since , for any regular matrix ,

there is a vector so that .

pR)( XX T

XX TT 2)(

It has to hold for any and any of type pR X )pn(

1y),X,y(fn d y),X,y(fyX)XX( nT1T dand

Page 36: Charles University

Imposing the marginal conditions, we obtain finally

)Xy()Xy(

2

1exp

2

1),X,y(f T

2

n

n

Page 37: Charles University

What is to be learnt from this lecture for exam ?

Linearity of estimator and of model – what advantages and restrictions do they represent ?

What means : “The estimator is the best in the class of … .”?

OLS is the best unbiased estimator - the condition(s) for it.

All what you need is on http://samba.fsv.cuni.cz/~visek/Econometrics_Up_To_2010