40
ECON2206 / ECON3290 Introductory Econometrics Lecture 4: The Multiple Regression Model - Inference I Lecturer: Garry Barrett UNSW Session 1, 2009 Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 1 / 40

E2206L0409b

Embed Size (px)

Citation preview

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 1/40

ECON2206 / ECON3290 Introductory EconometricsLecture 4: The Multiple Regression Model - Inference I

Lecturer: Garry Barrett

UNSW

Session 1, 2009

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 1 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 2/40

Outline

The Multiple Regression Model - Inference I1. Sampling Distribution of OLS Estimators

2. Testing hypotheses about 1 population parameter (The t  test)

3. Con…dence Intervals

4. Summary

Inference II (Next Week)

5. Testing single linear combination of parameters

6. Testing multiple linear restrictions (The F  test)7. SHAZAM 

8. Summary

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 2 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 3/40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 4/40

Distribution of OLS Estimators

Assumption 6. Normality 

The population error u  is independent of the explanatory variables x and is normally distributed with mean 0 and variance σ 2 :

u s Normal (0, σ 2) (1)

This assumption includes Assumptions 4 (ZCM) and 5(Homoskedasticity)

Together assumptions 1-6 are known as the classical linear model

(CLM) assumptions) under these assumptions the OLS estimators are the minimum

variance unbiased estimators (stronger than the Gauss-Markovresults, since we are not limiting ourselves to estimators which arelinear in the y i )

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 4 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 5/40

Distribution of OLS Estimators

A shorthand way to state the population assumptions of the CLM is:

y jx s Normal ( β0 + β1x 1 + .... + βk x k  , σ 2) (2)

Whether normality of  u  is a good or bad assumption depends on theapplication at hand.

E.g. Is it reasonable to assume wages  (conditional on educ , etc ) are

normally distributed when no wages < 0 and there are minimum wagelaws ? How about log (wages ) ?

Empirical evidence shows that the latter is reasonable

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 5 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 6/40

Assumptions on the error term

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 6 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 7/40

Normal Sampling Distribution

Normality of the error term translates into normal distributions for theOLS estimators

Theorem 7. Normal Sampling Distributions Under the assumptions 1-6 (CLM):

ˆ β j  s Normal ( β j , Var ( ˆ β j )) (3)

where Var ( ˆ β j ) was given last week

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 7 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 8/40

Normal Sampling Distribution

In turn, this implies:

( ˆ β j  β j )

sd (ˆ β j )

s Normal (0, 1) (4)

A linear combinations of the ˆ β j  is also normally distributed - this willbe useful later when we want to test things like β1 = β2

We’ll also see later on that the normality of the OLS is approximately

true in large samples even without normality of the error terms !

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 8 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 9/40

Testing Hypotheses about a Single Parameter

2. Testing Hypotheses about a Single β:

The t TestTo construct tests regarding the value of  β j  we use the following result:Theorem 8. t Distribution for the Standardised Estimators 

Under assumptions 1-6

(ˆ β j  β j )se ( ˆ β j )

s t nk 1 (5)

Note that (5) is di¤erent to (4):

) we have replaced the σ 2 in (5) with σ̂ 2

) consequently the expression has a t -distribution

The t -distribution is very similar to a normal distribution but has‘fatter’ tails

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 9 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 10/40

Testing Hypotheses about a Single Parameter

The result in (6) will be used to test hypotheses about β j .

We are often interested in testing the null hypothesis:

H 0 : β j  = 0 (6)

) in words, this means we are interested in testing whether x  j  has no(partial) e¤ect on the expected value of  y , after controlling for all theother explanatory variables.

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 10 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 11/40

Example

Example: Wages Consider the model:

log(wage ) = β0 + β1educ + β2exper  + β3tenure + u  (7)The null hypothesis H 0 : β2 = 0 means that once education and tenurehave been accounted for, the number of years of labour market experiencehas no e¤ect on the wage. If  β2 > 0 then prior work experience

contributes to higher wages

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 11 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 12/40

The t-test

The statistic we used to test H 0 : β j  = 0 against any alternative iscalled the “t -statistic” (or the “t -ratio”) which is given by:

t ̂ β j 

ˆ β j 

se ( ˆ β j )(8)

) this is simple to calculate once we have the parameter estimate andthe standard error which are reported by statistical software (the

software usually reports the t -statistics as well)

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 12 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 13/40

The t-test

Even if β j  = 0 is true, it is rare that the ‘point estimate’ ˆ β j  = 0 in asample of data

) how far away from 0 must ˆ β j  be before we reject the hypothesis?

) the standard error of  ˆ β j  is a measure of the standard deviation of  ˆ β j ,

so the t -statistic measures how many estimated standard deviationsˆ β j  is away from 0

The precise rejection rule depends on the alternative hypothesis andthe chosen signi…cance level of the test

) the signi…cance level is the probability of rejecting the null hypothesiswhen it is true

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 13 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 14/40

The t-test

Testing against One-Sided Alternatives

Consider:

H 0 : β j  = 0 (9)

H 1 : β j  > 0

The signi…cance level is the probability of rejecting the null hypothesiswhen it is true

) suppose we decide to use a 5% signi…cance level ) we are willing tomistakenly reject H 0 when it is true 5% of the time

) given our alternative, we are looking for a “su¢ciently large” positivevalue of  t ̂ β j 

in order to reject H 0 in favour of  H 1

) “su¢ciently large” for a 5% signi…cance level is the 95th percentile of a t  distribution with n-k-1 degrees of freedom (call this c  for “critical

value”)

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 14 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 15/40

The t-test

The rule is to reject H 0 in favour of  H 1 at the 5% level of signi…canceif 

t ̂ β j  > c  (10)

) this rejection rule is an example of a one-tailed test

(never reject H 0 in favour of  H 1 if  t ̂ β j is negative)

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 15 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 16/40

Rejection Region

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 16 / 40

Th

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 17/40

The t-test

To determine c  we only need to know the level of signi…cance and df 

Examples: From Table G.2If the signi…cance level is 10% and df =21 then c =1.323If the signi…cance level is 1% and df =21 then c =2.518

When the df  of the t  distribution is large, the t  distributionapproaches the standard normal distribution, and for practical

purposes when df  120, we can use critical values from the standardnormal distribution

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 17 / 40

T bl f l

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 18/40

Table of t-values

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 18 / 40

Th

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 19/40

The t-test

Consider the test with the one-sided alternative:

H 0 : β j  = 0

H 1 : β j  < 0

) this rejection rule is the mirror image of the previous case

) given symmetry of the t -distribution, reject H 0 in favour of  H 1 if 

t ̂ β j < c  (11)

where c  is the critical value for the alternative H 1 : β j  > 0

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 19 / 40

E l

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 20/40

Example

Examples:

If the signi…cance level is 5% and df =18, c =1.734,and hence reject H 0: β j  = 0 in favour of  H 1: β j  < 0 if  t ̂ β j < 1.734

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 20 / 40

Th t t t

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 21/40

The t-test

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 21 / 40

The t test

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 22/40

The t-test

Two-Sided Alternatives

It is common to test the null hypothesis (6) against a 2-sidedalternative:

H 0 : β j  = 0 (12)

H 1 : β j  6= 0

) when the alternative is 2-sided, we are interested in the absolute valueof the t  statistic

) the rejection rule is reject H 0 when

jt ̂ β j j > c  (13)

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 22 / 40

The t test

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 23/40

The t-test

For a signi…cance level of 5%, c  is chosen so that the area in each tailof the t  distribution equals 2.5%

(e.g. c  is the 97.5th percentile of the t  distribution with n-k-1 df)

Terminology 

) be explicit about the alternative hypothesis and the level of signi…cance

E.g. If we reject in favour of the alternative in (10) we say “x  j  is

statistically signi…cant, or statistically di¤erent from 0, at the 5%level”

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 23 / 40

Rejection Region

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 24/40

Rejection Region

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 24 / 40

The t test

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 25/40

The t-test

Testing Other Hypotheses about β j 

Sometime we are interested in testing whether β j  is equal to a speci…cvalue apart from 0. For instance, H 0 : β j  = 1 or perhaps H 0 : β j  = 1

For the null hypothesisH 0 : β j  = a (14)

the t -statistic is given by

t  =(estimate - hypothesised value)

standard error (15)

=( ˆ β

 j 

a)

se ( ˆ β j )

and follow exactly the same procedure as above

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 25 / 40

Example

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 26/40

Example

Examples:

1. Wage EquationThe following model was estimated:

 \ log(wage ) = 0.284 + 0.092 educ  + 0.0041exper  + 0.022 tenure 

(0.104) (0.007) (0.0017) (0.003)

n = 526, R 2 = 0.316

where the standard errors are in (.)

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 26 / 40

Example

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 27/40

Example

(i) Test whether the return to experience, after controlling for educationand tenure, in the population is 0 against the alternative it is positive.

Answer:Testing:

H 0 : βexper  = 0

H 1

: βexper 

> 0

Test statistic is:

t ̂ βexper 

=0.0041

0.0017

= 2.41

The critical value at the 1% level of signi…cance is 2.326 and since t ̂ βexper 

exceeds this value, we reject the null in favour of the alternativehypothesis (that the return to experience is positive).

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 27 / 40

Example

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 28/40

Example

(ii) Test whether the return to education, after controlling for experienceand tenure, in population is 0 against the alternative it is not equal to 0.

Answer:Testing:

H 0 : βeduc  = 0

H 1:

βeduc  6= 0

Test statistic is:

jt j =0.092

0.007

= 13.149

The critical value at the 1% level of signi…cance is 2.576 and since jt jexceeds this value, we reject the null in favour of the alternativehypothesis (that the return to education is not equal to 0).

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 28 / 40

Example

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 29/40

Example

Example 2. Housing Prices and Air Pollution.Using a sample of 506 communities in a large US city, a model was

estimated which related median house price in the community (price ) tovarious community characteristics: nox  is the amount of nitrous oxide inthe air (in part per million), dist  is the average distance of the communityfrom 5 employment centres, rooms  is the average number of rooms inhouses in the community, and stratio  is the average student-teacher ratio

in the community. The population model is:

log(price ) = β0 + β1 log(nox ) + β2 log(dist ) + β3rooms + β4stratio + u 

Hence β1 is the elasticity of  price  with respect to nox . We wish to test

H 0 : β1 = 1

H 1 : β1 6= 1

The t -statistic for doing this test is t  =( ˆ β1 +1)

se ( ˆ β1

)

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 29 / 40

Example

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 30/40

Example

Using the data, the estimated model is:

 \ log(price ) = 11.08 0.954 log(nox ) 0.134 log(dist )

(0.32) (0.117) (0.043)

+0.255rooms  0.052stratio 

(0.019) (0.006)

n = 506, R 2 = 0.581

Each slope coe¢cient is signi…cantly di¤erent from 0 even at verysmall signi…cance levels

) however we want to test H 0 : β1 = 1 against H 1 : β1 6= 1) for this test, the t -statistics is (0.954 + 1)/0.117 = 0.393) this test statistic is very small and the null is not rejected: after

controlling for the factors included in the model, there is little evidencethat the elasticity is di¤erent from 1.

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 30 / 40

P-values

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 31/40

P values

Computing p-values for t-tests

There is a degree of arbitrariness in choosing the signi…cance level

) di¤erent researchers prefer di¤erent signi…cance levels (e.g. 10%, 5%,

1%), and there is no “correct” signi…cance level (though researcherstend to adopt higher signi…cance levels when they use smaller samplesof data)

) rather than testing at di¤erent signi…cance levels, it is moreinformative to calculate the p -value for a test:

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 31 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 32/40

P-values

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 33/40

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 33 / 40

Economic vs Statistical Signi…cance

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 34/40

g

Economic vs Statistical Signi…cance

We have been focussing on statistical signi…cance

) the statistical signi…cance of a variable, x  j , is determined by the sizeof the t -statistic t ̂ β

 j 

) however, the economic (or practical) signi…cance also depends onthe size of  ˆ β j 

) too much focus on statistical signi…cance can lead to the falseconclusion that a variable is important  for explaining y  even thoughits estimated e¤ect is small

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 34 / 40

Con…dence Intervals

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 35/40

3. Con…dence IntervalsIt is straightforward to construct con…dence intervals (CI) for β j 

) con…dence intervals are also known as “interval estimates”

) under the CLM assumptions

( ˆ β j  β j )

se ( ˆ β j )s

t nk 1 (16)

manipulating this expression gives a CI for the unknown β j  :

) the 95% CI is given byˆ β j  c 

.

se (ˆ β j ) (17)

where c  is the critical values for the t -distribution withdf  = n k  1.

(e.g. c  = 97.5th percentile of a t nk 1 distribution)

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 35 / 40

Con…dence Intervals

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 36/40

More precisely, the lower and upper lower bounds are given by:

 β j  ˆ β j  c 

.

se (ˆ β j ) (18)

 β j  ˆ β j  + c .se ( ˆ β j )

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 36 / 40

Con…dence Intervals

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 37/40

Interpretation: if random samples were obtained over and over again,with β

 j and β j  computed each time, the population value β j  would lie

in the interval ( β j , β j ) for 95% of the samples

) constructing a CI is very simple, just needˆ

 β j ,

se (ˆ

 β j ),

c  (for c  need toknow df  and the signi…cance level)

) when df  > 120, the t nk 1 distribution is close to the normaldistribution, and so you can use the standard normal distribution (e.g.for a 95% CI, c  =1.96....as a rule of thumb, for a 95% CI take ˆ β j 

2 se ( ˆ β j ))

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 37 / 40

Con…dence Intervals

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 38/40

Note:

- lower con…dence levels (e.g. 90%) lead to narrower CIs- higher con…dence levels (e.g. 99%) lead to wider CIs

) Once a CI is constructed it is easy to carry out 2-tailed hypothesistests: if the null is H 0: β j =a, then H0 is rejected against H 1: β j  6=a atsay the 5% signi…cance level if (and only if) a is not in the 95% CI.

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 38 / 40

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 39/40

Looking Ahead

8/7/2019 E2206L0409b

http://slidepdf.com/reader/full/e2206l0409b 40/40

Next Week:

Continue with inference and consider:

- Testing single linear combination of parameters- Testing multiple linear restrictions (The F  test)

Running SHAZAM

Lecturer: Garry Barrett (UNSW) Lecture 4: Inference I Session 1, 2009 40 / 40