Further Regression Topics

Embed Size (px)

Citation preview

  • 8/11/2019 Further Regression Topics

    1/35

    1/35

    EC114 Introduction to Quantitative Economics19. Further Regression Topics I

    Marcus Chambers

    Department of EconomicsUniversity of Essex

    13/15 March 2012

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    http://find/
  • 8/11/2019 Further Regression Topics

    2/35

    2/35

    Outline

    1 Dummy Variables

    2 Chow Tests

    Reference: R. L. Thomas, Using Statistics in Economics,McGraw-Hill, 2005, sections 14.1 and 14.2.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    http://find/
  • 8/11/2019 Further Regression Topics

    3/35

    Dummy Variables 3/35

    Sometimes the variables we want to use cant be

    measured in a precise quantitative way.We therefore have to introducequalitativefactors into theanalysis.

    For example, suppose we want to study the demand forbeef over time.

    We might consider the regression model

    Qt=+Yt+t, t= 1, . . . , n,

    whereQis per-capita demand for beef,Y is per-capitadisposable income, and thetsubscript denotes the time

    period which indexes observations.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    http://find/
  • 8/11/2019 Further Regression Topics

    4/35

    Dummy Variables 4/35

    Suppose that we suspect the nature of demand for beef to

    change in certain periods because of a scare due to mad

    cow disease (Creutzfeld-Jakob disease, or CJD).How can we deal with such a qualitative factor?

    The approach we will follow introduces a dichotomousordummy variablesuch that, in period t,

    Dt= 1 when consumers fear getting CJD,Dt= 0 when consumers do not fear getting CJD.

    We shall see how such variables can be used in aregression model to allow the parameters of the model to

    change in certain periods.

    We shall also consider tests of whether the parameters are

    constant over the entire sample or whether they change inthe way described above.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    http://find/
  • 8/11/2019 Further Regression Topics

    5/35

    Dummy Variables 5/35

    Suppose we have monthly data on the demand for beef for

    5 years(n= 60).

    Furthermore suppose that we know that the CJD fear is

    particularly relevant for the 12 months of the second yearand for the first 6 months of the third year.

    We can then define a dummy variable for the relevant

    period as:

    Dt= 1 fort= 13, . . . , 30,Dt= 0 for all othert.

    This variable enters the data set as a series of zeros and

    ones and is treated like any other variable.Note that we can only constructDtif we have information

    on the period affected by the CJD scare.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    D V i bl /

    http://find/http://goback/
  • 8/11/2019 Further Regression Topics

    6/35

    Dummy Variables 6/35

    We can allow for the CJD scare to affect the intercept, ,simply by includingDtas an additional explanatory

    variable:

    Qt=+Dt+Yt+t, t= 1, . . . , n.

    WhenDt= 0(no CJD scare) the model reduces to

    Qt=+Yt+t.

    But during the CJD scare,Dt= 1and so

    Qt= (+) +Yt+t,

    the intercept becoming+(and we would probablyexpect < 0so that demand falls for givenYduring theCJD scare).

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    D V i bl 7/35

    http://find/
  • 8/11/2019 Further Regression Topics

    7/35

    Dummy Variables 7/35

    Suppose we run this regression and obtain

    Qt= 25 12Dt+0.02Yt.

    This implies that during the CJD scare (setting Dt= 1)

    Qt= 13+ 0.02Yt,

    while when there is no CJD scare (setting Dt= 0)

    Qt= 25+ 0.02Yt.

    This effect is illustrated in the following diagram:

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 8/35

    http://find/
  • 8/11/2019 Further Regression Topics

    8/35

    Dummy Variables 8/35

    The presence of the dummy variable enables the

    estimated demand equation to shift downwards during theCJD scare.

    Note, however, that the slope remains unaffected.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 9/35

    http://find/
  • 8/11/2019 Further Regression Topics

    9/35

    Dummy Variables 9/35

    The shift in the demand equation, allowing different

    demand equations in the different periods, was obtained byestimating asingleequation.

    It means that we can test whether the shift is significant bycarrying out at-test for the significance of the dummy

    variables Dt.

    So, to testH0:= 0againstHA:= 0we could use

    TS=

    s

    tn3 under H0,

    wheredenotes the estimated coefficient on Dtands isits standard error.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 10/35

    http://find/
  • 8/11/2019 Further Regression Topics

    10/35

    Dummy Variables 10/35

    However, it is nevertheless possible that the effect of the

    CJD scare is on the marginal propensity to consume beef

    out of income i.e. on the parameter.This situation might appear to be a bit more complicatedbut it is also straightforward to handle.

    Instead of including the variableDtby itself in the

    regression, we now include the product of DtwithYtin theregression i.e. we include the variable YtDt.

    It is easy to construct a new variable in regression software

    that is the product of two variables.

    For example, in Stata, if Dand Yare the two variables, we

    can use the command:

    gen yd=y*d

    to generate the product.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 11/35

    http://find/
  • 8/11/2019 Further Regression Topics

    11/35

    Dummy Variables 11/35

    The starting point is now given by the equation

    Qt=+Yt+YtDt+t.

    WhenDt= 0it follows thatYtDt= 0and we have theoriginal equation

    Qt=+Yt+t,

    while whenDt= 1it follows that YtDt= Ytand we obtain

    Qt=+ (+)Yt+t.

    If < 0and + > 0the intercept remains unchanged butthe slope falls while remaining positive.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 12/35

    http://find/
  • 8/11/2019 Further Regression Topics

    12/35

    Dummy Variables 12/35

    Suppose we run this regression and obtain

    Qt= 25+ 0.02Yt 0.01YtDt.

    This implies that during the CJD scare (setting Dt= 1)

    Qt= 25+ 0.01Yt,

    while when there is no CJD scare (setting Dt= 0)

    Qt= 25+ 0.02Yt.

    This effect is illustrated in the following diagram:

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 13/35

    http://find/
  • 8/11/2019 Further Regression Topics

    13/35

    y

    The presence of the product dummy YtDtresults in theslope of the estimated equation changing.

    Note that the intercept remains unchanged.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 14/35

    http://find/http://goback/
  • 8/11/2019 Further Regression Topics

    14/35

    y

    We can, of course, allow both the intercept and the slope

    to change at the same time.

    This requires adding bothDtandYtDtto the regression:

    Qt=+1Dt+Yt+2YtDt+t.

    WhenDt= 0it follows thatYtDt= 0and we have the

    original equation

    Qt=+Yt+t,

    while whenDt= 1it follows that YtDt= Ytand we obtain

    Qt= (+1) + (+2)Yt+t.

    We can generalise this to any multiple linear regression.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 15/35

    http://find/
  • 8/11/2019 Further Regression Topics

    15/35

    Example. Consider the logarithmic money demand model

    ln(M) =1+2ln(G) +

    whereMdenotes the money stock andGdenotes GDP.

    Data set 9.1 in Thomas contains observations for 30countries in 1985 and we shall split the sample into those

    countries with GDP$4,000.We can define an appropriate dummy variable:

    D= 1 if GDP$4,000.

    We shall run regressions to see if the intercept and/orslope are different for countries in these two ranges ofGDP per head.

    Including the dummy variable yields:

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 16/35

    http://find/
  • 8/11/2019 Further Regression Topics

    16/35

    . regress lm lg d

    Source | SS df MS Number of obs = 30

    -------------+------------------------------ F( 2, 27) = 119.10Model | 57.8613015 2 28.9306507 Prob > F = 0.0000

    Residual | 6.5585244 27 .242908311 R-squared = 0.8982

    -------------+------------------------------ Adj R-squared = 0.8906

    Total | 64.4198259 29 2.22137331 Root MSE = .49286

    ------------------------------------------------------------------------------

    lm | Coef. Std. Err. t P>|t| [95% Conf. Interval]

    -------------+----------------------------------------------------------------

    lg | .9030628 .132545 6.81 0.000 .6311029 1.175023d | -.4534076 .3644604 -1.24 0.224 -1.201219 .2944033

    _cons | -1.519285 .3323406 -4.57 0.000 -2.201191 -.8373782

    ------------------------------------------------------------------------------

    The dummy variable is statistically insignificant and so

    there does not appear to be any difference in the interceptbetween these two groups.

    Including the variableD ln(G)yields:

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 17/35

    http://find/
  • 8/11/2019 Further Regression Topics

    17/35

    . regress lm lg dlg

    Source | SS df MS Number of obs = 30

    -------------+------------------------------ F( 2, 27) = 116.31Model | 57.7200283 2 28.8600142 Prob > F = 0.0000

    Residual | 6.69979758 27 .248140651 R-squared = 0.8960

    -------------+------------------------------ Adj R-squared = 0.8883

    Total | 64.4198259 29 2.22137331 Root MSE = .49814

    ------------------------------------------------------------------------------

    lm | Coef. Std. Err. t P>|t| [95% Conf. Interval]

    -------------+----------------------------------------------------------------

    lg | 1.089832 .0828707 13.15 0.000 .9197953 1.259869dlg | -.1650309 .1697023 -0.97 0.339 -.5132313 .1831695

    _cons | -1.958399 .1146874 -17.08 0.000 -2.193718 -1.72308

    ------------------------------------------------------------------------------

    The product variable is also statistically insignificant

    suggesting no difference in the slope parameter (incomeelasticity of money demand) between the two groups ofcountries.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Dummy Variables 18/35

    http://find/
  • 8/11/2019 Further Regression Topics

    18/35

    Dummy variables are widely used in Econometrics.

    In cross-sections they can be used to represent things

    such as:gender: D= 1if female,D= 0if male;employment status: D= 1if employed,

    D= 0if unemployed;

    marital status: D=1

    if married,D= 0if unmarried.

    In time series dummies can be used to represent thingssuch as:

    season: Dj= 1if quarterj,

    Dj = 0otherwise (j= 1, . . . , 4);particular event: D= 1during wartime,

    D= 0otherwise.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 19/35

    http://find/http://goback/
  • 8/11/2019 Further Regression Topics

    19/35

    It is also possible to test whether coefficients in aregression remain constant over two pre-specified

    sub-samples using theChow test for parameter stability.Suppose we split the sample of nobservations into two

    sub-samples, the first containing n1observations, thesecond containingn2 observations, where n1+ n2=n.

    Suppose, in the first sub-sample, we have the populationregression

    E(Y) =1+2X2+. . .+kXk,

    while in the second sub-sample we have the population

    regression

    E(Y) =1+2X2+. . .+kXk.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 20/35

    http://find/
  • 8/11/2019 Further Regression Topics

    20/35

    The null hypothesis of interest is

    H0:1=1, 2=2, . . . , k=k,

    i.e. that the coefficients are the same in the two

    sub-samples.

    The alternative hypothesis is

    HA:j =j for at least one j.

    We need to conduct three regressions and obtain the sumof squared residuals, SSR, from each regression.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 21/35

    http://find/
  • 8/11/2019 Further Regression Topics

    21/35

    The required regressions are as follows:Regression 1: use then1observations, obtain SSR1;

    Regression 2: use then2observations, obtain SSR2;Regression 3: use allnobservations, obtain SSRp.

    Regression 3 is sometimes known as a pooled regressionbecause all the observations have been pooled together.

    There are two versions of the Chow test.

    The first version compares all three values of SSR, while

    the second compares either SSR1or SSR2with SSRp.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 22/35

    http://find/
  • 8/11/2019 Further Regression Topics

    22/35

    The first test statistic is

    TS=

    (SSRp SSR1 SSR2)/k

    (SSR1+SSR2)/(n1+ n2 2k) Fk,n1+n22k

    underH0.

    The usual decision rule applies:

    ifTS> F0.05rejectH0in favour of HA;ifTS< F0.05do not rejectH0,

    whereF0.05is the 5% critical value from the Fk,n1+n22kdistribution.

    This test assesses whether the gain from estimating twoseparate regressions, rather than the pooled regression, isstatistically significant (as measured by comparing SSRpwith SSR1+SSR2).

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 23/35

    http://find/
  • 8/11/2019 Further Regression Topics

    23/35

    The second test statistic is

    TS=

    (SSRp SSR1)/n2

    SSR1/(n1 k) Fn2,n1k

    underH0.

    The usual decision rule applies:

    ifTS> F0.05rejectH0in favour of HA;ifTS< F0.05do not rejectH0,

    whereF0.05is the 5% critical value from the Fn2,n1kdistribution.

    This test is sometimes known as a predictive failure test,because it is assessing whether the additional n2observations included in the pooled regression result in astatistically significant proportionate change in SSR.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 24/35

    http://goforward/http://find/http://goback/
  • 8/11/2019 Further Regression Topics

    24/35

    In time series applications the second sub-sample of n2observations follows chronologically from the first

    sub-sample ofn1observations, so there is a clear orderingof the observations.

    In cross-sections, however, there is often no uniqueordering, and so the roles of the two sub-samples can be

    reversed, resulting in the test statistic

    TS=(SSRp SSR2)/n1

    SSR2/(n2 k) Fn1,n2k

    underH0.

    In this case then2observations are taken as the reference

    point, which wouldnt make sense with time series.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 25/35

    http://find/
  • 8/11/2019 Further Regression Topics

    25/35

    Another way to think of the Chow test is in terms of dummyvariables.

    LetDdenote the following dummy variable:

    D= 1 if sub-sample 1 (n1observations),D= 0 if sub-sample 2 (n2observations).

    We can then define the population regression

    E(Y) = 1+2X2+. . .+kXk+ 1D +2(DX2) +. . .+k(DXk).

    In sub-sample 1, we have

    E(Y) = (1+1) + (2+2)X2+. . .+ (k+k)Xk,

    while in sub-sample 2 we have

    E(Y) =1+2X2+. . .+kXk.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 26/35

    http://find/
  • 8/11/2019 Further Regression Topics

    26/35

    The differences in the coefficients between the two

    sub-samples are captured by thej parameters.

    If thej are all zero then there are no differences.

    We can therefore consider our test of parameter stability as

    being a test of

    H0:1= 0, 2= 0, . . . , k= 0

    againstHA:at least onej = 0.

    We can carry out an F-test of thesekrestrictions by

    running just two regressions, as follows:

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 27/35

    http://find/
  • 8/11/2019 Further Regression Topics

    27/35

    Regression 1 is the unrestricted regression

    Y = 1+2X2+. . .+kXk

    + 1D +2(DX2) +. . .+k(DXk) +;

    from this we need the SSR, denoted SSRU.

    Regression 2 is the restricted regression

    Y=1+2X2+. . .+kXk+;

    from this we need SSRR.

    We then construct the test statistic

    TS=(SSRR SSRU)/k

    SSRU/(n 2k) Fk,n2k under H0

    and apply the usual decision rule.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 28/35

    http://find/
  • 8/11/2019 Further Regression Topics

    28/35

    Example. Lets return to the money demand example,

    where we divided the countries into those withGDP>$4000 per head and those with GDP F = 0.0000

    Residual | 6.93446511 28 .247659468 R-squared = 0.8924

    -------------+------------------------------ Adj R-squared = 0.8885

    Total | 64.4198259 29 2.22137331 Root MSE = .49765

    ------------------------------------------------------------------------------

    lm | Coef. Std. Err. t P>|t| [95% Conf. Interval]

    -------------+----------------------------------------------------------------

    lg | 1.04467 .068569 15.24 0.000 .9042126 1.185127

    _cons | -1.912253 .104309 -18.33 0.000 -2.12592 -1.698586

    ------------------------------------------------------------------------------

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 29/35

    http://find/
  • 8/11/2019 Further Regression Topics

    29/35

    The two sub-sample regressions are:

    . regress lm lg if g F = 0.0000Residual | 4.6831173 17 .275477489 R-squared = 0.6938

    -------------+------------------------------ Adj R-squared = 0.6758

    Total | 15.2933819 18 .849632328 Root MSE = .52486

    ------------------------------------------------------------------------------

    lm | Coef. Std. Err. t P>|t| [95% Conf. Interval]

    -------------+----------------------------------------------------------------

    lg | .9226822 .148673 6.21 0.000 .6090096 1.236355

    _cons | -1.970365 .1216961 -16.19 0.000 -2.227121 -1.713609------------------------------------------------------------------------------

    . regress lm lg if g>4

    Source | SS df MS Number of obs = 11

    -------------+------------------------------ F( 1, 9) = 3.52

    Model | .714288023 1 .714288023 Prob > F = 0.0934

    Residual | 1.8267651 9 .2029739 R-squared = 0.2811

    -------------+------------------------------ Adj R-squared = 0.2012Total | 2.54105312 10 .254105312 Root MSE = .45053

    ------------------------------------------------------------------------------

    lm | Coef. Std. Err. t P>|t| [95% Conf. Interval]

    -------------+----------------------------------------------------------------

    lg | .7237502 .3858088 1.88 0.093 -.1490099 1.59651

    _cons | -1.117129 .8758755 -1.28 0.234 -3.098497 .864239

    ------------------------------------------------------------------------------

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 30/35

    http://find/
  • 8/11/2019 Further Regression Topics

    30/35

    For the first test we find that

    SSRp= 6.9345, SSR1= 4.6831, SSR2= 1.8268

    withn1= 19,n2= 11and k= 2.

    The test statistic is

    TS = (SSRp SSR1 SSR2)/k

    (SSR1+SSR2)/(n1+ n2 2k)

    = (6.9345 4.6831 1.8268)/2

    (4.6831+ 1.8268)/(19+ 11 4)= 0.8479.

    UnderH0(that the coefficients are the same in the two

    sub-samples)TShas anF2,26distribution, and soF0.05= 3.3690.

    AsTS= 0.8026< 3.3690we are unable to reject H0.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 31/35

    http://find/
  • 8/11/2019 Further Regression Topics

    31/35

    For the second test we have

    SSRp= 6.9345, SSR1= 4.6831, SSR2= 1.8268

    withn1= 19,n2= 11and k= 2.The test statistic is

    TS = (SSRp SSR1)/n2

    SSR1/(n1 k)

    = (6.9345 4.6831)/11

    (4.6831)/(19 2) = 0.7430.

    UnderH0(that the coefficients are the same in the twosub-samples)TShas anF11,17distribution, and so

    F0.05 2.4153(the values forF10,17andF12,17are 2.4499and 2.3807, respectively).

    AsTS= 0.7430< 2.4153we are unable to reject H0.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 32/35

    http://find/
  • 8/11/2019 Further Regression Topics

    32/35

    We can also conduct the second test by reversing the roles

    of the two sub-samples, by computing

    TS = (SSRp SSR2)/n1

    SSR2/(n2 k)

    = (6.9345 1.8268)/19

    (1.8268)/(11 2) = 1.3244.

    UnderH0(that the coefficients are the same in the twosub-samples)TShas anF19,9distribution, and so

    F0.05 2.9365(this is actually the value for F20,9).

    AsTS= 1.3244< 2.9365we are once more unable to rejectH0.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 33/35

    http://find/
  • 8/11/2019 Further Regression Topics

    33/35

    Finally, lets compute the alternative version of the first testusing dummy variables, definingD= 1for countries with

    GDP F = 0.0000

    Residual | 6.5098824 26 .250380092 R-squared = 0.8989

    -------------+------------------------------ Adj R-squared = 0.8873

    Total | 64.4198259 29 2.22137331 Root MSE = .50038

    ------------------------------------------------------------------------------

    lm | Coef. Std. Err. t P>|t| [95% Conf. Interval]

    -------------+----------------------------------------------------------------

    lg | .7237502 .4285011 1.69 0.103 -.1570464 1.604547

    d | - .8532358 .979691 -0.87 0.392 -2.867019 1.160548

    dlg | .198932 .4513347 0.44 0.663 -.7287999 1.126664

    _cons | -1.117129 .9727969 -1.15 0.261 -3.116742 .8824836

    ------------------------------------------------------------------------------

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Chow Tests 34/35

    http://find/
  • 8/11/2019 Further Regression Topics

    34/35

    We therefore have SSRR= 6.5099while our earlier resultsgive SSRU= 6.9345; also,k= 2and n= 30.

    The test statistic is

    TS=(6.9345 6.5099)/2

    6.5099/(30 4) = 0.8479;

    this is exactly the same value as the first Chow statistic we

    computed!In fact, it is possible to prove that the two statistics areidentical.

    BecauseTShas anF2,26distributions under H0(as before),

    we know the 5% critical value is 3.3690 and the test resultis the same (do not rejectH0).

    We have actually carried out exactly the same test but by aslightly different approach.

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    Summary 35/35

    Summary

    http://find/
  • 8/11/2019 Further Regression Topics

    35/35

    Su a y

    Dummy variables

    Chow tests

    Next week:

    Heteroskedasticity, autocorrelation and dynamic models

    EC114 Introduction to Quantitative Economics 19. Further Regression Topics I

    http://find/