Upload
nguyennhu
View
248
Download
5
Embed Size (px)
Citation preview
Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013. Page 1
DYNAMIC PANEL DATA ANALYSIS
USING STATA 11.0
By:
Mahyudin Ahmad
UiTM Perlis
Page 2 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
1. Revisiting Endogeneity issue
2. Static model: IV estimation (recap)
3. Dynamic Panel data model
1. Difference GMM
2. System GMM
4. Diagnostic tests Sargan/Hansen and Autocorrelation tests
5. Hands-on with Stata
Main references:
1. Cameron & Trivedi (2005), Verbeek (2008), Dr Nor Azam notes.
2. https://www.iser.essex.ac.uk/files/teaching/spudney/ec968/downlo
ads/Lecture%20Notes/bergen%20notes.pdf
3. How to do xtabond2 http://ideas.repec.org/p/cgd/wpaper/103.html
Outline
Page 3 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
Recall if Cor (xit, vi) ≠ 0, we have problem of endogeneity.
Instrumental variable technique has been used to overcome the
problem of endogeneity
2SLS estimation: need to find external exogenous instruments that
satisfy necessary requirements:
correlated with endogenous variable,
uncorrelated with error term of the original model.
1. Endogeneity
Page 4 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
1. Endogeneity
Page 5 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
1. Endogeneity
Page 6 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
1. Endogeneity
Page 7 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
2. IV estimation
Page 8 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
2. IV estimation
Page 9 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
2. IV estimation
Page 10 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
2. IV estimation
Page 11 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
2. IV estimation
Page 12 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
The command to do IV estimation
ivregress: linear regression of depvar on varlist1 and varlist2,
using varlist iv (along with varlist1) as instruments for varlist2.
Diagnostics tests: executed after the ivregress command
estat endogenous
estat overid
estat firststage
2. IV estimation
Page 13 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
The command to do IV estimation
xtivreg: for panel-data models in which some of the right-hand-
side covariates are endogenous. These estimators are two-
stage least-squares generalizations of simple panel-data
estimators for exogenous variables.
Options: be: 2SLS between estimator.
fe: 2SLS within estimator.
re: 2SLS random-effects estimator.
fd: 2SLS first-differenced estimator.
xthtaylor: for panel-data random-effects models in which some
of the covariates are correlated with the unobserved
individual-level random effect.
We are not going to cover this in our
workshop. You may however read help
from Stata and try for yourself the example
given.
2. IV estimation
Page 14 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
What is it called dynamic?
when lagged dependent variable is included in regressors of the
equation to be estimated.
Example:
Economic growth analysis:
Real income/GDP or real GDP growth as depvar
lagged income/GDP is included for convergence analysis
Caselli et al. (1996) and Bond et al. (2001) show that the Generalized Methods of Moments (GMM) dynamic panel estimation is capable to correct for unobserved country heterogeneity, omitted variable bias, measurement error, and endogeneity problems frequently arise in growth estimation.
3. Dynamic Panel data model
Page 15 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
Consider a simple case below:
and
and we assume the error term for t=2....T as follows:
initial condition for dynamic model
(1) ,,...2for 1 T....tyy ititit
T....tuv itiit ...,,2for
22 ][
0][
0][
0][
vi
iti
it
i
vE
uvE
uE
vE
TtyuE
stuuE
iit
isit
,......,2for 0][finally and
for 0][
1
3. Panel data dynamic model
Page 16 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
3. Panel data dynamic model
The implication when having lagged dependent variable as in
Equation (1) earlier:
All Pooled OLS, Fixed effect and Random effect estimators
becoming inconsistent!
How?
OLS
Linear estimation of yit on yit-1
The eror term (vi +uit) is then correlated with the regressor yit-1
via vi
How? Say, we lag the Equation (1) to become
and it’s obvious yit-1 is correlated with vi in ɛit-1
121 ititit yy
Page 17 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
3. Panel data dynamic model
Fixed Effect (within estimator)
Within estimator regress on the error term
is
Inconsistent estimator since is correlated with by
construction.
Recall, and
Consistency requires to become very small relative to uit
and this possible only when which only occurs in long
panels but not on short panels (Nickell, 1981).
Random Effect estimator
Due to the fact that it is linear combination of within and
between estimators.
)( iit yy )( 1,1 iit yy)( iit uu
1,iy iu
T
t iti Tyy2 11, )1/(
T
t iti Tuu2 11, )1/(
iuT
Page 18 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
3.1: Difference GMM
Difference GMM transforms Equation (1) will into a differenced
equation as:
time-invariant fixed effect vi has now disappeared
OLS estimator from Equation (2) is definitely inconsistent as yit-1
in the is correlated with the error term
Anderson and Hsio (1981): estimate Eqn (2) via IV estimation
using earlier lagged of y (in level) i.e. yit-2 as instrument for the
.
Valid instrument since yit-2 is not correlated with the
assuming the error uit are serially uncorrelated. E(uit-1uit-2)=0
yit-2 is a good instrument since it is correlated with the .
(2) ,,...2for )()( 1211 T....tuuyyyy itititititit
)( 21 itit yy )( 1 itit uu
)( 21 itit yy
)( 1 itit uu
)( 21 itit yy
Page 19 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
3.1: Difference GMM
More efficient estimation is however possible using additional lags
of the dependent variable as instruments.
For example, both yit-2 and yit-3 as instruments for the .
The model is then overidentified (number of instruments greater than
number of instrumented variable) so estimation should be by 2SLS
or panel GMM.
Note that number of instruments available is highest for the
dependent variable observed in time t closest to the final time
period (most recent) T.
In period 3, only yi1 is available as instrument for Δyi3
In period 4, both yi1 and yi2 are available as instrument for Δyi4
In period 5, yi1, yi2, and yi3 are available as instrument for Δyi5
and so on.
)( 21 itit yy
Page 20 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
Use many lags, replacing missing with zero
Generate separate instrument for each lag and time period
instrumented
IV-style:
2,
1
.
.
Ti
i
y
y
GMM-style:
.
000
0000
00000
000000
000000
123
12
1
iii
ii
i
yyy
yy
y
3.1: Difference GMM
Page 21 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
3.1: Difference GMM
Holtz-Eakin et al (1988) and Arellano and Bond (1991) propose
panel GMM estimators, and the instruments need to observe
the following moment restrictions for α to be estimated
efficiently:
Assuming no serial correlation in the error term, uit , lagged
levels of the variables (that we differenced in equation to be
estimated) make a valid instruments as yis is not correlated
with the
Hence the name Arellano-Bond estimator (or difference GMM)
(3) 2,......,1and ,,......,3for 0][ tsTtyuE isit
)( 1 itit uu
Page 22 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
If our model contains other explanatory variables xit as regressors :
The differenced equation will be
Thus, lagged level of xit-2 will make valid instrument for
since it does not correlate with the .
Rule of thumb: for a level variable to be a valid instrument for
differenced variable, it must be lagged at least by 2 periods.
3.1: Difference GMM
)( 1 itit xx
(5) ,,.2for )()()( 11211 T..tuuxxyyyy itititititititit
)( 1 itit uu
(4) ,,...2for 1 T....txyy itititit
Page 23 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
Moment conditions now become:
An additional assumption is also required for the explanatory
variables, x’s ie they are assumed to be weakly exogenous
In other words, the explanatory variables must be be orthogonal to
future realizations of the error term.
3.1: Difference GMM
(3) 2,......,1and ,,......,3for 0][ tsTtyuE isit
(6) 2,......,1and ,,......,3for 0][ tsTtxuE isit
Page 24 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
Arellano-Bond is however shown to have conceptual and
statistical shortcomings
Alonso-Borrego and Arellano (1999), and Blundell and Bond
(1998) point out that when explanatory variables are persistent
over time, lagged levels of these variables make weak
instruments for regression in differences,
Instrument weakness in turn influences the asymptotic and the
small-sample performance of the difference estimator.
Asymptotically, variance of the coefficients will rise, and in small
sample, Monte Carlo experiments show that weak instruments
can produce biased coefficients
3.1: Difference GMM
Page 25 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
A better and efficient technique of dynamic panel analysis GMM is
proposed by Arellano and Bover (1995) using the following moment
conditions:
which equal to
The above moment conditions imply that we estimate Equation (1)
in level (not in differenced), and instrument the endogenous yit-1 in
the model with lagged differences of y, ie Δyis , s≤t-1
The estimator proposed by Arellano and Bover (1995) utilizing this
moment condition is therefore called Arrellano-Bover estimator.
3.2: System GMM
(7) 1for 0][ tsyE isit
1for 0])[(or tsyuvE isiti
1for 0])[( tsyuvE isiti
Page 26 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
For Δyis to be valid instrument, it must be uncorrelated with the
original composite error term ɛit or (vi+uit) in the level equation to be
estimated.
Notwithstanding the possible correlation between the regressor (in
level) and the time-invariant factor vi.
This is since the regressors are assumed to fulfill the following
stationarity property:
for all p and s.
Recall, earlier that we assume and which
imply normal distribution of vi
3.2: System GMM 1for 0])[(or tsyuvE isiti
][][ isitipit vyEvyE
0][ ivE22 ][ vivE
Page 27 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
Blundell and Bond (1998) combines estimation in difference (the
similar technique in difference GMM method) with estimation in
level proposed by Arrellano-Bover estimator
The estimator is called as system GMM estimator.
System GMM estimator must therefore fulfill the moment conditions
(3) and (7)
and the stationary property
3.2: System GMM 1for 0])[(or tsyuvE isiti
(3) 2,......,1and ,,......,3for 0][ tsTtyuE isit
(7) 1for 0][ tsyE isit
][][ isitipit vyEvyE
Page 28 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
If our model contains other explanatory variables xit as regressors as
in Equation (4) ie
The instruments in system GMM estimator (level & differenced
regressions) therefore need to fulfill the following moment conditions,
and the following stationary properties
for all p and s.
for all p and s.
3.2: System GMM
(4) ,,...2for 1 T....txyy itititit
(8) 1 allfor 0][( tsxE isit
][][ isitipit vxEvxE
(3) 2,......,1and ,,......,3for 0][ tsTtyuE isit
(6) 2,......,1and ,,......,3for 0][ tsTtxuE isit
(7) 1 allfor 0][( tsyE isit
][][ isitipit vyEvyE
Page 29 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
Consistency of the GMM estimator depends on the validity of the
instruments.
As suggested by Arellano and Bond (1991), Arellano and Bover
(1995), and Blundell and Bond (1998), two specification tests are
used: Sargan/Hansen test and serial correlation test (AR(1) &
AR(2)).
Sargan/Hansen test of over-identifying restrictions which tests
for overall validity of the instruments
the null hypothesis is that all instruments as a group are
exogenous. Therefore higher p-value is better (insignificant)
Rule of thumb : no of instruments ≤ no. of panel units.
4. Diagnostic tests
Page 30 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
Serial correlation test examines the null hypothesis that error term
of the differenced equation is not serially correlated at the first
order (AR1) and second order (AR2). So again we need higher p-
value here.
By construction, the differenced error term is probably serially
correlated at AR(1) even if the original error is not. Differenced
error term at AR (1) process is and
and both have uit-1
AR(2) test is most important since it will detect autocorrelation in
levels. AR(2) process is and
While most studies that employ GMM dynamic estimation report
the test for first order serial correlation, some do not.
1 ititit uuu 211 ititit uuu
1 ititit uuu 322 ititit uuu
4. Diagnostic tests
Page 31 Dynamic Panel Data Analysis – iLQAM, UiTM Shah Alam, 12-13 Dec 2013.
xtabond2 command is preferable.
Refer abdata and do file sent
Hands-on with Stata