Panel Data Econometrics kenya

COLLABORATIVE MASTERS PROGRAMME IN ECONOMICS FOR ANGLOPHONE AFRICA

(CMAP) JOINT FACILITY FOR ELECTIVES 2008

TOPICS IN MICROECONOMETRICS

DR. MOSES SICHEI*

Date: 8th September, 2008

LECTURE 10: INTRODUCTION TO PANEL ECONOMETRICS I

* Research Department, Central Bank of Kenya, Contact details: Tel. 254 20 2860000 Ext.3248 Mobile:+254 723383505;Email: [email protected] or [email protected]

Objectives: The main objective of the lecture is to provide motivation for panel data models. Specifically, the lecture presents the following:

• Places panel data in the context of other data types • Types of panel data types • Advantages of panel data • Limitations of panel data • Overview of panel data models

Key words

• Balanced panel data

• Cross-section oriented panel data

• Dynamic panel

• Macro-panel data

• Micro-panel data

• Nonstationary panel data

• Panel data

• Pooled data

• Pseudo-panel

• Rotating panels

• Seemingly unrelated regression model

• Spatial panel data

• Static panel data

• Stationary panel data

• Synthetic panel data

• Unbalanced panel data

1. INTRODUCTION AND MOTIVATION 1.1 Types of Data 1.1.1 Cross-section data

1

• Values of one or more variables are collected for several sample units/economic entities

at the same point in time.

• In other words it is a snapshot at a point in time

• Examples

o Poverty rates in different countries in Africa at a particular point in time

o Econometrics marks for the 2008 CMAP group

o Household survey data for Uganda

• Cross-section models are predominantly equilibrium models that generally do not shed

light on intertemporal dependence of events

• They fail to resolve fundamental issues about the sources of persistent behaviour

• For instance what’s the main cause of high non-performing loans in country A?

1.1.2 Time series data

• Observe the values of one or more variables over time e.g. GDP, money supply for

several years.

• They shed light on intertemporal dependence of events

• E.g. autoregressive distributed lag models, error correction models etc.

1.1.3 Panel data (cross-section and time series)

• Cross-section repeatedly sampled over time but where the same economic agent has been

followed throughout the period of the sample.

• An example is the average marks for the CMAP econometrics course for each university

over the period 2001-2008.

Period University of Nairobi

University of Dar Es Salaam

University of Ghana

University of Addis Ababa

University of Malawi

University of Zimbabwe

University of Botswana

2001

2002

2003

2004

2

2005

2006

2007

2008

• In other words panel data combines cross-section (“picture or snapshot”, or space)

with time series (“path.”, movie)

Other terms used;

• Pooled data (pooling of time series and cross-section observations)


• Combination of time series and cross-section data


• Longitudinal data (Study over time of a variable or group of subjects).

• Event history (study of the movement over time of subjects through successive states

or conditions)

• Cohort analysis (e.g. following the career path of the first CMAP graduates)

⇒ All these terminologies essentially connote movements over time of cross-sectional units.

⇒Thus panel data is used as a generic term to include one or more of these situations.

⇒Regressions based on such data are called panel data regression models

Examples

• Gravity model of trade, where you observe trade figures for different countries/products

over time

• Investment model, where your cross-sections are the firms observed over time

• Studies dealing with a panel of commercial banks

• Etc.

1.2 Structures of Panel Data

3

(a) Cross-section oriented panel data. The number of cross-sections (N) is more

than the time dimension (T) .e.g. study covering 24 banks over 10 years. This is

the original panel data

(b) Time-series oriented panel data. The time dimension (T) is greater than the

cross-sections (N) e.g. Study of the demand for 4 different oil products covering a

period of say 10 years. This is quite common in macroeconomics

(c) Balanced panel data. This is panel data where there is no missing observations

for every cross-section

(d) Unbalanced panel data. This is the case, where the cross-sections do not have

the same number of data observations. In other words some cross-sections do not

have data. For example when studying Ghana’s trade data to a number of

countries in Africa including South Africa, There would be no exports figures

before 1994 due to sanctions imposed on South Africa.

(e) Rotating panels. This is a case where in order to keep the same number of

economic agents in a survey; the fraction of economic agents that drops from the

sample in the second period is replaced by an equal number of similar economic

agents that are freshly surveyed. This is a necessity in survey panels where the

same economic agent (say household) may not want to be interviewed again and

again.

(f) Pseudo-Panels/synthetic panels. This panel data that is close to a genuine panel

data structure. For instance for some countries, panel data may not exist. Instead

the researcher may find annual household survey based on a large random sample

of the population. For instance in Kenya there are household surveys for 1993,

1994, 1997 and the recent KHIBS 2006. For these repeated cross-section

surveys, it may be impossible to track the same household over time as required

in a genuine panel. In Pseudo panels cohorts are tracked (e.g. males borne

between 1970 to 1980). For large samples, successive surveys will generate

random samples of members of each cohort. We can then estimate economic

relations based on means rather than individual observations.

(g) Spatial Panels. This is panel data dealing with space. For instance cross-section

of countries, regions, states. These aggregate units are likely to exhibit cross-

sectional correlation that has to be dealt with using special methods (spatial

econometrics)

4

(h) Limited dependent/nonlinear panel data. This is panel data where the

dependent variable is not completely continuous-binary(logit/probit models),

hierarchical (nested logit models), ordinal (ordered logit/probit),

categorical(multinomial logit/probit), count models(poisson and negative

binomial), truncated (truncated regression), censored (tobit), sample

selection(Heckit model)

1.3 Types of Panel Data Models

(a) Static Panel data Models vs Dynamic Panel Data Model. Static panel data model

has no lagged dependent variable on the rhs.

(c ) Stationary Panel Data Model vs Non Stationary Panel Data Model. Stationary

panel data model contain stationary variables (i.e. I(0) variables) as opposed to non-

stationary variables

1.4 Why Panel Data? (Baltagi, 2005, Chapter 1)

1. Control for heterogeneity among economic agents.

• Time series and cross-section studies that do not control for the heterogeneity run the

risk of obtaining biased results.

2. More informative data, more variability, less collinearity amongst variables,

more degress of freedom and more efficiency;

• Time series studies are faced with multi-collinearity in most cases for instance in

studying say the demand for beer in Kenya using time series, there is likely to be

high collinearity between price and income in aggregate time series data. This is less

likely with a panel across the 8 provinces in Kenya since the cross-section dimension

adds a lot of variability, adding more informative data on price and income. The idea

is that the variation in the data can be decomposed into variation between the 8

provinces of different sizes and characteristics and variation within each prince over

time.

• Blows up degrees of freedom (NxT);

5

• Increased precision in estimation (more efficiency). With additional more

informative data, we can produce more reliable parameter estimates.

3. Panel data are better able to study dynamics of adjustment. Panel data are better

suited for studying the duration of economic states like unemployment and poverty

and if such panels are long enough, they can shed some light on the speed of

adjustments of to economic policy changes.

• E.g. the effects of free primary education on poverty.

• Questions such as determining whether families’ experiences of poverty,

unemployment and dependency ratios are transitory or chronic necessitate the use of

panels. By studying the repeated cross-section of observations, panel data are better

suited to study the dynamics of change.

• Estimation of intertemporal relations, life cycles and inter-generational models

4. Panel data are better able to identify and measure the effects that are simply not

detectable in pure cross-section and pure time series data

• Does union membership in Kenya increase or decrease wages? We need to observe a

worker moving from union to nonunion jobs or vice versa. Holding the individual’s

characteristics constant, we will be better equipped to determine whether union

membership affects wage and by how much.

5. Panel data models allow us to construct and test more complicated behavioural

models than purely cross-section and time series data e.g technical efficiency better

studied in panel

6. Panel data are usually gathered on micro units like individuals, firms, households,

countries etc. Many variables can be more accurately measured at the micro level

and biases resulting from aggregation over firms or individuals are eliminated.

1.5 Limitations of Panel Data Analysis

1. Design and data collection problems

• Coverage (incomplete account of the population of interest)

• Non-response (due to lack of cooperation of the respondent-fear for use of the

results(tax?) or because of interviewer error

6

• Recall (respondents not remembering correctly)

• Freq of interviews

• Time in sample bias-is observed when a significantly different level for a

characteristic occurs in the first interview than in later interviews, when ideally one

would expect the same level

2. Distortions of measurement error

Measurement errors may arise because of

• Faulty responses due to unclear questions, memory errors, deliberate distortions

(e.g.prestige bias); inappropriate informants, misrecording of responses

3. Selectivity problems

These include;

• Self-selectivity-For instance people choose not to work because of the reservation

wage is higher than the offered wage. In this case we only observe the characteristics

of these individuals but not their wage. Panel data does not solve such a problem

• Non/partial-responses-Occurs mainly at the initial wave of the panel due to refusal to

participate, nobody at home, untraced sample unit, etc. This cause loss in efficiency

as well as serious identification problems for the population parameters

• Attrition-Respondents may die, or move or find that the cost of responding is too

high. In order to counter the effects of attrition, rotating panels are sometimes used,

where a fixed percentage of the respondents are replaced in every wave to replenish

the sample.

4. Very short time series dimension

• Sometimes data for the problem at hand has a short time span for each individual

(micropanels). • This means that asymptotic refinements which rely crucially on the number of

individuals tending to infinity may not be useful. 5. Cross-section dependence

• Macropanels on countries or regions may lead to misleading inference.

• Most analyses assume independence

• When we have dependence between cross-sections, it becomes complicated

• More on this when we handle panel unit roots and panel cointegration

7

2. OVERVIEW OF PANEL DATA MODELS

• Panel data model notation differs from a regular time series or cross-section regression in

that it has a double subscript on its variables ; itit xy , ;

2.1 Panel Data Models

• A very general linear model for panel data permits the intercept and slope coefficients to

vary over both individual and time

ititititit xy εβα +′+= , ,Ni ,...,1= Tt ,...,1= (1)

• Where is a scalar dependent variable, is a ity itx 1×k vector of independent

variables, itε is a scalar disturbance term, i indexes individuals (firms, country etc.),

indexes time.

t

• This model is too general and not estimable as there are more parameters to estimate

than observations

• More restrictions need to be placed on the extent to which itα and itβ can vary with

and and on the behaviour of the error term i t itε

2.2 The Pooled Data Model

• This is the most restrictive model that specifies constant coefficients, which is the usual

assumption about cross-section analysis

ititit xy εβα +′+= (2)

Where denotes households, individuals, firms, countries etc and t denotes time. i

• We assume that errors are homoscedastic and serially independent both within and

between individuals(cross-sections).

2)( σε =itVar

0),( =jsitCor εε when ji ≠ and/or st ≠

• The marginal effects β of the set of k vector of time-varying characteristics are

taken to be common across i and t, although this assumption can itself be tested.

itx

8

• If the model is correctly specified and regressors are uncorrelated with the error term, the

pooled OLS will product consistent and efficient estimates for the parameters

• This is the pooled least squares model.

∑∑

∑∑

= =

= =

′

′= N

i

T

titit

N

i

T

titit

xxNT

yxNT

1 1

1 1

~~1

~~1

β

Where ∑∑= =

=N

i

T

titx

NTx

1 1

1 , ∑∑= =

=N

i

T

tity

NTy

1 1

1 , xxx itit −=~ , yyy itit −=~ (3)

xy βα ˆˆ −=

• This formulation does not distinguish between two different individuals and the same

individual at two different points in time

• This feature undermines the accuracy of the approach when differences do exist between

cross-sectional units.

• Nonetheless, the increase in the sample by pooling data across time generates an

improvement in efficiency relative to a single cross-section.

itx

ity

ititit xy εβα +′+=

θ

• Here we do not use any panel information. The data are treated as if there was only one

single index.

9

2.3 Traditional Panel Data Model

• In this case the constant term, iα , varies from individual to individual.

ititiit xy εβα +′+= (4)

• We assume that errors are homoscedastic and serially independent both within and

between individuals(cross-sections).

2)( σε =itVar

0),( =jsitCor εε when ji ≠ and/or st ≠

• This is what we refer to in panel parlance as individual (unobserved) heterogeneity.

• The slopes are the same for all individuals i.e.

• Its graphical form is as follows;

itx

ity

1

2

3

1α

2α

3α

ititit xy εβα +′+= 1



1θ

2θ

3θ

βθθθ === )tan()tan()tan( 321

0

2.4 Traditional Seemingly Unrelated Regression (SUR) Model

• The constant terms, iα , and slope coefficients, iβ ,vary from individual to individual.

itiitiit xy εβα +′+=

10

itx

ity1

2

3

1α

2α

3α

1θ

2θ3θ

0




( ) ( ) ( )321 tantantan θθθ >>

• In the SUR models, the error terms are assumed to be contemporaneously correlated and

heteroscedastic between individuals.

2)( iitVar σε =

ijjtitCor σεε =),( Contemporaneous (same period) correlation

0),( =jsitCor εε when st ≠

2.5 Which model is appropriate for my Data?

Large number of independent individuals observed for a few time periods (N>>T).

This is common in cross-sectional panels.

o It is not possible to estimate different individual slopes, iβ , for all the exogenous

variables. The panel data model is the most appropriate.

There are medium length time series for relatively few individuals (say countries, firms,

sectors, banks,etc.). T>N.

o In this case the SUR model may be appropriate.

o Efficient SUR estimation is mainly used when . NT ≥

o Equation by Equation OLS is used if NTK ≤≤

In terms of unrestrictiveness, the relationship is as follows:

Pooled (i.e.most restrictive)<Panel<SUR( i.e.most unrestrictive)

11

References

Baltagi, B.H. (2005), Econometric Analysis of Panel Data, 3rd Edition, John Wiley Chapter 1 Cameron,A.C. and Trivedi, P.K. (2006), Microeconometrics: Methods

and Applications, Cambridge University Press. Chapter Chapter 21

Cheng Hsiao, Analysis of Panel Data, Cambridge University Press, 1986

Greene W H. Econometric Analysis, Second Edition, Macmillan, 2003.

Chapter 13

Wooldridge, J. M. (2002), Econometric Analysis of Cross-Section and

Panel Data, MIT Press: Cambridge, Massachusets. Chapter 10

12

COLLABORATIVE MASTERS PROGRAMME IN ECONOMICS FOR ANGLOPHONE AFRICA

(CMAP) JOINT FACILITY FOR ELECTIVES 2008


DR. MOSES SICHEI*


LECTURE 11: ONE-WAY ERROR COMPONENT MODELS


Objectives: The main objective of the lecture is to understand some basic panel data models in the context of one-way error components. Key words

• Balanced panel data

• Between estimator

• Cross-section oriented panel data

• Dynamic panel

• Fixed effects model(FEM)

• Generalised least squares

• Least squares dummy variable (LSDV)

• Panel data

• Pooled data

• One-way error components model

• Random effects model (REM)

• Within estimation

1. INTRODUCTION

1.1 One-Way Error Component Model

• This model allows cross-section heterogeneity in the error term.

• From the traditional panel data model in lecture 10

ititiit xy εβα +′+= (1)

• The error term in equation 1 is decomposed into;

itiit νµε += (2)

• Where iµ denotes the unobservable individual specific effect and itν denotes

idiosyncratic errors or idiosyncratic disturbances, which change across time and cross-

section.

1

• iµ is time invariant (same for all the time) and accounts for any individual-specific

effect that is not included in the regression.

1.1.1 Meaning of the unobservable individual specific effects iµ

• These refer to unobservable individual specific effects which are not included in the

equation because of :

o We do not know exactly how to specify them explicitly

o We know but have no data

• We simply want to acknowledge their existence

• For instance in a production function utilizing data on firms across time, iµ refers to the

unobservable entrepreneurial or management skills of the firm executives

The other terminologies given to iµ are;

• Latent variable

• Unobserved heterogeneity

• Substituting Equation 2 in 1 yields the following one-way error component model;

ititiit xy νβµα +′++= (3)

• Note here that ∑=

=N

iiN 1

1 αα

ααµ −= ii

α is the average individual effect while iµ is the individual deviation from the average (recall

the reference class is multinomial logit model)

2. FIXED EFFECTS MODEL (FEM)

• This is appropriate when differences between individual economic agents may

reasonably be viewed as parametric shifts in the regression function itself.

• Suppose we have a simple linear panel regression model of the form ;

ititiit xy νβµα +′++= (4)

• The iµα + are possibly correlated with the regressors itx

2

• The following figure shows how the FEM handles the heterogeneity issue.

ity Pooled

Group 1

Group2

itx

Biased slope when fixed effects are ignored

0

[ ] ititit xxyE βα +=

[ ] ititit xxyE βµα ++= 1

[ ] ititit xxyE βµα ++= 2

1µα +

2µα +

α

2.1 ESTIMATION OF FIXED EFFECTS MODEL

• The challenge of estimation is the presence of the N individual-specific effects that

increase as ∞→N

• Nonetheless, there are several methods that can be applied

o Least squares dummy variables (LSDV) which is a direct OLS with indicator

(dummy variables) for each of the N fixed effects

o Use OLS in the within estimation context

o Generalized Leased squares in the within model context

o Maximum likelihood estimation conditional on the individual means

Niyi ,...,2,1, =

o OLS in first differences

3. LEAST SQUARES DUMMY VARIABLE (LSDV) ESTIMATION

• This approach assumes that any difference across economic agents can be captured by

shifts in the intercepts of a standard OLS regression.

ititiit xy νβµα +′++=

3

• We estimate an LSDV model first by defining a series of individual-specific dummies

variables.

• In principle one simply estimates the OLS regression of on and a set of N-1

indicator variables

ity itx

tNitt ddd )1(2,1 ,... −

• The resulting estimator of β turns out to equal the within estimator (running a

regression through the mean)

• This is a special case of the Frisch-Waugh-Lovell theorem. You have been using this

theorem: running a regression through the origin (after subtracting the mean) produces

the same slope coefficients as running it with an intercept).

• The theorem was introduced by Frisch and Waugh (1933), and then reintroduced by

Lovell (1963).

• Read pages 62-75 of Econometric theory and methods by Davidson and Mackinnon

(2004) for more details of the theorem

Digression: Different ways of stacking data

• Suppose we studying private consumption in 12 Africa countries over the period 1998-

2003

• We have data on real consumption and income for the different African countries

Dependent variable: consumption

(i) stacked vertically to create 1×NT vector

4

Country Period ConsumptionBotswana 1998 7180.3Botswana 1999 7533.5Botswana 2000 7841.1Botswana 2001 7919.2Botswana 2002 8085.2Botswana 2003 8222.9Burkina Faso 1998 1283027.4Burkina Faso 1999 1297642.2Burkina Faso 2000 1306400.0Burkina Faso 2001 1411715.4Burkina Faso 2002 1513999.2Burkina Faso 2003 1626124.7Burundi 1998 462066.7Burundi 1999 492055.5Burundi 2000 465738.0Burundi 2001 469289.9Burundi 2002 491701.3Burundi 2003 486195.2Kenya 1998 596883.1Kenya 1999 594332.1Kenya 2000 609862.0Kenya 2001 629103.7Kenya 2002 650968.4Kenya 2003 680065.0Madagascar 1998 21830.2Madagascar 1999 22441.8Madagascar 2000 22483.0Madagascar 2001 22443.8Madagascar 2002 21150.2Madagascar 2003 22985.4Mauritius 1998 69552.9Mauritius 1999 71594.9Mauritius 2000 73939.3Mauritius 2001 76048.7Mauritius 2002 78570.9Mauritius 2003 82602.2Morocco 1998 240.3Morocco 1999 233.4Morocco 2000 243.0Morocco 2001 256.4Morocco 2002 256.1Morocco 2003 261.7Nigeria 1998 3307.9Nigeria 1999 2255.7Nigeria 2000 2446.5Nigeria 2001 3068.0Nigeria 2002 3665.8Nigeria 2003 3424.9Rwanda 1998 588.0Rwanda 1999 595.6Rwanda 2000 641.9Rwanda 2001 676.1Rwanda 2002 740.4Rwanda 2003 769.9Sierra Leone 1998 1180237.6Sierra Leone 1999 1032168.8Sierra Leone 2000 1142680.0Sierra Leone 2001 1369830.5Sierra Leone 2002 1547871.3Sierra Leone 2003 1613277.6South Africa 1998 516925.9South Africa 1999 531213.0South Africa 2000 556652.0South Africa 2001 579316.4South Africa 2002 598804.9South Africa 2003 614082.8Tanzania 1998 5610.4Tanzania 1999 6003.2Tanzania 2000 6069.6Tanzania 2001 6579.9Tanzania 2002 7064.1Tanzania 2003 7974.2

• This is how data for stata, SAS and PCGIVE should be

organized

• Data should be organized this way for Eviews if you would

like to use dynamic panel methods

5

(ii) Horizontal to create matrix NNT ×Country Period rcons_Bots rcons_BurkF rcons_Bur rcons_Ken rcons_Madag rcons_Maurit rcons_Mor rcons_Nig rcons_Rwa rcons_SierL rcons_rsa rcons_TanBotswana 1998 7180.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Botswana 1999 7533.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Botswana 2000 7841.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Botswana 2001 7919.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Botswana 2002 8085.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Botswana 2003 8222.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burkina Faso 1998 0.0 1283027.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burkina Faso 1999 0.0 1297642.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burkina Faso 2000 0.0 1306400.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burkina Faso 2001 0.0 1411715.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burkina Faso 2002 0.0 1513999.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burkina Faso 2003 0.0 1626124.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burundi 1998 0.0 0.0 462066.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burundi 1999 0.0 0.0 492055.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burundi 2000 0.0 0.0 465738.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burundi 2001 0.0 0.0 469289.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burundi 2002 0.0 0.0 491701.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Burundi 2003 0.0 0.0 486195.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Kenya 1998 0.0 0.0 0.0 596883.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Kenya 1999 0.0 0.0 0.0 594332.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Kenya 2000 0.0 0.0 0.0 609862.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Kenya 2001 0.0 0.0 0.0 629103.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Kenya 2002 0.0 0.0 0.0 650968.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Kenya 2003 0.0 0.0 0.0 680065.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Madagascar 1998 0.0 0.0 0.0 0.0 21830.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0Madagascar 1999 0.0 0.0 0.0 0.0 22441.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0Madagascar 2000 0.0 0.0 0.0 0.0 22483.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Madagascar 2001 0.0 0.0 0.0 0.0 22443.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0Madagascar 2002 0.0 0.0 0.0 0.0 21150.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0Madagascar 2003 0.0 0.0 0.0 0.0 22985.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0Mauritius 1998 0.0 0.0 0.0 0.0 0.0 69552.9 0.0 0.0 0.0 0.0 0.0 0.0Mauritius 1999 0.0 0.0 0.0 0.0 0.0 71594.9 0.0 0.0 0.0 0.0 0.0 0.0Mauritius 2000 0.0 0.0 0.0 0.0 0.0 73939.3 0.0 0.0 0.0 0.0 0.0 0.0Mauritius 2001 0.0 0.0 0.0 0.0 0.0 76048.7 0.0 0.0 0.0 0.0 0.0 0.0Mauritius 2002 0.0 0.0 0.0 0.0 0.0 78570.9 0.0 0.0 0.0 0.0 0.0 0.0Mauritius 2003 0.0 0.0 0.0 0.0 0.0 82602.2 0.0 0.0 0.0 0.0 0.0 0.0Morocco 1998 0.0 0.0 0.0 0.0 0.0 0.0 240.3 0.0 0.0 0.0 0.0 0.0Morocco 1999 0.0 0.0 0.0 0.0 0.0 0.0 233.4 0.0 0.0 0.0 0.0 0.0Morocco 2000 0.0 0.0 0.0 0.0 0.0 0.0 243.0 0.0 0.0 0.0 0.0 0.0Morocco 2001 0.0 0.0 0.0 0.0 0.0 0.0 256.4 0.0 0.0 0.0 0.0 0.0Morocco 2002 0.0 0.0 0.0 0.0 0.0 0.0 256.1 0.0 0.0 0.0 0.0 0.0Morocco 2003 0.0 0.0 0.0 0.0 0.0 0.0 261.7 0.0 0.0 0.0 0.0 0.0Nigeria 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3307.9 0.0 0.0 0.0 0.0Nigeria 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2255.7 0.0 0.0 0.0 0.0Nigeria 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2446.5 0.0 0.0 0.0 0.0Nigeria 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3068.0 0.0 0.0 0.0 0.0Nigeria 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3665.8 0.0 0.0 0.0 0.0Nigeria 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3424.9 0.0 0.0 0.0 0.0Rwanda 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 588.0 0.0 0.0 0.0Rwanda 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 595.6 0.0 0.0 0.0Rwanda 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 641.9 0.0 0.0 0.0Rwanda 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 676.1 0.0 0.0 0.0Rwanda 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 740.4 0.0 0.0 0.0Rwanda 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 769.9 0.0 0.0 0.0Sierra Leone 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1180237.6 0.0 0.0Sierra Leone 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1032168.8 0.0 0.0Sierra Leone 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1142680.0 0.0 0.0Sierra Leone 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1369830.5 0.0 0.0Sierra Leone 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1547871.3 0.0 0.0Sierra Leone 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1613277.6 0.0 0.0South Africa 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 516925.9 0.0South Africa 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 531213.0 0.0South Africa 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 556652.0 0.0South Africa 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 579316.4 0.0South Africa 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 598804.9 0.0South Africa 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 614082.8 0.0Tanzania 1998 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5610.4Tanzania 1999 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6003.2Tanzania 2000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6069.6Tanzania 2001 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6579.9Tanzania 2002 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 7064.1Tanzania 2003 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 7974.2

6

The following stacking is used by Eviews software when you are not

interested in using dynamic panel or the version of eviews software

cannot allow you to do so (e.g.eviews 3.1) Period rcons_Bots rcons_BurkF rcons_Bur rcons_Ken rcons_Madag rcons_Maurit rcons_Mor rcons_Nig rcons_Rwa rcons_SierL rcons_rsa rcons_Tan1990 5587.5 944244.1 652316.1 534498.2 18388.6 48387.5 203.9 2003.8 680.7 1351781.6 416324.7 4179.11991 6361.7 922686.9 616476.2 521072.4 19200.5 50061.0 230.1 2538.2 639.0 1538679.5 417907.8 4188.81992 6517.6 954058.8 686615.5 513099.7 18385.7 52803.7 219.0 3192.2 646.7 1359474.4 419952.9 4391.41993 5989.5 978916.2 712774.9 414525.6 19496.6 55480.3 208.9 2700.9 624.1 1485785.2 430586.1 4471.01994 6344.1 878232.0 733701.2 382161.6 19770.5 58196.4 229.5 2221.0 546.1 1403960.5 447390.1 4490.51995 6361.3 998737.2 548712.2 485436.4 19840.0 60635.0 221.2 2857.5 440.4 1445824.7 473595.4 4585.61996 6397.7 1103507.1 411135.4 496801.0 19766.0 63251.7 242.1 3391.5 494.8 1470327.2 494634.1 4684.21997 6633.4 1112742.4 409956.4 562446.2 21313.3 65509.1 230.5 3222.4 579.4 1322968.4 510869.8 5115.21998 7180.3 1283027.4 462066.7 596883.1 21830.2 69552.9 240.3 3307.9 588.0 1180237.6 516925.9 5610.41999 7533.5 1297642.2 492055.5 594332.1 22441.8 71594.9 233.4 2255.7 595.6 1032168.8 531213.0 6003.22000 7841.1 1306400.0 465738.0 609862.0 22483.0 73939.3 243.0 2446.5 641.9 1142680.0 556652.0 6069.62001 7919.2 1411715.4 469289.9 629103.7 22443.8 76048.7 256.4 3068.0 676.1 1369830.5 579316.4 6579.92002 8085.2 1513999.2 491701.3 650968.4 21150.2 78570.9 256.1 3665.8 740.4 1547871.3 598804.9 7064.12003 8222.9 1626124.7 486195.2 680065.0 22985.4 82602.2 261.7 3424.9 769.9 1613277.6 614082.8 7974.2

Note that if the above data is treated as a matrix, you can simply

stack it vertically by using the vectorisation algebra (available as

option in matrix algebra in most software like stata and eviews)

⎥⎦

⎤⎢⎣

⎡=

7431

A ( )⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

7341

AVec

Independent variable: income

• The same stacking as the dependent variable can be done depending on the software

Individual specific dummy variables for LSDV model: Creates NNT × matrix

7

Country Period Bots BurkF Bur Ken Madag Maurit Mor Nig Rwa SierL rsa TanBotswana 1998 1 0 0 0 0 0 0 0 0 0 0 0Botswana 1999 1 0 0 0 0 0 0 0 0 0 0 0Botswana 2000 1 0 0 0 0 0 0 0 0 0 0 0Botswana 2001 1 0 0 0 0 0 0 0 0 0 0 0Botswana 2002 1 0 0 0 0 0 0 0 0 0 0 0Botswana 2003 1 0 0 0 0 0 0 0 0 0 0 0Burkina Faso 1998 0 1 0 0 0 0 0 0 0 0 0 0Burkina Faso 1999 0 1 0 0 0 0 0 0 0 0 0 0Burkina Faso 2000 0 1 0 0 0 0 0 0 0 0 0 0Burkina Faso 2001 0 1 0 0 0 0 0 0 0 0 0 0Burkina Faso 2002 0 1 0 0 0 0 0 0 0 0 0 0Burkina Faso 2003 0 1 0 0 0 0 0 0 0 0 0 0Burundi 1998 0 0 1 0 0 0 0 0 0 0 0Burundi 1999 0 0 1 0 0 0 0 0 0 0 0Burundi 2000 0 0 1 0 0 0 0 0 0 0 0Burundi 2001 0 0 1 0 0 0 0 0 0 0 0Burundi 2002 0 0 1 0 0 0 0 0 0 0 0Burundi 2003 0 0 1 0 0 0 0 0 0 0 0Kenya 1998 0 0 0 1 0 0 0 0 0 0 0Kenya 1999 0 0 0 1 0 0 0 0 0 0 0Kenya 2000 0 0 0 1 0 0 0 0 0 0 0Kenya 2001 0 0 0 1 0 0 0 0 0 0 0Kenya 2002 0 0 0 1 0 0 0 0 0 0 0Kenya 2003 0 0 0 1 0 0 0 0 0 0 0Madagascar 1998 0 0 0 0 1 0 0 0 0 0 0 0Madagascar 1999 0 0 0 0 1 0 0 0 0 0 0 0Madagascar 2000 0 0 0 0 1 0 0 0 0 0 0 0Madagascar 2001 0 0 0 0 1 0 0 0 0 0 0 0Madagascar 2002 0 0 0 0 1 0 0 0 0 0 0 0Madagascar 2003 0 0 0 0 1 0 0 0 0 0 0 0Mauritius 1998 0 0 0 0 0 1 0 0 0 0 0Mauritius 1999 0 0 0 0 0 1 0 0 0 0 0Mauritius 2000 0 0 0 0 0 1 0 0 0 0 0Mauritius 2001 0 0 0 0 0 1 0 0 0 0 0Mauritius 2002 0 0 0 0 0 1 0 0 0 0 0Mauritius 2003 0 0 0 0 0 1 0 0 0 0 0Morocco 1998 0 0 0 0 0 0 1 0 0 0 0 0Morocco 1999 0 0 0 0 0 0 1 0 0 0 0 0Morocco 2000 0 0 0 0 0 0 1 0 0 0 0 0Morocco 2001 0 0 0 0 0 0 1 0 0 0 0 0Morocco 2002 0 0 0 0 0 0 1 0 0 0 0 0Morocco 2003 0 0 0 0 0 0 1 0 0 0 0 0Nigeria 1998 0 0 0 0 0 0 0 1 0 0 0Nigeria 1999 0 0 0 0 0 0 0 1 0 0 0Nigeria 2000 0 0 0 0 0 0 0 1 0 0 0Nigeria 2001 0 0 0 0 0 0 0 1 0 0 0Nigeria 2002 0 0 0 0 0 0 0 1 0 0 0Nigeria 2003 0 0 0 0 0 0 0 1 0 0 0Rwanda 1998 0 0 0 0 0 0 0 0 1 0 0Rwanda 1999 0 0 0 0 0 0 0 0 1 0 0Rwanda 2000 0 0 0 0 0 0 0 0 1 0 0Rwanda 2001 0 0 0 0 0 0 0 0 1 0 0Rwanda 2002 0 0 0 0 0 0 0 0 1 0 0Rwanda 2003 0 0 0 0 0 0 0 0 1 0 0Sierra Leone 1998 0 0 0 0 0 0 0 0 0 1 0 0Sierra Leone 1999 0 0 0 0 0 0 0 0 0 1 0 0Sierra Leone 2000 0 0 0 0 0 0 0 0 0 1 0 0Sierra Leone 2001 0 0 0 0 0 0 0 0 0 1 0 0Sierra Leone 2002 0 0 0 0 0 0 0 0 0 1 0 0Sierra Leone 2003 0 0 0 0 0 0 0 0 0 1 0 0South Africa 1998 0 0 0 0 0 0 0 0 0 0 1 0South Africa 1999 0 0 0 0 0 0 0 0 0 0 1 0South Africa 2000 0 0 0 0 0 0 0 0 0 0 1 0South Africa 2001 0 0 0 0 0 0 0 0 0 0 1 0South Africa 2002 0 0 0 0 0 0 0 0 0 0 1 0South Africa 2003 0 0 0 0 0 0 0 0 0 0 1 0Tanzania 1998 0 0 0 0 0 0 0 0 0 0 0 0Tanzania 1999 0 0 0 0 0 0 0 0 0 0 0 0Tanzania 2000 0 0 0 0 0 0 0 0 0 0 0 0Tanzania 2001 0 0 0 0 0 0 0 0 0 0 0 0Tanzania 2002 0 0 0 0 0 0 0 0 0 0 0 0Tanzania 2003 0 0 0 0 0 0 0 0 0 0 0 0

000000000000

000000

000000000000

8

• Note the fact that the last cross-section is not coded with 1 to avoid the problem of

dummy variable trap (i.e. perfect multi-collinearity)

Use of Kronecker product

Recall that

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅

=⎥⎦

⎤⎢⎣

⎡⊗⎥

⎦

⎤⎢⎣

⎡=⊗

1236309024126030

11211323310133031222112132023101

1230

1321

BA

The dummy variables can be represented as

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=

100000000000010000000000001000000000000100000000000010000000000001000000000000100000000000010000000000001000000000000100000000000010000000000001

NI ,

⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=

111111

TJ

The reproduction of the dummy variables above amounts to

µZJI TN =⊗

Our model can then be written as

( )[ ] itxZy νβα

µ +⎥⎦

⎤⎢⎣

⎡=

The Kronecker product ( is an block diagonal matrix and X is the

matrix of nonconstant regressors

)TN JI ⊗

9

• An OLS estimator of this model yields the LSDV estimator

( ) ( ) ( )( )

( )⎥⎥⎦

⎤

⎢⎢⎣

⎡

′

′⊗×⎥⎥⎦

⎤

⎢⎢⎣

⎡

′⊗′

′⊗⊗′⊗=⎥⎦

⎤⎢⎣

⎡−

yxyJI

xxJIxxJIJIJI TN

TN

TNTNTN

LSDV

LSDV

1

ˆˆβα

⎥⎦

⎤⎢⎣

⎡′

×⎥⎦

⎤⎢⎣

⎡′′

=−

yxy

xxxTxTTIN

1

• The LSDV model can easily be estimated using over the full panel

to yield LSDV estimators.

• This model is appealing

• But for short panels the problem is that it estimates too many

(incidental) parameters that may not be of intrinsic value

K+1 + (N−1)

K parameters for the original X-regressors;

1 parameter for the intercept;

N-1 parameters for cross-section fixed effects (omitted cross-

section captured by the common intercept i.e the reference class-

Tanzania.)

Problems with LSDV Model

1. There are too many incidental/nuisance parameters since iµ grows

as N increases. The usual proof of consistency for an estimator

does not hold for LSDV model.

10

2. Inverting (K+1)+(N-1) matrix may be impossible if N is very

large. Even when it is possible it can be inaccurate.

Eviews panel results of LSDV model Dependent Variable: LN_RCONS? Method: Pooled Least Squares Date: 09/08/08 Time: 21:11 Sample: 1990 2003 Included observations: 14 Cross-sections included: 12 Total pool (unbalanced) observations: 163

Variable Coefficient Std. Error t-Statistic Prob.

LN_RGDP? 0.683031 0.050445 13.54021 0.0000_BOTS--C 2.042464 0.502558 4.064134 0.0001

_BURKF--C 4.224775 0.718861 5.877042 0.0000_BUR--C 4.730530 0.630225 7.506101 0.0000_KEN--C 3.976501 0.681912 5.831401 0.0000

_MADAG--C 3.065602 0.508071 6.033802 0.0000_MAURIT--C 3.215030 0.580446 5.538895 0.0000

_MOR--C 1.471438 0.295430 4.980671 0.0000_NIG--C 2.076701 0.434629 4.778103 0.0000_RWA--C 2.025546 0.325644 6.220130 0.0000

_SIERL--C 4.366440 0.721471 6.052135 0.0000_RSA--C 3.813676 0.687518 5.547019 0.0000_TAN--C 2.569156 0.444138 5.784585 0.0000

R-squared 0.998770 Mean dependent var 10.40442Adjusted R-squared 0.998671 S.D. dependent var 2.939862S.E. of regression 0.107162 Akaike info criterion -1.552558Sum squared resid 1.722550 Schwarz criterion -1.305817Log likelihood 139.5335 F-statistic 10147.81Durbin-Watson stat 0.961912 Prob(F-statistic) 0.000000

Stata is not quite good in estimating the LSDV model. Eviews does a good job We need a trick to deal with these problems 4. WITHIN/Q ESTIMATOR 11

• Using the “WITHIN” estimation we can still assume individual effects, although we no

longer directly estimate them.

• We demean the data so as “wipe out the incidental parameters (individual effects) and

estimate β only.

• This means subtracting the mean for each cross-section from each observation. • Demeaning the data will not change the estimates for β. (Think of the econometric

exercise of “running a regression line through the origin”.)

• In order to wipe out the individual effects, we define a Q matrix

itQxQQy νβ +′=

Where PIQ N −=

( ) µµµµ ZZZZP 1−′=

• P is a centering matrix that averages across time for each

individual cross-section

• Consequently, pre-multiplying this regression by Q obtains

deviations from the means within each cross-section

Qyy =~ and Qxx =~

The OLS estimator is

( ) QyxQxx ′′= −1~β

( ) 1~var −′= Qxxvσβ

Let’s look at the same stuff using common parlance

12


The mean model is

••• +′++= iiii xy νβµα

Demeaning the model

( )••• +′+−+′++=− iiiititiiit xxyy νβµανβµα

( ) ( ) ( ) ( )•• −+′−′+−+−= iitiitii xx ννβµµαα

( ) ( )••• −+′−′=− iitiitiit xxyy ννβ

Where

∑=

• =T

titi y

Ty

1

1

∑=

• =T

titi x

Tx

1

1

∑=

• =T

titi T 1

1 νν

Notice that we have wiped out the individual effect coefficients since

( ) 0=−αα

( ) 0=− ii µµ

Using OLS yields the within estimator

( )( ) ( )( )∑∑∑∑= =

••

−

= =•• −−⎥⎦

⎤⎢⎣

⎡ ′−−=N

i

T

tiitiit

N

i

T

iiitiitW yyxxxxxx

1 1

1

1 1β

• There are no incidental parameters and the errors still satisfy the usual assumptions. • We can therefore use OLS on the above equation to obtain consistent estimates.

• Averaging across all observations yields

•••••• ++= νβα xy

• Individual effects can be solved (not estimated) with the assumption:

13

to avoid the dummy variable trap or perfect multicollinearity ∑=

=N

ii

10µ

and solving:

•••••• −−= xyy 21~~ ββα

21~~~ ββαµ −−−= ••• iii xy

• In other words, we can use First Order Conditions to derive individual effects.

• Note that the total individual effect is the sum of the common constant and the

constructed individual component.

14

A B A-CCountry Period Consumption mean consumption demeaned consumptionBotswana 1998 7180.3 7797.021896 -616.7Botswana 1999 7533.5 7797.021896 -263.6Botswana 2000 7841.1 7797.021896 44.1Botswana 2001 7919.2 7797.021896 122.2Botswana 2002 8085.2 7797.021896 288.2Botswana 2003 8222.9 7797.021896 425.9Burkina Faso 1998 1283027.4 1406484.816 -123457.4Burkina Faso 1999 1297642.2 1406484.816 -108842.6Burkina Faso 2000 1306400.0 1406484.816 -100084.8Burkina Faso 2001 1411715.4 1406484.816 5230.6Burkina Faso 2002 1513999.2 1406484.816 107514.4Burkina Faso 2003 1626124.7 1406484.816 219639.9Burundi 1998 462066.7 477841.1 -15774.4Burundi 1999 492055.5 477841.1 14214.4Burundi 2000 465738.0 477841.1 -12103.1Burundi 2001 469289.9 477841.1 -8551.2Burundi 2002 491701.3 477841.1 13860.2Burundi 2003 486195.2 477841.1 8354.1Kenya 1998 596883.1 626869.0426 -29986.0Kenya 1999 594332.1 626869.0426 -32537.0Kenya 2000 609862.0 626869.0426 -17007.0Kenya 2001 629103.7 626869.0426 2234.7Kenya 2002 650968.4 626869.0426 24099.3Kenya 2003 680065.0 626869.0426 53196.0Madagascar 1998 21830.2 22222.41045 -392.2Madagascar 1999 22441.8 22222.41045 219.4Madagascar 2000 22483.0 22222.41045 260.6Madagascar 2001 22443.8 22222.41045 221.4Madagascar 2002 21150.2 22222.41045 -1072.2Madagascar 2003 22985.4 22222.41045 763.0Mauritius 1998 69552.9 75384.83324 -5831.9Mauritius 1999 71594.9 75384.83324 -3789.9Mauritius 2000 73939.3 75384.83324 -1445.5Mauritius 2001 76048.7 75384.83324 663.9Mauritius 2002 78570.9 75384.83324 3186.1Mauritius 2003 82602.2 75384.83324 7217.4Morocco 1998 240.3 248.4780447 -8.1Morocco 1999 233.4 248.4780447 -15.0Morocco 2000 243.0 248.4780447 -5.5Morocco 2001 256.4 248.4780447 7.9Morocco 2002 256.1 248.4780447 7.6Morocco 2003 261.7 248.4780447 13.2Nigeria 1998 3307.9 3028.129497 279.7Nigeria 1999 2255.7 3028.129497 -772.4Nigeria 2000 2446.5 3028.129497 -581.6Nigeria 2001 3068.0 3028.129497 39.9Nigeria 2002 3665.8 3028.129497 637.6Nigeria 2003 3424.9 3028.129497 396.7Rwanda 1998 588.0 668.6450459 -80.7Rwanda 1999 595.6 668.6450459 -73.1Rwanda 2000 641.9 668.6450459 -26.7Rwanda 2001 676.1 668.6450459 7.5Rwanda 2002 740.4 668.6450459 71.8Rwanda 2003 769.9 668.6450459 101.2Sierra Leone 1998 1180237.6 1254557.641 -74320.1Sierra Leone 1999 1032168.8 1254557.641 -222388.8Sierra Leone 2000 1142680.0 1254557.641 -111877.6Sierra Leone 2001 1369830.5 1254557.641 115272.9Sierra Leone 2002 1547871.3 1254557.641 293313.6Sierra Leone 2003 1613277.6 1254557.641 358720.0South Africa 1998 516925.9 566165.8325 -49239.9South Africa 1999 531213.0 566165.8325 -34952.8South Africa 2000 556652.0 566165.8325 -9513.8South Africa 2001 579316.4 566165.8325 13150.5South Africa 2002 598804.9 566165.8325 32639.1South Africa 2003 614082.8 566165.8325 47917.0Tanzania 1998 5610.4 6550.223129 -939.8Tanzania 1999 6003.2 6550.223129 -547.0Tanzania 2000 6069.6 6550.223129 -480.6Tanzania 2001 6579.9 6550.223129 29.7Tanzania 2002 7064.1 6550.223129 513.9Tanzania 2003 7974.2 6550.223129 1423.9

15

Within estimation results from Eviews Dependent Variable: LN_RCONS? Method: Pooled Least Squares Date: 09/08/08 Time: 21:16 Sample: 1990 2003 Included observations: 14 Cross-sections included: 12 Total pool (unbalanced) observations: 163


C 3.082438 0.540824 5.699521 0.0000LN_RGDP? 0.683031 0.050445 13.54021 0.0000

Fixed Effects (Cross) _BOTS--C -1.039974

_BURKF--C 1.142337 _BUR--C 1.648092 _KEN--C 0.894063

_MADAG--C -0.016836 _MAURIT--C 0.132592

_MOR--C -1.611000 _NIG--C -1.005736 _RWA--C -1.056892

_SIERL--C 1.284002 _RSA--C 0.731238 _TAN--C -0.513282

Effects Specification

Cross-section fixed (dummy variables)


• The fixed effects have not been computed but simply recovered

• This can be seen from the lack of standard errors as opposed to

the LSDV results

• The interpretation of the country-specific fixed effects is as

follows 16

• For those countries with positive values, it means that there are

some unobservable factors which tend to enhance consumption

• For those countries with negative country-specific fixed effects,

there are unobservable characteristics that hinder the consumption Look at stata within results xtreg ln_cons ln_rgdp, fe i(country) Fixed-effects (within) regression Number of obs = 154 Group variable (i): country Number of groups = 11 R-sq: within = 0.5644 Obs per group: min = 14 between = 0.9903 avg = 14.0 overall = 0.9888 max = 14 F(1,142) = 183.96 corr(u_i, Xb) = 0.9591 Prob > F = 0.0000 ------------------------------------------------------------------------------ ln_cons | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ln_rgdp | .6576598 .0484889 13.56 0.000 .5618063 .7535133 _cons | 3.255495 .5148909 6.32 0.000 2.237653 4.273337 -------------+---------------------------------------------------------------- sigma_u | 1.0877784 sigma_e | .10204792 rho | .99127587 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(10, 142) = 127.58 Prob > F = 0.0000

Properties of the WITHIN Estimators

• The slope coefficients are consistent if N or T become large

• The fixed effects are only consistent if T is large

• The number of degrees of freedom must be adjusted. The degrees of

freedom k=NT-N-K. Please note that the usual OLS

programs(software) not designed for panel data assume that the

degrees of freedom k=NT-K, which is wrong!!! Their standard

errors, test statistics and p-values must be corrected as follows. Let’s

use the following notation KNTku −= (unadjusted degrees of freedom)

and (adjusted degrees of freedom) Nkk ua −=

17

( ) )ˆ(ˆ ββ ua

ua se

kkse =

( ) )ˆ(ˆ ββ uu

aa t

kkt =

Which is distributed under the null hypothesis avt

Note that “a” denotes adjusted and “u” denotes unadjusted

• The parameter estimates from the LSDV are the same as from the

WITHIN regression model.

• Please note that this is not a general result since incidental paremeters

do cause inconsistencies in may applied models.

Important disadvantage with the WITHIN method:

• Demeaning the data means that X-regressors which are dummy

variables cannot be used.

• For example sex, race, religion, etc.

• Thus we would be able to say nothing about the relationship

between the dependent variable and the time-invariant

characteristics using this estimator.

5. RANDOM EFECTS MODEL (REM)

• The benefit of REM approach is that you concede variation across

the cross-sections, but don’t estimate “N-1” of these variations. 18

• However, in this approach you introduce a more complicated

variance structure and OLS is no longer appropriate.

• This method is best suited to “random” draws from a large

population ex. household surveys which claim to be

“representative”. (N is usually quite large.)

• The problems of too many parameters with LSDV model and

“sweeping away” the time-invariant regressors “can be avoided”

if µi are assumed random, i.e. drawn from a given distribution.

),0(~ 2µσµ IIDi

),0(~ 2vit IIDv σ

and µi are independent of vit.

• In other words we are assuming that the individual effects have an

empirical distribution function.

0

2 0

4 0

6 0

8 0

1 0 0

1 2 0

-2 .5 0 -1 .2 5 0 .0 0 1 .2 5 2 .5 0 µˆ

Freq

uenc

y

Which has certain characteristics.

19

∑=

==N

iiN

average1

1 µµα

=2µσ Variance of µ

• We can use these definitions to write the panel data model in form

of REM

( ) itiitit xy ναµβα +−+′+=

The new error term is ( ) itiitu ναµ +−=

We can then rewrite the REM model as;

ititit uxy +′+= βα

This is almost like the pooled model, except for the following;

• The constant term can be interpreted as the average individual

effects

• The error term has a special complicated form

• We can estimate the REM model using OLS to obtain estimates of

α and β .

• These estimates will only be consistent if the following conditions

hold; o ( ) ( ) ( ) 0=+−= itiit EEuE ναµ

o ( ) ( ) ( ) 0,,, =+= itititiitit xCovxCvxuCov νµ , i.e. no correlation between

individual effects and regressors

20

5.1 Efficiency in the REM

For REM to be efficient, two conditions must fulfilled;

• Homoscedasticity : for all i and t. Here we assume

that

22)( νµ σσ +=ituVar

iµ and are independent itu

• Serial independence in the error term , ( ) itiitu ναµ +−=

( ) 22, νµ σσ +=jsit uuCov if ji = and ts = (Same cross-section, same

year)

( ) 0, =jsit uuCov if ji ≠ and ts = If individuals are

independent)

( ) 0, 2 ≠= µσjsit uuCov if ji = and ts ≠ (Same cross-section,

different year)

The last condition violates the serial independence assumption.

OLS is thus inefficient in a REM and thus yields incorrect standard

errors and tests.

5.2 FGLS ESTIMATOR

ititit uxy +′+= βα

• The FGLS estimator for the REM can be implemented by OLS

regression of the transformed equation as follows

1. Define 1

1ˆσσθ ν−= where 222

1 νµ σσσ += T

2. Calculate “pseudo within differences”

•−= iitit yyy θ* , •−= iitit xxx θ*

3. Perform an OLS regression

21

****ititit uxy ++= βα

Where ( )αθα ˆ1* −= and ( ) ( )iitiitu εθεαθ ˆˆ1* −+−=

4. The REM estimate of β is given by;

( )( )

( )∑∑

∑∑

= =•

= =••

−

−−= N

i

T

tiit

N

i

T

tiitiit

re

xx

yyxx

1 1

2**

1 1

****

β

The estimator of the intercept can be shown to equal

xy rere βµ ˆ−= (Greene, 2003)

5.2.1 The Crucial Problem: We do not know θ

• The unfortunate thing is that we do not know and 2µσ

2νσ

• If the errors ,itu itν and iµ were known, we could estimate the

variances easily as follows;

1. ∑=

•=N

iiu

NT

1

21σ

2. ( )2

1 1

2

)1(1ˆ ∑∑

= =•−

−=

N

i

T

tiit uu

TNνσ

( )2

1 1)1(1 ∑∑

= =•−

−=

N

i

T

tiitTN

νν

3. ( )2

1 1

2

)1(1ˆ ∑∑

= =

−−

=N

i

T

tiN

µµσµ

• Since ,itu itν and iµ are unknown, there are a number of suggestions

on how they can be estimated.

22

• These methods use various residuals instead of unknown

parameters.

Possible Residuals

1. =REM residuals from the pooled regression olsu ititit uxy +′+= βα ,

number of observations is NT

2. =REM residuals from the bu between regression

••• +′+= iii uxy βα , number of observations is N.

3. wν =FE residuals from the Within regression ititit xy νβ ~~~ +′= ,

number of observations is NT.

4. =REM residuals from the wu Within regression. This can be

computed as ( )www µµν ˆˆˆ −+ .

5. =REM residuals from the regression , number of

observations is NT.

reu ****ititit uxy ++= βα

1. Wallace and Hussein (1969)

Use (unbiased and consistent but not efficient) instead of u in 4

and 5

olsu

2. Swamy and Arora (1972)

Use in 4 and bu wν in 5. This is the approach used in Eviews

econometric software when you estimate a REM. It is also the

default in stata when you use re option.

3. Amemiya(1971)

Use in 4 and wu wν in 5

23

4. Nerlove (1971)

Use wµ in 6 and wν in 5

5. Wansbeek-Kapteyn (1989) for incomplete panels

Some Comments

• There is no much difference between the models when the REM

specification is correct

• Only Nerlove (1971) guarantees that . Many users of the

other methods set

02 >µσ

1=θ (fixed effects model) if a negative value of

is found. 2µσ

• There are no general rules as to which method to use. The most

common is the Swammy-Arora(also used in Eviews)

• The REM estimates are more efficient when REM specification is

correct. They are inconsistent when the model is incorrect

• It is important to test which model is correct xtreg ln_cons ln_rgdp, re i(country) Random-effects GLS regression Number of obs = 154 Group variable (i): country Number of groups = 11 R-sq: within = 0.5644 Obs per group: min = 14 between = 0.9903 avg = 14.0 overall = 0.9888 max = 14 Random effects u_i ~ Gaussian Wald chi2(1) = 879.56 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ ln_cons | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ln_rgdp | .8916337 .0300645 29.66 0.000 .8327083 .950559 _cons | .7713074 .3363147 2.29 0.022 .1121428 1.430472 -------------+---------------------------------------------------------------- sigma_u | .31717419 sigma_e | .10204792 rho | .90619338 (fraction of variance due to u_i) ------------------------------------------------------------------------------

24

5.3 BETWEEN ESTIMATOR

• The between estimator uses just the cross-sectional variation.

• For instance for a model ititiit xy νβα +′+=

• We could average all the years to yield iiii xy νβα +′+=

• This can be rewritten as a between model

( )iiii xy νααβα +−+′+=

Where ∑=

−=T

titi yTy

1

1 , ∑=

−=T

titi T

1

1 εε and ∑=

−=T

titi xTx

1

1

• The between estimator is the OLS estimator for regression of

iy on time averaged regressors

• The concern is the difference between different individuals

(i.e. “between estimator”) and is the analogue of cross-section

regression which is a special case T=1

• For instance for our consumption example, the data for

consumption will be as follows Country Mean real consumptionBotswana 6926.8Burkina Faso 1166573.8Burundi 545623.9Kenya 547946.8Madagascar 20678.3Mauritius 64759.6Morocco 234.0Nigeria 2878.3Rwanda 618.8Sierra Leone 1376062.0South Africa 500589.7Tanzania 5386.2

Do the same for gdp • The between estimator is consistent if the regressors ix are

independent of the composite error ( )ii ναα +−

25

• This will be case for the constant-coefficients model and the

REM model.

• For the fixed effects model, the between estimator is

inconsistent as iα is assumed to be correlated with and

hence

ix

ix

Between regression (regression on group means) Number of obs = 154 Group variable (i): country Number of groups = 11 R-sq: within = 0.5644 Obs per group: min = 14 between = 0.9903 avg = 14.0 overall = 0.9888 max = 14 F(1,9) = 920.77 sd(u_i + avg(e_i.))= .3183446 Prob > F = 0.0000 ------------------------------------------------------------------------------ ln_cons | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ln_rgdp | .9996306 .0329431 30.34 0.000 .9251081 1.074153 _cons | -.375336 .3627005 -1.03 0.328 -1.195821 .4451495 ------------------------------------------------------------------------------

5.3.1 Comments

• The REM estimator of the slope parameters converge to the

within estimator as

reβ

∞→T and as 1→θ

• We can show that the GLS estimator is a weighted average of

the within and between estimator

bewithinre ww βββ ˆˆ11 +=

5.4 FEM vs. REM MODEL

• The FEM vs REM is an issue that has generated a hot debate in

panel econometrics literature

26

(1) Traditional criterion: Is iµ viewed as a random variable or as

parameter to be estimated?

iµ is “random effect”- when the individual effects are randomly

distributed across the cross-sections

iµ is “fixed effect”- when it is treated as a parameter to estimated for

each cross-section observation i.

(2) Modern panel data econometrics-The key issue here is whether iµ is

correlated with the regressors or not.

iµ is random effect-When there is zero correlation between the

observed explanatory variables and the iµ i.e. 0),( =iitXCov µ

iµ is fixed effect-when there is correlation between the observed

explanatory variables and iµ .

In other words, we allow for arbitrary correlation between the

unobserved effects iµ and the observed explanatory variables

5.4. 1 Individual Specific Variables (Time-invariant regressors)

• Many times when we conduct research we have exogenous

variables that vary between individuals but which do not vary over

time within a given individual (e.g.gender, race, language,

nationality etc.).

27

• These are called time-invariant regressors

Which is the best model to deal with such exogenous variables?

To understand this let’s denote and individual specific variable as is

In a FEM, we will write it as follows;

ititiiit xsy νβλµα +′+++=

Let’s look at the two FEM approachs: LSDV and WITHIN estimation

a) LSDV: The LSDV will not be able to estimate it because of perfect

multicollinearity between the individual-specific effects, iµ , and the

individual specific variables . The reason is because in both cases

dummy variables are used

is

b) WITHIN: Under this approach the term ( )ii sλµα ++ does not vary

over time and will thus be removed by within transformation

(demeaning process). ( ) ( )••• −+−=− iitiitiit xxyy ννβ 1

This means that the parameters of the individual specific variables

cannot be estimated using the FEM. This means that we cannot

distinguish between observed and unobserved heterogeneity, a feature

that may be important for policy.

28

In REM we will write the model as follows

ititiit uxsy +′++= βλα

In this case λ can easily be estimated, although not when using the

Nerlove method. It is however, important to note that for the REM to

be appropriate, the observed heterogeneity, , must be independent of

the unobserved heterogeneity,

is

iµ .

The REM thus offers an added advantage by allowing us to estimate

parameters for time-invariant regressors which may be of policy

relevance.

Note however, that in the context of the REM, we cannot

interpret the coefficients for the unobserved heterogeneity!!.

Look at eviews output

Dependent Variable: LN_RCONS?

Method: Pooled EGLS (Cross-section random effects)

Date: 09/08/08 Time: 22:56

Sample: 1990 2003

Included observations: 14

Cross-sections included: 12

Total pool (unbalanced) observations: 163

Swamy and Arora estimator of component variances


C 1.019244 0.371701 2.742110 0.0068

LN_RGDP? 0.879092 0.032618 26.95085 0.0000

Random Effects (Cross)

29

_BOTS--C -0.922566

_BURKF--C 0.411866

_BUR--C 1.256634

_KEN--C 0.307806

_MADAG--C 0.074454

_MAURIT--C -0.057190

_MOR--C -0.687420

_NIG--C -0.625203

_RWA--C -0.253274

_SIERL--C 0.542767

_RSA--C 0.124030

_TAN--C -0.171903


S.D. Rho

Cross-section random 0.419148 0.9386

Idiosyncratic random 0.107162 0.0614

Weighted Statistics

R-squared 0.812596 Mean dependent var 0.721526

Adjusted R-squared 0.811432 S.D. dependent var 0.266552

S.E. of regression 0.115749 Sum squared resid 2.157038

F-statistic 698.1075 Durbin-Watson stat 0.954621

Prob(F-statistic) 0.000000

Unweighted Statistics

R-squared 0.964211 Mean dependent var 10.40442

Sum squared resid 50.10865 Durbin-Watson stat 0.041094

We cannot interpret the coefficients of the random effects

coefficients because they are randomly distributed across the

cross-sections

Question:

Suppose you have a one-way error components model of the form


Prove that the sum of iµ is equal to zero.

30

Solution

The question is equivalent to 01

=∑=

N

iiµ

Use the fact that ααµ −= ii

( ) 01111

=−=−=−= ∑∑∑∑====

ααααααµ NNN

i

N

ii

N

ii

N

ii

References

Baltagi, B.H. (2005), Econometric Analysis of Panel Data, 3rd Edition, John Wiley Chapter 2 Cameron,A.C. and Trivedi, P.K. (2006), Microeconometrics:

Methods and Applications, Cambridge University Press. Chapter

Chapter 21


Greene W H. Econometric Analysis, Second Edition, Macmillan,

2003. Chapter 13

Wooldridge, J. M. (2002), Econometric Analysis of Cross-Section

and Panel Data, MIT Press: Cambridge, Massachusets. Chapter

10

31

Stata 9 reference on Longitudinal/panel data

32

COLLABORATIVE MASTERS PROGRAMME

IN ECONOMICS FOR ANGLOPHONE AFRICA

(CMAP)

JOINT FACILITY FOR ELECTIVES 2008


DR. MOSES SICHEI*



LECTURE 12: HYPOTHESIS TESTING AND TWO-WAY

ERROR COMPONENT MODELS

Objectives: The main objective of the lecture is to understand some basic panel data models in the context of one-way error components. Key words

• Breusch-Pagan LM test

• Chow test

• Hausman test

• Generalised least squares

• Least squares dummy variable (LSDV)

• Panel data

• Pooled data

• Two-way error components model

• Random effects model (REM)

• Within estimation

1. INTRODUCTION

• Hypothesis testing is central to statistical inference in

econometrics.

• In econometrics, we distinguish 3 types of tests;

o Parameter test: do the parameters have specified values e.g.

are a parameter significant?

o Specification tests: Is the model correct? e.g. pooled, REM,

FEM or SUR 1

o Misspecification test: Are any of the statistical assumptions

violated? E.g. homoscedasticity or serial independence

1.1 Parameter tests

• The principles used in ordinary regression can be applied in panel

data models

• The usual way to test the parameters is to use the t-test (one

parameter) or F-test for several parameters.

2. SPECIFICATION TESTS

• We have considered the FEM, REM, SUR and pooled model in

lectures 10 and 11.

Pooled: ititit xy εβα +′+=

FEM: ititit xy εβα ~~~ +′+= i.e. transformation with Q (demeaning)

REM: i.e transformation with ititit uxy *** ++= βα22

1ˆvT σσ

σθµ

ν

+−=

SUR: itiitiit xy εβα +′+= i.e. allowing for contemporaneous correlation in error

terms

• The error terms in these models satisfy standard OLS assumption

if the respective model is correct

The specification tests are shown below;

2

Pooled

REM

FEM

Cho

w te

st

Bre

usch

-Pag

an

LM te

stH

ausm

ante

st

2.1 The Chow Test of the Pooled Model against the FEM

:0H the pooled (restricted) model is correct

The FE (unrestricted) model is correct :AH

• The URSS is calculated using the residuals from within regression

model i.e. ititit xy εβα ~~~ +′+= .

• The number of parameters is KN +

• The RRSS is computed from the pooled regression model

ititit xy εβα +′+=

• The number of parameters is 1+K

• The number of observations in both models are NT

The Chow test sum of squares test is thus;

)(),1(~0

)/()1/()(

KNNTN

H

FKNNTURSS

NURSSRRSSF −−−−−−−

=

3

• This test is called the Chow test because of the similarity to the

well known chow test for parameter stability.

2.2 Pooled or REM?: Breusch-Pagan (1980) LM Test

• Breusch and Pagan (1980) derived Lagrange multiplier tests for

the presence of individual specific random effects against the null

hypothesis assumption of iid errors

• The REM reduces to the pooled model if the variance of the

individual effects become zero i.e. 0→µσ so that 01ˆ22=

+−=

vT σσσθµ

ν

• The LM test uses this notion and tests the following hypothesis; 0: 2

0 =µσH

0: 2 >µσAH

• The Breusch-Pagan LM statistic is calculated using OLS residuals

from the pooled model

21

1 1

2

1

22

~1)1(2

χν

ν

⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛

−−

=

∑∑

∑

= =

=•

N

i

T

tit

N

iiT

TNTLM under 0H

• Unfortunately, the B-P test is two-sided test against the alternative

despite of the fact that we know that the variance cannot

be negative

0: 2 ≠µσAH

xttest0 Breusch and Pagan Lagrangian multiplier test for random effects: ln_cons[country,t] = Xb + u[country] + e[country,t] Estimated results: | Var sd = sqrt(Var) ---------+----------------------------- ln_cons | 8.644141 2.940092 e | .0104138 .1020479

4

u | .1005995 .3171742 Test: Var(u) = 0 chi2(1) = 731.09 Prob > chi2 = 0.0000

• The above BP LM test is based on Breusch and Pagan (1980) and

as modified by Baltagi and Li (1990) to deal with

unbalanced/incomplete panels

• We reject the null of implying that REM is better than a

pooled model

0: 20 =µσH

• It is important to note that LM tests generally have low power.

Experiments have shown that it is better to use the Chow test for

FE against pooled even if we suspect that RE is the correct

alternative model.

Eviews does not do Breusch pagan test directly

2.3 Testing the Joint Validity of Fixed Effects

0...: 1210 ==== −NH µµµ (no individual effects; same intercept for

all cross sections) 0...: 121 ≠≠≠≠ −NAH µµµ

• We test the null hypothesis of no individual effects with an applied

Chow or F-test, combining the Residual Sum of Squares for the

regression both with constraints (under the null) and without

(under alternative):

5

RRSS − OLS on pooled model (constant intercept)

URSS −From LSDV model or Within estimation

)(),1(~0

)/()1/()(

KNNTN

H

FKNNTURSS

NURSSRRSSF −−−−−−−

=

If N is “large enough”, can use “within” estimation instead of LSDV for

the residual sum of squares. Eviews 5 test Redundant Fixed Effects Tests Pool: POOL1 Test cross-section fixed effects

Effects Test Statistic d.f. Prob.

Cross-section F 193.077198 (11,150) 0.0000Cross-section Chi-square 443.130831 11 0.0000

We reject the null of 0...: 1210 ==== −NH µµµ Stata produces the test of fixed effects as default output from the fixed effects command as follows . xtreg ln_cons ln_rgdp,fe i(country) Fixed-effects (within) regression Number of obs = 154 Group variable (i): country Number of groups = 11 R-sq: within = 0.5644 Obs per group: min = 14 between = 0.9903 avg = 14.0 overall = 0.9888 max = 14 F(1,142) = 183.96 corr(u_i, Xb) = 0.9591 Prob > F = 0.0000 ------------------------------------------------------------------------------ ln_cons | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- ln_rgdp | .6576598 .0484889 13.56 0.000 .5618063 .7535133 _cons | 3.255495 .5148909 6.32 0.000 2.237653 4.273337 -------------+---------------------------------------------------------------- sigma_u | 1.0877784 sigma_e | .10204792

6

rho | .99127587 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(10, 142) = 127.58 Prob > F = 0.0000

• We reject the null that fixed effects are redundant just as we found

using eviews software

2.4 The Hausman Specification Test for the FE against the REM

• The Hausman test is a general test procedure, which is used when

we want to test the validity of an assumption that is necessary for

efficient estimation.

• In the panel framework the test checks for the following;

( ) 0,:0 =iti xCorH µ i.e. REM is correct

( ) 0,:0 ≠iti xCorH µ i.e. FEM is correct

The Hausman test is calculated in a matrix form or omitted variable

form.

• We begin by assuming that the true model is the random effects

model with individual effects uncorrelated with regressors and

error term

• The estimator is fully efficient reβ

• The hausman test statistic is

( ) ( ) ( )[ ] ( )wrerewwre VVH ββββββ ˆˆˆˆˆˆˆˆ 1−−

′−=

−

:0H i.e. the correct model is a REM 0ˆˆ =− wre ββ

:AH i.e. the correct model is a FEM 0ˆˆ ≠− wre ββ

The test statistic is asymptotically chi-square distributed under the

null hypothesis

7

Correlated Random Effects - Hausman Test Pool: POOL1 Test cross-section random effects

Test Summary Chi-Sq.Statistic Chi-Sq. d.f. Prob.

Cross-section random 25.960427 1 0.0000

Cross-section random effects test comparisons:

Variable Fixed Random Var(Diff.) Prob.

LN_RGDP? 0.683031 0.879092 0.001481 0.0000

We reject the null that the FE and RE models are exactly the same

In stata, we have to run the FEM and REM and store the values

and use the hausman test (see table 4.3 in Baltagi,2005:71)

2.5 Testing for heteroscedasticity

• Estimating heteroscedastic errors with the assumption of

homoscedasticity will yield consistent estimates but they will not

be efficient.

• The standard errors will be biased and one should compute robust

standard errors correcting for the possible presence of

heteroscedasticity.

• From 222νµ σσσ +=i

• Which component on the right hand side is heteroscedastic?

• There are two different situations that can arise;

8

o Variance of the individual effects iµ is heteroscedastic as

suggested by Mazodier and Trognon (1978) in the case of

REM

o Variance of itν is heteroscedastic as suggested by Baltagi and

Griffin (1988)

• There is quite a lot of details regarding this in Baltagi 2005 chapter

5

• Testing for heteroscedasticity in one-way error component is

complicated especially when iµ is heteroscedastic

Verbon (1980) derived a LM test for the null of

homoscedasticity against heteroscedastic alternative

( )2,0~ii µσµ and ( )2,0~

ititv σ

Li and Stengos (1994) suggested a motified Breusch-Pagan

test for significance of random individual effects:

which is robust to heteroscedasticity of unknown form in the

remainder error term

0: 20 =µσH

• An easy test could be done as follows;

σσ =iH :0 for all i i.e. homoscedasticity

σσ ≠iAH : for all i i.e. heteroscedasticity

Where ( )eeN

′=1ˆ 2σ -the pooled rss

( iii eeN

′=1ˆ 2σ )-cross-section specific residual vectors

9

The LM test is as follows (2

1

2

12

2

~1ˆˆ

2 −=∑ ⎥

⎦

⎤⎢⎣

⎡−= N

N

i

iTLM χσσ

) . This is a one tail test

Notice that this is an imperfect test since 222ˆ νµ σσσ +=i

2.6 Testing for serial correlation

• Ignoring serial correlation when it is present results in consistent

but inefficient estimates of the regression coefficient and biased

standard errors

• There are two types of correlation here

Effects-drive serial correlation: classical one-way error

component model, itiit νµε += , assumes that the only

correlation over time is due to the presence of the same

individual across the panel.

The correlation between itε and jsε is;

( )⎪⎩

⎪⎨

⎧

≠=+

==== stji

stjiforcor

v

jsit22

2,1

,σσ

σεερµ

µ

• It is the same no matter how far s is from t. this is

quite restrictive for economic relationships like

investment, where unobserved shock this period will

affect the behavioural relationship for at least the next

few periods

Traditional time/distance driven serial correlation, which is

ignored above

10

• There are different ways to model this serial correlation

o AR processes e.g. AR(1) ititit vv ηρ += −1 , AR(2) :

itititit vvv ηρρ ++= −− 2211

o Moving average process e.g.MA(1) 1−+= ititit vv λη

• One could compute first-order within individual autocorrelation

coefficient from the within regression estimation

∑∑

∑∑

= =

= =−

= N

i

T

tit

N

i

T

titit

r

1 2

2

1 21

ˆ

ˆˆ

ε

εε

• The simplest test is the LM test due to Breusch and Godfrey

)1,0(~1

2

NrTNTLM−

= under 0H

• We can also use the Durbin-Watson in panel data

2.7 Robust standard errors

• If we discover (or suspect) heteroscedasticity or serial correlation,

we must decide what to do

• One approach is to try and model the variances and or correlation.

For instance we could estimate the regression assuming that the

disturbance term is first-order autoregressive.

• This can be done in stata using the xtregar command

itiitit xy εµβα ++′+=

Where titit ηρεε += −1

11

• An alternative approach is to accept the usual estimates, but to

calculate their robust standard errors

• If we only suspect heteroscedasticity, we can use the White’s

robust standard errors

• If we suspect heteroscedasticity and /or within individual

autocorrelation we can use Arrellano’(1987)s robust standard

errors

• The white’s method is standard in most econometric software

• Arrelano’s method is available in stata

3. TWO-WAY ERROR COMPONENT MODEL

• In a one-way error component model, we assume that there exist

unobserved individual heterogeneity but that the model is

homogenous over time.

• Under the two-way error component model, the error term is

decomposed into two key components

• ittiit νλµαε +++=

• Where iµ denotes the unobserved individual effects discussed

earlier, tλ denotes the unobserved time effect and itν is the

remainder stochastic disturbance term

• Note that tλ is individual-invariant and it accounts for any time-

specific effect that is not included in the regression

• Note also that iµ is time-invariant

12

itx

ity

i=1,t=1

i=2,t=2

i=3,t=2ititiit xy εβα +′+=

i=1,t=2

i=2,t=1

i=3,t=1

• Our model now becomes

itittiit xy νβλµα +′+++=

Where 0,011

== ∑∑==

T

tt

N

ii λµ

The individual/time effects can be defined as follows;

tiit λµαα ++=

∑∑= =

•• ≡=N

i

T

titNT 1 1

1 ααα i.e. the average effect

∑=

• ≡=+T

titii T 1

1 ααµα i.e. the individual effect. This is time-invariant

∑=

• ≡=+N

iittt N 1

1 ααλα i.e. the time effect. This is cross-section

invariant.

13

• Note that some software report the individual effects as •iα while

others report as iµ . Notice also that 0=−−− •••• αααα tiit

• In a two way error component models both the individual and time effects can be fixed or random e.g.

o FF λµ , i.e.both are fixed effects o FR λµ , or RF λµ , i.e.mixed FE/RE model o RR λµ , both are RE

3.1. FULLY FIXED EFFECTS MODEL ( FF λµ , )

1. ESTIMATING FIXED EFFECTS

• We extend the one-way model assumptions, iµ and tλ are fixed parameters to be estimated and ( )2,0~ vit IID σν

LSDV Estimation with LSDV requires the estimation of (N-1)+(T-1) dummies.

itittiit xy νβλµα +′+++=

14

Country Period 1998 1999 2000 2001 2002 2003

Botswana 1998 1 0 0 0 0 0Botswana 1999 0 1 0 0 0 0Botswana 2000 0 0 1 0 0 0Botswana 2001 0 0 0 1 0 0Botswana 2002 0 0 0 0 1 0Botswana 2003 0 0 0 0 0 0Burkina Faso 1998 1 0 0 0 0 0Burkina Faso 1999 0 1 0 0 0 0Burkina Faso 2000 0 0 1 0 0 0Burkina Faso 2001 0 0 0 1 0 0Burkina Faso 2002 0 0 0 0 1 0Burkina Faso 2003 0 0 0 0 0 0Burundi 1998 1 0 0 0 0 0Burundi 1999 0 1 0 0 0 0Burundi 2000 0 0 1 0 0 0Burundi 2001 0 0 0 1 0 0Burundi 2002 0 0 0 0 1 0Burundi 2003 0 0 0 0 0 0Kenya 1998 1 0 0 0 0 0Kenya 1999 0 1 0 0 0 0Kenya 2000 0 0 1 0 0 0Kenya 2001 0 0 0 1 0 0Kenya 2002 0 0 0 0 1 0Kenya 2003 0 0 0 0 0 0Madagascar 1998 1 0 0 0 0 0Madagascar 1999 0 1 0 0 0 0Madagascar 2000 0 0 1 0 0 0Madagascar 2001 0 0 0 1 0 0Madagascar 2002 0 0 0 0 1 0Madagascar 2003 0 0 0 0 0 0Mauritius 1998 1 0 0 0 0 0Mauritius 1999 0 1 0 0 0 0Mauritius 2000 0 0 1 0 0 0Mauritius 2001 0 0 0 1 0 0Mauritius 2002 0 0 0 0 1 0Mauritius 2003 0 0 0 0 0 0Morocco 1998 1 0 0 0 0 0Morocco 1999 0 1 0 0 0 0Morocco 2000 0 0 1 0 0 0Morocco 2001 0 0 0 1 0 0Morocco 2002 0 0 0 0 1 0Morocco 2003 0 0 0 0 0 0Nigeria 1998 1 0 0 0 0 0Nigeria 1999 0 1 0 0 0 0Nigeria 2000 0 0 1 0 0 0Nigeria 2001 0 0 0 1 0 0Nigeria 2002 0 0 0 0 1 0Nigeria 2003 0 0 0 0 0 0Rwanda 1998 1 0 0 0 0 0Rwanda 1999 0 1 0 0 0 0Rwanda 2000 0 0 1 0 0 0Rwanda 2001 0 0 0 1 0 0Rwanda 2002 0 0 0 0 1 0Rwanda 2003 0 0 0 0 0 0Sierra Leone 1998 1 0 0 0 0 0Sierra Leone 1999 0 1 0 0 0 0Sierra Leone 2000 0 0 1 0 0 0Sierra Leone 2001 0 0 0 1 0 0Sierra Leone 2002 0 0 0 0 1 0Sierra Leone 2003 0 0 0 0 0 0South Africa 1998 1 0 0 0 0 0South Africa 1999 0 1 0 0 0 0South Africa 2000 0 0 1 0 0 0South Africa 2001 0 0 0 1 0 0South Africa 2002 0 0 0 0 1 0South Africa 2003 0 0 0 0 0 0Tanzania 1998 1 0 0 0 0 0Tanzania 1999 0 1 0 0 0 0Tanzania 2000 0 0 1 0 0 0Tanzania 2001 0 0 0 1 0 0Tanzania 2002 0 0 0 0 1 0Tanzania 2003 0 0 0 0 0 0

15

This can introduce a rather severe loss of degrees of freedom! WITHIN

• Once again to avoid this problem we can perform “WITHIN” transformation (similar to one-way model.)

• Now, however, we must demean across both dimensions. •••• +−−= yyyyy tiitit

~~ •••• +−−= xxxxx tiitit

~~ •••• +−−= vvvvv tiitit

~~ with

NTy

y itTt

Ni∑ ==

••

Σ= 11 ,

NTx

x itTt

Ni∑ ==

••

Σ= 11 ,

NTv

v itTt

Ni∑ ==

••

Σ= 11

• The two-way within model can thus be written as

ititit xy νβ~~~~~~ +′=

• We now need two constraints to capture individual and time

effects:

0

0

=∑

=∑

tt

ii

λ

µ

We can recover the average, individual and time effects as follows;

ttt

iii

xy

xy

xy

••

••

••••

−=

−=

−=

βλ

βµ

βα

ˆ~ˆ~ˆ~

Consistency

• α~ and are consistent as either N or T tend to infinity β

• iµ~ is only T-consistent • tλ

~ is only N-consistent

16

• The two way error components within transformation removes both observed and unobserved heterogeneity for both individual and time effects

Two-way error components FEM in Eviews Dependent Variable: LN_RCONS? Method: Pooled Least Squares Date: 09/09/08 Time: 15:39 Sample: 1990 2003 Included observations: 14 Cross-sections included: 12 Total pool (unbalanced) observations: 163


C 5.063740 0.633160 7.997570 0.0000LN_RGDP? 0.498205 0.059060 8.435613 0.0000

Fixed Effects (Cross) _BOTS--C -1.244337

_BURKF--C 1.731393 _BUR--C 1.943289 _KEN--C 1.347627

_MADAG--C -0.200967 _MAURIT--C 0.214007

_MOR--C -2.576373 _NIG--C -1.459456 _RWA--C -1.911089

_SIERL--C 1.882629 _RSA--C 1.205361 _TAN--C -0.932085

Fixed Effects (Period) 1990--C -0.003013 1991--C 0.023721 1992--C 0.031312 1993--C 0.025417 1994--C 0.019385 1995--C 0.000829 1996--C 0.024241 1997--C 0.055336 1998--C 0.082757 1999--C 0.057133 2000--C 0.064770 2001--C 0.120360 2002--C 0.171388 2003--C 0.186043


17

Cross-section fixed (dummy variables) Period fixed (dummy variables)


3.2 FULLY RANDOM EFFECTS MODEL ( RR λµ , )

• This model can be written as itittiit xy νβλµα +′+++=

• Where λµ, and x are independent • OLS will be consistent but inefficient • The efficient estimate is to use FGLS by regressing and

where **y **y

•••• +−−= yyyyy tiitit 321

* θθθ •••• +−−= xxxxx tiitit 321

* θθθ •••• +−−= vvvvv tiitit 321

* θθθ Where

11 1

σσθ v−= where 222

1 vT σσσ µ +=

22 1

σσθ v−= where 222

2 vN σσσ λ +=

13

213 −++=σσθθθ v where 22

221

23 vσσσσ −+=

3.2.1 Problem of estimating θ

• The problem is that we do not know θ • Similar alternatives as in one-way error components can be

applied

18

o Wallace which uses OLS residuals o Amemiya which uses within estimation residuals o Swammy-Arora which uses between individual and

between time residuals and within residuals Dependent Variable: LN_RCONS? Method: Pooled EGLS (Two-way random effects) Date: 09/09/08 Time: 15:44 Sample (adjusted): 1990 1998 Included observations: 9 after adjustments Cross-sections included: 12 Total pool (balanced) observations: 108 Swamy and Arora estimator of component variances


C 0.986764 0.489575 2.015554 0.0464LN_RGDP? 0.881646 0.043623 20.21071 0.0000


_BOTS--C -0.877294 _BURKF--C 0.408867

_BUR--C 1.259177 _KEN--C 0.253819

_MADAG--C 0.091398 _MAURIT--C -0.042582

_MOR--C -0.654678 _NIG--C -0.610147 _RWA--C -0.200834

_SIERL--C 0.494613 _RSA--C 0.091849 _TAN--C -0.214189

Random Effects (Period) 1990--C 0.000000 1991--C 0.000000 1992--C 0.000000 1993--C 0.000000 1994--C 0.000000 1995--C 0.000000 1996--C 0.000000 1997--C 0.000000 1998--C 0.000000

Effects Specification S.D. Rho

19

Cross-section random 0.451746 0.9510Period random 0.000000 0.0000Idiosyncratic random 0.102496 0.0490

Weighted Statistics

R-squared 0.793964 Mean dependent var 0.786370Adjusted R-squared 0.792020 S.D. dependent var 0.251936S.E. of regression 0.114895 Sum squared resid 1.399292F-statistic 408.4727 Durbin-Watson stat 1.033559Prob(F-statistic) 0.000000


R-squared 0.961920 Mean dependent var 10.42738Sum squared resid 35.22565 Durbin-Watson stat 0.041057

3.3 MIXED FE/RE MODEL ( RF λµ , ) and FR λµ , • The estimation can be done either for within or RE model • There are different transformation depending on whether we have

RF λµ , or FR λµ , • There also methods for computation of θ in the REM • Eviews can estimate this model easily

Dependent Variable: LN_RCONS? Method: Pooled EGLS (Cross-section random effects) Date: 09/09/08 Time: 15:46 Sample (adjusted): 1990 1998 Included observations: 9 after adjustments Cross-sections included: 12 Total pool (balanced) observations: 108 Swamy and Arora estimator of component variances


C 1.002894 0.421706 2.378184 0.0193LN_RGDP? 0.880140 0.039372 22.35460 0.0000


_BOTS--C -0.878631 _BURKF--C 0.413947

_BUR--C 1.261821 _KEN--C 0.257952

_MADAG--C 0.090330 _MAURIT--C -0.041596

20

_MOR--C -0.662075 _NIG--C -0.613345 _RWA--C -0.207404

_SIERL--C 0.500086 _RSA--C 0.096118 _TAN--C -0.217204

Fixed Effects (Period) 1990--C -0.010309 1991--C 0.002957 1992--C -0.001522 1993--C 0.003697 1994--C 0.011615 1995--C -0.035977 1996--C -0.016217 1997--C 0.016454 1998--C 0.029302

Effects Specification S.D. Rho

Cross-section random 0.451746 0.9510Period fixed (dummy variables) Idiosyncratic random 0.102496 0.0490

Weighted Statistics

R-squared 0.799179 Mean dependent var 10.42738Adjusted R-squared 0.780736 S.D. dependent var 0.251936S.E. of regression 0.117971 Sum squared resid 1.363876F-statistic 43.33285 Durbin-Watson stat 1.017320Prob(F-statistic) 0.000000


R-squared 0.961578 Mean dependent var 10.42738Sum squared resid 35.54244 Durbin-Watson stat 0.039038

3.4. TESTING FOR FIXED EFFECTS

• We can adapt an applied F-test to test the null hypothesis of one

common intercept across time and cross-section versus the

alternative of an intercept for each year and cross-section. 21

H0: 0and0 121121 ======= −− TN λλλµµµ KK HA: not all equal to 0. RRSS from pooled OLS (without dummies) URSS from WITHIN regression

( )

( )( ) )))1)(1(),2((1 ~0

11/

2/

KTNTN

H

FKTNURSS

TNURSSRRSSF −−−−+−−−

−+⎟⎠

⎞⎜⎝

⎛ −=

• Note: as seen previously this can also test just individual effects or

time effects. Redundant Fixed Effects Tests Pool: POOL1 Test cross-section fixed effects

Effects Test Statistic d.f. Prob.

Cross-section F 167.055245 (11,95) 0.0000

References

Arellano, M.(1987), “Computing Robust Standard Errors for Within Group Estimators”, Oxford Bulleting of Economics and Statistics, Vol.49(4), 431-434. Baltagi, B.H. and Li, Q. (1990), “A Lagrange Multiplier Test for Error Components with Incomplete Panels”, Econometric Reviews, Vol.9(1), 103-107. Baltagi, B.H. (2005), Econometric Analysis of Panel Data, 3rd Edition, John Wiley Chapter 3-5 22

Breusch, T. and Pagan, A. (1980), “The Lagrange Multiplier Test and its Applications to Model Specifications in Econometrics”. Review of Economic Studies, Vol.47, 239-253. Cameron,A.C. and Trivedi, P.K. (2006), Microeconometrics: Methods




Chapter 13




23

COLLABORATIVE MASTERS PROGRAMME

IN ECONOMICS FOR ANGLOPHONE AFRICA

(CMAP)

JOINT FACILITY FOR ELECTIVES 2008


DR. MOSES SICHEI*



LECTURE 13: INTRODUCTION TO DYNAMIC PANEL DATA

AND NONSTATIONARY PANELS

Objectives: The main objective of the lecture is to understand dynamic panels and nonstationary panel. Key words

• Arrelano and Bond

• Arrelano and Bover

• Dynamic panel

• GMM

• Kao test of cointegration

• Nickell bias

• Panel cointegration

• Panel unit roots

• Pedroni test of cointegration

• Nonstationary panels

• Orthogonal forward deviations

• Spurious regression

1. DYNAMIC PANEL DATA

• Many economic relationships are dynamic in nature

• One key advantage of panel data is that they allow the researcher to better understand

dynamics of adjustment

• Examples

o Dynamic demand for energy

o Dynamic wage equation

o Dynamic model of employment

1

• Dynamic relationships are characterized by the presence of a lagged dependent variable

among the regressors

itititit uxyy +′+= − βδ 1

• Assume one-way error component model

• itiit vu += µ

• The model then becomes itititiit vxyy +′++= − βδµ 1

Where ( )2,0~ µσµ iidi and ( )2,0~ vit iidv σ

• The dynamic model above is characterized by two sources of persistence over time

Autocorrelation due to the presence of lagged dependent variable among the

regressors

Individual effects characterizing the heterogeneity among the individuals

1.1 Problems introduced by the lagged dependent variable

• Since is a function of ity iµ then is also a function of 1−ity iµ . This can be seen from the

fact that

1121 −−−− +′++= itititiit vxyy βδµ .

• It implies that is correlated with the error term in the equation 1−ity

itititit uxyy +′+= − βδ 1

• This renders OLS estimator biased and inconsistent even if is not serially correlated itv

1.2 Nickell (1981) bias in FE estimator

• The within estimator will run an equation of the form

( ) ( ) ( )••−•−• −+−′+−+−=− iitiitiitiiiit vvxxyyyy βδµµ 11

• BUT: ( 11 −•− − iit yy ) is correlated with ( )•− iit vv even if are not serially correlated. itv

• This is because is correlated with 1−ity •iv by construction

• Indeed, the within estimator will be biased of magnitude ( )1−TO and its consistency

depends on T being large

• Consequently, only if will the within estimator for ∞→T δ and β be consistent for

the dynamic error component model.

2

1.3 Bias in Random effects model

• The random effects GLS estimator will also be biased in a dynamic panel data model

• The problem is that the GLS method entails quasi-demeaning of variables using θ

( ) ( ) ( )••−•−• −+−′+−=− iitiitiitiit vvxxyyyy θβθθδθ 11

• But •− iit yy θ will be correlated with ( )1−•− iit vu θ

2. ANDERSON AND HSIAO (1981) ESTIMATOR-AH

• Anderson and Hsiao (1981) suggested the following procedure:

Difference the model to get rid of the iµ

itititit vxyy ∆+′∆+∆=∆ − βδ 1

Use or simply as instruments for 322 −−− −=∆ ititit yyy 2−ity 211 −−− −=∆ ititit yyy

• These instruments will not be correlated with 11 −− −=∆ ititit vvv provided the are not

serially correlated

itv

Limitations of this method

• This method leads to consistent but not efficient estimates of parameters because it does

not make use of all the available moment conditions

• It also does not take into account the differenced structure on the residual disturbance

2. ARRELANO AND BOND (1991) METHOD

• Arrelano and Bond (1991) proposed a generalized method of moments (GMM)

procedure that is more efficient than Anderson and Hsiao (1981) method

• They argue that additional instruments can be obtained in a dynamic model if one

utilizes the orthogonality conditions that exist between lagged values of and the

disturbance term

ity

itv

Illustration of additional instruments

• From the model ititiit vyy ++= −1δµ

• Difference the model to eliminate the individual effects as suggested by AH

• ( ) ( 1211 −−−− )−+−=− itititititit vvyyyy δ

3

At we can find a valid instrument 3=t

• ( ) ( 231223 iiiiii vvyyyy −+ )−=− δ

• A valid instrument here is because it is highly correlated with 1iy ( )12 ii yy − but not

correlated with ( )23 ii vv −

At we can find a valid instrument 4=t

• ( ) ( 342334 iiiiii vvyyyy −+ )−=− δ

• A valid instrument here is because it is highly correlated with 2iy ( )23 ii yy − but not

correlated with ( )34 ii vv −

• We can continue the process to get additional valid instruments as a vector

( )221 ,...,, −iTii yyy

Equations Instruments

323 iii vyy ∆+∆=∆ δ 1iy

434 iii vyy ∆+∆=∆ δ 21, ii yy

. .

. .

iTiTiT vyy ∆+∆=∆ −1δ 221 ,...,, −iTii yyy

The set of dependent variables for the regression are the LHS in the equations column above

• We can put these in a matrix of instruments ( )′∆∆∆= iTiii yyyy ,...,, 43*

• The set of instruments are the variables in the instruments column above shown in the

matrix below

( )( )

( )

( )⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=

−221

321

21

1

,...,,....0............0..,,.00.00,000000

iTii

iii

ii

i

Di

yyy

yyyyy

y

W

• refer to the instruments for the first differenced based Arrelano and Bond

estimator

DiW

• The two-step GMM estimator is

4

• ( ) ( ) ( ) ( ) ⎥⎦⎤

⎢⎣⎡ ∆′∆⎥⎦⎤

⎢⎣⎡ ∆′∆= −

−−

−−

−−

−1

11

11

11

12ˆˆˆ yVWyyVWy NNδ

• Where ( )( ) i

N

iiiiN WvvWV′

∆∆′= ∑=1

• If we have x regressors the matrix of instruments change to

•

( )( )

( )

( )⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

′′

′′′′

′′

=

−− 21,221

21321

2121

211

,..,...,,....0............0..,,,,.00.00,,,000000,,

iTiiTii

iiiii

iiii

iii

Di

xxyyy

xxyyyxxyy

xxy

W

• Additionally, Arrelano and Bond suggested Sargan’s test of overidentifying restrictions given by

•

( )( ) ( ) ( 1~ˆˆ 2

1

1

−−∆′⎥⎥⎦

⎤

⎢⎢⎣

⎡ ′∆∆′′∆=

−

=∑ kpvWWvvWWvm i

N

iiii χ )

Where p refers to the number of columns of W and v∆ denotes the residuals from a two-step estimation 3. ARRELANO AND BOVER (1995) SYSTEMS ESTIMATOR

• Arellano and Bover (1995) and Blundell and Bond(1998) unified the GMM framework

for looking at IV estimators for dynamic panel data models

• This method eliminates individual effects by computing orthogonal forward deviation

• Orthogonal deviations as proposed by Arellano(1988) and Arellano and Bond (1995)

express each observation as the deviation from the average of the future observations in

the sample for the same individual and weight each deviation to standardize the variance

as follows;

21

)2()1(*

1...

⎟⎠⎞

⎜⎝⎛

+−−

⎟⎟⎠

⎞⎜⎜⎝

⎛−

++−= ++

tTtT

tTxxx

xx iTtitiitit for 1,..,1 −= Tt

• Look at the matrix given in question 8.4 in Baltagi (2005:162) 3C

5

• The orthogonal forward deviations use the same instruments as in Arrelano and Bond

(1991)

• But adds the following

Equations Instruments

323 iiii vyy ++= µδ 2iy∆

434 iiii vyy ++= µδ 3iy∆

. .

. .

iTiiTiT vyy ++= − µδ 1 1−∆ iTy

• The full matrix of instruments entails a stacked matrix of the form

⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

∆

∆∆

=

−1

3

2

....0............0...00.00000000

iT

i

i

Di

Di

y

yy

W

W

Instruments for the differenced estimator

• Can get details in Baltagi

4. NON-STATIONARY PANEL DATA MODELS

• With the growing use of cross-country data to study a number of macroeconomic topics, the focus of panel data econometrics has shifted towards studying the asymptotics of macroeconomic-oriented panel data rather than microeconomic-oriented panel data.

4.1 Problems with nonstationary data in general

Low power of time series tests (unit root and cointegration.) 6

Nonstandard limiting distributions of time series tests. Spurious regression problem (t-statistics diverge in misspecified regressions of

two I(1) variables.)

• Panel data can help solve the problems faced in time series data but at the cost of introducing a new issue, how homogeneous is the panel?

• If we have large T dimensions then:

• Allow estimation of heterogeneous panels (heterogeneous regressors) • Allow investigation of non-stationary spurious regression, and cointegration “The aim of the econometrics of non-stationary panel data is to combine the best of both worlds: the method of dealing with non-stationary data from the time series and the increased data from the cross-section.” • View cross-sections as repeated draws, or paths, from the same distribution. • Certain panel statistics (estimators) converge in distribution to normally distributed random

variables Some distinctive results for panel data in non-stationary world

• Many test statistics and estimators of interest have normal limiting distribution. e.g.IPS test. This is in contrast to the non-stationary time series literature where the limiting distributions are complicated functionals of Brownian/Wiener processes

• Using panel data one can avoid the problem of spurious regression. • However, unlike time series spurious regression literature, the panel data spurious

regression estimates give consistent estimate of the true value of the parameter as both N and T tend to ∞

• The reason is because the panel estimator averages across individuals and the information in the independent cross-section data in the panel leads to a stronger overall signal than in pure time series

• Note however that the introduction of ∞→N and ∞→T introduces some other complications in the asymptotic analysis (Phillips and Moon, 2000-multi-indexed processes)

2. TESTING FOR UNIT ROOTS IN PANEL DATA

• Testing for unit roots in time series is now common practice among applied researchers. • However, testing for panel unit roots is quite recent and many researches and thesis

applying panel data still disregard this crucial step. Unit root tests in time series analysis:

7

• Based on the coefficient of the AR(1) representation. (Less than one in absolute value) • ADF/PP consider the null hypothesis of non-stationarity; KPSS considers the null hypothesis

of stationarity. • A process with a unit root has an infinite memory and can be considered the sum of all past

random shocks. • Tests are notorious for low power and non-standard limiting distributions.

• Panel unit root tests are similar but not identical to unit root tests carried out in time series analysis.

• All panel unit roots begin with the following

• itittiiti xyy ελδ ++= −1,, • The represent the exogenous variables in the model itx

• If 1<iδ is weakly(trend) stationary and we shall be dealing with stationary data tiy ,

• On the other hand if 1=iδ then contains a unit root tiy ,

• We can simplify this further by substracting on both sides so that 1, −tiy( ) itittiiti xyy ελδ ++−=∆ −1,, 1

• Assuming that ( )1−= ii δρ

• Our ADF type-model is ∑=

−− +∆++=∆ip

jtijtiijittiiti yxyy

1,,1,, εθλρ

• For purposes of testing, there are two natural assumptions (1) assume that the persistence parameter are common across the cross-sections so that

ρρ =i Examples of panel unit root methods that apply this approach are;

o Levin, Lin and Chu (LLC) o Breitung o Hadri

(2) Assume that iρ vary with cross-sections o Im, Pesaran and Shin (IPS) o Fischer-ADF o Fischer PP tests

3. TESTS WITH COMMON UNIT ROOTS

8

• The Levin, Lin and Chu (LLC), Breitun and Hadri all assume that there is a common unit

root process so that iρ is identical across cross-sections • LLC and Breitung assume a null of unit root while the Hadri test uses a null of no unit

root (just like KPSS in time series) • LLC and Breitun all consider the following basic ADF specification

∑=

−− +∆++=∆ip

jtijtiijittiti yxyy

1,,1,, εθλρ

• LLC and Breitun allow the lag order for the difference terms to vary across cross-sections

ip

• The null and alternative hypotheses are H0: 0=ρ i.e. there is a unit root HA: 0<ρ i.e. there is no unit root

3.1 LEVIN, LIN AND CHU • This method attempts to derive estimates for ρ from proxies for ity∆ and that are

standardized and free of autocorrelations and deterministic components ity

• The LLC requires specification of the number of lags used in each equation ADF regression, as well as kernel choices ip

3.2 BREITUNG

• This method differs from LLC in two distinct ways o Only the autoregressive component (and not the exogenous components) is

removed when constructing standardized proxies o Proxies are transformed and detrended

• Breitung method requires only a specification of the number o lags used in each cross-section ADF and exogenous regressors

• No kernel computations are required

3.3 HADRI

• This is similar to KPSS unit root test • It has a null of no unit root in any of the series of the panel • Like KPSS, Hadri is based on the residuals from the individual OLS regressions of the

on a constant, or on a constant and a trend ity• Hadri test requires only the specification of the form of the OLS regression

4. TESTS WITH INDIVIDUAL UNIT ROOT PROCESSES

9

• Im, Pesaran and Shin, and the ADF and PP tests allow for individual unit root processes so that iρ may vary across cross-sections

• The tests are characterized by combining of individual unit root tests to derive a panel-specific result

4.1 IM, PESARAN AND SHIN (IPS)

• IPS begin by specifying a separate ADF regression for each cross-section

∑=

−− +∆++=∆ip

jtijtiijittiiti yxyy

1,,1,, εθλρ

H0: 0=iρ i.e. there is a unit root HA: 0<iρ i.e. there is no unit root

• After estimating the separate ADF regressions, the average of the t-statistics for each of the iρ from the individual ADF regressions, ( )iiT pt

i

• The average is ( )

N

ptt

N

iiiT

NT

i⎟⎠

⎞⎜⎝

⎛

=∑=1

• IPS show that a properly standardized NTt has asymptotic standard normal distribution

• ( )( )

( )( ))1,0(~

var1

1

1

1

NptN

ptENtNW

N

iiiT

N

iiiTNT

t NT

∑

∑

=

−

=

− ⎟⎠

⎞⎜⎝

⎛−

=

• The expression for the expected mean and variance of the ADF regression t-statistics are provided by IPS for various values of the T and p and differing test equation assumptions

5. PANEL COINTEGRATION

• Panel cointegration tests can be motivated by the search for more powerful tests than those obtained by applying individual time-series tests

• Time series tests are known to have low power, especially for short T Residual based tests

Residual based DF and ADF tests (Kao tests-1999) Residual based LM test (McCoskey and Kao-1998) Pedroni tests

Likelihood-based tests

• Combined individual tests 5.1 PEDRONI (ENGLE-GRANGER BASED) PANEL COINTEGRATION TEST

• Engle-Granger (1987) cointegration test is based on an examination of the residuals of a spurious regression performed using I(1) variables

10

• If the variables are cointegrated, the residual should be I(0). • If the variables are not cointegrated, then the residuals will be I(1) • Pedroni (1999, 2004) and Kao (1999) extend the Engle-Granger framework to tests

involving panel data • Pedroni proposes several tests for cointegration that allow for heterogenous intercepts

and trend coefficients across cross-sections • Consider the regression • itMitMiitiitiiiit exxxty ++++++= βββδα ..2211 • MmNiTt ,..1,,...2,1,,...2,1 ===• and y x are assumed to be integrated of order 1 • The parameters iα and iδ are individual and trends effects which may be set to zero if

desired • Under the null hypothesis that there is no cointegration, the residuals will be I(1) • The general approach is to obtain residuals from the above regression and then test

whether the residuals are I(1) just like in EG-2 step procedure • We run an auxiliary regression of the form

• ∑=

−− +∆+=∆ip

jtijtiijtiiti veee

1,,1,, ϕρ

• Pedroni’s panel cointegration statistic NTΞ is constructed from the residuals from the auxiliary regression

• Pedroni shows that the standardized statistic is asymptotically normally distributed

)1,0(~ Nv

NNT µ−Ξ

Where µ and v are Monte Carlo generated adjustment terms 5.2 KAO (ENGLE-GRANGER BASED) PANEL COINTEGRATION TEST

• Kao test follows the same basic approach as Pedroni tests but specifies cross-section specific intercepts and homogenous coefficients on the first-stage regressors

• In the bivariate case described by Kao(1999) we have the following • ititiit exy ++= βα For ititit uyy += −1

ititit yx ε+= −1 • NiTt ,...2,1,,...2,1 ==• The general approach is to obtain residuals from the above regression and then test

whether the residuals are I(1) just like in EG-2 step procedure • Kao runs an auxiliary regression of the form

• vitee titi += −1,, ρ • In order to test the null of no cointegration, the null can be written as 1:0 =ρH • The t-value is

11

( )

e

N

i

T

tit

s

e∑∑=

−−= 2

21ˆ1ρ

ρ

• Under the null of no cointegration, Kao presents several statistics for

and **,,, tt DFDFDFDF ρρ ADF

• It can be shown that the asymptotic distributions for and **,,, tt DFDFDFDF ρρ ADF converge to standard normal by sequential limit theory

5.3 COMBINED INDIVIDUAL TESTS (FISHER/JOHANSEN)

• Fisher(1932) derives a combined test that uses the results of the individual independent tests

• Maddala and Wu (1999) use Fisher’s result to propose an alternative approach to testing for cointegration in panel data by combining tests from individual cross-sections to obtain test for the full panel

• If iπ is the p-value from an individual cointegration test for cross-section i then under the null hypothesis for the panel

( ) ( )∑=

→−N

ii N

1

2 2log2 χπ

References

Anderson, T.W. and Hsiao, C. (1981), Estimation of Dynamic Panels with Error Components, Journal of the American Statistical Association, Vol.76, 598-606. Arrelano,M. and Bond,S. (1991), “Some Tests of Specification for Panel Data: Monte Carlo Evidence and Application to Employment Equations”, Review of Economic Studies, vol.58, 277-297. Arrelano,M. and Bover,O.(1995), “Another Look at the Instrumental Variables Estimation of Error Component Models”, Journal of Econometrics, vol.68, 29-51.

12

Baltagi, B.H. (2005), Econometric Analysis of Panel Data, 3rd Edition, John Wiley Chapter 8 Blundell, R.W. and Bond, S.R. (1998), “Initial Conditions and Moment

Restrictions in Dynamic Panel Data Models” Journal of Econometrics,

Vol87, 115-143.

Breitung, J. (2000), “The Local Power of Some Unit Root Tests for

Panel Data,” in B.Baltagi(ed). Advances in Econometrics, Vol.15:

Nonstationary Panels, Panel Cointegration, and Dynamic Panels,

Amsterdam: JAI Press, P.161-178.

Cameron,A.C. and Trivedi, P.K. (2006), Microeconometrics: Methods



Fisher, R.A. (1932), Statistical Methods for Research Workers, 4th

Edition, Edinburg: Oliver & Boyd.


Chapter 13

Hadri, K. (2000), “Testing for Stationarity in Heterogenous Panel Data”,

Econometric Journal, Vol.3, 148-161.

Im,K.S., Pesaran, M.H. and Shin, Y. (2003), “Testing for Unit Roots in

Heterogenous Panels”, Journal of Econometrics, Vol.115, 53-74.

13

Kao, C. (1999), “Spurius Regression and Residual-Based Tests for

Cointegration in Panel Data”, Journal of Econometrics, Vol.90, 1-144.

Levin, A., Lin, C.F. and Chu, C. (2002), “Unit Root Tests in Panel Data:

Asymptotic and Finite Sample Properties”, Journal of Econometrics,

Vol108, 1-24.

Maddal, G.S. and Wu, S. (1999), “A Comparative Study of Unit Root

Tests with Panel Data and a New Simple Test”, Oxford Bulletin of

Economics and Statistics, Vol. 61, 631-652.

Nickell, S. (1981), “Biases in Dynamic Models with Fixed Effects”,

Econometrica, vol.49, 1417-1426.

Pedroni,P. (1999), “Critical Values for Cointegration Tests in

Heterogenous Panels with Multiple Regressors”, Oxford Bulleting of

Economics and Statistics, Vol.61, 653-70.

Pedroni,P. (2004), “Panel Cointegration, Asymptotic and Finite Sample

Properties of Pooled Time Series Tests with an Application to the PPP

Hypothesis”, Econometric Theory, 20, 597-625.

14




Practice quizes

(1) Explain how nonstationary panel data analysis addresses

the problems of low power of tests, nonlimiting

distribution and spurious regression.

(2) What do you understand by the Nickell (1981) bias?

15

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 408 — #1

14Introduction to PanelData Models

14.1 Introduction

If the same units of observation in a cross-sectional sample are surveyed two ormore times, the resulting observations are described as forming a panel or longit-udinal data set. The National Longitudinal Survey of Youth that has provideddata for many of the examples and exercises in this text is such a data set. TheNLSY started with a baseline survey in 1979 and the same individuals have beenreinterviewed many times since, annually until 1994 and biennially since then.However the unit of observation of a panel data set need not be individuals. Itmay be households, or enterprises, or geographical areas, or indeed any set ofentities that retain their identities over time.

Because panel data have both cross-sectional and time series dimensions, theapplication of regression models to fit econometric models are more complexthan those for simple cross-sectional data sets. Nevertheless, they are increasinglybeing used in applied work and the aim of this chapter is to provide a briefintroduction. For comprehensive treatments see Hsiao (2003), Baltagi (2001),and Wooldridge (2002).

There are several reasons for the increasing interest in panel data sets. Animportant one is that their use may offer a solution to the problem of bias causedby unobserved heterogeneity, a common problem in the fitting of models withcross-sectional data sets. This will be discussed in the next section.

A second reason is that it may be possible to exploit panel data sets to revealdynamics that are difficult to detect with cross-sectional data. For example, ifone has cross-sectional data on a number of adults, it will be found that someare employed, some are unemployed, and the rest are economically inactive. Forpolicy purposes, one would like to distinguish between frictional unemploymentand long-term unemployment. Frictional unemployment is inevitable in a chan-ging economy, but the long-term unemployment can indicate a social problemthat needs to be addressed. To design an effective policy to counter long-termunemployment, one needs to know the characteristics of those affected or at risk.In principle the necessary information might be captured with a cross-sectionalsurvey using retrospective questions about past emloyment status, but in practice

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 409 — #2

Introduction to Econometrics 409

the scope for this is often very limited. The further back in the past one goes, theworse are the problems of a lack of records and fallible memories, and the greaterbecomes the problem of measurement error. Panel studies avoid this problem inthat the need for recall is limited to the time interval since the previous interview,often no more than a year.

A third attraction of panel data sets is that they often have very large numbersof observations. If there are n units of observation and if the survey is undertakenin T time periods, there are potentially nT observations consisting of time seriesof length T on n parallel units. In the case of the NLSY, there were just over6,000 individuals in the core sample. The survey has been conducted 19 times asof 2004, generating over 100,000 observations. Further, because it is expensiveto establish and maintain them, such panel data sets tend to be well designedand rich in content.

A panel is described as balanced if there is an observation for every unit ofobservation for every time period, and as unbalanced if some observations aremissing. The discussion that follows applies equally to both types. However, ifone is using an unbalanced panel, one needs to take note of the possibility thatthe causes of missing observations are endogenous to the model. Equally, if abalanced panel has been created artificially by eliminating all units of observationwith missing observations, the resulting data set may not be representative of itspopulation.

Example of the use of a panel data set to investigate dynamics

In many studies of the determinants of earnings it has been found that marriedmen earn significantly more than single men. One explanation is that marriageentails financial responsibilities—in particular, the rearing of children—that mayencourage men to work harder or seek better paying jobs. Another is that certainunobserved qualities that are valued by employers are also valued by potentialspouses and hence are conducive to getting married, and that the dummy variablefor being married is acting as a proxy for these qualities. Other explanationshave been proposed, but we will restrict attention to these two. With cross-sectional data it is difficult to discriminate between them. However, with paneldata one can find out whether there is an uplift at the time of marriage or soonafter, as would be predicted by the increased productivity hypothesis, or whethermarried men tend to earn more even before marriage, as would be predicted bythe unobserved heterogeneity hypothesis.

In 1988 there were 1,538 NLSY males working 30 or more hours a week,not also in school, with no missing data. The respondents were divided intothree categories: the 904 who were already married in 1988 (dummy variableMARRIED = 1); a further 212 who were single in 1988 but who married withinthe next four years (dummy variable SOONMARR = 1); and the remaining 422who were single in 1988 and still single four years later (the omitted category).Divorced respondents were excluded from the sample. The following earnings

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 410 — #3

410 14: Introduction to Panel Data Models

function was fitted (standard errors in parentheses):

LGEARN = 0.163 MARRIED + 0.096 SOONMARR + constant + controls

(0.028) (0.037) R2 = 0.27.

(14.1)

The controls included years of schooling, ASVABC score, years of tenure withthe current employer and its square, years of work experience and its square,age and its square, and dummy variables for ethnicity, region of residence, andliving in an urban area.

The regression indicates that those who were married in 1988 earned 16.3percent more than the reference category (strictly speaking, 17.7 percent, if theproportional increase is calculated properly as e0.163 − 1) and that the effectis highly significant. However, it is the coefficient of SOONMARR that is ofgreater interest here. Under the null hypothesis that the marital effect is dynamicand marriage encourages men to earn more, the coefficient of SOONMARRshould be zero. The men in this category were still single as of 1988. The tstatistic of the coefficient is 2.60 and so the coefficient is significantly differentfrom zero at the 0.1 percent level, leading us to reject the null hypothesis at thatlevel.

However, if the alternative hypothesis is true, the coefficient of SOONMARRshould be equal to that of MARRIED, but it is lower. To test whether it issignificantly lower, the easiest method is to change the reference category to thosewho were married by 1988 and to introduce a new dummy variable SINGLEthat is equal to 1 if the respondent was single in 1988 and still single four yearslater. The omitted category is now those who were already married by 1988.The fitted regression is (standard errors in parentheses)

LGEARN = −0.163 SINGLE − 0.066 SOONMARR + constant + controls

(0.028) (0.034) R2 = 0.27.

(14.2)

The coefficient of SOONMARR now estimates the difference between the coef-ficients of those married by 1988 and those married within the next four years,and if the second hypothesis is true, it should be equal to zero. The t statistic is−1.93, so we (just) do not reject the second hypothesis at the 5 percent level.The evidence seems to provide greater support for the first hypothesis, but it ispossible that neither hypothesis is correct on its own and the truth might residein some compromise.

In the foregoing example, we used data only from the 1988 and 1992 roundsof the NLSY. In most applications using panel data it is normal to exploit thedata from all the rounds, if only to maximize the number of observations in the

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 411 — #4


sample. A standard specification is

Yit = β1 +k∑

j=2

βjXjit +s∑

p=1

γpZpi + δt + εit (14.3)

where Y is the dependent variable, the Xj are observed explanatory variables,and the Zp are unobserved explanatory variables. The index i refers to the unitof observation, t refers to the time period, and j and p are used to differentiatebetween different observed and unobserved explanatory variables. εit is a dis-turbance term assumed to satisfy the usual regression model conditions. A trendterm t has been introduced to allow for a shift of the intercept over time. If theimplicit assumption of a constant rate of change seems too strong, the trend canbe replaced by a set of dummy variables, one for each time period except thereference period.

The Xj variables are usually the variables of interest, while the Zp variablesare responsible for unobserved heterogeneity and as such constitute a nuisancecomponent of the model. The following discussion will be confined to the (quitecommon) special case where it is reasonable to assume that the unobservedheterogeneity is unchanging and accordingly the Zp variables do not need atime subscript. Because the Zp variables are unobserved, there is no means ofobtaining information about the

∑sp=1 γpZpi component of the model and it is

convenient to rewrite (14.3) as

Yit = β1 +k∑

j=2

βjXjit + αi + δt + εit (14.4)

where

αi =s∑

p=1

γpZpi. (14.5)

αi, known as the unobserved effect, represents the joint impact of the Zpi onYi. Henceforward it will be convenient to refer to the unit of observation as anindividual, and to the αi as the individual-specific unobserved effect, but it shouldbe borne in mind that the individual in question may actually be a household oran enterprise, etc. If αi is correlated with any of the Xj variables, the regressionestimates from a regression of Y on the Xj variables will be subject to unobservedheterogeneity bias. Even if the unobserved effect is not correlated with any of theexplanatory variables, its presence will in general cause OLS to yield inefficientestimates and invalid standard errors. We will now consider ways of overcomingthese problems.

First, however, note that if the Xj controls are so comprehensive that theycapture all the relevant characteristics of the individual, there will be no relevantunobserved characteristics. In that case the αi term may be dropped and a pooled

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 412 — #5


OLS regression may be used to fit the model, treating all the observations for allof the time periods as a single sample.

14.2 Fixed effects regressions

The two main approaches to the fitting of models using panel data are knownas fixed effects regressions, discussed in this section, and random effects regres-sions, discussed in the next. Three versions of the fixed effects approach willbe described. In the first two, the model is manipulated in such a way that theunobserved effect is eliminated.

Within-groups fixed effects

In the first version, the mean values of the variables in the observations on agiven individual are calculated and subtracted from the data for that individual.In view of (14.4), one may write

Yi = β1 +k∑

j=2

βjXij + δt + αi + εit. (14.6)

Subtracting this from (14.4), one obtains

Yit − Yi =k∑

j=2

βj

(Xijt − Xij

)+ δ(t − t) + εit − εi (14.7)

and the unobserved effect disappears. This is known as the within-groupsregression model because it is explaining the variations about the mean of thedependent variable in terms of the variations about the means of the explanatoryvariables for the group of observations relating to a given individual. The possib-ility of tackling unobserved heterogeneity bias in this way is a major attractionof panel data for researchers.

However, there are some prices to pay. First, the intercept β1 and any Xvariable that remains constant for each individual will drop out of the model.The elimination of the intercept may not matter, but the loss of the unchangingexplanatory variables may be frustrating. Suppose, for example, that one is fittingan earnings function to data for a sample of individuals who have completed theirschooling, and that the schooling variable for individual i in period t is Sit. If theeducation of the individual is complete by the time of the first time period, Sit

will be the same for all t for that individual and Sit = Si for all t. Hence (Sit −Si)

is zero for all time periods. If all individuals have completed their schooling bythe first time period, Sit will be zero for all i and t. One cannot include a variablewhose values are all zero in a regression model. Thus if the object of the exercise

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 413 — #6


were to obtain an estimate of the returns to schooling untainted by unobservedheterogeneity bias, one ends up with no estimate at all.

A second problem is the potential impact of the disturbance term. We saw inChapter 3 that the precision of OLS estimates depends on the mean square devi-ations of the explanatory variables being large in comparison with the varianceof the disturbance term. The analysis was in the context of the simple regressionmodel, but it generalizes to multiple regression. The variation in (Xj − Xj) maywell be much smaller than the variation in Xj. If this is the case, the impact ofthe disturbance term may be relatively large, giving rise to imprecise estimates.The situation is aggravated in the case of measurement error, since this will leadto bias, and the bias is the greater, the smaller the variation in the explanatoryvariable in comparison with the variance of the measurement error.

A third problem is that we lose a substantial number of degrees of freedomin the model when we manipulate the model to eliminate the unobserved effect:we lose one degree of freedom for every individual in the sample. If the panel isbalanced, with nT observations in all, it may seem that there would be nT − kdegrees of freedom. However, in manipulating the model, the number of degreesof freedom is reduced by n, for reasons that will be explained later in this section.Hence the true number of degrees of freedom will be n(T−1)−k. If T is small, theimpact can be large. (Regression applications with a fixed regression facility willautomatically make the adjustment to the degrees of freedom when implementingthe within-groups method.)

First differences fixed effects

In a second version of the fixed effects approach, the first differences regressionmodel, the unobserved effect is eliminated by subtracting the observation for theprevious time period from the observation for the current time period, for alltime periods. For individual i in time period t the model may be written

Yit = β1 +k∑

j=2

βjXijt + δt + αi + εit. (14.8)

For the previous time period, the relationship is

Yit−1 = β1 +k∑

j=2

βjXijt−1 + δ(t − 1) + αi + εit−1. (14.9)

Subtracting (14.9) from (14.8), one obtains

Yit =k∑

j=2

βjXijt + δ + εit − εit−1 (14.10)

and again the unobserved heterogeneity has disappeared. However, the otherproblems remain. In particular, the intercept and any X variable that remains

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 414 — #7


fixed for each individual will disappear from the model and n degrees of free-dom are lost because the first observation for each individual is not defined. Inaddition, this type of differencing gives rise to autocorrelation if εit satisfies theregression model conditions. The error term for Yit is (εit −εit−1). That for theprevious observation is (εit−1−εit−2). Thus the two error terms both have a com-ponent εit−1 with opposite signs and negative moving average autocorrelationhas been induced. However, if εit is subject to autocorrelation:

εit = ρεit−1 + vit (14.11)

where vit is a well behaved innovation, the moving average disturbance term isequal to vit − (1 − ρ)εit−1. If the autocorrelation is severe, the (1 − ρ)εit−1 com-ponent could be small and so the first differences estimator could be preferableto the within-groups estimator.

Least squares dummy variable fixed effects

In the third version of the fixed effects approach, known as the least squaresdummy variable (LSDV) regression model, the unobserved effect is broughtexplicitly into the model. If we define a set of dummy variables Ai, where Ai isequal to 1 in the case of an observation relating to individual i and 0 otherwise,the model can be rewritten

Yit =k∑

j=2

βjXijt + δt +n∑

i=1

αiAi + εit. (14.12)

Formally, the unobserved effect is now being treated as the coefficient of theindividual-specific dummy variable, the αiAi term representing a fixed effect onthe dependent variable Yi for individual i (this accounts for the name given tothe fixed effects approach). Having re-specified the model in this way, it can befitted using OLS.

Note that if we include a dummy variable for every individual in the sample aswell as an intercept, we will fall into the dummy variable trap described in Section5.2. To avoid this, we could define one individual to be the reference category,so that β1 is its intercept, and then treat the αi as the shifts in the interceptfor the other individuals. However, the choice of reference category is oftenarbitrary and accordingly the interpretation of the αi in such a specification notparticularly illuminating. Alternatively, we can drop the β1 intercept and definedummy variables for all of the individuals, as has been done in (14.12). The αi

now become the intercepts for each of the individuals. Note that, in common withthe first two versions of the fixed effects approach, the LSDV method requirespanel data. With cross-sectional data, one would be defining a dummy variablefor every observation, exhausting the degrees of freedom. The dummy variableson their own would give a perfect but meaningless fit.

If there are a large number of individuals, using the LSDV method directlyis not a practical proposition, given the need for a large number of dummy

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 415 — #8


Table 14.1 Individual-specific dummy variables and anunchanging X variable

Individual Time period A1 A2 A3 A4 Xj

1 1 1 0 0 0 c11 2 1 0 0 0 c11 3 1 0 0 0 c12 1 0 1 0 0 c22 2 0 1 0 0 c22 3 0 1 0 0 c23 1 0 0 1 0 c33 2 0 0 1 0 c33 3 0 0 1 0 c34 1 0 0 0 1 c44 2 0 0 0 1 c44 3 0 0 0 1 c4

variables. However, it can be shown mathematically that the method is identicalto the within-groups method. The only apparent difference is in the number ofdegrees of freedom. It is easy to see from (14.12) that there are nT −k−n degreesof freedom if the panel is balanced. In the within-groups approach, it seemed atfirst that there were nT −k. However, n degrees of freedom are consumed in themanipulation that eliminates the αi.

Given that it is equivalent to the within-groups approach, the LSDV method issubject to the same problems. In particular, we are unable to estimate coefficientsfor the X variables that are fixed for each individual. Suppose that Xij is equalto ci for all the observations for individual i. Then

Xj =n∑

i=1

ciAi. (14.13)

To see this, suppose that there are four individuals and three time periods, asin Table 14.1, and consider the observations for the first individual. Xj is equalto c1 for each observation. A1 is equal to 1. All the other A dummies are equalto 0. Hence both sides of the equation are equal to c1. Similarly, both sides ofthe equation are equal to c2 for the observations for individual 2, and similarlyfor individuals 3 and 4.

Thus there is an exact linear relationship linking Xj with the dummy variablesand the model is subject to exact multicollinearity. Accordingly Xj cannot beincluded in the regression specification.

Example

To illustrate the use of a fixed effects model, we return to the example in Section14.1 and use all the available data from 1980 to 1996, 20,343 observations in all.Table 14.2 shows the extra hourly earnings of married men and of men who aresingle but married within the next four years. The controls (not shown) are the

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 416 — #9


Table 14.2 Earnings premium for married and soon-to-be married men, NLSY1980–96

OLS Fixed effects Random effects

Married 0.184 0.106 – 0.134 –(0.007) (0.012) (0.010)

Single, married 0.096 0.045 −0.061 0.060 −0.075within 4 years (0.009) (0.010) (0.008) (0.009) (0.007)Single, not married – – –0.106 – −0.134within 4 years (0.012) (0.010)R2 0.358 0.268 0.268 0.346 0.346DWH test – – – 205.8 205.8n 20,343 20,343 20,343 20,343 20,343

same as in Section 14.1. The first column gives the estimates obtained by simplypooling the observations and using OLS with robust standard errors. The secondcolumn gives the fixed effects estimates, using the within-groups method, withsingle men as the reference category. The third gives the fixed effects estimateswith married men as the reference category. The fourth and fifth give the randomeffects estimates, discussed in the next section.

The OLS estimates are very similar to those in the wage equation for 1988discussed in Section 14.1. The fixed effects estimates are considerably lower,suggesting that the OLS estimates were inflated by unobserved heterogeneity.Nevertheless, the pattern is the same. Soon-to-be-married men earn significantlymore than single men who stay single. However, if we fit the specification corre-sponding to equation (14.2), shown in the third column, we find that soon-to-bemarried men earn significantly less than married men. Hence both hypothesesrelating to the marriage premium appear to be partly true.

14.3 Random effects regressions

As we saw in the previous section, when the variables of interest are constantfor each individual, a fixed effects regression is not an effective tool becausesuch variables cannot be included. In this section we will consider an alternat-ive approach, known as a random effects regression that may, subject to twoconditions, provide a solution to this problem.

The first condition is that it is possible to treat each of the unobserved Zp

variables as being drawn randomly from a given distribution. This may wellbe the case if the individual observations constitute a random sample from agiven population as, for example, with the NLSY where the respondents wererandomly drawn from the US population aged 14 to 21 in 1979. If this is thecase, the αi may be treated as random variables (hence the name of this approach)

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 417 — #10


drawn from a given distribution and we may rewrite the model as

Yit = β1 +k∑

j=2

βjXjit + αi + δt + εit

= β1 +k∑

j=2

βjXjit + δt + uit (14.14)

where

uit = αi + εit. (14.15)

We have thus dealt with the unobserved effect by subsuming it into thedisturbance term.

The second condition is that the Zp variables are distributed independentlyof all of the Xj variables. If this is not the case, α, and hence u, will not beuncorrelated with the Xj variables and the random effects estimation will bebiased and inconsistent. We would have to use fixed effects estimation instead,even if the first condition seems to be satisfied.

If the two conditions are satisfied, we may use (14.14) as our regressionspecification, but there is a complication. uit will be subject to a special formof autocorrelation and we will have to use an estimation technique that takesaccount of it.

First, we will check the other regression model conditions relating to the dis-turbance term. Given our assumption that εit satisfies the usual regression modelconditions, we can see that uit satisfies the condition that its expectation be zero,since

E(uit) = E(αi + εit) = E(αi) + E(εit) = 0 for all i and t (14.16)

Here we are assuming without loss of generality that E(αi) = 0, any nonzerocomponent being absorbed by the intercept, β1. uit will also satisfy the conditionthat it should have constant variance, since

σ 2uit

= σ 2αi+εit

= σ 2α + σ 2

ε + 2σαε = σ 2α + σ 2

ε for all i and t. (14.17)

The σαε term is zero on the assumption that αi is distributed independentlyof εit. uit will also satisfy the regression model condition that it be distributedindependently of the values of Xj, since both αi and εit are assumed to satisfythis condition.

However, there is a problem with the regression model condition that thevalue of uit in any observation be generated independently of its value in allother observations. For all the observations relating to a given individual, αi willhave the same value, reflecting the unchanging unobserved characteristics of theindividual. This is illustrated in Table 14.3 for the case where there are fourindividuals and three time periods.

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 418 — #11


Table 14.3 Example of disturbance termvalues in a random effects model

Individual Time period u

1 1 α1 + ε111 2 α1 + ε121 3 α1 + ε132 1 α2 + ε212 2 α2 + ε222 3 α2 + ε233 1 α3 + ε313 2 α3 + ε323 3 α3 + ε334 1 α4 + ε414 2 α4 + ε424 3 α4 + ε43

Since the disturbance terms for individual i have a common component αi,they are correlated. For individual i in period t, the disturbance term is (αi +εit).For the same individual in any other period t′ it is (αi + εit′). The populationcovariance between them is

σuit ,uit′ = σ(αi+εit),(αi+εit′ ) = σαi ,αi + σαi ,εit′ + σεit ,αi + σεit ,εit′ = σ 2α . (14.18)

For observations relating to different individuals the problem does not arisebecause then the α components will be different and generated independently.

We have encountered a problem of the violation of this regression modelcondition once before, in the case of autocorrelated disturbance terms in a timeseries model. As in that case, OLS remains unbiased and consistent, but it isinefficient and the OLS standard errors are computed wrongly.

The solution then was to transform the model so that the transformed disturb-ance term satisfied the regression model condition, and a similar procedure isadopted in the present case. However, while the transformation in the case ofautocorrelation was very straightforward, in the present case it is more complex.Known as feasible generalized least squares, its description requires the use oflinear algebra and is therefore beyond the scope of this text. It yields consistentestimates of the coefficients and therefore depends on n being sufficiently large.For small n its properties are unknown.

Assessing the appropriateness of fixed effects and randomeffects estimation

When should you use fixed effects estimation rather than random effects estima-tion, or vice versa? In principle, random effects is more attractive becauseobserved characteristics that remain constant for each individual are retainedin the regression model. In fixed effects estimation, they have to be dropped.

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 419 — #12


Also, with random effects estimation we do not lose n degrees of freedom, as isthe case with fixed effects.

However, if either of the preconditions for using random effects is violated,we should use fixed effects instead. One precondition is that the observationscan be described as being drawn randomly from a given population. This is areasonable assumption in the case of the NLSY because it was designed to be arandom sample. By contrast, it would not be a reasonable assumption if the unitsof observation in the panel data set were countries and the sample consisted ofthose countries that are members of the Organization for Economic Cooperationand Development (OECD). These countries certainly cannot be considered torepresent a random sample of the 200-odd sovereign states in the world.

The other precondition is that the unobserved effect be distributed indepen-dently of the Xj variables. How can we tell if this is the case? The standardprocedure is yet another implementation of the Durbin–Wu–Hausman test usedto help us choose between OLS and IV estimation in models where there is sus-pected measurement error (Section 8.5) or simultaneous equations endogeneity(Section 9.3). The null hypothesis is that the αi are distributed independently ofthe Xj. If this is correct, both random effects and fixed effects are consistent,but fixed effects will be inefficient because, looking at it in its LSDV form, itinvolves estimating an unnecessary set of dummy variable coefficients. If the nullhypothesis is false, the random effects estimates will be subject to unobservedheterogeneity bias and will therefore differ systematically from the fixed effectsestimates.

As in its other applications, the DWH test determines whether the estimates ofthe coefficients, taken as a group, are significantly different in the two regressions.If any variables are dropped in the fixed effects regression, they are excluded fromthe test. Under the null hypothesis the test statistic has a chi-squared distribution.In principle this should have degrees of freedom equal to the number of slopecoefficients being compared, but for technical reasons that require matrix algebrafor an explanation, the actual number may be lower. A regression applicationthat implements the test, such as Stata, should determine the actual number ofdegrees of freedom.

Example

The fixed effects estimates, using the within-groups approach, of the coeffi-cients of married men and soon-to-be married men in Table 14.2 are 0.106 and0.045, respectively. The corresponding random effects estimates are consider-ably higher, 0.134 and 0.060, inviting the suspicion that they may be inflated byunobserved heterogeneity. The DWH test involves the comparison of 13 coeffi-cients (those of MARRIED, SOONMARR, and 11 controls). Stata reports thatthere are in fact only 12 degrees of freedom. The test statistic is 205.8. With12 degrees of freedom the critical value of chi-squared at the 0.1 percent level is32.9, so we definitely conclude that we should be using fixed effects estimation.

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 420 — #13


Our findings are the same as in the simpler example in Section 14.1. Theyconfirm that married men earn more than single men. Part of the differentialappears to be attributable to the characteristics of married men, since men whoare soon-to-marry but still single also enjoy an earnings premium. However, partof the marriage premium appears to be attributable to the effect of marriage itself,since married men earn significantly more than those who are soon-to-marry butstill single.

Random effects or OLS?

Suppose that the DWH test indicates that we can use random effects ratherthan fixed effects. We should then consider whether there are any unobservedeffects at all. It is just possible that the model has been so well specified that thedisturbance term

uit = αi + εit (14.19)

consists of only the purely random component εit and there is no individual-specific αi term. In this situation we should use pooled OLS, with two advantages.There is a gain in efficiency because we are not attempting to allow for non-existent within-groups autocorrelation, and we will be able to take advantage ofthe finite-sample properties of OLS, instead of having to rely on the asymptoticproperties of random effects.

Various tests have been developed to detect the presence of random effects.The most common, implemented in some regression applications, is the Breusch–Pagan Lagrange multiplier test, the test statistic having a chi-squared distributionwith one degree of freedom under the null hypothesis of no random effects. Inthe case of the marriage effect example the statistic is very high indeed, 20,007,but in this case it is meaningless because we are not able to use random effectsestimation.

A note on the random effects and fixed effects terminology

It is generally agreed that random effects/fixed effects terminology can be mis-leading, but that it is too late to change it now. It is natural to think that randomeffects estimation should be used when the unobserved effect can be character-ized as being drawn randomly from a given population and that fixed effectsshould be used when the unobserved effect is considered to be non-random. Thesecond part of that statement is correct. However, the first part is correct onlyif the unobserved effect is distributed independently of the Xj variables. If it isnot, fixed effects should be used instead to avoid the problem of unobserved het-erogeneity bias. Figure 14.1 summarizes the decision-making process for fittinga model with panel data.

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 421 — #14


Can the observations be described as being a random sample from a given population?

Use fixed effectsPerform both fixed effects and random effects regressions.

Does a DWH test indicate significant differences in the

coefficients?

Provisionally choose random effects. Does a test indicate the

presence of random effects?

Use fixed effects Use random effects Use pooled OLS

Yes

Yes

No

Yes

No

No

Figure 14.1 Choice of regression model for panel data

Key terms

balanced panel

Durbin–Wu–Hausman test

first differences regression

fixed effects

least squares dummy variable (LSDV)regression

longitudinal data set

panel data set

pooled OLS regression

random effects

unbalanced panel

unobserved effect

within-groups regression

Exercises

14.1 Download the OECD2000 data set from the website. See Appendix B for details Thedata set contains 32 variables:

ID This is the country identification, with 1 = Australia, 2 = Austria,3 = Belgium, 4 = Canada, 5 = Denmark, 6 = Finland, 7 = France,8 = Germany, 9 = Greece, 10 = Iceland, 11 = Ireland, 12 = Italy, 13 =Japan, 14 = Korea, 15 = Luxembourg, 16 = Mexico, 17 = Netherlands,18 = New Zealand, 19 = Norway, 20 = Portugal, 21 = Spain,

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 422 — #15


22 = Sweden, 23 = Switzerland, 24 = Turkey, 25 = United Kingdom,26 = United States. Four countries that have recently joined the OECD,the Czech Republic, Hungary, Poland, and Slovakia, are excluded becausetheir data do not go back far enough.

ID01–26 These are individual country dummy variables. For example, ID09 is thedummy variable for Greece.

E Average annual percentage rate of growth of employment for country iduring time period t.

G Average annual percentage rate of growth of GDP for country i duringtime period t.

TIME There are three time periods, denoted 1, 2, and 3. They refer to averageannual data for 1971–80, 1981–90, and 1991–2000.

TIME2 Dummy variable defined to be equal to 1 when TIME = 2, 0 otherwise.

TIME3 Dummy variable defined to be equal to 1 when TIME = 3, 0 otherwise.

Perform a pooled OLS regression of E on G. Regress E on G, TIME2, and TIME3.Perform appropriate statistical tests and give an interpretation of the regressionresults.

14.2 Using the OECD2000 data set, perform a (within-groups) fixed effects regressionof E on G, TIME2, and TIME3. Perform appropriate statistical tests, give aninterpretation of the regression coefficients, and comment on R2.

14.3 Perform the corresponding LSDV regression, using OLS to regress E on G, TIME2,TIME3, and the country dummy variables (a) dropping the intercept, and (b) drop-ping one of the dummy variables. Perform appropriate statistical tests and give aninterpretation of the coefficients in each case. Explain why either the intercept or oneof the dummy variables must be dropped.

14.4 Perform a test for fixed effects in the OECD2000 regression by evaluating theexplanatory power of the country dummy variables as a group.

14.5 Download the NLSY2000 data set from the website. See Appendix B for details. Thiscontains the variables found in the EAEF data sets for the years 1980–94, 1996,1998, and 2000 (there were no surveys in 1995, 1997, or 1999). Assuming that arandom effects model is appropriate, investigate the apparent impact of unobservedheterogeneity on estimates of the coefficient of schooling by fitting the same earningsfunction, first using pooled OLS, then using random effects.

14.6 The UNION variable in the NLSY2000 data set is defined to be equal to 1 if therespondent was a member of a union in the year in question and 0 otherwise.Assuming that a random effects model is appropriate, add UNION to the earningsfunction specification and fit it using pooled OLS and random effects.

14.7 Using the NLSY2000 data set, perform a fixed effects regression of the earningsfunction specification used in Exercise 14.5 and compare the estimated coefficients

DOUGH: “CHAP14” — 2006/8/29 — 17:05 — PAGE 423 — #16


with those obtained using OLS and random effects. Perform a Durbin–Wu–Hausmantest to discriminate between random effects and fixed effects.

14.8 Using the NLSY2000 data set, perform a fixed effects regression of the earningsfunction specification used in Exercise 14.6 and compare the estimated coefficientswith those obtained using OLS and random effects. Perform a Durbin–Wu–Hausmantest to discriminate between random effects and fixed effects.

14.9 The within-groups version of the fixed effects regression model involved subtractingthe group mean relationship

Yi = β1 +k∑

j=2

(βjXij) + δt + αi + εit

from the original specification in order to eliminate the individual-specific effect αi.Regressions using the group mean relationship are described as between effects regres-sions. Explain why the between effects model is in general inappropriate for estimatingthe parameters of a model using panel data. (Consider the two cases where the αi arecorrelated and uncorrelated with the Xj controls.)

ECON 5350 Class NotesChapter 13. Panel Data

1 Introduction

Panel (a.k.a., longitudinal or time series-cross sectional) data are observed both across sections and over

time. The advantages of panel data are that it

1. increases the number of observations,

2. increases the precision of parameter estimates and

3. allows one to sort out effects that may be impossible with only cross sectional or only time series data

(e.g., technological progress versus economies of scale).

Three famous U.S. panel data sets are the

• Panel Study of Income Dynamics (PSID),

• National Longitudinal Survey (NLS) and

• Current Population Survey (CPS).

For example, the PSID follows 6,000 families and 15,000 individuals since 1968, asking questions related

to income, job changes, marital status, other socioeconomic and demographic characteristics, etc.

Although there are clear advantages of panel data, there are also some complications. Below, I present

an introduction to estimation with panel data. Begin with the following model

yit = αi + x0itβ + it (1)

where i = 1, ..., n and t = 1, ..., T . Of course, when αi = α for i = 1, ..., n, then the data can simply

be pooled together, the model written in standard form Y = Xβ + , and estimated with standard linear

techniques.

1

2 Fixed-Effects (FE) Model

In the fixed-effects model, we treat αi as a group-specific constant term to be estimated with the other

parameters. Now write the model as

Y = Dα+Xβ + (2)

where

Y =

y1

y2...

yn

nT×1

D =

iT 0 · · · 0

0 iT 0

.... . .

...

0 0 · · · iT

nT×n

X =

x11 x12 · · · x1k

x21 x22 x2k...

. . ....

xn1 xn2 · · · xnk

nT×k

and

α =

α1

α2...

αn

n×1

β =

β1

β2...

βk

k×1

.

This commonly referred to as the Least Squares Dummy Variable (LSDV) model. Theoretically, there are

no problems in estimating (2) — assuming the standard assumptions hold, the Gauss-Markov theorem applies

and we obtain unbiased and efficient estimates. There may be computational problems, however. Notice

that the stacked coefficient vector is of length (n+k). Therefore, the standard least squares formula requires

inverting a matrix of size (n + k) × (n + k). Since many panel data sets have n > 1000, this can lead to

numerical errors.

2.1 Partitioned Regression

A partitioned regression provides a simple solution to the above problem. Recall that

b = (X 0MDX)−1(X 0MDY )

2

where

MD = I −D(D0D)−1D0 =

IT − 1T ii

0 0 · · · 0

0 IT − 1T ii

0 0

.... . .

...

0 0 · · · IT − 1T ii

0

is a symmetric, idempotent "residual-maker" matrix for the regression on the dummy variables D. Since

MD is symmetric, idempotent and premultiplication produces the average over t = 1, ..., T for each i, the

partitioned regression is equivalent to regressing

Y∗ =MDY = Y − Y =

y1 − y1

y2 − y2...

yn − yn

nT×1

on

X∗ =MDX = X − X =

x11 − x11 x12 − x12 · · · x1k − x1k

x21 − x21 x22 − x22 x2k − x2k...

. . ....

xn1 − xn1 xn2 − xn2 · · · xnk − xnk

nT×k

.

This only requires inversion of a (k × k) matrix. The group-specific constant terms can then be recovered

according to

ai = yi − b0xi

for i = 1, ..., n. The partitioned regression approach is basically a two-stage estimation procedure:

• Step #1. Transform the data by subtracting group means.

• Step #2. Run OLS on the transformed data.

2.2 Variance Estimation

The variance estimator for b is as expected

dvar(b) = s2(X 0MDX)−1

3

where the appropriate estimator for σ2 is

s2 =e0e

nT − n− k=

Pni=1

PTt=1(yit − ai − x0itb)

2

nT − n− k.

Keep in mind that if you use the partitioned-matrix approach, standard econometric programs may incor-

rectly use the degrees of freedom correction nT − k when nT − n− k is appropriate. Finally, the variance

estimator for the ai is dvar(ai) = s2

T+ x0is2(X 0MDX)

−1xi.

2.3 Testing for Group-Specific Effects

The standard F test can be used to test whether the pooled or fixed-effects model is more appropriate. The

null hypothesis is

H0: α1 = α2 = · · · = αn.

The F statistic is

F =(R2LSDV −R2pooled)/(n− 1)(1−R2LSDV )/(nT − n− k)

∼ F (n− 1, nT − n− k)

which could alternatively be written in sum-of-squared-errors form.

3 Random-Effects (RE) Model

An alternative approach is to treat αi in equation (1) as a random draw from a distribution rather than

being nonstochastic. The advantages are

• The RE model has fewer parameters to estimate.

• The RE model allows for additional explanatory variables that have equal value for all observationswithin a group (e.g., education level of parents, number of siblings, etc.).

The disadvantage of the RE approach is that

• if the unobserved group-specific effects are correlated with the explanatory variables, then the estimateswill be biased and inconsistent.

• the estimator is a bit more complicated.

4

3.1 Basic Framework

From equation (1), we will let αi = α+ µi so that the model is

yit = α+ x0itβ + (µi + it)

where the following assumptions are made

• E( it) = 0, ∀i, t

• E(µi) = 0, ∀i

• E( 2it) = σ2, ∀i, t

• E(µ2i ) = σ2µ, ∀i

• E( itµj) = 0, ∀i, t, j

• E( it js) = 0, ∀s 6= t or i 6= j

• E(µiµj) = 0, ∀i 6= j.

The RE model can be written in matrix form as

Y = Xβ + η

where η ∼ N(0,Ω). Given the assumptions above, the (nT × nT ) variance-covariance matrix Ω has the

following structure

Ω = (In ⊗ ΣT ) =

ΣT 0 · · · 0

0 ΣT 0

.... . .

...

0 0 · · · ΣT

where

ΣT =

σ2 + σ2µ σ2µ · · · σ2µ

σ2µ σ2 + σ2µ · · · σ2µ...

.... . .

...

σ2µ σ2µ · · · σ2 + σ2µ

T×T

.

5

3.2 Estimation

Although OLS will produce consistent estimates, because Ω is not diagonal, the OLS estimates will be

inefficient. Therefore, GLS is the efficient estimator. Recall that the GLS estimator is

βGLS = (X0∗X∗)

−1(X 0∗Y∗) = ((PX)

0(PX))−1((PX)0(PY )) = (X 0Ω−1X)−1(X 0Ω−1Y ). (3)

For the RE model,

P = Ω−1/2 = [In ⊗ Σ−1/2T ]

where

Σ−1/2T =

1

σ[IT − θ

TiT i

0T ]

and

θ = 1− σqσ2 + Tσ2µ

.

Therefore, the GLS estimator can be calculated by running a regression of the pseudo-deviations

Y∗ =

y1 − θy1

y2 − θy2...

yn − θyn

on the similarly transformed X∗.

3.3 Feasible Estimation

To make (3) operational, all that is left is to estimate σ2 and σ2µ. We will do this sequentially — first we

estimate σ2 and then use that to estimate σ2µ.

3.3.1 Estimation of σ2.

We begin by using the "within-groups" information given by the difference between

yit = α+ β0xit + (µi + it) (4)

6

and

yi = α+ β0xi + (µi + i). (5)

This produces

yit − yi = β0(xit − xi) + ( it − i). (6)

Now that the unobserved group-specific random effects are gone, we estimate (6) using the LSDV estimator

and use the residuals to get the following estimate of σ2:

σ2 =

Pni=1

PTt=1(eit − ei)

2

nT − n− k.

3.3.2 Estimation of σ2µ.

Now we use the "between-groups" information to estimate σ2µ. Consider (5) again

i + µi = yi − α− β0xi.

The variance of (5) is

σ2∗∗ =σ2

T+ σ2µ.

Therefore, we can estimate σ2µ using

σ2µ =e0∗∗e∗∗n− k

− σ2

T.

Finally, insert the estimates σ2 and σ2µ into P and calculate βGLS .

4 Choosing Between Fixed and Random Effects Models

A frequent question with panel data is which model to use — fixed or random effects. The answer boils down

to whether the unobserved group-specific effects are correlated with the explanatory variables or not. If

they are, then the RE model will produce inconsistent estimates. If they are not, then the RE model may

be preferable. There are two methods for choosing between RE and FE models.

4.1 Think Through the Problem

Consider the following two problems where the RE model is almost certainly inappropriate.

7

1. Returns to Schooling. Labor economists use panel datasets to explain individual wages as a function of

years of schooling, as well as other socioeconomic and demographic characteristics. Individuals almost

certainly have unobserved innate abilities that are likely to be correlated with observable explanatory

variables such as years of schooling, marital status, type of employment, etc.

2. Economic Growth and R&D Spending. Consider a regression of GDP per capita on a number of

different country-specific variables such as research and development (R&D) spending, saving rates,

population growth rates, schooling, and capital-labor ratios. There are likely to be unobserved,

country-specific cultural differences that influence economic growth and, at the same time, are corre-

lated with the explanatory variables such as saving rates, population growth rates, etc.

4.2 Hausman Specification Test

The motivation behind the Hausman test is that under the null hypothesis of no correlation (i.e., H0:

corr(Xit, µi) = 0), then both the FE and RE estimators are consistent but only the RE estimator is efficient.

Under the alternative, while the FE estimator is consistent, the RE estimator is not. The statistic is

W = (bLSDV − βGLS)0[var(b)− var(βGLS)]

−1(bLSDV − βGLS)asy∼ χ2(k − 1).

5 Heteroscedasticity and Autocorrelation

In general, there are two ways to handle nonspherical disturbances — robust estimation of the asymptotic

variance-covariance matrix (e.g., White’s estimator or the Newey-West estimator) or respecification of the

error structure and application of generalized least squares. I will present only the latter (see Greene

section 13.7 for a discussion of robust estimation). Note that LIMDEP has canned routines for handling

heteroscedasticity and autocorrelation in panel-data models.

5.1 Heteroscedasticity in the FE Model

The most straightforward way to handle heteroscedasticity in the FE model is to begin by calculating an

estimate of σ2,i using the LSDV residuals

σ2,i =1

T

XT

t=1e2i,t. (7)

8

Feasible GLS estimates are then calculated by

βFGLS = (X0Ω−1X)−1(X 0Ω−1Y )

where

Ω =

σ2,1IT 0 · · · 0

0 σ2,2IT 0

.... . .

...

0 0 · · · σ2,nIT

nT×nT

.

5.2 Heteroscedasticity in the RE Model

Begin by considering the composite error term µi + it. Although it makes sense to allow E(µ2i ) = σ2µ,i,

we will only have one observation for each i on µi. Therefore, estimation of σ2µ,i would have to be µ2i ,

which is probably not desirable. Therefore, if we let the E( 2it) = σ2,i, then all we have to do is adjust our

transformation parameter as follows:

θi = 1− σ ,iqσ2,i + Tσ2µ

.

A similar result holds for unbalanced panels where heteroscedasticity is introduced by varying group sizes.

The only remaining task is to estimate σ2,i and σ2µ. Greene suggests using the LSDV residuals to estimate

the σ2,i since the model has been purged of the µi. This estimate is given in equation (7). The group-specific

variance σ2µ can then be estimated by

σ2µ =1

n

Xn

i=1(σ2OLS,i − σ2,i)

where σ2OLS,i are comparable to σ2,i, but estimated using the pooled OLS residuals. The RE estimator then

proceeds as described earlier.

5.3 Autocorrelation in the FE Model

Autocorrelation in the FE is fairly simple (although care needs to be made that the data are stacked in the

correct manner). Begin with AR(1) errors

it = ρi i,t−1 + νit (8)

9

where some researchers choose to set ρi = ρ for all i = 1, ..., n. Either way, the ρis can be estimated with

the LSDV residuals, the data pseudo-differenced, and feasible GLS applied.

5.4 Autocorrelation in the RE Model

Autocorrelation in the RE model is only slightly more complex. Of course, in the RE model there is

always going to be autocorrelation in the composite error term µi + it because µi does not vary over time.

Therefore, it only makes sense to specify autocorrelation in it, as is done in equation (8). The LSDV

residuals can be used to get an estimate of ρ (or ρi if desired) and Cochrane-Orcutt can then be applied.

The transformed model will take the form

yit − ρiyi,t−1 = α(1− ρi) + β0(xit − ρixi,t−1) + µi(1− ρi) + νit

for t = 2, ..., T .

6 Other Types of Panel-Data Models

6.1 Unbalanced Panels

Up to this point, we have implicitly assumed that each cross section is observed for T periods. However,

because of sample attrition and new entry, it is common to have not observe each cross section for all periods.

In this case, the total number of observations will not be nT , but ratherPn

i=1 Ti. Below, I describe how to

modify the FE and RE models for an unbalanced panel, which is already programmed into most econometric

software packages.

6.1.1 FE Model

The FE model works with an unbalanced panel with no real change, provided the dummy variables are

updated to no longer have the same number of ones in each column.

6.1.2 RE Model

The RE model is only slightly more complicated. With an unbalanced panel, now define

θi = 1− σqσ2 + Tiσ2µ

10

where i = 1, ..., n. The data are then transformed according to

Y∗ =

y1 − θ1y1

y2 − θ2y2...

yn − θnyn

and similarly for X. Of course, if Ti = T , then this collapses back to the standard RE estimator.

6.2 Time-Specific Effects

It is possible, using either the FE or RE approach, to incorporate time-specific effects:

yit = αi + β0xit + γt + it.

Using the FE approach, γt can be estimated by incorporating dummy variables for T −1 of the time periods.The RE approach, which treats γt as a random variable, is more complicated.

6.3 Dynamic Models

Sometimes it is desirable to allow for dynamic effects in the model

yit = αi + γyi,t−1 + x0itβ + it.

The problem with estimating such a model, either of the FE or RE nature, is that after transforming the

model to get rid of the group-specific effect (e.g., first-differencing the model), the right-hand side variables

are correlated with the error term. The solution to this problem involves using values of yi,t−s for s > 1 as

instrumental variables.

7 Gauss Application

Consider the following panel data model taken from Woolridge (2002)

Murdersit = αi + β1Executionsit + β2Unempit + it

11

where Murdersit is the number of murders in state i in year t per 10,000 people; Executionsit is the

total number of executions for the current and prior two years; Unempit is the current unemployment rate;

i = 1, ..., 50; and t = 1987, 1990, 1993. See Gauss example 13.1 for further details.

12

Documents

Panel Data Econometrics kenya