Panel Data - nuffield.ox.ac.uk · Panel Data Panel datarefers todatasets withbothcross-sectional andtime-series vari-ation We can think of this either as: - repeated observations

Panel Data

Panel data refers to datasets with both cross-sectional and time-series vari-

ation

We can think of this either as:

- repeated observations for the same cross-section of observation units in

different time periods (months, years, etc.);

or as:

- time-series data for multiple observation units (households, firms, etc.)

Panel data is also known as longitudinal data

1

Some examples of panel datasets

Household surveys

- US Panel Study of Income Dynamics (PSID) or Health and Retirement

Study (HRS)

- British Household Panel Survey (BHPS) or English Longitudinal Study

of Ageing (ELSA)

Company accounts

- Compustat or Worldscope databases for publicly traded firms

- Bureau van Dijk datasets (FAME, AMADEUS, ORBIS) for all firms,

including subsidiaries2

‘Census’of production microdata for business establishments (‘plants’)

- US Longitudinal Research Database (LRD) or UK Annual Respondents

Database (ARD)

Sectoral/regional panel datasets

- sectoral national accounts or regional accounts for individual countries

- European System of national and regional Accounts (ESA, regional data

for EU countries); OECD STAN or EU KLEMS databases (sectoral data for

OECD/EU countries)

Country panel datasets

- World BankWorld Development Indicators (WDI) or PennWorld Tables

3

Panel datasets come in all shapes and sizes

We draw a broad distinction between panels in which the number of time

periods is large enough to consider estimating separate time series models

for each observation unit, and to rely on asymptotic properties derived by

considering the number of time period observations (T ) going to infinity

(long T panels)

And panels in which the number of time periods is too small for this ap-

proach to be viable (short T panels)

4

For short T panels with many cross-sectional observations, we rely instead

on asymptotic properties derived by considering the number of cross-section

observation units (N) going to infinity

Datasets with a small number of observation units observed for a small

number of time periods are more problematic, and it may not be appropri-

ate to rely on any asymptotic properties to characterize the behaviour of

estimators and test statistics in this case

5

For long T panels, a natural starting point is to estimate a separate time

series model for each observation unit

We may then want to summarize the results for the panel as a whole by

consideringmeans ormedians of the estimated parameters, or other summary

statistics

- taking simple averages of the estimated parameters inN linear regression

models estimated by OLS results in the Mean Groups estimator, due to

Pesaran and Smith (1995)

6

Alternatively we may want to test and impose restrictions that some or all

of the estimated parameters are common to all the observation units

- such restrictions are known as ‘pooling’restrictions, and estimators which

impose them are referred to as ‘pooled’estimators

We will look in a little more detail at some of the methods proposed for

long T panels later in the course, but our main focus will be on methods for

short T panels

7

Short T Panels

Having data for a small number of time periods limits our ability to allow

for heterogeneity across observation units in all parameters of the model

But we can still allow for unrestricted heterogeneity in the intercepts of

linear models, which in some contexts can be an important advantage over

cross-section datasets in controlling for omitted variables

8

Panel data is particularly useful when both:

- outcome variables and explanatory variables of interest vary (non-trivially)

over time

- important omitted variables are plausibly time-invariant (or very nearly)

In this setting, allowing for individual-specific intercepts controls for time-

invariant omitted variables, while parameters of interest can still be esti-

mated by exploiting the variation over time in y and X

One way to think about panel data in this setting is that it provides us

with observations on the outcome y ‘before and after’some change in the

explanatory variable X9

Examples

For micro production functions, measures of inputs and output generally

vary over time, while hard-to-measure factors like technology and manage-

ment quality may have important time-invariant components

For empirical growth models, measures of GDP per capita and investment

rates generally vary over time, while historical and geographical determi-

nants of income levels are time-invariant, and hard-to-measure factors like

institutional quality may have an important time-invariant component

10

Warning

Having repeated observations over time is much less useful if the outcomes

or explanatory variables of interest have little or no variation over time

This should be intuitive - having repeated observations on things that don’t

change adds no information over having an observation at one point in time

For this reason, panel data is less useful if we are interested in the effect of

education on earnings

- earnings vary over time, but for most working individuals observed in

labour force surveys, measures of educational attainment vary very little

once they have completed full-time education and entered the labour market

11

Types of Variables and Error Components

With panel data, we distinguish between 3 types of observed variables

i) those that vary both across individual observation units and over time

(for example, sales of firm i in year t are different from sales of the same firm

i in year t− 1, and different from sales of firm j 6= i in year t)

ii) those that vary across individual observation units but are constant over

time for each observation unit (for example, characteristics of individuals like

race or gender)

12

iii) those that vary over time but are the same for all observation units

in a given time period (for example, the exchange rate between countries A

and B in a model of exports to country B for a panel of exporting firms in

country A)

For observed variables, we reflect this in our notation by using 2 subscripts

for variables which vary over both dimensions of the panel, and only 1 sub-

script for variables which vary only over one dimension

For example: SALESit denotes sales of firm i in year t;MALEi denotes an

indicator for whether individual i is male or female; EXCHRATEt denotes

the exchange rate between countries A and B

13

Similarly we can decompose the error term in linear models, or unobserved

components of models more generally, into a component which varies across

individuals but not over time, a component which varies over time but not

across individuals, and a remaining component which varies both across

individuals and over time

- and use the same notational convention to distinguish between these three

error components

With this notation, the most general linear model we could consider can

be written in the form:

14

yit = xitβ + wiγ + stδ + (ηi + ft + vit)

for i = 1, 2, ..., N and t = 1, 2, ..., T , where

yit is a scalar outcome variable for individual i at time t

xit is a row vector containing observations on a set of explanatory variables

that vary over individuals and time

wi is a row vector containing observations on a set of time-invariant ex-

planatory variables

st is a row vector containing observations on a set of common, time-varying

explanatory variables

15

yit = xitβ + wiγ + stδ + (ηi + ft + vit)

β, γ and δ are the corresponding column vectors of parameters

and the error term is decomposed into an individual-specific time-invariant

component (ηi), a common time-varying component (ft) and a residual com-

ponent (vit) which varies both across individuals and over time

This version of the linear model for panel data is called the three-way er-

ror components model (whether or not we include all 3 types of observed

explanatory variables)

16

Time Dummies

Since our focus is on short T panels, we can allow for the common time-

varying component of the error term (ft) simply by specifying period-specific

intercepts; this component of the error term simply indicates that the inter-

cept in our linear model may take different values in different time periods

First combine the common time-varying component of the error term with

any common time-varying explanatory variables included in the original

model to form φt = stδ + ft, and re-write the original model as

yit = xitβ + wiγ + (ηi + φt + vit)

17

Now define a set of T dummy variables with:

D1t = 1 for observations in period 1, and D

1t = 0 otherwise

D2t = 1 for observations in period 2, and D

2t = 0 otherwise ....

DTt = 1 for observations in period T , and D

Tt = 0 otherwise

And re-write the model in the form

yit = xitβ + wiγ + φ1D1t + φ2D

2t + ...φTD

Tt + (ηi + vit)

= xitβ + wiγ +T∑s=1

φsDst + (ηi + vit)

This is now a two-way error components model with period-specific

intercepts

18

yit = xitβ + wiγ +T∑s=1

φsDst + (ηi + vit)

We estimate the T additional parameters (φ1, φ2, ..., φT ) together with the

original β and γ parameter vectors

Since the number of time periods T is small, this does not present any

problems

The dummy variables (D1t , D

2t , ..., D

Tt ) are known as time dummies (or,

with annual observations, year dummies)

19

There is no loss of generality with this approach, although we lose identi-

fication of the δ parameters on any observed common time-varying explana-

tory variables

With short T panels, this is usually not a concern, as short T panels are

in any case not ideally suited to estimating the effects of such common time-

varying covariates

Having estimated the φt parameters, we have the option of plotting these

as a time series and investigating whether they vary with the business cycle,

or have some other temporal pattern

20

The model with a full set of T time dummies is said to be ‘saturated’in

the time dimension

We have observations on T time periods, so only T degrees of freedom in

the time dimension; here we estimate T parameters on included variables

which vary only over time, and this is the maximum number of parameters

we could estimate on explanatory variables of this type

If we tried to add one or more additional common time-varying explanatory

variables, this would show up as a perfect multicollinearity problem

This also indicates that if we omit time dummies, we can include at most

T common time-varying explanatory variables in the vector st

21

Having shown that it is easy to allow for period-specific intercepts, we will

now suppress these and focus on the two-way error components model

yit = xitβ + wiγ + (ηi + vit)

for i = 1, ..., N and t = 1, ..., T , where

xit = (x1it, ..., xKit), β =

β1

...

βK

, wi = (w1i, ..., wGi), γ =

γ1

...

γG

1×K K × 1 1×G G× 1

and yit, ηi, and vit are scalars

22

Before studying estimation, we introduce some further notation

First stack the T observations for each individual

yi = Xiβ +Wiγ + (ηijT + vi)

for i = 1, ..., N

yi =

yi1

...

yiT

, Xi =

x1i1 . . . xKi1

... . . . ...

x1iT . . . xKiT

, Wi =

w1i . . . wGi

... . . . ...

w1i . . . wGi

, ηijT =

ηi

...

ηi

, vi =

vi1

...

viT

T × 1 T ×K T ×G T × 1 T × 1

jT is a T × 1 column vector with each element equal to one23

Then stack over the N individuals

y = Xβ +Wγ + (η + v)

y =

y1

...

yN

, X =

X1

...

XN

, W =

W1

...

WN

, η =

η1jT

...

ηNjT

, v =

v1

...

vN

NT × 1 NT ×K NT ×G NT × 1 NT × 1

As discussed, panel data is most useful for estimating parameters in β on

time-varying explanatory variables. We now simplify further by omitting

consideration of time-invariant explanatory variables

24

If our parameters of interest are in β, this is again without loss of generality,

since we can combine wiγ + ηi = η∗i , which just re-defines the individual-

specific component of the error term. Setting G = 0

y = Xβ + (η + v)

y =

y1

...

yN

, X =

X1

...

XN

, η =

η1jT

...

ηNjT

, v =

v1

...

vN

NT × 1 NT ×K NT × 1 NT × 1

Two important assumptions that we maintain for short T panels:

25

yi = Xiβ + (ηijT + vi)

Cross-sectional independence: Observations on (yi, Xi) are indepen-

dent over i = 1, ..., N

Slope parameter homogeneity: The parameters in β are common to

all i = 1, ..., N

The formof unobserved heterogeneity that we address relates to the individual-

specific intercept terms (ηi) in our linear model relating yit to xit (known as

‘fixed effects’or ‘random effects’, depending on whether they are assumed

to be correlated or uncorrelated with the explanatory variables in xit)

26

yit = xitβ + (ηi + vit)

= xitβ + uit for i = 1, ..., N and t = 1, ..., T

y = Xβ + (η + v)

= Xβ + u

uit = ηi + vit; u = η + v

We can now define the (pooled) ordinary least squares estimator of

the parameter vector β

β̂OLS = (X′X)−1X ′y

27

Properties of the (pooled) OLS estimator

We assume that both error components have expected values of zero, so

that we have E(uit) = E(ηi) + E(vit) = 0. Note that E(ηi) is defined over

the individual observation units, while E(vit) and E(uit) are defined over

individual observation units and time periods

We also assume that there is no correlation between (any of the explanatory

variables in) xit and the time-varying component of the error term vit

- otherwise we would have a source of simultaneity, and the OLS estimator

would be inconsistent

28

In the panel data context, we say that the explanatory variables in xit are

predetermined with respect to vit

- we do not rule out the possibility that current xit may be correlated with

lagged values of vi,t−k for some k > 0

Assumption (xit predetermined)

E(xitvit) = 0 for t = 1, 2, ..., T

Whether or not the OLS estimator of β is consistent then depends on

whether or not (any of the explanatory variables in) xit are correlated with

the time-invariant component of the error term, or the individual-specific

effects, ηi29

Under the further assumption that all of the explanatory variables in xit

are uncorrelated with the individual-specific effects, we satisfy all of the

conditions required to establish that β̂OLS is a consistent estimator of β

(provided that X is full rank so that (X ′X)−1 exists and the OLS estimator

can be computed)

Assumption (uncorrelated individual effects, or ‘random effects’)

E(xitηi) = 0 for t = 1, 2, ..., T

Under these assumptions, we have E(xituit) = E(xitηi) +E(xitvit) = 0, so

that we have the key orthogonality condition needed for OLS to be consis-

tent: β̂OLSP→ β as NT →∞

30

This consistency result holds as the total sample size (NT ) goes to infinity

Importantly, it also holds in the semi-asymptotic sense, letting N → ∞

with T held fixed (large N , fixed T asymptotics), which is more useful for

approximating the behaviour of estimators in short T panels

Notice that the error term uit = ηi + vit is serially correlated; even if

the time-varying component vit is serially uncorrelated, we have uit = ηi+vit

and ui,t−1 = ηi+vi,t−1, and these two errors are positively correlated through

the common ηi component

31

Consequently valid inference for the (pooled) OLS estimator in panel data

models with individual-specific effects requires the use of cluster-robust

standard errors and test statistics

- clustering on the identifier for individual observation units (the i subscript

variable) allows for this correlation between the error terms for the same

individual in different time periods

Another consequence of this serial correlation is the OLS estimator of β is

not effi cient in panel data models with individual-specific effects

- we will discuss the effi cient estimator of β in a particular version of the

linear model with uncorrelated individual effects later in the course

32

The assumption that all of the explanatory variables in xit are uncorrelated

with the individual-specific effects requires that all of the included explana-

tory variables are uncorrelated with any time-invariant omitted variables

whose influence on yit is reflected in the value of ηi

This assumption is highly restrictive in many economic applications

Notice that the OLS estimator of β in a single cross-section data sample

would also be consistent under the same assumptions made in this section

- the consistency of the (pooled) OLS estimator thus follows from these as-

sumptions, and not specifically from the availability of repeated observations

over time, i.e. panel data

33

If one or more of the explanatory variables in xit is correlated with the

individual-specific effects, the OLS estimator of β is inconsistent

- we can view ηi as a time-invariant omitted variable, which in this case is

both relevant and correlated with at least some of the included explanatory

variables, so that we have a form of ‘omitted variable bias’

Assumption (correlated individual effects, or ‘fixed effects’)

E(xitηi) 6= 0 for t = 1, 2, ..., T

In this case we have E(xituit) = E(xitηi) + 0 6= 0, so that the key orthog-

onality condition needed for OLS to be consistent is violated

34

Consequently the (pooled) OLS estimator is inconsistent in panel data

models with correlated individual effects

- this holds regardless of whether we consider N →∞ or T →∞ or both

OLS using the panel data (pooled OLS) is subject to the same kind of

omitted variable bias as OLS in a single cross-section

Simply using the repeated observations over time for each individual does

not change this

35

However the availability of repeated observations over time allows us to

transform the original model in order to construct consistent estimators of

parameters on time-varying explanatory variables in models with correlated

individual effects

This is one of the major advantages of empirical work using panel data

compared to empirical work using single cross-section datasets

36

Panel data is useful when we suspect that cross-section regression results

would be biased, due to the presence of (relevant and correlated) omitted

variables

- particularly if it is plausible that important omitted variables are time-

invariant (or vary little over the sample period)

- and the dependent variable and the explanatory variables of interest vary

over time

37

Examples

Micro production functions: firms with better managers tend to be larger,

with higher levels of capital and labour inputs

Empirical growth models: countries with better institutions or more fa-

vorable geography tend to better environments for investment, and so have

higher investment rates

Our next task will be to consider estimators for panel data which can

estimate parameters consistently in this setting

38

Documents

Panel Data - nuffield.ox.ac.uk · Panel Data Panel datarefers todatasets withbothcross-sectional andtime-series vari-ation We can think of this either as: - repeated observations