440
Intensive Course in Econometrics Slides Rolf Tschernig & Harry Haupt University of Regensburg University of Bielefeld March 2009 1 1 We are greatly indebted to Kathrin Kagerer, Joachim Schnurbus, and Roland Weigand who helped us enormously to improve and correct this course material. Of course, the usual disclaimer applies. c These slides may be printed and reproduced for individual or instructional use, but may not be printed for commercial purposes. i

Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

  • Upload
    others

  • View
    28

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics

Slides

Rolf Tschernig & Harry Haupt

University of Regensburg University of Bielefeld

March 20091

1We are greatly indebted to Kathrin Kagerer, Joachim Schnurbus, and Roland Weigand who helped us enormously to improve and correct this course

material. Of course, the usual disclaimer applies.

c© These slides may be printed and reproduced for individual or instructional use, but may not be printed for commercial purposes.

i

Page 2: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Contents

1 Introduction: What is Econometrics? 4

1.1 A Trade Example: What Determines Trade Flows? . . . 4

1.2 Economic Models and the Need for Econometrics . . . 13

1.3 Causality and Experiments . . . . . . . . . . . . . . . 21

1.4 Types of Economic Data . . . . . . . . . . . . . . . . 24

2 The Simple Regression Model 29

2.1 The Population Regression Model . . . . . . . . . . . 30

ii

Page 3: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

2.2 The Sample Regression Model . . . . . . . . . . . . . 44

2.3 The OLS Estimator . . . . . . . . . . . . . . . . . . . 47

2.4 Best Linear Prediction, Correlation, and Causality . . . 63

2.5 Algebraic Properties of the OLS Estimator . . . . . . . 69

2.6 Parameter Interpretation and Functional Form . . . . . 73

2.7 Statistical Properties: Expected Value and Variance . . 84

2.8 Estimation of the Error Variance . . . . . . . . . . . . 91

3 Multiple Regression Analysis: Estimation 94

3.1 Motivation: The Trade Example Continued . . . . . . . 94

3.2 The Multiple Regression Model of the Population . . . 98

3.3 The OLS Estimator: Derivation and Algebraic Properties 111

3.4 The OLS Estimator: Statistical Properties . . . . . . . 123

3.5 Model Specification I: Model Selection Criteria . . . . . 153

iii

Page 4: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

4 Multiple Regression Analysis: Hypothesis Testing 163

4.1 Basics of Statistical Tests . . . . . . . . . . . . . . . . 163

4.2 Probability Distribution of the OLS Estimator . . . . . 193

4.3 The t Test in the Multiple Regression Model . . . . . . 200

4.4 Empirical Analysis of a Simplified Gravity Equation . . . 208

4.5 Confidence Intervals . . . . . . . . . . . . . . . . . . 217

4.6 Testing a Single Linear Combination of Parameters . . 227

4.7 The F Test . . . . . . . . . . . . . . . . . . . . . . . 233

4.8 Reporting Regression Results . . . . . . . . . . . . . . 259

5 Multiple Regression Analysis: Asymptotics 262

5.1 Large Sample Distribution of the Mean Estimator . . . 262

5.2 Large Sample Inference for the OLS Estimator . . . . . 277

iv

Page 5: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

6 Multiple Regression Analysis: Interpretation 282

6.1 Level and Log Models . . . . . . . . . . . . . . . . . 282

6.2 Data Scaling . . . . . . . . . . . . . . . . . . . . . . 283

6.3 Dealing with Nonlinear or Transformed Regressors . . . 290

6.4 Regressors with Qualitative Data . . . . . . . . . . . . 301

7 Multiple Regression Analysis: Prediction 317

7.1 Prediction and Prediction Error . . . . . . . . . . . . . 317

7.2 Statistical Properties of Linear Predictions . . . . . . . 324

8 Multiple Regression Analysis: Heteroskedasticity 325

8.1 Consequences of Heteroskedasticity for OLS . . . . . . 328

8.2 Heteroskedasticity-Robust Inference after OLS . . . . . 330

8.3 The General Least Squares (GLS) Estimator . . . . . . 333

8.4 Feasible Generalized Least Squares (FGLS) . . . . . . . 341

v

Page 6: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

9 Multiple Regression Analysis: Model Diagnostics 363

9.1 The RESET Test . . . . . . . . . . . . . . . . . . . . 363

9.2 Heteroskedasticity Tests . . . . . . . . . . . . . . . . 366

9.3 Model Specification II: Useful Tests . . . . . . . . . . 386

10 Appendix I

10.1 A Condensed Introduction to Probability . . . . . . . . I

10.2 Important Rules of Matrix Algebra . . . . . . . . . . . XXI

10.3 Rules for Matrix Differentiation . . . . . . . . . . . . . XXVIII

10.4 Data for Estimating Gravity Equations . . . . . . . . . XXX

vi

Page 7: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Organisation

Contact

Prof. Dr. Rolf Tschernig

Building RW(L), 5th floor, room 514

Universitatsstr. 31, 93040 Regensburg, Germany

Tel. (+49) 941/943 2737, Fax (+49) 941/943 4917

Email: [email protected]

http://www.wiwi.uni-regensburg.de/tschernig/

1

Page 8: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Organisation — UR March 2009 — R. Tschernig

Schedule and Location

Date of Course: February 16 to February 27, 2009

Osteuropa-Institut and University of Regensburg

Lectures: every morning 8.30 - 10.00 W 113 Tschernig

every morning 10.30 - 12.00 W 113 Tschernig

Exercises: every afternoon 14.00 - 17.00 W 113 or PC-room Kagerer, Schnurbus, Weigand

Exam

no exam in this course

2

Page 9: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Organisation — UR March 2009 — R. Tschernig

Required Text

Wooldridge, J.M. (2009). Introductory Econometrics. A Modern

Approach, 4th ed., Thomson South-Western.

Additional Reading

Stock, J.H. and Watson, M.W. (2007). Introduction to Economet-

rics, 2nd ed., Pearson, Addison-Wesley.

plus what will be announced during the course.

3

Page 10: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

1 Introduction: What is Econometrics?

1.1 A Trade Example: What Determines Trade Flows?

Goal/Research Question: Identify the factors that influence imports

to Kazakhstan and quantify their impact.

4

Page 11: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.1 — UR March 2009 — R. Tschernig

• Three basic questions that have to be answered during the anal-

ysis:

1. Which (economic) relationships could be / are “known” to be

relevant for this question?

2. Which data can be useful for checking the possibly relevant eco-

nomic conjectures/theories?

3. How to decide about which economic conjecture to reject or to

follow?

• Let’s have a first look at some data of interest: the imports (in

current US dollars) to Kazakhstan from 55 originating countries in

2004.

5

Page 12: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.1 — UR March 2009 — R. Tschernig

Imports to Kazakhstan in 2004 in current US dollars

0

200000000

400000000

600000000

800000000

1000000000

1200000000

1400000000

ALB

AUT

BEL

BIH

CAN

CHN

CZE

ESP

FIN

GBR

GER

HKG

HUN

ISL

JPN

KGZ

LTU

MDA

MLT

NOR

PRT

RUS

SVN

THA

TKM

TWN

USA

YUG

The original data are from

UN Commodity Trade Statistics Database (UN COMTRADE)

6

Page 13: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.1 — UR March 2009 — R. Tschernig

• See section 10.4 in the Appendix for detailed data descriptions.

Data are provided in the EViews file Kazakhstan imports 2004.wf1.

We thank Richard Frensch, Osteuropa-Institut, Regensburg, Germany, who pro-

vided all data throughout this course for analyzing trade flows.

• A first attempt to answer the three basic questions:

1. Ignore for the moment all existing economic theory and simply

hypothesize that observed imports depend somehow on the GDP

of the exporting country.

2. Collect GDP data for the countries of origin, e.g. from the

International Monetary Fund (IMF) – World Economic Outlook Database

3. Plot the data, e.g. by using a scatter plot.

Can you decide whether there is a relationship between trade

flows from and the GDP of exporting countries?

7

Page 14: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.1 — UR March 2009 — R. Tschernig

A scatter plot

0.00E+00

2.00E+08

4.00E+08

6.00E+08

8.00E+08

1.00E+09

1.20E+09

0 2000 4000 6000 8000 10000 12000

WEO_GDPCR_O

TR

AD

E_

0_

D_

O

Some questions:

• What do you see?

• Is there a relationship?

• If so, how to quantify it?

• Is there a causal relationship

- what determines what?

• By how much do the im-

ports from Germany change

if the GDP in Germany

changes by 1%?

• Are there other relevant

factors determining imports,

e.g. distance?

• Is it possible to forecast fu-

ture trade flows?8

Page 15: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.1 — UR March 2009 — R. Tschernig

• What have you done?

– You tried to simplify reality

– by building some kind of (economic) model.

• An (economic) model

– has to reduce the complexity of reality such that it is useful for

answering the question of interest;

– is a collection of cleverly chosen assumptions from which implica-

tions can be inferred (using logic) — Example: Heckscher-Ohlin

model;

– should be as simple as possible and as complex as necessary;

– cannot be refuted or “validated” without empirical data of some

kind.

9

Page 16: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.1 — UR March 2009 — R. Tschernig

• Let us consider a simple formal model for the relationship between

imports and GDP of the originating countries

importsi = β0 + β1gdpi, i = 1, . . . , 55.

– Does this make sense?

– How to determine the values of the so called parameters β0 and

β1?

– Fit a straight line through the cloud!

0.00E+00

2.00E+08

4.00E+08

6.00E+08

8.00E+08

1.00E+09

1.20E+09

0 2000 4000 6000 8000 10000 12000

WEO_GDPCR_O

TR

AD

E_

0_

D_

O

10

Page 17: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.1 — UR March 2009 — R. Tschernig

0.00E+00

2.00E+08

4.00E+08

6.00E+08

8.00E+08

1.00E+09

1.20E+09

0 2000 4000 6000 8000 10000 12000

W E O _ G D P C R _ O

TR

AD

E_

0_

D_

OTRADE_0_D_O vs. WEO_GDPCR_O

More questions:

– How to fit a line through

the cloud of points?

– Which properties does the

fitted line have?

– What to do with other

relevant factors that are

currently neglected in the

analysis?

– Which criteria to choose

for identifying a potential

relationship?

11

Page 18: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.1 — UR March 2009 — R. Tschernig

0

200,000,000

400,000,000

600,000,000

800,000,000

1,000,000,000

1,200,000,000

0 4,000 8,000 12,000

WEO_GDPCR_O

TRADE_0_D_OTRADE_0_D_F_LEVTRADE_0_D_F_LEV2

Further questions:

– Is the potential relation-

ship really linear? Com-

pare it to the green points

of a nonlinear relationship.

– And: how much may re-

sults change with a differ-

ent sample, e.g. for 2003?

12

Page 19: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig

1.2 Economic Models and the Need for Econometrics

• Standard problems of economic models:

– The conjectured economic model is likely to neglect some factors.

– Numeric results to the numerical questions posed depend in gen-

eral on the choice of a data set. A different data set leads to

different numerical results.

=⇒ Numeric answers always have some uncertainty.

13

Page 20: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig

• Econometrics

– offers solutions for dealing with unobserved factors in economic

models,

– provides “both a numerical answer to the question and a measure

how precise the answer is (Stock & Watson 2007, p. 7)”,

– as will be seen later, provides tools that allow to refute economic

hypotheses using statistical techniques by confronting theory with

data and to quantify the probability of such decisions to be wrong,

– as will be seen later as well, allows to quantify risks of forecasts,

decisions, and even of its own analysis.

14

Page 21: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig

• Therefore:

Econometrics can also be useful for providing answers to questions

like:

– How reliable are predicted growth rates or returns?

– How likely is it that the value realizing in the future will be close

to the predicted value? In other words, how precise are the pre-

dictions?

• Main tool: Multiple regression model

It allows to quantify the effect of a change in one variable on another

variable, holding other things constant (ceteris paribus analysis).

15

Page 22: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig

• Steps of an econometric analysis:

1. Careful formulation of question/problem/task of interest.

2. Specification of an economic model.

3. Careful selection of a class of econometric models.

4. Collecting data.

5. Selection and estimation of an econometric model.

6. Diagnostics of correct model specification.

7. Usage of the model.

Note that there exists a large variety of econometric models and

model choice depends very much on the research question, the un-

derlying economic theory, availability of data, and the structure of

the problem.

16

Page 23: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig

• Goals of this course:

providing you with basic econometric tools such that you can

– successfully carry out simple empirical econometric analyzes and

provide quantitative answers to quantitative questions,

– recognize ill-conducted econometric studies and their consequences,

– recognize when to ask for help of an expert econometrician,

– attend courses for advanced econometrics / empirical economics,

– study more advanced econometric techniques.

17

Page 24: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig

Some Definitions of Econometrics

– “... discover empirical relation between economic variables, pro-

vide forecast of various economic quantities of interest ... (First

issue of volume 1, Econometrica, 1933).”

– “The science of model building consists of a set of quantitative

tools which are used to construct and then test mathematical rep-

resentations of the real world. The development and use of these

tools are subsumed under the subject heading of econometrics

(Pindyck & Rubinfeld 1998).”

18

Page 25: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig

– “At a broad level, econometrics is the science and art of using eco-

nomic theory and statistical techniques to analyze economic data.

Econometric methods are used in many branches of economics,

including finance, labor economics, macroeconomics, microeco-

nomics, marketing, and economic policy. Econometric methods

are also commonly used in other social sciences, including political

science and sociology (Stock & Watson 2007, p. 3).”

So, some may also say: “Alchemy or Science?”, “Economic-

tricks”, “Econo-mystiques”.

– “Econometrics is based upon the development of statistical meth-

ods for estimating economic relationships, testing economic the-

ories, and evaluating and implementing government and business

policy (Wooldridge 2009, p. 1).”

19

Page 26: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig

• Summary of tasks for econometric methods

– In brief: econometrics can be useful whenever you en-

counter (economic) data and you want to make sense

out of them.

– In detail:

∗ Providing a formal framework for falsifying postulated

economic relationships by confronting economic theory with

economic data using statistical methods: Economic hypotheses

are formulated and statistically tested on basis of adequately

(and repeatedly) collected data such that test results may fal-

sify the postulated hypotheses.

∗ Analyzing the effects of policy measures.

∗ Forecasting.

20

Page 27: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.3 — UR March 2009 — R. Tschernig

1.3 Causality and Experiments

• Common understanding: “causality means that a specific action”

(touching a hot stove) “leads to a specific, measurable consequence”

(get burned) (Stock & Watson 2007, p. 8).

• How to identify causality? Observe repeatedly an action and its

consequence!

• Thus, in science one aims at repeating an action and its conse-

quences under identical conditions. How to generate repetitions

of actions?

21

Page 28: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.3 — UR March 2009 — R. Tschernig

• Randomized controlled experiments:

– there is a control group that receives no treatment (e.g. fertil-

izer) and a treatment group that receives treatment, and

– where treatment is assigned randomly in order to eliminate

any possible systematic relationship between the treatment and

other possible influences.

• Causal effect:

A “causal effect is defined to be an effect on an outcome of a given

action or treatment, as measured in an ideal randomized controlled

experiment (Stock & Watson 2007, p. 9).”

• In economics randomized controlled experiments are very often dif-

ficult or impossible to conduct. Then a randomized controlled ex-

periment provides a theoretical benchmark and econometric analysis

22

Page 29: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.3 — UR March 2009 — R. Tschernig

aims at mimicking as closely as possible the conditions of a random-

ized controlled experiment using actual data.

• Note that for forecasting knowledge of causal effects is not nec-

essary.

• Warning: in general multiple regression models do not allow con-

clusions about causality!

23

Page 30: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.4 — UR March 2009 — R. Tschernig

1.4 Types of Economic Data

1. Cross-Sectional Data

• are collected across several units at a single point or period of

time.

• Units: “economic agents”, e.g. individuals, households, investors,

firms, economic sectors, cities, countries.

• In general: the order of observations has no meaning.

• Popular to use index i.

• Optimal: the data are a random sample of the underlying popu-

lation, see Section 2.1 for details.

• Cross-Sectional data allow to explain differences between individ-

ual units.

24

Page 31: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.4 — UR March 2009 — R. Tschernig

• Example: sample of countries that export to Kazakhstan in 2004

of Section 1.1.

2. Time Series Data

• are sampled across differing points/periods of time.

• Popular to use index t.

• Sampling frequency is important:

– variable versus fixed;

– fixed: annually, quarterly, monthly, weekly, daily, intradaily;

– variable: ticker data, duration data (e.g. unemployment spells).

• Time series data allow the analysis of dynamic effects.

• Univariate versus multivariate time series data.

25

Page 32: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.4 — UR March 2009 — R. Tschernig

• Example: Trade flow from Germany to Kazakhstan and GDP in

Germany (in current US dollars), 1990 - 2007, T = 18.

0.0E+00

1.0E+08

2.0E+08

3.0E+08

4.0E+08

5.0E+08

6.0E+08

7.0E+08

8.0E+08

1990 1992 1994 1996 1998 2000 2002 2004 2006

TRADE_0_D_O

1.60E+12

1.80E+12

2.00E+12

2.20E+12

2.40E+12

2.60E+12

2.80E+12

3.00E+12

1990 1992 1994 1996 1998 2000 2002 2004 2006

WDI_GDPUSDCR_O

3. Panel data

• are a collection of cross-sectional data for at least two different

points/periods of time.

• Individual units remain identical in each cross-sectional sample

(except if units vanish).

26

Page 33: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.4 — UR March 2009 — R. Tschernig

• Use of double index: it where i = 1, . . . , N and t = 1, . . . , T .

• Typical problem: missing values - for some units and periods there

are no data.

• Example: growth rate of imports from 55 different countries to

Kazakhstan from 1991 to 2008 where all 55 countries were chosen

for the sample 1990 and kept fixed for all subsequent years

(T = 18, N = 55).

4. Pooled Cross Sections

• also a collection of cross-sectional data, however, allowing for

changing units across time.

• Example: in 1995 countries of origin are Germany, France, Russia

and in 1996 countries of origin are Poland, US, Italy.

27

Page 34: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 1.4 — UR March 2009 — R. Tschernig

In this course: focus on the analysis of cross-sectional data and

specific types of time series data:

• simple regression model → Chapter 2,

• multiple regression model → Chapters 3 to 9.

• Time series analysis requires advanced econometric techniques that

are beyond the scope of this course (given the time constraints).

Recall the arithmetic quality of data:

• quantitative variables,

• qualitative or categorical variables.

Reading: Sections 1.1-1.3 in Wooldridge (2009).

28

Page 35: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

2 The Simple Regression Model

Distinguish between the

• population regression model and the

• sample regression model.

29

Page 36: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

2.1 The Population Regression Model

• In general:

y and x are two variables that describe properties of the population

under consideration for which one wants “to explain y in terms of

x” or “to study how y varies with changes in x” or “to predict y

for given values of x”.

Example: By how much changes the hourly wage for an additional

year of schooling keeping all other influences fixed?

• If we knew everything, then the relationship between y and x

may formally be expressed as

y = m(x, z1, . . . , zs) (2.1)

where z1, . . . , zs denote s additional variables that in addition to

years of schooling x influence the hourly wage y.

30

Page 37: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

• For practical application it is possible

– that relationship (2.1) is too complicated to be useful,

– that there does not exist an exact relationship, or

– that there exists an exact relationship for which, however, not all

s influential variables z1, . . . , zs can be observed, or

– one has no idea about the structure of the function m(·).• Our solution:

– build a useful model, cf. Section 1.1,

– which focuses on a relationship that holds on “average”. What

do we mean by “average”?

31

Page 38: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

• Crucial building blocks for our model:

– Consider the variable y as random. You may think of y

denoting the value of the variable of a random choice out of all

units in the population. Furthermore, in case of discrete values of

the random variable y, a probability is assigned to each value

of y. (If the random variable y is continuous, a density value is

assigned.)

In other words: apply probability theory. See Appendices B

and C in Wooldridge (2009).

Examples:

∗ The population consists of all apartments in Almaty. The vari-

able y denotes the rent of a single apartment randomly chosen

from all apartments in Almaty.

32

Page 39: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

∗ The population consists of all possible values of imports to

Kazakhstan from a specific country and period.

∗ For a dice the population consists of all numbers that are writ-

ten on each side although in this case statisticians prefer to

talk about a sample space.

– In terms of probability theory the “average” of a variable y is

given by the expected value of this variable. In case of discrete

y one has

E[y] =∑

j ∈ all different yj in population

yjProb(y = yj

)

– Sometimes one may only look at a subset of the population,

namely all y that have the same value for another variable x.

33

Page 40: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

Example: one only considers the rents of all apartments in Al-

maty of size x = 75 m2.

– If the “average” is conditioned on specific values of another vari-

able x, then one considers the conditional expected value of

y for a given x: E[y|x]. For discrete random variables y one has

E[y|x] =∑

j ∈ all different yj in population

yjProb(y = yj|x

)

(See Appendix 10.1 for a brief introduction to probability theory

and corresponding definitions for continuous random variables.)

Example continued: the conditional expectation E[y|x = 75]

corresponds to the average rent of all apartments in Almaty of

size x = 75 m2.

34

Page 41: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

– Note that the variable x can be random, too. Then, the condi-

tional expectation E[y|x] is a function of the (random) variable

x

E[y|x] = g(x)

and therefore a random variable itself.

– From the identity

y = E[y|x] + (y − E[y|x]) (2.2)

one defines the error term or disturbance term as

u ≡ y − E[y|x]

so that one obtains a simple regression model of the pop-

ulation

y = E[y|x] + u (2.3)

35

Page 42: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

• Interpretation:

– The random variable y varies randomly around the conditional

expectation E[y|x]:

y = E[y|x] + u.

– The conditional expectation E[y|x] is called the systematic

part of the regression.

– The error term u is called the unsystematic part of the regres-

sion.

• So instead of trying the impossible, namely specifying m(x, . . .)

given by (2.1), one focuses the analysis on the “average” E[y|x].

36

Page 43: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

• How to determine the conditional expectation?

– This step requires assumptions!

– To keep things simple we make Assumption (A) given by

E[y|x] = β0 + β1x. (2.4)

– Discussion of Assumption (A):

∗ It restricts the flexibility of g(x) = E[y|x] such that g(x) =

β0 + β1x has to be linear in x. So if E[y|x] = δ0 + δ1 log x,

Assumption A is wrong.

∗ It can be fulfilled if there are other variables influencing y lin-

early. For example, consider

E[y|x, z] = δ0 + δ1x + δ2z.

37

Page 44: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

Then, by the law of iterated expectations one obtains

E[y|x] = δ0 + δ1x + δ2E[z|x]

If E[z|x] is linear in x, one obtains

E[y|x] = δ0 + δ1x + δ2(α0 + α1x)

= γ0 + γ1x (2.5)

with γ0 = δ0 + δ2α0 und γ1 = δ1 + δ2α1. Note, however,

that in this case E[y|x, z] 6= E[y|x] in general. Then model

choice depends on the goal of the analysis: the smaller model

can sometimes be preferable for prediction, the larger model is

needed if controlling for z is important ⇔ controlled random

experiments, see Section 1.3.

∗ In general, Assumption (A) is violated if (2.5) does not hold

e.g. if E[z|x] is nonlinear in x. Then the linear population

model is called misspecified. More on that in Section 3.4.

38

Page 45: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

• Properties of the error term u: From Assumption (A)

1. E[u|x] = 0,

2. E[u] = 0,

3. Cov(x, u) = 0.

39

Page 46: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

• An alternative set of assumptions:

The above result E[u|x] = 0 together with the identity (2.3) al-

lows to rewrite Assumption (A) in terms of the following two

assumptions:

1. Assumption SLR.1

(Linearity in the Parameters)

y = β0 + β1x + u, (2.6)

2. Assumption SLR.4

(Zero Conditional Mean)

E[u|x] = 0.

40

Page 47: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

• Linear Population Regression Model:

The simple linear population regression model is given by equation

(2.6)

y = β0 + β1x + u

and obtained by specifying the conditional expectation in the regres-

sion model (2.3) by a linear function (linear in the parameters).

The parameters β0 and β1 are called the intercept parameter

and slope parameter, respectively.

41

Page 48: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

• Some terminology for regressions

y x

Dependent variable Independent variable

Explained variable Explanatory variable

Response variable Control variable

Predicted variable Predictor variable

Regressand Regressor

Covariate

42

Page 49: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.1 — UR March 2009 — R. Tschernig

• A simple example: a game of dice

Let the random numbers x and u denote the throws of two fair

dices with x, u = −2.5,−1.5,−0.5, 0.5, 1.5, 2.5. Based on both

throws the random number y denotes the following sum

y = 2︸︷︷︸β0

+ 3︸︷︷︸β1

x + u.

This completely describes the population regression model.

– Derive the systematic relationship between y and x holding x

fixed.

– Interpret the systematic relationship.

– How can you obtain the values of the parameters β0 = 2 and

β1 = 3 if those values are unknown?

Next section: How can you determine/estimate β0 and β1?

43

Page 50: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.2 — UR March 2009 — R. Tschernig

2.2 The Sample Regression Model

Estimators and Estimates

• In practice one has to estimate the unknown parameters β0 and β1

of the population regression model using a sample of observations.

• The sample has to be representative and has to be collected/drawn

from the population.

• A sample of the random numbers x and y of size n is given by

(xi, yi) : i = 1, . . . , n.• Now we require an estimator that allows us — given the sample

observations (xi, yi) : i = 1, . . . , n — to compute estimates for

the unknown parameters β0 and β1 of the population.

44

Page 51: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.2 — UR March 2009 — R. Tschernig

• Note:

– If we want to construct an estimator for the unknown parameters,

we have not yet observed a sample. An estimator is a function

that contains the sample values as arguments.

– Once we have an estimator and observe a sample, we can compute

estimates (=numerical values) for the unknown quantities.

• For estimating the unknown parameters there exist many different

estimators that differ with respect to their statistical properties (sta-

tistical quality)!

Example: Two different estimators for estimating the mean: 1n

∑ni=1 yi

and 12 (y1 + yn).

45

Page 52: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.2 — UR March 2009 — R. Tschernig

• If you denote estimators of the parameters β0 and β1 in the popu-

lation regression model

y = β0 + β1x + u

by β0 and β1, then the sample regression model is given by

yi = β0 + β1xi + ui, i = 1, . . . , n.

It consists of

– the sample regression function or regression line

yi = β0 + β1xi,

– the fitted values yi, and

– the residuals ui = yi − yi, i = 1, . . . , n.

With which method can we estimate?

46

Page 53: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

2.3 The Ordinary Least Squares Estimator (OLS) Estimator

• The ordinary least squares estimator is frequently abbreviated as

OLS estimator. The OLS estimator goes back to C.F. Gauss (1777-

1855).

• It is derived by choosing the values β0 and β1 such that the sum

of squared residuals (SSR)n∑

i=1

u2i =

n∑

i=1

(yi − β0 − β1xi

)2

is minimized.

47

Page 54: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

• One computes the first partial derivatives with respect to β0 and β1

and sets them equal to zero:n∑

i=1

(yi − β0 − β1xi

)= 0, (2.7)

n∑

i=1

xi

(yi − β0 − β1xi

)= 0. (2.8)

The equations (2.7) and (2.8) are called normal equations.

48

Page 55: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

From (2.7) one obtains

β0 = n−1n∑

i=1

yi − β1n−1

n∑

i=1

xi

β0 = y − β1x, (2.9)

where z = n−1∑ni=1 zi denotes the estimated mean of zi, i =

1, . . . , n.

Inserting (2.9) into the normal equation (2.8) deliversn∑

i=1

xi

(yi −

(y − β1x

)− β1xi

)= 0.

Moving terms leads ton∑

i=1

xi(yi − y) = β1

n∑

i=1

xi(xi − x).

49

Page 56: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

Note thatn∑

i=1

xi(yi − y) =

n∑

i=1

(xi − x)(yi − y),

n∑

i=1

xi(xi − x) =

n∑

i=1

(xi − x)2,

such that:

β1 =

∑ni=1(xi − x)(yi − y)∑n

i=1(xi − x)2. (2.10)

50

Page 57: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

• Terminology:

– The sample functions (2.9) and (2.10)

β0 = y − β1x,

β1 =

∑ni=1(xi − x)(yi − y)∑n

i=1(xi − x)2

are called the ordinary least squares (OLS) estimators for

β0 and β1.

– For a given sample, the quantities β0 and β1 are called the OLS

estimates for β0 and β1.

51

Page 58: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

– The OLS sample regression function or OLS regression

line for the simple regression model is given by

yi = β0 + β1xi (2.11)

with residuals ui = yi − yi.

– The OLS sample regression model is denoted by

yi = β0 + β1xi + ui (2.12)

52

Page 59: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

Note:

– The OLS estimator β1 only exists if the sample observations xi,

i = 1, . . . , n exhibit variation.

Assumption SLR.3

(Sample Variation in the Explanatory Variable):

In the sample the outcomes of the independent variable xi, i =

1, 2, . . . , n are not all the same.

– The derivation of the OLS estimator only requires assumption

SLR.3 but not the population Assumptions SLR.1 and SLR.4.

– In order to investigate the statistical properties of the OLS esti-

mator one needs further assumptions, see Sections 2.7, 3.4, 4.2.

– One also can derive the OLS estimator from the assumptions

about the population, see below.

53

Page 60: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

• The OLS estimator as a Moment Estimator:

– Note that from Assumption SLR.4 E[u|x] = 0 one obtains two

conditions on moments: E[u] = 0 and Cov(x, u) = 0. Inserting

Assumption SLR.1 u = y − β0 − β1x defines moment condi-

tions for the model parameters

E(y − β0 − β1x) = 0 (2.13)

E[x(y − β0 − β1x)] = 0 (2.14)

– How to estimate the moment conditions using sample functions?

– Assumption SLR.2 (Random Sampling):

The sample of size n is obtained by random sampling that is, the

pairs (xi, yi) and (xj, yj), i 6= j, i, j = 1, . . . , n, are pairwise

identically and independently distributed following the population

model.

54

Page 61: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

– An important result in statistics, see Section 5.1, says:

If Assumption SLR.2 holds, then the expected value can well be

estimated by the sample average. (Assumption SLR.2 can be

weakened, see e.g. Chapter 11 in Wooldridge (2009).)

– If one replaces the expected values in (2.13) and (2.14) by their

sample averages, one obtains

n−1n∑

i=1

(yi − β0 − β1xi

)= 0, (2.15)

n−1n∑

i=1

xi

(yi − β0 − β1xi

)= 0. (2.16)

By multiplying (2.15) (2.16) by n one obtains the normal equa-

tions (2.7) and (2.8).

55

Page 62: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

The Trade Example Continued

Question:

Do imports to Kazakhstan increase if the exporting country experiences

an increase in GDP?

Scatter plot (from Section 1.1)

0.00E+00

2.00E+08

4.00E+08

6.00E+08

8.00E+08

1.00E+09

1.20E+09

0 2000 4000 6000 8000 10000 12000

WEO_GDPCR_O

TR

AD

E_

0_

D_

O

56

Page 63: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

The OLS regression line is given by

importsi = 53441198 + 2.16 · 10−05gdpi, i = 1, . . . , 55,

and the sample regression model by

importsi = 53441198 + 2.16 · 10−05gdpi + ui, i = 1, . . . , 55.

0.00E+00

2.00E+08

4.00E+08

6.00E+08

8.00E+08

1.00E+09

1.20E+09

0 2000 4000 6000 8000 10000 12000

W E O _ G D P C R _ O

TR

AD

E_

0_

D_

OTRADE_0_D_O vs. WEO_GDPCR_O

57

Page 64: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

====================================================================

Dependent Variable: TRADE_0_D_O

Method: Least Squares

Date: 02/09/09 Time: 17:12

Sample: 1 55

Included observations: 52

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C 53441198 25852541 2.067155 0.0439

WDI_GDPUSDCR_O 2.16E-05 1.37E-05 1.572746 0.1221

====================================================================

R-squared 0.047139 Mean dependent var 67801339

Adjusted R-squared 0.028081 S.D. dependent var 1.77E+08

S.E. of regression 1.74E+08 Akaike info criterion 40.82943

Sum squared resid 1.52E+18 Schwarz criterion 40.90448

Log likelihood -1059.565 F-statistic 2.473530

Durbin-Watson stat 2.297310 Prob(F-statistic) 0.122085

====================================================================

• For a data description see Appendix 10.4:

importsi (from country i) TRADE 0 D O

gdpi (in exporting country i) WDI GDPUSDCR O

58

Page 65: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

• Potential interpretation of estimated slope parameter:

β1 =∆imports

∆gdp

indicates by how many US dollars imports in Kazakhstan increase if

GDP in an exporting country increases by 1 US dollar.

• Does this interpretation really make sense? Aren’t there other im-

portant influencing factors missing? What about using economic

theory as well?

• What about the quality of the estimates?

59

Page 66: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

Example: Wage Regression

Question:

How does education influence the hourly wage of an employee?

• Data (Source: Example 2.4 in Wooldridge (2009)): Sample of U.S.

employees with n = 526 observations. Available data are:

– wage per hour in dollars and

– educ years of schooling of each employee.

• The OLS regression line is given by

ˆwagei = −0.90 + 0.54 educi, i = 1, . . . , 526.

The sample regression model is

wagei = −0.90 + 0.54 educ + ui, i = 1, . . . , 526

60

Page 67: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

====================================================================

Dependent Variable: WAGE Method: Least Squares Sample: 1 526

Included observations: 526

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C -0.904852 0.684968 -1.321013 0.1871

EDUC 0.541359 0.053248 10.16675 0.0000

====================================================================

R-squared 0.164758 Mean dependent var 5.896103

Adjusted R-squared 0.163164 S.D. dependent var 3.693086

S.E. of regression 3.378390 Akaike info criterion 5.276470

Sum squared resid 5980.682 Schwarz criterion 5.292688

Log likelihood -1385.712 F-statistic 103.3627

Durbin-Watson stat 1.823686 Prob(F-statistic) 0.000000

====================================================================

• Interpretation of the estimated slope parameter:

β1 =∆wage

∆educindicates by how much the hourly wage changes if the years of

schooling increases by one year:

– An additional year in school or university increases the hourly

61

Page 68: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.3 — UR March 2009 — R. Tschernig

wage by 54 cent.

– But: Somebody without any education earns an hourly wage of

-90 cent? Does this interpretation make sense?

• Is it always sensible to interpret the slope coefficient? Watch out

spurious causality, see next section.

• Are these estimates reliable or good in some sense? What do we

mean by “good” in econometrics and statistics? To get more insight

study

– the statistical properties of the OLS estimator and the OLS esti-

mates, see Section 2.7 and

– check the choice of the functional form for the conditional expec-

tation E[y|x], see Section 2.6.

62

Page 69: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.4 — UR March 2009 — R. Tschernig

2.4 Best Linear Prediction, Correlation, and Causality

Best Linear Prediction

• What does the OLS estimator estimate if Assumptions SLR.1 and

SLR. 4 (alias Assumption (A)) are not valid in the population

from which the sample is drawn?

• Note that SSR(γ0, γ1)/n =∑n

i=1 (yi − γ0 − γ1xi)2 /n is a sample

average and thus estimates the expected value

E[(y − γ0 − γ1x)2

](2.17)

if Assumption SLR.2 (or some weaker form) holds. (For existence

of (2.17) it is required that 0 < V ar(x) < ∞ and V ar(y) < ∞.)

Equation (2.17) is called the mean squared error of a linear

predictor

γ0 + γ1x.

63

Page 70: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.4 — UR March 2009 — R. Tschernig

• Mimicking minimizing SSR(γ0, γ1), the theoretically best fit of a

linear predictor γ0 + γ1x to y is obtained by minimizing its mean

squared error (2.17) with respect to γ0 and γ1. This leads (try to

derive it) to

γ∗0 = E[y] − γ∗1E[x], (2.18)

γ∗1 =Cov(x, y)

V ar(x)= Corr(x, y)

√V ar(y)

V ar(x)(2.19)

with

Corr(x, y) =Cov(x, y)√

V ar(x)V ar(y), −1 ≤ Corr(x, y) ≤ 1

denoting the correlation that measures the linear dependence be-

tween two variables in a population, here x and y.

The expression

γ∗0 + γ∗1x (2.20)

64

Page 71: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.4 — UR March 2009 — R. Tschernig

is called the best linear predictor of y where “best” is defined by

minimal mean squared error.

• Now observe that for the simple regression model

y = γ∗0 + γ∗1x + ε

one has Cov(x, ε) = 0, a weaker form of SLR.4, since

Cov(x, y) =Cov(x, y)

V ar(x)V ar(x) + Cov(x, ε).

This indicates that one can show that under Assumption SLR.2

and SLR.3 the OLS estimator estimates the parameters γ∗0and γ∗1 of the best linear predictor. Observe also that the

OLS estimator (2.10) for the slope coefficient consists of the sample

averages of the moments defining γ∗1

γ1 =

∑ni=1(xi − x)(yi − y)∑n

i=1(xi − x)2

65

Page 72: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.4 — UR March 2009 — R. Tschernig

• Rewriting γ1 as

β1 = Corr(x, y)

√∑ni=1(yi − y)2∑ni=1(xi − x)2

using the empirical correlation coefficient

Corr(x, y) =

∑ni=1(xi − x)(yi − y)√∑n

i=1(xi − x)2∑n

i=1(yi − y)2

shows that the estimated slope coefficient is non-zero if there is

sample correlation between x and y.

66

Page 73: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.4 — UR March 2009 — R. Tschernig

Causality

• Recall Section 1.3.

• Be aware that the slope coefficient of the best linear pre-

dictor γ∗1 and its OLS estimate γ1 cannot be automatically

interpreted in terms of a causal relationship since estimating

the best linear predictor

– only captures correlation but not direction,

– may not estimate the model of interest, e.g. if Assumptions SLR.1

and SLR.4 are violated and β1 6= γ∗1 ,

– may produce garbage if Corr(x, y) estimates spurious correlation

(Corr(x, y) = 0 and Assumption SLR.2 (or its weaker versions)

are violated),

– or relevant control variables are missing in the simple regression

67

Page 74: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.4 — UR March 2009 — R. Tschernig

model such that the results cannot represent results of a fictive

randomized controlled experiment, see Chapter 3 onwards.

Therefore, before any causal interpretation takes place one has to

use specification and diagnostic techniques for regression models.

Frequently one needs economic theory to identify causal relation-

ships.

68

Page 75: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.5 — UR March 2009 — R. Tschernig

2.5 Algebraic Properties of the OLS Estimator

Basic properties:

•∑n

i=1 ui = 0, because of normal equation (2.7),

•∑n

i=1 xiui = 0, because of normal equation (2.8).

• The point (x, y) lies on the regression line.

Can you provide some intuition for these properties?

69

Page 76: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.5 — UR March 2009 — R. Tschernig

• Total sum of squares (SST)

SST ≡n∑

i=1

(yi − y)2

• Explained sum of squares (SSE)

SSE ≡n∑

i=1

(yi − y)2

• Sum of squared residuals (SSR)

SSR ≡n∑

i=1

u2i

• The decomposition SST = SSE + SSR holds if the regression

model contains an intercept β0.

70

Page 77: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.5 — UR March 2009 — R. Tschernig

• Coefficient of Determination R2 or (R-squared)

R2 =SSE

SST.

– Interpretation: share of variation of yi that is explained by the

variation of xi.

– If the regression model contains an intercept term β0, then

R2 =SSE

SST= 1 − SSR

SSTdue to the decomposition SST = SSE + SSR, and therefore

0 ≤ R2 ≤ 1.

– Later we will see: Choosing regressors with R2 is in general mis-

leading.

71

Page 78: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.5 — UR March 2009 — R. Tschernig

Reading:

• Sections 1.4 and 2.1-2.3 in Wooldridge (2009) and Appendix 10.1

if needed.

• 2.4 and 2.5 in Wooldridge (2009).

72

Page 79: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

2.6 Parameter Interpretation and Functional Form and

Data Transformation

• The term linear in “simple linear regression models” does not imply

that the relationship between the explained and the explanatory

variable is linear. Instead it refers to the fact that the parameters

β0 and β1 enter the model linearly.

• Examples for regression models that are linear in their parameters:

yi = β0 + β1xi + ui,

yi = β0 + β1 ln xi + ui,

ln yi = β0 + β1 ln xi + ui,

ln yi = β0 + β1xi + ui,

yi = β0 + β1x2i + ui.

73

Page 80: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

The Natural Logarithm in Econometrics

Frequently variables are transformed by taking the natural logarithm

ln. Then the interpretation of the slope coefficient has to be ad-

justed accordingly.

Taylor approximation of the logarithmic function:

ln(1 + z) ≈ z if z is close to 0.

Using this approximation one can derive a popular approximation of

growth rates or returns

(∆xt)/xt−1 ≡ (xt − xt−1)/xt−1

≈ ln (1 + (xt − xt−1)/xt−1) ,

(∆xt)/xt−1 ≈ ln(xt) − ln(xt−1).

which approximates well if the relative change ∆xt/xt−1 is close to 0.

74

Page 81: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

One obtains percentages by multiplying with 100:

100∆ ln(xt) ≈ %∆xt = 100(xt − xt−1)/xt−1.

Thus, the percentage change for small ∆xt/xt−1 can be well ap-

proximated by 100[ln(xt) − ln(xt−1)].

• Examples of models that are nonlinear in the parameters (β0, β1, γ, λ, π, δ):

yi = β0 + β1xγi + ui,

yγi = β0 + β1 ln xi + ui,

yi = β0 + β1xi +1

1 + exp(λ(xi − π))(γ + δxi) + ui.

• The last example allows for smooth switching between two linear

regimes. The possibilities for formulating nonlinear regression mod-

els are huge. However, their estimation requires more advanced

methods such as nonlinear least squares that are beyond the scope

of this course.

75

Page 82: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

• Note, however, that linear regression models allow for a wide range

of nonlinear relationships between the dependent and independent

variables, some of which were listed at the beginning of this section.

Economic Interpretation of OLS Parameters

• Consider the ratio of relative changes of two non-stochastic

variables y and x

∆yy

∆xx

=%change of y

%change of x=

%∆y

%∆x.

If ∆y → 0 and ∆x → 0, then it can be shown that ∆y∆x → dy

dx.

• If this result is applied to the ratio above, one obtains the elasticity

η(x) =dy

dx

x

y.

76

Page 83: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

• Interpretation: If the relative change of x is 0.01, then the relative

change of y given by 0.01η(x).

In other words: If x changes by 1%, then y changes by η(x)%.

• If y, x are random variables, then the elasticity is defined with

respect to the conditional expectation of y given x:

η(x) =dE[y|x]

dx

x

E[y|x].

This can be derived fromE[y|x1=x0+∆x]−E[y|x0]

E[y|x0]

∆xx0

=

E[y|x1 = x0 + ∆x] − E[y|x0]

∆x

x0

E[y|x0]

and letting ∆x → 0.

77

Page 84: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

Different Models and Interpretations of β1

For each model it is assumed that SLR.1 and SLR.4 hold.

• Models that are linear with respect to their variables

(level-level models)

y = β0 + β1x + u.

It holds thatdE[y|x]

dx= β1

and thus

∆E[y|x] = β1∆x.

In words:

The slope coefficient denotes the absolute change in the conditional

expectation of the dependent variable y for a one-unit change in the

independent variable x.

78

Page 85: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

• Level-log models

y = β0 + β1 ln x + u.

It holds thatdE[y|x]

dx= β1

1

xand thus

∆E[y|x] ≈ β1∆ ln x =β1

100100∆ ln x ≈ β1

100%∆x.

In words:

The conditional expectation of y changes by β1/100 units if x

changes by 1%.

79

Page 86: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

• Log-level models

ln y = β0 + β1x + u

or

y = eln y = eβ0+β1x+u = eβ0+β1xeu.

Thus

E[y|x] = eβ0+β1xE[eu|x].

If E[eu|x] is constant, then

dE[y|x]

dx= β1 eβ0+β1xE[eu|x]︸ ︷︷ ︸

E[y|x]

= β1E[y|x].

One obtains the approximation

∆E[y|x]

E[y|x]≈ β1∆x, or %∆E[y|x] ≈ 100β1∆x

In words: The conditional expectation of y changes by 100 β1% if

x changes by one unit.

80

Page 87: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

• Log-log models

are frequently called loglinear models or constant-elasticity

models and are very popular in empirical work

ln y = β0 + β1 ln x + u.

Similar to above one can show that

dE[y|x]

dx= β1

E[y|x]

x, and thus β1 = η(x)

if E[eu|x] is constant.

In these models the slope coefficient is interpreted as the elasticity

between the level variables y and x.

81

Page 88: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

The Trade Example Continued====================================================================

Dependent Variable: LOG(TRADE_0_D_O)

Method: Least Squares

Date: 02/08/09 Time: 12:29

Sample: 1 55

Included observations: 52

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C -3.460963 3.280047 -1.055157 0.2964

LOG(WDI_GDPUSDCR_O) 0.770127 0.129507 5.946600 0.0000

====================================================================

R-squared 0.414260 Mean dependent var 15.97292

Adjusted R-squared 0.402545 S.D. dependent var 2.613094

S.E. of regression 2.019797 Akaike info criterion 4.281574

Sum squared resid 203.9791 Schwarz criterion 4.356622

Log likelihood -109.3209 F-statistic 2.65E-07

====================================================================

Note the very different interpretation of the estimated slope coeffi-

cient β1:

– Level-level model (Section 2.3): an increase in GDP in the ex-

porting country by 1 billion US dollars corresponds to an increase

82

Page 89: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.6 — UR March 2009 — R. Tschernig

of imports to Kazakhstan by 0.216 million US dollars.

– Log-log model: an 1%-increase of GDP in the exporting country

corresponds to an increase of imports by 0.77%.

But wait before you take these numbers seriously.

83

Page 90: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.7 — UR March 2009 — R. Tschernig

2.7 Statistical Properties of the OLS Estimator: Expected

Value and Variance

• Some preparatory transformations (all sums are indexed by i =

1, . . . , n):

β1 =

∑ni=1 (xi − x) (yi − y)∑n

i=1 (xi − x)2=

∑ni=1 (xi − x) yi∑nj=1

(xj − x

)2

=

n∑

i=1

(xi − x)∑n

j=1

(xj − x

)2

︸ ︷︷ ︸wi

yi =∑

wiyi

where it can be shown that (try it):∑wi = 0,

∑wixi = 1 and

∑w2

i = 1∑nj=1(xj−x)2.

84

Page 91: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.7 — UR March 2009 — R. Tschernig

• Unbiasedness of the OLS estimator:

If Assumptions SLR.1 to SLR.4 hold, then

E[β0] = β0,

E[β1] = β1.

Interpretation:

If one keeps repeatedly drawing new samples and estimating the re-

gression parameters, then the average of all obtained OLS parameter

estimates roughly corresponds to the population parameters.

The property of unbiasedness is a property of the sample distribution

of the OLS estimators for β0 and β1. It does not imply that the

population parameters are perfectly estimated for a specific sample.

85

Page 92: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.7 — UR March 2009 — R. Tschernig

Proof for β1 (clarify where each SLR assumption is needed):

1. E[β1

∣∣∣ x1, . . . , xn

]can be manipulated as follows:

= E[∑

wiyi

∣∣∣x1, . . . , xn

]

= E[∑

wi(β0 + β1xi + ui)∣∣∣x1, . . . , xn

]

=∑

E [wi(β0 + β1xi + ui)| x1, . . . , xn]

= β0

∑wi + β1

∑wixi +

∑E [wiui| x1, . . . , xn]

= β1 +∑

wiE [ui| x1, . . . , xn]

= β1 +∑

wiE [ui| xi]

= β1.

2. From E[β1] = E[E[β1|x1, . . . , xn]] one obtains unbiasedness

E[β1] = β1.

86

Page 93: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.7 — UR March 2009 — R. Tschernig

• Variance of the OLS estimator

In order to determine the variance of the OLS estimators β0 and β1

we need another assumption,

Assumption SLR.5 (Homoskedasticity):

V ar(u|x) = σ2.

• Variances of parameter estimators

conditional on the sample observations

If Assumptions SLR.1 to SLR.5 hold, then

V ar(β1

∣∣∣ x1, . . . , xn

)= σ2 1

∑ni=1 (xi − x)2

,

V ar(β0

∣∣∣ x1, . . . , xn

)= σ2 n−1∑x2

i∑ni=1 (xi − x)2

.

87

Page 94: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.7 — UR March 2009 — R. Tschernig

Proof (for the conditional variance of β1):

V ar(β1

∣∣∣x1, . . . , xn

)

= V ar(∑

wiui

∣∣∣x1, . . . , xn

)

=∑

V ar (wiui| x1, . . . , xn)

=∑

w2iV ar (ui| x1, . . . , xn)

=∑

w2iV ar (ui| xi)

=∑

w2iσ

2

= σ2∑

w2i

= σ2 1∑

(xi − x)2.

88

Page 95: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.7 — UR March 2009 — R. Tschernig

• Covariance between the intercept and the slope estimator:

Cov(β0, β1|x1, . . . , xn) = −σ2 x∑n

i=1 (xi − x)2.

Proof: Cov(β0, β1 |x1, . . . , xn) can be manipulated as follows:

= Cov(y − β1x, β1

∣∣∣ x1, . . . , xn

)

= Cov(u, β1

∣∣∣x1, . . . , xn

)

︸ ︷︷ ︸=0 see below

−Cov(β1x, β1

∣∣∣x1, . . . , xn

)

= −xCov(β1, β1

∣∣∣ x1, . . . , xn

)

= −xV ar(β1

∣∣∣ x1, . . . , xn

)

= −σ2 x∑

(xi − x)2.

89

Page 96: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.7 — UR March 2009 — R. Tschernig

Cov(u, β1

∣∣∣ x1, . . . , xn

)

= Cov(u,∑

wiui

∣∣∣ x1, . . . , xn

)

= Cov (u, w1u1| x1, . . . , xn) + · · · + Cov (u, wnun|x1, . . . , xn)

= w1Cov (u, u1| x1, . . . , xn) + · · · + wnCov (u, un|x1, . . . , xn)

=∑

wiCov (u, ui| x1, . . . , xn)

= Cov (u, u1|x1, . . . , xn)∑

wi

= 0.

90

Page 97: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.8 — UR March 2009 — R. Tschernig

2.8 Estimation of the Error Variance

• One estimator for the error variance σ2 is given by

σ2 =1

n

n∑

i=1

u2i ,

where the ui’s denote the residuals of the OLS estimator.

Disadvantage: The estimator σ2 does not take into account that 2

restrictions were imposed on obtaining the OLS residuals, namely:∑ui = 0,

∑uixi = 0.

This leads to biased estimates, E[σ2|x1, . . . , xn] 6= σ2.

• Unbiased estimator for the error variance:

σ2 =1

n − 2

n∑

i=1

u2i .

91

Page 98: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.8 — UR March 2009 — R. Tschernig

• If Assumptions SLR.1 to SLR.5 hold, then

E[σ2|x1, . . . , xn] = σ2.

• Standard error of the regression, standard error of the es-

timate or root mean squared error:

σ =√

σ2 .

• In the formulas for the variances of and covariance between the

parameter estimators β0 und β1 the variance estimator σ2 can be

used for estimating the unknown error variance σ.

Example:

V ar(β1|x1, . . . , xn) =σ2

∑(xi − x)2

.

92

Page 99: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 2.8 — UR March 2009 — R. Tschernig

Denote the standard deviation as

sd(β1|x1, . . . , xn) =

√V ar(β1|x1, . . . , xn),

then

sd(β1|x1, . . . , xn) =σ

(∑(xi − x)2

)1/2

is frequently called the standard error of β1 and reported in the

output of software packages.

Reading: Sections 2.4 and 2.5 in Wooldridge (2009) and Appendix

10.1 if needed.

93

Page 100: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

3 Multiple Regression Analysis: Estimation

3.1 Motivation for Multiple Regression: The Trade

Example Continued

• In Section 2.6 two simple linear regression models for explaining

imports to Kazakhstan were estimated (and interpreted): a level-

level model and a log-log model.

• It is hardly credible that imports to Kazakhstan only depend on the

GDP of the exporting country. What about, for example, distance,

94

Page 101: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.1 — UR March 2009 — R. Tschernig

borders, and other factors causing trading costs?

• Such quantities have been found to be relevant in the empirical

literature on gravity equations for explaining intra- and interna-

tional trade. In general, bi-directional trade flows are considered.

Here we consider only one-directional trade flows, namely exports

to Kazakhstan in 2004. Such a simplified gravity equation reads as

ln(importsi) = β0 + β1 ln(gdpi) + β2 ln(distancei) + ui. (3.1)

Standard gravity equations are based on bilateral imports and ex-

ports over a number of years and thus require panel data techniques

that are beyond the scope of this course. However, in Section 4.4

we will consider cross-section data on both imports and exports for

2004. For a brief introduction to gravity equations see e.g. Fratianni

(2007). A recent theoretic underpinning of gravity equations was

provided by Anderson & Wincoop (2003).

95

Page 102: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.1 — UR March 2009 — R. Tschernig

• If relevant variables are neglected, Assumptions SLR.1 and/or SLR.4

could be violated and in this case interpretation of causal effects

can be highly misleading, see Section 3.4. To avoid this trap, the

multiple regression model can be useful.

• To get an idea about the change in the elasticity parameter due to a

second independent variable, like e.g. distance, inspect the following

OLS estimate of the simple import equation (3.1):Dependent Variable: LOG(TRADE_0_D_O), Method: Least Squares, Date: 02/13/09 Time: 14:31

Sample: 1 55, Included observations: 52

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C 4.800950 3.497341 1.372743 0.1761

LOG(WDI_GDPUSDCR_O) 1.088546 0.137001 7.945508 0.0000

LOG(CEPII_DIST) -1.970804 0.480555 -4.101103 0.0002

====================================================================

R-squared 0.563937 Mean dependent var 15.97292

Adjusted R-squared 0.546138 S.D. dependent var 2.613094

S.E. of regression 1.760423 Akaike info criterion 4.024946

Sum squared resid 151.8554 Schwarz criterion 4.137518

Log likelihood -101.6486 F-statistic 31.68448

Durbin-Watson stat 2.117895 Prob(F-statistic) 0.000000

96

Page 103: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.1 — UR March 2009 — R. Tschernig

Instead of an estimated elasticity of 0.77, see Section 2.6, one ob-

tains a value of 1.09. Furthermore, the R2 increases from 0.41 to

0.56, indicating a much better statistical fit. Finally, a 1% increase

in distance reduces imports by almost 2%. Is this model then better?

Or is it (also) misspecified?

To answer these questions we have to study the linear multiple re-

gression model first.

97

Page 104: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

3.2 The Multiple Regression Model of the Population

• Assumptions:

The Assumptions SLR.1 and SLR.4 of the simple linear regression

model have to be adapted accordingly to the multiple linear regres-

sion model (MLR) for the population (see Section 3.3 in Wooldridge

(2009)):

– MLR.1 (Linearity in the Parameters)

The multiple regression model allows for more than one, say

k, explanatory variables

y = β0 + β1x1 + β2x2 + · · · + βkxk + u (3.2)

and the model is linear in its parameters.

Example: the import equation (3.1).

98

Page 105: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

– MLR.4 (Zero Conditional Mean)

E[u|x1, . . . , xk] = 0 for all x.

Observe that all explanatory variables of the multiple regression

(3.2) must be included in the conditioning set. Sometimes the

conditioning set is called information set.

• Remarks:

– To see the need for MLR.4, take the conditional expectation of

y in (3.2) given all k regressors

E[y|x1, x2, . . . , xk] = β0 + β1x1 + · · · βkxk + E[u|x1, x2, . . . , xk].

If E[u|x1, x2, . . . , xk] 6= 0 for some x, then the systematic part

β0+β1x1+· · · βkxk does not model the conditional expectations

E[y|x1, . . . , xk] correctly.

99

Page 106: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

– If MLR.1 and MLR.4 are fulfilled, then equation (3.2)

y = β0 + β1x1 + β2x2 + · · · + βkxk + u

is also called the linear multiple regression model for the

population. Frequently it is also called the true model (even

if any model may be fare from truth). Alternatively, one may

think of equation (3.2) as the data generating mechanism

(although, strictly speaking, a data generating mechanism also

requires specification of the probability distributions of all regres-

sors and the error).

• To guarantee nice properties of the OLS estimator and the sample

regression model, we adapt SLR.2 and SLR.3 accordingly:

– MLR.2 (Random Sampling)

The sample of size n is obtained by random sampling, that is

the observations (xi1, . . . , xik, yi) : i = 1, . . . , n are pairwise

100

Page 107: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

independently and identically distributed following the population

model in MLR.1.

– MLR.3 (No Perfect Collinearity)

(more on MLR.3 in Section 3.3)

• Interpretation:

– If Assumptions MLR.1 and MLR.4 are correct and the popula-

tion regression model allows for a causal interpretation, then the

multiple regression model is a great tool for ceteris paribus

analysis. It allows to hold the values of all explanatory variables

fixed except one and check how the conditional expectation of

the explained variable changes. This resembles changing one con-

trol variable in a randomized control experiment. Let xj be the

control variable of interest.

101

Page 108: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

– Taking conditional expectations of the multiple regression (3.2)

and applying Assumption MLR.4 delivers

E[y|x1, . . . , xj, . . . , xk] = β0 +β1x1 + · · ·+βjxj + · · ·+βkxk.

– Consider a change in xj: xj + ∆xj

E[y|x1, . . . , xj+∆xj, . . . , xk] = β0+β1x1+· · ·+βj(xj+∆xj)+· · ·+βkxk.

∗ Ceteris-paribus effect:

In (3.2) the absolute change due to a change of xj by ∆xj is

given by

∆E[y|x1, . . . , xj, . . . , xk] ≡E[y|x1, . . . , xj−1, xj + ∆xj, xj+1, . . . , xk]

− E[y|x1, . . . , xj−1, xj, xj+1, . . . , xk] = βj∆xj,

102

Page 109: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

where βj corresponds to the first partial derivative

∂E[y|x1, . . . , xj−1, xj, xj+1, . . . , xk]

∂xj= βj.

The parameter βj gives the partial effect of changing xj on

the conditional expectation of y while all other regressors are

held constant.

∗ Total effect:

Of course one can also consider simultaneous changes in the

regressors, for example ∆x1 and ∆xk. For this case one obtains

∆E[y|x1, . . . , xk] = β1∆x1 + βk∆xk.

– Note that the specific interpretation of βj depends on how

variables enter, e.g. as log variables. In a ceteris paribus analysis

the results of Section 2.6 remain valid.

103

Page 110: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

Trade Example Continued

• Considering the log-log model (3.1)

ln(importsi) = β0 + β1 ln(gdpi) + β2 ln(distancei) + ui

a 1% increase in distance leads to a increase of β2% in imports

keeping GDP fixed. In other words, one can separate the effect

of distance on imports from the effect of economic size. From

the output table in Section 3.1 one obtains that a 1% increase in

distance decreases imports by about 2%.

• Keep in mind that determining distances between countries is a

complicated matter and results may change with the choice of the

method for computing distances. Our data are from CEPII, see also

Appendix 10.4.

• There may still be missing variables, see also Section 4.4.

104

Page 111: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

Wage Example Continued

• In Section 2.3 it was assumed that hourly wage is determined by

wage = β0 + β1 educ + u.

Instead of a level-level model one may also consider a log-level model

ln(wage) = β0 + β1 educ + u. (3.3)

However, since we expect that experience also matters for hourly

wages, we want to include experience as well. We obtain

ln(wage) = β0 + β1 educ + β2 exper + v. (3.4)

What about the expected log wage given the variables educ and

exper?

E[ln(wage)|educ, exper] = β0 + β1 educ + β2 exper +E[v|educ, exper]

E[ln(wage)|educ, exper] = β0 + β1 educ + β2 exper,

105

Page 112: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

where the second equation only holds if MLR.4 holds, that is if

E[v|educ, exper] = 0.

• Note that if instead of (3.4) one investigates the simple linear log-

level model (3.3) although the population model contains

exper one obtains

E[ln(wage)|educ] = β0 + β1 educ + β2 E[exper|educ] +E[v|educ]

indicating misspecification of the simple model since it ignores the

influence of exper via β2. Thus, the smaller model suffers from

misspecification if

E[ln(wage)|educ] 6= E[ln(wage)|educ, exper]

for some values of educ or exper.

106

Page 113: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

• Empirical results:

See Example 2.10 in Wooldridge (2009), file: wage1.wf1, output

from EViews 6:– Simple log-level model

Dependent Variable: LOG(WAGE)

Method: Least Squares, Date: 02/17/09 Time: 15:59

Sample: 1 526, Included observations: 526

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C 0.583773 0.097336 5.997510 0.0000

EDUC 0.082744 0.007567 10.93534 0.0000

====================================================================

R-squared 0.185806 Mean dependent var 1.623268

Adjusted R-squared 0.184253 S.D. dependent var 0.531538

S.E. of regression 0.480079 Akaike info criterion 1.374061

Sum squared resid 120.7691 Schwarz criterion 1.390279

Log likelihood -359.3781 Hannan-Quinn criter. 1.380411

F-statistic 119.5816 Durbin-Watson stat 1.801328

Prob(F-statistic) 0.000000

====================================================================

107

Page 114: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

ln(wagei) = 0.5838 + 0.0827 educi + ui, i = 1, . . . , 526,

R2 = 0.1858.

If SLR.1 to SLR.4 are valid, then each additional year of schooling

is estimated to increase hourly wages by 8.3% on average. The

sample regression model explains about 18.6% of the variation of

the dependent variable ln(wage).

108

Page 115: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

– Multivariate log-level model:Dependent Variable: LOG(WAGE), Method: Least Squares

Date: 02/17/09 Time: 15:48

Sample: 1 526, Included observations: 526

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C 0.216854 0.108595 1.996909 0.0464

EDUC 0.097936 0.007622 12.84839 0.0000

EXPER 0.010347 0.001555 6.653393 0.0000

====================================================================

R-squared 0.249343 Mean dependent var 1.623268

Adjusted R-squared 0.246473 S.D. dependent var 0.531538

S.E. of regression 0.461407 Akaike info criterion 1.296614

Sum squared resid 111.3447 Schwarz criterion 1.320940

Log likelihood -338.0094 Hannan-Quinn criter. 1.306139

F-statistic 86.86167 Durbin-Watson stat 1.789452

Prob(F-statistic) 0.000000

====================================================================

ln(wagei) = 0.2169 + 0.0979 educi + 0.0103 experi + ui,

i = 1, . . . , 526,

R2 = 0.2493.

109

Page 116: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.2 — UR March 2009 — R. Tschernig

∗ Ceteris-paribus interpretation: If MLR.1 to MLR.4 are cor-

rect, then the expected increase in hourly wages due to an ad-

ditional year of schooling is about 9.8% and thus slightly larger

than obtained from the simple regression model.

An additional year of experience corresponds to an in increase

in expected hourly wages by 1%.

∗ Model fit:

The model explains 24.9% of the variation of the independent

variable. Does this imply that the multivariate model is better

than the simple regression model with an R2 of 18.6%? Be

careful with your answer and wait until we investigate model

selection criteria.

110

Page 117: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

3.3 The OLS Estimator: Derivation and Algebraic

Properties

• For an arbitrary estimator the sample regression model for a

sample (yi, xi1, . . . , xik), i = 1, . . . , n, is given by

yi = β0 + β1xi1 + β2xi2 + · · · + βkxik + ui, i = 1, . . . , n.

• Recall the idea of the OLS estimator: Choose β0, . . . , βk such that

the sum of squared residuals (SSR)

SSR(β0, . . . , βk) =

n∑

i=1

u2i =

n∑

i=1

(yi − β0 − β1xi1 − · · · − βkxik

)2

is minimized. Taking first partial derivatives of SSR(β0, . . . , βk)

with respect to all k + 1 parameters and setting them to zero yields

111

Page 118: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

the first order conditions of a minimum:n∑

i=1

(yi − β0 − β1xi1 − · · · − βkxik

)= 0 (3.5a)

n∑

i=1

xi1

(yi − β0 − β1xi1 − · · · − βkxik

)= 0 (3.5b)

... ...n∑

i=1

xik

(yi − β0 − β1xi1 − · · · − βkxik

)= 0 (3.5c)

This system of normal equations contains k + 1 unknown pa-

rameters and k + 1 equations. Under some further conditions (see

below) it has a unique solution.

Solving this set of equations becomes cumbersome if k is large. This

can be circumvented if the normal equations are written in matrix

notation.

112

Page 119: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

• The Multiple Regression Model in Matrix Form

Using matrix notation the multiple regression model can be rewritten

as (Wooldridge 2009, Appendix E)

y = Xβ + u, (3.6)

where

y1

y2

...

yn

︸ ︷︷ ︸y

=

x10 x11 x12 · · · x1k

x20 x21 x22 · · · x2k

... ... ... ...

xn0 xn1 xn2 · · · xnk

︸ ︷︷ ︸X

β0

β1

β2

...

βk

︸ ︷︷ ︸β

+

u1

u2

...

un

︸ ︷︷ ︸u

.

The matrix X is called the regressor matrix and has n rows and

k + 1 columns. The column vectors y and u have n rows each, the

column vector β has k + 1 rows.

113

Page 120: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

• Derivation: The OLS Estimator in Matrix Notation

– One possibility to derive the OLS estimator in matrix notation is

to rewrite the normal equations (3.5) in matrix notation. We do

this explicitly for the j-th equationn∑

i=1

xij

(yi − β0xi0 − β1xi1 − · · · − βkxik

)= 0

that is manipulated ton∑

i=1

(xijyi − β0xijxi0 − β1xijxi1 − · · · − βkxijxik

)= 0

and further ton∑

i=1

(β0xijxi0 + β1xijxi1 + · · · + βkxijxik

)=

n∑

i=1

xijyi.

114

Page 121: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

By factoring out we have

n∑

i=1

xijxi0

β0+

n∑

i=1

xijxi1

β1+· · ·+

n∑

i=1

xijxik

βk =

n∑

i=1

xijyi.

Similarly, rearranging all other equations and collecting all k + 1

equations in a vector delivers

(∑n

i=1 xi0xi0) β0 + (∑n

i=1 xi0xi1) β1 + · · · + (∑n

i=1 xi0xik) βk...

(∑n

i=1 xikxi0) β0 + (∑n

i=1 xikxi1) β1 + · · · + (∑n

i=1 xikxik) βk

=

∑ni=1 xi0yi

...∑n

i=1 xikyi

.

115

Page 122: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

Applying the rules for matrix multiplication yields

(∑n

i=1 xi0xi0) (∑n

i=1 xi0xi1) · · · (∑n

i=1 xi0xik)

... ... . . . ...

(∑n

i=1 xikxi0) (∑n

i=1 xikxi1) · · · (∑n

i=1 xikxik)

︸ ︷︷ ︸X′X

β0

...

βk

︸ ︷︷ ︸β

=

∑ni=1 xi0yi

...∑n

i=1 xikyi

︸ ︷︷ ︸X′y

as well as the normal equations in matrix notation

(X′X)β = X′y. (3.7)

116

Page 123: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

– Note: The matrix X′X has k + 1 columns and rows so that it is

a square matrix.

The inverse (X′X)−1 exists if all columns (and rows) are linearly

independent. This can be shown to be the case if all columns of

X are linearly independent.

This is exactly what the next assumption states.

Assumption MLR.3 (No Perfect Collinearity):

In the sample none of the regressors can be expressed as an exact

linear combination of one or more of the other regressors.

Is this a restrictive assumption?

117

Page 124: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

– Finally, multiply the normal equation (3.7) by (X′X)−1 from the

left and obtain the OLS estimator in matrix notation:

β = (X′X)−1X′y. (3.8)

This is the compact notation for

β0

...

βk

︸ ︷︷ ︸β

=

(∑n

i=1 xi0xi0) (∑n

i=1 xi0xi1) · · · (∑n

i=1 xi0xik)

... ... . . . ...

(∑n

i=1 xikxi0) (∑n

i=1 xikxi1) · · · (∑n

i=1 xikxik)

−1

︸ ︷︷ ︸(X′X)−1

∑ni=1 xi0yi

...∑n

i=1 xikyi

︸ ︷︷ ︸X′y

.

118

Page 125: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

Algebraic Properties of the OLS Estimator

• X′u = 0, that is∑n

i=1 xijui = 0 for j = 0, . . . , k.

Proof: Plugging y = Xβ + u into the normal equation yields

(X′X)β = (X′X)β + X′u and hence X′u = 0.

• If xi0 = 1, i = 1, . . . , n, it follows that∑n

i=1 ui = 0.

• For the special case k = 1, the algebraic properties of the simple

linear regression model follow immediately.

• The point (y, x1, . . . , xk) is always located on the regression hyper-

plane if there is a constant in the model.

• The definitions for SST, SSE and SSR are like in the simple regres-

sion.

• If a constant term is included in the model, we can decompose

SST = SSE + SSR.

119

Page 126: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

• The Coefficient of Determination:

R2 is defined as in the SLR case as

R2 =SSE

SSTor, if there is an intercept in the model,

R2 = 1 − SSR

SST.

It can be shown that the R2 is the squared empirical coefficient of

correlation between the observed yi’s and the explained yi’s, namely

R2 =

(∑ni=1 (yi − y)

(yi − ¯y

))2(∑n

i=1 (yi − y)2)(∑n

i=1

(yi − ¯y

)2)

=[Corr(y, y)

]2.

Note that[Corr(y, y)

]2can be used even when R2 is not useful.

120

Page 127: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

• Adjusted R2:

If we rewrite R2 by expanding the SSR/SST term by n

R2 = 1 − SSR/n

SST/n,

we can interpret SSR/n and SST/n as estimators for σ2 and σ2y

respectively. They are biased estimators, however.

Using unbiased estimators thereof instead one obtains the “ad-

justed” R2

R2 = 1 − SSR/(n − k − 1)

SST/(n − 1)

= 1 − n − 1

n − k − 1· SSR

SST

121

Page 128: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.3 — UR March 2009 — R. Tschernig

R2 = 1 − n − 1

n − k − 1

(1 − R2

)

=−k

n − k − 1+

n − 1

n − k − 1· R2

Properties of R2 (see Section 6.3 in Wooldridge (2009)):

– R2 can increase or fall when including an additional regressor.

– R2 always increases if an additional regressor reduces the unbi-

ased estimate of the error variance.

Attention: Analogously to R2 one may not compare R2 of regression

models with different y.

• The quantities R2 and R2 are both called goodness-of-fit mea-

sures.

122

Page 129: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4 — UR March 2009 — R. Tschernig

3.4 The OLS Estimator: Statistical Properties

Assumptions (Recap):

• MLR.1 (Linearity in the Parameters)

• MLR.2 (Random Sampling)

• MLR.3 (No Perfect Collinearity)

• MLR.4 (Zero Conditional Mean)

123

Page 130: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

3.4.1 The Unbiasedness of Parameter Estimates

• Let MLR.1 through MLR.4 hold. Then we have E[β] = β

Proof:

β = (X′X)−1X′y MLR.3

= (X′X)−1X′ (Xβ + u) MLR.1

= (X′X)−1X′Xβ + (X′X)−1X′u= β + (X′X)−1X′u.

Taking conditional expectation

E[β|X] = β + E[(X′X)−1X′u|X]

= β + (X′X)−1X′E[u|X]

= β. MLR.2 and MLR.4

124

Page 131: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

The last equality holds because

E[u|X] =

E[u1|X]

E[u2|X]

...

E[un|X]

=

0

0

...

0

,

where the latter follows from

E[ui|X] = E[ui|x11, . . . , x1k, . . . , xnk]

= E[ui|xi1, . . . , xik] MLR.2

= 0 MLR.4

for i = 1, . . . , n.

125

Page 132: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

• The Danger of Omitted Variable Bias

We partition the k + 1 regressors in a (n × k) matrix XA and a

(n × 1) vector xa. This yields

y = XAβA + xaβa + u. (3.9)

In the following it is assumed that the population regression model

has the same structure as (3.9).

Trade Example Continued (from Section 3.2):

Assume that in the population imports depend on gdp, distance,

and whether the trading countries share to some extent the same

language

ln(importsi) = β0 + β1 ln(gdpi) + β2 ln(distancei)

+ β3 ln(languagei) + ui.(3.10)

so that XA includes the constant, gdpi, and distancei and xa

126

Page 133: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

includes languagei, each i = 1, . . . , n.

Imagine now that you are only interested in the values of βA (the

parameters for the constant, gdp, and distance), and that the re-

gressor vector xa has to be omitted because you have, for instance,

no data.

Which effect has the omission of the variable xa on the es-

timation of βA if, for example, the model

y = XAβA + w (3.11)

is considered? Model (3.11) is frequently called the smaller model.

Or, stated differently, which estimation properties does the OLS

estimator for βA have on basis of the smaller model (3.11)?

127

Page 134: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

Derivation:

– Denote the OLS estimator for βA from the small model by βA.

Following the proof of unbiasedness for the small model but re-

placing y with the true population model (3.9) delivers

βA = (X′AXA)−1X′

Ay

= (X′AXA)−1X′

A(XAβA + xaβa + u)

= βA + (X′AXA)−1X′

Axaβa + (X′AXA)−1X′

Au.

– By the law of iterated expectations E[u|XA] = E [E[u|XA,xa]|XA]

and therefore E[u|XA] = E[0|XA] = 0 by validity of MLR.4 for

the population model (3.9).

128

Page 135: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

– Compute the conditional expectation of βA. Treating the (un-

observed) xa in the same way as XA one obtains

E[βA|XA,xa

]= βA + (X′

AXA)−1X′Axaβa.

Therefore the estimator βA is unbiased only if

(X′AXA)−1X′

Axaβa = 0. (3.12)

Take a closer look at the term on the left hand side of (3.12), i.e.

(X′AXA)−1X′

Axaβa.

One observes that

δ = (X′AXA)−1X′

Axa

is the OLS estimator of δ in a regression of xa on XA

xa = XAδ + ε.

129

Page 136: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

Condition (3.12) holds (and there is no bias) if

∗ δ = 0, so xa is uncorrelated with XA in the sample or

∗ βa = 0 holds and the smaller model is the population model.

If neither of these conditions holds, then βA is biased

E[βA|XA,xa] = βA + δβa.

This means that the OLS estimator βA is in general biased for

every parameter in the smaller model.

Since these biases are caused by using a regression model that

misses a variable that is relevant in the population model, this

kind of bias is called omitted variable bias and the smaller

model is said to be misspecified. (See Appendix 3A.4 in Wooldridge

(2009).)

130

Page 137: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

– One may also ask about the unconditional bias. Applying the LIE

delivers

E[βA|XA

]= βA + E

[δ|XA

]βa,

E[βA

]= βA + E

[δ]βa

Interpretation: The second expression delivers the expected value

of the OLS estimator if one keeps drawing new samples for y

and XA. Thus, in repeated sampling there is only bias if there

is correlation in the population between the variables in XA and

xa since otherwise E[δ]

= 0, cf. 2.4.

131

Page 138: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

• Wage Example Continued (from Section 3.2):

– If the observed regressor educ is correlated with the unobserved

variable ability, then the regressor xa = ability is missing in

the regression and the OLS estimators, e.g. for the effect of an

additional year of schooling, are biased.

– Interpretation of the various information sets for computing the

expectation of βeduc:

∗ First consider

E[βeduc|educ, exper, ability] = βeduc + δβability,

where

ability =(1 educ exper

)δ + ε.

Then the conditional expectation above indicates the average

of βeduc computed over many different samples where each

132

Page 139: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

sample of workers is drawn in the following way: You always

guarantee that each sample has the same number of workers

with e.g. 10 years of schooling, 15 years of experience, and 150

units of ability and the same number of workers with 11 years of

schooling, etc., so that for each combination of characteristics

there is the same amount of workers although the workers are

not identical.

∗ Next consider

E[βeduc|educ, exper] = βeduc + E[δ|educ, exper] βability.

When drawing a new sample you only guarantee that the amounts

of workers with a specific number of years of schooling and

experience stay the same. In contrast to above, you do not

control ability.

133

Page 140: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

∗ Finally consider

E[βeduc] = βeduc + E[δ] βability.

Here you simply draw new samples where everything is allowed

to vary. If you had, let’s say 50 workers with 10 years of school-

ing in one sample, you may have 73 workers with 10 years of

schooling in another sample. This possibility is excluded in the

two previous cases.

134

Page 141: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

• Effect of omitted variables on the conditional mean:

– General terminology:

∗ If E[y|xA, xa] 6= E[y|xA],

then the smaller model omitting xa is misspecified and esti-

mation will suffer from omitted variable bias.

∗ If E[y|xA, xa] = E[y|xA],

then the variable xa in the larger model is redundant and

should be eliminated from the regression.

∗ Trade Example Continued: Assume that the population

regression model only contains the variables gdp and distance.

Then a simple regression model with gdp is misspecified and a

multiple regression model with gdp, distance, and language

contains the redundant variable language.

135

Page 142: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

– It can happen that for a misspecified model Assump-

tions MLR.1 to MLR.4 are fulfilled.

To see this, consider only one variable in XA

E[y|xA, xa] = β0 + βAxA + βaxa.

Then, by the law of iterated expectations one obtains

E[y|xA] = β0 + βAxA + βaE[xa|xA].

If, in addition, E[xa|xA] is linear in xA

xa = α0 + α1xA + ε, E[ε|xA] = 0,

one obtains

E[y|xA] = β0 + βAxA + βa(α0 + α1xA)

= γ0 + γ1x

with γ0 = β0 + βaα0 und γ1 = βA + βaα1 being the parameters

of the best linear predictor, see Section 2.4.

136

Page 143: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.1 — UR March 2009 — R. Tschernig

– Note that in this case SLR.1 and SLR.4 are fulfilled for the smaller

model although it is not the population model. However

E[y|xA, xa] 6= E[y|xA]

if βa 6= 0 and α1 6= 0.

– Thus, model choice matters, see Section 3.5. If controlling for

xa is important (controlled random experiments, see Section 1.3),

then the smaller model is of not much use if the differences be-

tween the expected values are large for some values of the regres-

sors.

If one needs a model for prediction, the smaller model may be

preferable since it exhibits smaller estimation variance, see Sec-

tions 3.4.3 and 3.5.

Reading: Section 3.3 in Wooldridge (2009).

137

Page 144: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

3.4.2 The Variance of Parameter Estimates

• Assumption MLR.5 (Homoskedasticity):

V ar(ui|xi1, . . . , xik) = σ2 i = 1, . . . , n

• Assumptions MLR.1 bis MLR.5 are frequently called Gauss-Markov-

Assumptions.

• Note that by the Random Sampling assumption MLR.2 one has

Cov(ui, uj|xi1, . . . , xik, xj1, . . . , xjk) = 0 for all i 6= j, 1 ≤ i, j ≤ n,

Cov(ui, uj) = 0 for all i 6= j, 1 ≤ i, j ≤ n,

where for the latter equations the LIE was used. Because of MLR.2

one may also write

V ar(ui|xi1, . . . , xik) = V ar(ui|X), Cov(ui, uj|X) = 0, i 6= j.

138

Page 145: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

One writes all n variances and all covariances in a matrix

V ar(u|X) ≡

V ar(u1|X) Cov(u1, u2|X) · · · Cov(u1, un|X)

Cov(u2, u1|X) V ar(u2|X) · · · Cov(u2, un|X)

... ... . . . ...

Cov(un, u1|X) Cov(un, u2|X) · · · V ar(un|X)

(3.13)

= σ2

1 0 · · · 0

0 1 · · · 0

... ... . . . ...

0 0 · · · 1

or short (MLR.2 and MLR.5 together)

V ar(u|X) = σ2I. (3.14)

139

Page 146: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

• Variance of the OLS Estimator

Under the Gauss-Markov Assumptions MLR.1 to MLR.5 we have

V ar(βj|X) =σ2

SSTj(1 − R2j)

, xj not constant, (3.15)

where SSTj is the total sample variation (total sum of squares)

of the j-th regressor,

SSTj =

n∑

i=1

(xij − xj)2,

and the coefficient of determination R2j is taken from a regression

of the j-th regressor on all other regressors

xij = δ0xi0 + · · · + δj−1xi,j−1 + δj+1xi,j+1 + vi,

i = 1, . . . , n.(3.16)

(See Appendix 3A.5 in Wooldridge (2009) for the proof.)

140

Page 147: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

Interpretation of the variance of the OLS estimator:

– The larger the error variance σ2, the larger is the variance of

βj.

Note: This is a property of the population so that this variance

component cannot be influenced by sample size. (In analogy to

the simple regression model.)

– The larger the total sample variation SSTj of the j-th

regressor xj is, the smaller is the variance V ar(βj|X).

Note: The total sample variation can always be increased by

increasing sample size since adding another observation increases

SSTj.

– If SSTj = 0, assumption MLR.3 fails to hold.

141

Page 148: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

– The larger the coefficient of determination R2j from regression

(3.16) is, the larger is the variance of βj.

– The larger R2j, the better the variation in xj can be explained

by variation in the other regressors because in this case there is

a high degree of linear dependence between xj and the other

explanatory variables.

Then only a small part of the sample variation in xj is specific for

the j-th regressor (precisely the error variation in (3.16)). The

other part of the variation can be explained equally well by the

estimated linear combination of all other regressors. This effect

is not well attributable by the estimator to either variable xj or

the linear combination of all the remaining variables and thus the

estimator suffers from a larger estimation variance.

142

Page 149: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

– Special cases:

∗ R2j = 0: Then xj and all other explanatory variables are empiri-

cally uncorrelated and the parameter estimator βj is unaffected

by all other regressors.

∗ R2j = 1: Then MLR.3 fails to hold.

∗ R2j near 1: This situation is called multi- oder near collinear-

ity. In this case V ar(βj|X) is very large.

– But: The multicollinearity problem is reduced in larger samples

because SSTj rises and hence variance decreases for a given value

of R2j. Therefore multicollinearity is always a problem of small

sample sizes, too.

143

Page 150: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

• Estimation of the error variance σ2

– Unbiased estimation of the error variance σ2:

σ2 =u′u

n − (k + 1).

– Properties of the OLS estimator (continued):

Call sd(βj|X) =√

V ar(βj|X) the standard deviation, then

sd(βj|X) =σ

(SSTj(1 − R2

j))1/2

is the standard error of βj.

144

Page 151: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

• Variance-covariance-matrix of the OLS estimator:

Basics: The covariance of jointly estimating βj and βl — between

the estimators of the j-th and the l-th parameter — is written as

Cov(βj, βl|X) = E[(βj −βj)(βl−βl)|X], j, l = 0, 1, . . . , k +1,

where unbiasedness of the estimators is assumed. Rewrite all vari-

ances and covariances in a ((k + 1) × (k + 1))-matrix:

V ar(β|X) ≡

=

Cov(β0, β0|X) Cov(β0, β1|X) · · · Cov(β0, βk|X)

Cov(β1, β0|X) Cov(β1, β1|X) · · · Cov(β1, βk|X)

... ... . . . ...

Cov(βk, β0|X) Cov(βk, β1|X) · · · Cov(βk, βk|X)

145

Page 152: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

Rewriting yields

V ar(β|X) =

=

E[(β0 − β0)(β0 − β0)|X] · · · E[(β0 − β0)(βk − βk)|X]

E[(β1 − β1)(β0 − β0)|X] · · · E[(β1 − β1)(βk − βk)|X]

... . . . ...

E[(βk − βk)(β0 − β0)|X] · · · E[(βk − βk)(βk − βk)|X]

= E

(β0 − β0)

· · ·(βk − βk)

((β0 − β0) · · · (βk − βk)

)∣∣∣∣∣∣∣∣X

.

Next it will be shown that it holds:

V ar(β|X) = E[(β − β)(β − β)′|X

]= σ2(X′X)−1.

146

Page 153: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

Proof:

Remember that correct model specification implies

β = (X′X)−1X′y = (X′X)−1X′(Xβ + u) = β + (X′X)−1X′u,

hence β−β = (X′X)−1X′u, which inserted into V ar(β|X) yields

E[(β − β)(β − β)′|X

]= E

[(X′X)−1X′u

((X′X)−1X′u

)′|X]

= E[(X′X)−1X′uu′X(X′X)−1|X

]

= (X′X)−1X′ E[uu′|X]︸ ︷︷ ︸σ2In

X(X′X)−1

= σ2(X′X)−1X′X(X′X)−1

= σ2(X′X)−1.

From the definition of V ar(β|X) above it can be seen that the

diagonal elements are the variances V ar(βj|X), j = 0, . . . , k.

147

Page 154: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.2 — UR March 2009 — R. Tschernig

• Efficiency of OLS

Note: The OLS estimator is a linear estimator with respect to the

dependent variable because it holds for given X that

βj =

n∑

i=1

(vi∑ni=1 v2

i

)yi,

where vi are the residuals from regression (3.16). (For a derivation

without matrix algebra see Appendix 3A.2 in Wooldridge (2009).)

Further OLS is unbiased so E[βj] = βj.

Gauss-Markov Theorem: Under assumptions MLR.1 through

MLR.5 the OLS estimator is the best linear unbiased estimator

(BLUE).

“Best” means that the OLS estimator, that is unbiased since E[βj] =

βj, has minimal variance among linear unbiased estimators.

148

Page 155: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.3 — UR March 2009 — R. Tschernig

3.4.3 Trade-off between Bias and Multicollinearity

• Example: Let the population model be

y = β0 + β1x1 + β2x2 + u.

– For a given sample let R21 be close to 1. Then β1 is estimated

with a large variance by (3.15).

– A possible solution? Leaving out the regressor x2 and estima-

tion of the simple regression. But then, as already shown, the

estimator of β1 is biased.

Hence: If there is correlation between x1 and x2 near 1 or -1, then

— for given sample size — one faces a trade-off between

variance and bias.

– What we observe is kind of a statistical uncertainty relation:

The sample does not provide sufficient information to precisely

149

Page 156: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.3 — UR March 2009 — R. Tschernig

answer the formulated question.

– The only good solution: Increasing sample size.

– Alternative solution: Combining highly correlated variables.

• Variance of parameter estimates in misspecified models:

Again, there are different possibilities how incorrect regression mod-

els might be chosen (cf. Section 3.4.1):

– Too many variables: Parameters are estimated for variables that

do not play a role in the “true” data generation mechanism

(redundant variables).

– Too few variables: One or more variables are missing which

are relevant in the population regression model (omitted vari-

ables).

150

Page 157: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.3 — UR March 2009 — R. Tschernig

– Wrong variables: A combination of both.

Effect on the variance of parameter estimators:

– Case 1 (redundant variables):

Consider the population model y = Xβ+u. Assume that instead

the following sample specification is chosen:

y = Xβ + zα + w,

where the vector z contains all sample observations of the variable

z. The variance of the parameter estimator βj is

V ar(βj|X) =σ2

SSTj(1 − R2j,X,z)

,

where now R2j,X,z is the coefficient of determination of a re-

gression of xj on all other variables in X and on z. It is easily

seen that R2j,X,z ≥ R2

j because less variables are included in the

151

Page 158: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.4.3 — UR March 2009 — R. Tschernig

regression yielding the second R2.

Therefore: Including additional variables in a regression

model increases estimation variance or leaves it un-

changed.

– Case 2 (omitted variables):

The converse of case 1 holds: If a variable is omitted, it

can be shown that the estimation variance is smaller than

when using the true model.

– Case 3 (redundant and omitted variables):

Should really be avoided.

Correct model specification is crucial!

152

Page 159: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

3.5 Model Specification I: Model Selection Criteria

• Goal of model selection:

– In principle: find the population model.

– In practice: find the “best” model for the purpose of the analysis.

– More specific: Under the assumption that the population model

is a multiple linear regression model find all regressors that are

included in the regression and their appropriate transformations

(log or level or ...). Avoid omitting variables and including irrel-

evant variables.

153

Page 160: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

• Brief theory of model selection:

– There are two issues:

a) the variable (model) choice,

b) the estimation variance.

– Consider a): Choose a goal function to evaluate different models.

A popular goal function is the mean squared error (MSE).

For fixed parameters it is defined as

MSE = E[(y − β0x0 − β1x1 − · · · − βkxk)2

], (3.17)

see also equation (2.17) for the simple regression case.

Choose the model for which the MSE is minimal.

154

Page 161: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

Important cases:

∗ If x0, . . . , xk include all relevant variables, the population

model is a multiple linear regression, and MSE is minimized

with respect to the parameters, then

MSE = E[u2]

= σ2.

∗ If relevant variables are missing, it can be shown that

the MSE decomposes into variance and squared bias. For

simplicity, omit all variables except x1 and fit the simple linear

regression

y = γ0 + γ1x1 + v.

Then

MSE1 = E[(y − E[y|x1, . . . , xk]) + (E[y|x1, . . . , xk] − E[y|x1])2

]

= σ2 + (E[y|x1, . . . , xk] − E[y|x1])2 .

155

Page 162: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

Since the squared bias term is positive, (E[y|x1, . . . , xk] − E[y|x1])2 >

0, one clearly has MSE < MSE1.

– Consider a) and b): If parameters have to estimated, a

further term enters the mean squared error, namely the variances

and covariances for estimating the model parameters. One has

MSE = V ariance of population error + Bias of chosen model2

+ Estimation variance,

where the estimation variance in general increases with the num-

ber of variables. Now it can happen that for minimizing MSE it

is optimal to choose a model that omits variable(s). A typical

case is prediction.

– Therefore, a reliable method for estimating the MSE is needed.

156

Page 163: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

• What does not work:

– Selecting the model with the smallest standard error of

the regression σ does not work.

∗ Why? It is always possible to select a model for which every

residual is zero, that is ui = 0 for all i = 1, . . . , n. Then σ = 0

as well although the error variance σ2 > 0 in the true model.

∗ How? Simply take k+1 = n regressors into the sample regres-

sion model which fulfil MLR.3 and solve the normal equations

(3.5). Then you obtain a perfect fit since you have a linear

equation system with n equations and n unknown parameters.

∗ Note that you can add any regressors that fulfil MLR.3 even if

they have nothing to do with the population regression model.

∗ Note also that SSR remains constant or decreases if for a given

157

Page 164: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

sample of size n a further regressor variable is added since

the linear equation system obtains more flexibility to fit the

sample observations. Therefore σ2 = SSRn remains constant or

decreases as well.

∗ For the variance estimator σ2 = SSRn−k−1 there are opposing

effects: a decrease in SSR maybe offset by the decrease in

n − k − 1.

In sum, the standard error of regression tends to decrease when

additional regressors are added so that it is not suited for selecting

those variables that are part of the population model.

– Selecting the model with the largest R2 does not work either.

Why?

158

Page 165: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

– Although the adjusted R2 may fall or increase with adding another

regressor, it screws up for k + 1 = n since R2 = 1 as well in this

case.

• Solution: Use model selection criteria

– Basic idea:

Selection criterion = lnu′un

+ (k + 1) · penalty function(n)

∗ First term: a variance estimator for ln(σ2) of the chosen

model.

Note that the estimated variance σ2 = u′u/n is reduced by

every additionally included independent variable.

∗ Second term: is a penalty term punishing the number of

parameters to avoid models that include redundant variables.

159

Page 166: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

Because the true error variance is typically underestimated us-

ing σ2, the penalty term penalizes the inclusion of additional

regressors.

The penalty term increases with k and the penalty function

must be chosen such that is decreases with n such that a large

number of parameters matters less in large samples. Why?

∗ This implies a trade-off : Regressors are taken in the model,

if the penalty is smaller than the decrease in estimated MSE.

When choosing a criterion one determines how the trade-off is

shaped.

∗ Rule: Choose among all considered candidate models the spec-

ification for which the criterion is minimal.

160

Page 167: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

– Popular model selection criteria:

∗ the Akaike Criterion (AIC)

AIC = lnu′un

+ (k + 1)2

n, (3.18)

∗ the Hannan-Quinn Criterion (HQ)

HQ = lnu′un

+ (k + 1)2 ln(ln n)

n, (3.19)

∗ the Schwarz / Bayesian Information Criterion (SC/BIC)

SC = lnu′un

+ (k + 1)ln n

n. (3.20)

It is advised always to check all criteria although the researcher

decides which to use. In nice cases, all criteria deliver the same

result. Note that for standard sample sizes SC punishes additional

parameters more than HQ, and HQ more than AIC

161

Page 168: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 3.5 — UR March 2009 — R. Tschernig

• Trade Example Continued:

– Model 1

LOG(TRADE_0_D_O) = -3.4610 + 0.7701*LOG(WDI_GDPUSDCR_O)

AIC = 4.282, HQ = 4.310, SC = 4.357

– Model 2

LOG(TRADE_0_D_O) = 4.8009 + 1.0885*LOG(WDI_GDPUSDCR_O) - 1.9708*LOG(CEPII_DIST)

AIC = 4.025, HQ = 4.068, SC = 4.138

– Model 3

LOG(TRADE_0_D_O) = -9.5789+ 1.3566*LOG(WDI_GDPUSDCR_O) - 1.1442*LOG(CEPII_DIST)

+ 3.1265*CEPII_COMCOL_REV

AIC = 3.694, HQ = 3.751, SC = 3.844

– Model 4

Y_LOG = -13.0268 + 1.3176*LOG(WDI_GDPUSDCR_O) - 0.6249*LOG(CEPII_DIST)

+ 3.3512*CEPII_COMCOL_REV + 2.1096*CEPII_COMLANG_OFF

AIC = 3.679, HQ = 3.751, SC = 3.867

– Comparing all four models, SC selects model 3 with regressors gdp, distance and common colonizer while

AIC selects model 4 with additional regressor common official language. See Appendix 10.4 for more details

on variables. One can nicely see that SC punishes additional variables more than AIC. Statistical tests may

provide further information on which model to choose, see Sections 4.3 onwards.

162

Page 169: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

4 Multiple Regression Analysis: Hypothesis Testing and

Confidence Intervals

4.1 Basics of Statistical Tests

Foundations of statistical hypothesis testing

• In general: Statistical hypothesis tests allow statistically sound and

unambiguous answers to yes-or-no questions:

– Do men and women earn equal income in Kazakhstan?

163

Page 170: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

– Do certain political attempts lead to a decrease in unemployment

in 2010?

– Are imports to Kazakhstan influenced by the gdp of exporting

countries?

164

Page 171: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

• Elements of a statistical test:

1. Two disjoint hypotheses about the value(s) of (a) parameter(s)

θ in a population.

That means that one of the two competing hypotheses has to

hold in the population:

– Null hypothesis H0

– Alternative hypothesis H1

2. A test statistic T that is a function of some or all sample values

(X,y). We will denote it as t(X,y).

3. A decision rule, stating for which values of t(X,y) the null

hypothesis H0 is rejected and for which values the null is

not rejected.

165

Page 172: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

More precisely: Partition the domain of the test statistic T in

two disjoint regions:

– Rejection region, critical region CIf the test statistic t(X,y) is located in the critical region, H0

is rejected:

Reject H0 if t(X,y) ∈ C

– Non-rejection region

If the test statistic t(X,y) falls into the non-rejection region,

H0 is not rejected:

Do not reject H0 if t(X,y) 6∈ C

– Critical value c: Boundary between rejection and non-rejection

region.

166

Page 173: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

• Properties of a test:

– Type I error, α error:

The type I error measures the probability (evaluated before the

sample is taken) of rejecting H0 though H0 is correct in the pop-

ulation,

α = P (Reject H0 |H0 is true) = P (T ∈ C|H0 is true).

The type I error is frequently called the significance level or

size of a test.

– Type II error, β error:

The type II error gives the probability of not rejecting H0 though

it is wrong,

β = P (Not reject H0|H1 is true).

167

Page 174: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

– Size of a test: The significance level or size equal to α has to

be fixed by the researcher before the test is carried out.

– Power of a test: The power of a test gives the probability of re-

jecting a wrong null hypothesis. power = π = P (Reject H0 |H1 is true),

that is

π = 1 − P (Not reject H0|H1 is true) = 1 − β.

To calculate C for a given α one has to know the probability

distribution of the test statistic under H0.

168

Page 175: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

Deriving Tests about the Sample Mean:

1. Consider two disjoint hypotheses about the mean of a sample.

(For example, the mean µ of hourly wages in the US in 1976.)

a) Null hypothesis

H0 : µ = µ0

(In our example: mean hourly wage is 6 US-$,

thus H0 : µ = 6)

b) Alternative hypothesis

H1 : µ 6= µ0

(In the example: mean hourly wages are not 6 US-$,

thus H1 : µ 6= 6)

169

Page 176: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

2. Test statistic:

a) Choice of an estimator for the unknown mean µ, e.g. the OLS

estimator of a regression of hourly wages w on a constant:

Compute the sample mean

µ =1

n

n∑

i=1

wi.

out of a sample w1, . . . , wn with n observations.

b) Obtain the probability distribution of the estimator: For simplicity

assume that individual wages wi are jointly normally distributed

with expected value µ and variance σ2w, that is

wi ∼ N (µ, σ2w).

170

Page 177: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

From the properties of jointly normally distributed random vari-

ables it follows that

µ ∼ N(µ, σ2

µ

),

where σ2µ = V ar(µ) = V ar(n−1∑wi) = n−1σ2

w.

c) In order to obtain a test statistic t(w1, . . . , wn) all unknown pa-

rameters have to be removed from the distribution. In this simple

case this can be achieved by standardizing µ

t(w1, . . . , wn) =µ − µ

σµ∼ N (0, 1).

d) The test statistic t(w1, . . . , wn) can be calculated if we know µ

and σµ. Assume for the moment that σµ is known.

171

Page 178: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

Which value takes µ under H0?

H0 : µ = µ0.

Under H0 we can compute the test statistic for a given sample as

t(w1, . . . , wn) =µ − µ0

σµ∼ N (0, 1).

3. Decision rule:

When should we reject H0 and in which case shouldn’t we?

(Now the significance level α has to be chosen!)

If the deviation of µ from the null hypothesis value µ0 is large

enough one would reject H0.

172

Page 179: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

0Region of rejection under H

f(t)

t

0

0

Critical value c

Probability of error a/2Probability of error a/2

0Region of rejection under H

0Non-rejection region under H

Intuition: If t is very large (or very small) then

a) the estimated mean µ is far from µ0 (under H0) and / or

b) the standard deviation σµ of the estimated deviation is small

relative to µ − µ0.

173

Page 180: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

• When is |t| large enough (to reject H0)?

• Note: Under H0 it holds that

t(w1, . . . , wn) =µ − µ0

σµ∼ N (0, 1)

and hence for given α the rejection region C can be determined

(see figure).

• Formally:

P (T < −c|H0) + P (T > c|H0) = α

or in this case due to the symmetry of the normal distribution

P (T < −c|H0) =α

2und P (T > c|H0) =

α

2.

The values of −c and c are tabulated — they are the α/2 and

1 − α/2 quantiles of the standard normal distribution.

174

Page 181: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

• Under H1 it holds that

µ − µ

σµ∼ N (0, 1).

Expanding yields

µ − µ + µ0 − µ0

σµ=

µ − µ0

σµ+

µ0 − µ

σµ=

µ − µ0

σµ︸ ︷︷ ︸t(w1,...,wn)

− µ − µ0

σµ︸ ︷︷ ︸m

and therefore we have under H1

t(w1, . . . , wn) =µ − µ0

σµ∼ N

(µ − µ0

σµ, 1

)

since X ∼ N (m, 1) is equivalent to X − m ∼ N (0, 1).

• Conclusion: If H1 is true, then the density of t(w1, . . . , wn) is

shifted by (µ − µ0)/σµ.

175

Page 182: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

• In the figure exhibiting the density under H1 (for a specific value

of µ 6= µ0) the power can be seen as the sum of the two shaded

areas because π = P (t < −c|H1) + P (t > c|H1).

f(t)

t

0Non-rejection region of H

0Rejection region of H

0Rejection region of H

Critical value c

Power = Sum of the probabilites of rejection

Probability of rejectionProbability of rejection

0 0m-m

ms

• For a given σµ, the power of the test increases with the distance

between the null hypothesis µ0 and the true value µ.

• Recall that if H0 is true, then (µ − µ0)/σµ = 0 holds and one

176

Page 183: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

obtains the distribution under H0.

• It can further be seen that the type II error — given as 1 − π =

1 − (1 − β) = β — does not equal zero!

4. There remains one problem: In real world applications we do not

know the standard deviation of the mean estimator σµ = σw/√

n.

Remedy: Estimation by

σµ =σw√

n.

Then one has the popular t statistic

t(w1, . . . , wn) =µ − µ0

σµ,

however, watch out!

177

Page 184: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

The test statistic is no longer normally distributed but follows a t

distribution with n− 1 degrees of freedom (short tn−1). Therefore

t(w1, . . . , wn) =µ − µ0

σµ∼ tn−1.

To obtain the critical values

P (T < −c|H0) =α

2und P (T > c|H0) =

α

2,

the tables of the t distribution have to be considered (see Appendix

G, Table G.2 in Wooldridge (2009)).

Wage Example Continued:

Hourly wages wi, i = 1, . . . , 526 of US employees:

1. Hypotheses:

a) Null hypothesis: H0 : µ = 6

b) Alternative hypothesis: H1 : µ 6= 6

178

Page 185: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

2. Estimation and calculation of the t statistic in EViews:====================================================================

Dependent Variable: WAGE, Method: Least Squares Sample: 1 526, Included

observations: 526

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C 5.896103 0.161026 36.61580 0.0000

====================================================================

R-squared 0.000000 Mean dependent var 5.896103

Adjusted R-squared 0.000000 S.D. dependent var 3.693086

S.E. of regression 3.693086 Akaike info criterion 5.452701

Sum squared resid 7160.414 Schwarz criterion 5.460810

Log likelihood -1433.060 Durbin-Watson stat 1.817647

====================================================================

Thus

µ = 5.896103, σµ = 0.161026

and

t(w1, . . . , w526) =5.896103 − 6

0.161026= −0.64521878.

179

Page 186: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

3. Determination of critical values:

Suppose a significance level of α = 5%. Then the critical value

c = t525,0.05 can be obtained from the table for the t distribution

with n − 1 = 525 degrees of freedom: c = t525,0.05 = 1.96.

4. Test decision: Do not reject H0 : µ = 6 since

−c = −1.96 < t = −0.645 < c = 1.96,

and therefore t /∈ C (the test statistic is not contained in the rejec-

tion region).

5. However:

Do hourly wages wi really follow a normal distribution as assumed?

Examine the histogram of the sample observations wi (see the de-

scriptive statistics option in EViews):

180

Page 187: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

Result:

0

20

40

60

80

100

120

140

0 2 4 6 8 10 12 14 16 18 20 22 24

Series: WAGE

Sample 1 526

Observations 526

Mean 5.896103

Median 4.650000

Maximum 24.98000

Minimum 0.530000

Std. Dev. 3.693086

Skewness 2.007325

Kurtosis 7.970083

Jarque-Bera 894.6195

Probability 0.000000

• The normality condition for our test does not seem to be fulfilled.

The test result could be misleading!

• There are also tests that work without the normality assumption,

see Section 5.1.

181

Page 188: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

One- and two-sided hypothesis tests

• Two-sided tests

H0 : θ = θ0 versus H1 : θ 6= θ0

• One-sided tests

– Tests with left-sided alternative hypothesis

H0 : θ ≥ θ0 versus H1 : θ < θ0

Notice: Often, also in Wooldridge (2009), you can read H0 : θ =

θ0 versus H1 : θ < θ0. This notation, however, is somewhat

imprecise since either H0 or H1 has to be true. This is not made

clear by the latter notation.

182

Page 189: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

H0 : θ ≥ θ0 versus H1 : θ < θ0

0 t

0Non-rejection region under H

Critical value c

f(t)

Probability of error a

0Region of rejection under H

∗ Decision rule:

t < c ⇒ Reject H0.

∗ You do not need a rejection region on the right hand side since

all θ > θ0 are elements of H0 and thus fall into the non-

rejection region.

183

Page 190: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

∗ The critical value is obtained on basis of the density for θ = θ0

since then for a given critical value c the shaded area is larger

than for any θ > θ0 and one prefers a test for which the

maximum of the type I error is controlled.

184

Page 191: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

Wage Example Continued:

(In the following we ignore that wages are not normally dis-

tributed.)

∗ The null hypothesis states that mean hourly wages are US-$ 6

or more (H1 says it is less than US-$ 6):

H0 : µ ≥ 6 versus H1 : µ < 6

∗ Calculation of the test statistic: as in the two-sided case, be-

cause again µ0 is the boundary between null and alternative

hypothesis:

t(w1, . . . , w526) =5.896103 − 6

0.161026= −0.64521878

185

Page 192: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

∗ Calculation of the critical value: For α = 0.05 the critical value

(note: one-sided test) from the t distribution with 525 degrees

of freedom (df) is 1.645.

∗ Decision: Since

t = −0.64521878 > c = −1.645

the null hypothesis is not rejected.

186

Page 193: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

– Test with right-sided alternative

H0 : θ ≤ θ0 versus H1 : θ > θ0f(t)

t

0

0

0Non-rejection region under H Region of rejection under H

Probability of error a

Critical value c

0

As with left-sided alternatives, but reversed.

• Why do we carry out one-sided tests? Consider the following

issue: Provide statistical evidence that the mean wage is above $

5.60.

– Since by using statistical tests we can never confirm but only

187

Page 194: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

reject a hypothesis, we have to choose the alternative hypothesis

such that it reflects our conjecture. Here, this is a mean wage

larger than $ 5.60. Rejecting the null hypothesis then provides

statistical evidence for the alternative hypothesis. However, there

are exceptions to this rule, see e.g. Sections 4.6 and 4.7.

– We thus have to test if the mean wage is statistically significantly

larger than $ 5.60.

We therefore need a test with a one-sided alternative. Our pair

of hypotheses is

H0 : µ ≤ 5.60 versus H1 : µ > 5.60.

– For α = P (T > c|H0) = 0.05 the critical value is c = 1.645.

– Decision:

t =5.896103 − 5.60

0.161026= 1.8388521 > c = 1.645

188

Page 195: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

⇒ Reject H0 (for size 5%) that means data confirm that the

mean wage is statistically significantly above $ 5.60.

– If, on the contrary, we want to examine whether mean wages

deviate from $ 5.60 in any direction, the pair of hypotheses is:

H0 : µ = 5.60 versus H1 : µ 6= 5.60.

Given the chosen significance level, α = 0.05, the critical values

are -1.96 and 1.96, respectively, and hence

−1.96 < 1.84 < 1.96.

Thus, the null hypothesis cannot be rejected.

– It is therefore easier to reject if one has knowledge about the

location of the alternative because then the region of rejection

can be made smaller and it is “easier” to reject the null hypothesis

if it is false.

189

Page 196: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

p-values

• For every test statistic one can calculate the largest significance

level for which — given a sample of observations — the computed

test statistic would have just not led to a rejection of the null. This

probability is called p-value (probability value).

In case of a one-sided test with right-hand alternative one has

(Wooldridge 2009, Appendix C.6, p. 776)

P (T ≤ t(y)|H0) ≡ 1 − p

• Since P (T > t(y)|H0) = 1 − P (T ≤ t(y)|H0), one also has

P (T > t(y)|H0) = p

and thus it is common to say that the p-value is the smallest signif-

icance level at which the null can be rejected. Cf. Section 4.2, p.

133 in Wooldridge (2009)

190

Page 197: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

• The decision rule of a test can also be stated in terms of p-

values:

Reject H0 if the p-value is smaller than the significance level α.

Note: In the figure t is shorthand for t(y).

Left-sided test: p = P (T < t(X,y)),

Right-sided test: p = P (T > t(X,y))

Two-sided test: p = P (T < −|t(X,y)|) + P (T > |t(X,y)|)

191

Page 198: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.1 — UR March 2009 — R. Tschernig

• Software packages (e.g. EViews) often give p-values for

H0 : θ = 0 versus θ 6= 0.

Reading: Appendix C.6 in Wooldridge (2009).

192

Page 199: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.2 — UR March 2009 — R. Tschernig

4.2 Probability Distribution of the OLS Estimator

For the multiple regression model

y = Xβ + u

we assume MLR.1 to MLR.5, as we did in Sections 3.2 and 3.4.

• Recall from Section 3.4.1 that under MLR.1 the OLS estimator

β = (X′X)−1X′y

can be written as

β = β + (X′X)−1X′︸ ︷︷ ︸

W

u. (4.1)

193

Page 200: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.2 — UR March 2009 — R. Tschernig

• In order to derive the probability distribution of a test statistic one

needs the probability distribution of the underlying estimators since

the former is a function of the latter. Furthermore, the probability

distribution of the OLS estimator is necessary to construct interval

estimators, see Section 4.5.

Conditioning on the regressor matrix X, it follows from (4.1) that

the probability distribution of the OLS estimator only depends on

the error vector u. Similarly to the case of testing the mean we

make the assumption that the relevant random variables are nor-

mally distributed.

194

Page 201: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.2 — UR March 2009 — R. Tschernig

• Assumption MLR.6 (Normality of Errors):

Conditionally on the regressor matrix X, the vector of sample errors

u is stochastically independently and identically normally distributed

as

ui|xi1, . . . , xik ∼ i.i.d.N (0, σ2), i = 1, . . . , n.

Jointly with MLR.2, it can be equivalently written that u is multi-

variate normal with mean zero and variance-covariance matrix σ2I

u|X ∼ N (0, σ2I).

• Of course, one could assume for the errors u any other probability

distribution. However, assuming normally distributed errors has two

advantages:

1. The probability distribution of the OLS estimator and derived test

statistics can easily be derived, see the remaining sections.

195

Page 202: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.2 — UR March 2009 — R. Tschernig

2. Under certain conditions the resulting probability distribution for

the OLS estimator holds even if the errors are not normally dis-

tributed. Then it is called asymptotic distribution, see Chap-

ter 5.

See Appendix B and D in Wooldridge (2009) for rules and properties

of normally distributed random variables and vectors.

196

Page 203: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.2 — UR March 2009 — R. Tschernig

• Properties of the multivariate normal distribution:

– If Z ∼ N (µ, σ2), then aZ + b ∼ N (aµ + b, a2σ2).

– If the random numbers Z and V are jointly normally distributed,

then Z and V are stochastically independent if and only if

Cov(Z, V ) = 0.

– Every linear combination of a vector of identically and indepen-

dently normally distributed random variables z ∼ N (µ, σ2I) is

also normally distributed. Let

w =

w1

...

wn

, z =

z1

...

zn

.

Then∑n

j=1 wjzj = w′z ∼ N(w′µ, σ2w′w

).

197

Page 204: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.2 — UR March 2009 — R. Tschernig

More generally, it holds for z = (z1, . . . , zn)′ ∼ N (µ, σ2I) and

W =

w01 w02 · · · w0n

w11 w12 · · · w1n

w21 w22 · · · w2n

... ... ...

wk1 wk2 · · · wkn

∑nj=1 w0jzj

...∑n

j=1 wkjzj

= Wz ∼ N

(Wµ, σ2WW′

). (4.2)

• The property (4.2) for linear combinations of normally distributed

random numbers is very helpful for us since the OLS estimator (4.1)

is just such a linear combination.

198

Page 205: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.2 — UR March 2009 — R. Tschernig

Thus, one obtains

β − β = Wu ∼ N(0, σ2WW′

).

Since WW′ = (X′X)−1X′X(X′X)−1 = (X′X)−1, one obtains

β ∼ N(β, σ2(X′X)−1

).

Similarly one can show that

βj ∼ N

(βj, σ

2βj

)(4.3)

with σ2βj

= σ2

SSTj(1−R2j)

(see (3.15) in Section 3.4).

• Note that (4.3) generalizes the example of Section 4.1 for testing

hypotheses on the mean. If X is a column vector of ones, then

β0 = µ.

199

Page 206: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.3 — UR March 2009 — R. Tschernig

4.3 The t Test in the Multiple Regression Model

• Derivation of the test statistic and its distribution

– From (4.3) βj ∼ N

(βj, σ

2βj

).

– Standardizing leads to

βj − βj

σβj

∼ N (0, 1) .

For estimated σ2 (no proof) the test statistic follows a t gdistribution

with n − k − 1 degrees of freedom. Estimating k + 1 regression

parameters implies k + 1 restrictions from the normal equations

t(X,y) =βj − βj

σβj

∼ tn−k−1.

200

Page 207: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.3 — UR March 2009 — R. Tschernig

• Critical region and decision rule

– Two-sided test

∗ Hypotheses:

H0 : βj = βj0 versus H1 : βj 6= βj0.

For a given significance level one obtains the critical values from

the table of the t distribution such that P (T < −c|H0) = α/2

and P (T > c|H0) = α/2 or equivalently 2 ·P (T > c|H0) = α.

∗ Decision rule:

· Reject H0 if |t(X,y)| > c, otherwise do not reject H0.

· Alternatively: Calculate p-value

p = P (|T | > |t(X,y)||H0) = 2 · P (T > t(X,y)|H0)

and reject H0 if p < α, otherwise do not reject H0.

201

Page 208: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.3 — UR March 2009 — R. Tschernig

– One-sided test with left-sided alternative

∗ Hypotheses:

H0 : βj ≥ βj0 versus H1 : βj < βj0.

For a given significance level one obtains the critical value from

the table of the t distribution such that

P (T < c|H0) = α.

∗ Decision rule:

· Reject H0 if t(X,y) < c, otherwise do not reject H0.

· Alternatively: Calculate p-value

p = P (T < t(X,y)|H0).

and reject H0 if p < α, otherwise do not reject H0.

202

Page 209: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.3 — UR March 2009 — R. Tschernig

– One-sided test with right-sided alternative

∗ Hypotheses:

H0 : βj ≤ βj0 versus H1 : βj > βj0.

For a given significance level one obtains the critical value from

the table of the t distribution such that

P (T > c|H0) = α.

∗ Decision rule:

· Reject H0 if t(X,y) > c, otherwise do not reject H0.

· Alternatively: Calculate p-value

p = P (T > t(X,y)|H0)

and reject H0 if p < α, otherwise do not reject H0.

203

Page 210: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.3 — UR March 2009 — R. Tschernig

• Economic versus statistical significance

– For a given (statistical) significance level α, the power of a test

increases with increasing sample size since σβj

in the denominator

of the test statistic decreases with sample size.

– Not being able to reject a null hypothesis may thus be simple

caused by a too small sample size (if the null hypothesis is wrong

in the population).

– On the other hand, if a variable has only weak influence in the

population, its parameter will be significantly different from zero

if the sample size is large enough. Thus, even if βjxj only has

small economic impact on the dependent variable, the variable

is statistically significant.

– Be careful: In order to avoid estimation bias due to too small

204

Page 211: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.3 — UR March 2009 — R. Tschernig

models, significant variables must be kept in the model, see Sec-

tion 3.4.1.

• Choice of significance level

– Two reasons for decreasing the significance level α with increasing

sample size n:

∗ Larger sample sizes make tests more powerful. Thus, one can

decide whether the benefits of a larger sample size is only at-

tributed to reducing the Type II error β = 1 − π or whether

one wants also to decrease the Type I error as well. In case

of standard significance testing, the type I error represents the

probability to include a variable in the model although it is irrel-

evant in the population model. Thus, it makes sense to reduce

this probability as well.

205

Page 212: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.3 — UR March 2009 — R. Tschernig

∗ In general one selects relevant variables from a large number

of possibly relevant variables. Since for each statistically sig-

nificant variable a significance level α holds, one includes er-

roneously on average about αK redundant variables where K

denotes the total number of variables considered. Since fre-

quently K is allowed to increase with sample size n, the sig-

nificance level α should fall in order to avoid αK to increase.

– If one uses the Hannan-Quinn (HQ) (3.19) or the Schwarz (SC)

(3.20) model selection criterion, then the significance level de-

creases with sample size. This is not the case for the AIC criterion

(3.18).

206

Page 213: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.3 — UR March 2009 — R. Tschernig

• Insignificance, multicollinearity, and sample size

– Recall: The test statistic t(X,y) is small since

∗ the deviation between the true value and the null hypothesis is

small, for example between βj and βj0

∗ or the estimated standard error σβj

of βj is large.

The latter can also be caused by multicollinearity in X. Thus: A

high degree of multicollinearity makes it more unlikely to reject

the null hypothesis (since |t(X,y)| is small on average).

– For this reason one may keep insignificant variables in the regres-

sion. However, corresponding parameter estimates have then to

be interpreted with care.

Reading: Appendices C.5, E.3 in Wooldridge (2009) if needed.

207

Page 214: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.4 — UR March 2009 — R. Tschernig

4.4 Example of an Empirical Analysis I: A Simplified

Gravity Equation

Trade Example Continued (from Section 3.5):

Compare steps of an econometric analysis, see Section 1.2.

1. Question of interest:

Quantify impact of changes of gdp in exporting country and changes

in imports to Kazakhstan.

2. Economic model:

Under idealized assumptions including complete specialization in

production and identical consumption preferences among countries,

no trading costs, and focusing exclusively on imports, economic the-

ory implies (see Section II, equation (5) in Fratianni (2007))

importsi = A gdpi distanceβ2i , β2 < 0.

208

Page 215: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.4 — UR March 2009 — R. Tschernig

This implies a unit elasticity (elasticity of 1) of gdp on imports. This

means that a 1% change in gdp in the exporting country increases

imports by 1% as well.

This hypothesis can be statistically tested.

3. Econometric model:

The simplest econometric model is obtained by taking logs of the

economic model and adding an error term. This delivers

ln(importsi) = β0 + β1 ln(gdpi) + β2 ln(distancei) + ui.

4. Collecting data: see Appendix 10.4.

5. Selection and estimation of an econometric model:

In practice, there may be further variables influencing imports. Thus,

further control variables have to be added. Based on the Schwarz

criterion the model selection exercise in Section 3.5 suggested to

209

Page 216: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.4 — UR March 2009 — R. Tschernig

add the control variable common colonizer since 1945

(Model 3),

ln(importsi) = β0+β1 ln(gdpi)+β2 ln(distancei)+β3comcoli+ui.

====================================================================

Dependent Variable: LOG_IMP

Method: Least Squares

Sample: 1 55, Included observations: 52

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C -9.578911 4.273583 -2.241424 0.0297

LOG(WDI_GDPUSDCR_O) 1.356566 0.128793 10.53290 0.0000

LOG(CEPII_DIST) -1.144269 0.441293 -2.592991 0.0126

CEPII_COMCOL_REV 3.126534 0.674896 4.632616 0.0000

====================================================================

R-squared 0.698665 Mean dependent var 15.97292

Adjusted R-squared 0.679832 S.D. dependent var 2.613094

S.E. of regression 1.478578 Akaike info criterion 3.693842

Sum squared resid 104.9372 Schwarz criterion 3.843937

Log likelihood -92.03988 Durbin-Watson stat 2.044512

====================================================================

210

Page 217: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.4 — UR March 2009 — R. Tschernig

6. Model diagnostics: Check possible violation of MLR.5 (Homoskedas-

ticity) by plotting the residuals against the fitted values and possible

violation of MLR.6 (Normal errors) by plotting a histogram of the

residuals:

-4

-3

-2

-1

0

1

2

3

10 12 14 16 18 20 22

FIT_OLS_IMP_KAZ

RE

S_

OL

S_

IMP

_K

AZ

0

2

4

6

8

10

12

-4 -3 -2 -1 0 1 2 3

Series: RES_OLS_IMP_KAZ

Sample 1 55

Observations 52

Mean -2.95e-15

Median -0.103912

Maximum 2.909987

Minimum -3.625761

Std. Dev. 1.434431

Skewness -0.295179

Kurtosis 3.017182

Jarque-Bera 0.755772

Probability 0.685309

The scatter plot does not indicate a violation of MLR.5. Why?

Inspecting the box right to the histogram shows that the estimated

kurtosis is close to 3 which is the theoretical value implied by the

211

Page 218: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.4 — UR March 2009 — R. Tschernig

standard normal distribution.

Thus, we may continue to use this model.

7. Usage of the model: Conduct tests:

A two-sided test

• Now we can formulate the pair of statistical hypotheses:

H0 : The elasticity of imports to gdp is 1. versus H1 : The elasticity is unequal to 1.

H0 : β1 = 1 versus H1 : β1 6= 1.

• Compute t statistic from the relevant line of the outputVariable Coefficient Std. Error t-Statistic Prob.

LOG(WDI_GDPUSDCR_O) 1.356566 0.128793 10.53290 0.0000

t(X,y) =β1 − β10

σβ1

=1.356566 − 1

0.128793= 2.76852

212

Page 219: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.4 — UR March 2009 — R. Tschernig

• Choose a significance level, e.g. α = 0.05.

Compute critical values: The degrees of freedom are n−k−1 =

52 − 3 − 1 = 48. One may obtain an approximate critical value

from Table G.2 in Wooldridge (2009) or a precise critical value

e.g. from

– EViews using scalar crit = @qtdist(1-alpha/2,n-k-1)

in the command window or

– Excel using c =(TINV(alpha;n-k-1))=2.0106. (Note that

the Excel function already assumes a two-sided test.)

• Since

t(X,y) = 2.76852 > 2.0106 = c

one rejects the null hypothesis.

• p-values can be computed in EViews using

213

Page 220: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.4 — UR March 2009 — R. Tschernig

scalar pval = 2*(1-@ctdist(t,n-k-1))= 0.0080. Thus,

one can reject H0 even at the 1% significance level. The p-

value means that we would observe a t statistic of at least 2.787

in absolute value only in about 8 samples out of 1000 samples

drawn.

One-sided test

• Now we can formulate the pair of statistical hypotheses with re-

spect to the sign of β2, that is the impact of distance on imports.

Since we want to provide evidence for β2 < 0, we put this into

H1:

H0 : β2 ≥ 0 versus H1 : β2 < 0.

• Compute t statistic from the relevant line of the outputVariable Coefficient Std. Error t-Statistic Prob.

LOG(CEPII_DIST) -1.144269 0.441293 -2.592991 0.0126

214

Page 221: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.4 — UR March 2009 — R. Tschernig

t(X,y) =β2 − β20

σβ2

=−1.144269 − 0

0.441293= −2.592991

• Choosing again α = 0.05, we compute the critical value using the

EViews function scalar crit = @qtdist(alpha,n-k-1)=-

1.6772.

• Since

t(X,y) = −2.592991 < −1.6772 = c

one rejects the null hypothesis. Thus, log distance has a statis-

tically significant negative impact on imports at the given signif-

icance level.

• The corresponding p-value using EViews is

scalar pval = @ctdist(t,n-k-1)= 0.0063. Thus, distance

has a negative impact even at the 1% significance level.

215

Page 222: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.4 — UR March 2009 — R. Tschernig

Interpretation of β3:

In principle, this parameter can be interpreted like in a log-level

model, see Section 2.6. However, since β3 is very different from

0, and because the regressor is a dummy variable, one should use

an exact formula to compute the relative change, see Section 6.3:

the precise value is eβ3 − 1 = 21.8. Thus, imports from countries

with colonial ties are about 22 times larger than from countries from

other countries keeping everything else fixed! These are very likely

the trading patterns from the former Soviet Union.

Note that we already considered other model specifications in Sec-

tion 3.5. It might be interesting to check whether these test results

are robust if other model specifications are used such as Model 2 or

Model 4.

216

Page 223: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

4.5 Confidence Intervals

• How large is the probability that the estimated parameter value

corresponds to the true value?

• A parameter estimator — to be more precise, a point estimator —

does not allow any conclusions how “close” the estimate is to the

true value of the population.

• Following the position of Sir Karl Popper who advocated the crit-

ical rationalism in the philosophy of science, point estimates are

not very useful since it cannot be falsified. Instead, an empirical

hypothesis is only scientific if it is falsifiable.

• Example: Assume we predicted on basis of an econometric model

a price index and obtained a predicted value of 5.12. the realized

value, however, will be 5.24. → Then we made a wrong prediction

217

Page 224: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

since it did not realize exactly.

This “error” can only have three reasons:

– The random error of the population regression model.

– The estimation error of the sample regression model.

– The regression model is not correct or (more realistic) it is a bad

approximation. At least one of our assumptions is not justified.

Problem:

From an subjective point of view one can have different opinions

about these “explanations”:

– One believes that the deviation is due to the random error.

– Another claims that the model is wrong.

218

Page 225: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

Solution

One should specify objective criteria such that one can make a

scientific decision. These criteria should be determined before any

predicted value realizes.

Then one cannot escape a potential falsification of a hypothesis af-

terwards. This makes a hypothesis scientific in the sense of Popper.

• Let’s be more precise:

How large is the probability that the estimated value βj corresponds

exactly to the true value βj if, as was shown in Section 4.3,

βj ∼ N

(βj, σ

2βj

)

219

Page 226: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

and (βj − βj)/σβj∼ N (0, 1), or if σ

βjis estimated,

βj − βj

σβj

∼ tn−k−1 ?

• Alternative question:

How large is the probability that the true value βj lies in the interval

[βj − c · σβj

, βj + c · σβj

]

where c is given?

Note that the endpoints of the interval are random prior to obtaining

a sample. Its location is random through βj and its length is

random through σβj

This interval is the most well known example of an interval esti-

mator.

220

Page 227: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

• Answer for given σβj

:

How large is the probability that the true value βj is contained in

the interval [βj − c · σβj

, βj + c · σβj

] where the value c is chosen

by you?

221

Page 228: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

– It is 2Φ(c) − 1 since

P

(βj − cσ

βj≤ βj ≤ βj + cσ

βj

)= P

(−cσ

βj≤ βj − βj ≤ cσ

βj

)

= P

−c ≤

βj − βj

σβj

≤ c

= P

−c ≤

βj − βj

σβj

≤ c

= Φ(c) − Φ(−c)

= Φ(c) − (1 − Φ(c))

= 2Φ(c) − 1.

– Example: For c = 1.96 one obtains Φ(1.96) − Φ(−1.96) =

0.975 − 0.025 = 0.95:

222

Page 229: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

The true value βj will be with 95% probability within the interval

βj ± c · σβj

. One also relates this probability to α by writing

0.95 = 1 − α. Thus one has α = 0.05.

• Answer for estimated σβj

: The true value βj lies in the interval

βj±c·σβj

with probability 1−α. Note, however, that for computing

the probability one has to use the tn−k−1 distribution since

P

(βj − cσ

βj≤ βj ≤ βj + cσ

βj

)= P

−c ≤

βj − βj

σβj

≤ c

.

• The interval

[βj − c · σβj

, βj + c · σβj

]

is called confidence interval. One says that the confidence in-

terval contains the true value with a probability of confidence of

223

Page 230: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

(1 − α)100%. The value (1 − α) is also called confidence level

or coverage probability of the confidence interval.

• In practice one determines the confidence level 1−α and then com-

putes the value c using the appropriate distribution: either N (0, 1)

or tn−k−1.

• Note:

– The constant c corresponds to the (upper) critical value of a

two-sided test with significance level α.

– Since the confidence interval is a random interval, its location

and length is in general different for each sample.

– The larger (1 − α), the smaller α, the larger is the confidence

interval. In other words: the more you want to be on the safe

side, the larger the confidence interval becomes. Why?

224

Page 231: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

– A two-sided t test and a confidence interval contain the same

amount of information. The null hypothesis of a two-sided t

test is rejected if and only if the value of the null hypothesis lies

outside the confidence interval. Draw a graph to make this clear.

– If keep drawing new samples from a population, how many con-

fidence intervals do not contain the true value on average?

• Trade Example Continued (from Section 4.4):

– Compute a 95% confidence interval for the elasticity βgdp of im-

ports with respect to gdp.

– From Section 4.4 it can be justified that MLR.1 to MLR.6 hold

and imports are normally distributed.

– Since σβgdp

has to be estimated, one has to use the t distribution

with n − k − 1 = 48 degrees of freedom. For a confidence level

225

Page 232: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.5 — UR March 2009 — R. Tschernig

of 0.95 one obtains α = 0.05 and thus c = 2.0106 (e.g. by using

EViews scalar crit = @qtdist(1-0.05/2,52-3-1)).

– The relevant line of output was, see Section 4.4:Variable Coeff. Std.Err. t-Stat. Prob.

LOG(WDI_GDPUSDCR_O) 1.356566 0.128793 10.53290 0.0000

– Therefore the 95% confidence interval is given by

[βj − c · σβj

, βj + c · σβj

]

[1.3566 − 2.0106 · 0.1288 , 1.3566 + 2.0106 · 0.1288]

[1.0976 , 1.6156].

– The elasticity of imports with respect to gdp falls with 95% prob-

ability within the range between 1.0976 and 1.6156. Note that 1

is not included in the confidence interval. This reflects the test

result in Section 4.4 of rejecting H0 : βgdp = 1.

226

Page 233: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.6 — UR March 2009 — R. Tschernig

4.6 Testing a Single Linear Combination of Parameters

• Example: Cobb-Douglas production function

log Y = β0 + β1 log K + β2 log L + u,

where Y denotes output, K and L denote the production factors

capital and labor, respectively. Note that β1 and β2 are elasticities

here.

If the restriction β1 + β2 = 1 holds true, the production function

has constant returns to scale, e.g. a 1% increase of labor and capital

leads to a 1% increase of output on average.

227

Page 234: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.6 — UR March 2009 — R. Tschernig

For an empirical test of constant returns to scale, we employ the

following pair of hypotheses:

H0 : β1 + β2 = 1 versus H1 : β1 + β2 6= 1.

• How to construct the test statistic:

1. First, define auxiliary parameters θ and θ0, where

θ = β1 + β2, θ0 = 1,

or, equivalently

H0 : θ = θ0 versus H1 : θ 6= θ0.

228

Page 235: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.6 — UR March 2009 — R. Tschernig

2. Second, solve θ for one of the parameters βi, here β1

β1 = θ − β2

and insert it into the initial regression equation and reformulate

the latter to

log Y = β0 + (θ − β2) log K + β2 log L + u

log Y = β0 + θ log K + β2 (log L − log K)︸ ︷︷ ︸new variable

+u. (4.4)

Then estimate (4.4) and obtain the test statistic

tθ =θ − θ0

σθ

which can be directly calculated from the estimation of (4.4).

229

Page 236: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.6 — UR March 2009 — R. Tschernig

Example:

In a classical marketing model we regress (the natural logarithm of)sales (S) of a consumer good on (the natural logarithm of) this good’sprice (P ) as well as on (the natural logarithms of) cross prices (PK1,PK2) of competing goods. The following regression output is calcu-lated from the data:

Dependent Variable: log(S), Method: Least Squares, Obs: 6917

Variable Coeff. Std.Err. t-Stat. Prob.

C 4.407786 0.079559 55.40268 0.0000

LOG(P) -3.955281 0.068095 -58.08499 0.0000

LOG(P_K1) 0.710274 0.073912 9.609683 0.0000

LOG(P_K2) 1.154163 0.079815 14.46046 0.0000

R-squared 0.332264 Mean dependent var 2.824244

Adj. R-squared 0.331974 S.D. dependent var 1.250189

S.E. regression 1.021815 Akaike info crit. 2.881617

Sum squared resid 7217.910 Schwarz criterion 2.885573

F-statistic 1146.631 Prob(F-statistic) 0.000000

230

Page 237: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.6 — UR March 2009 — R. Tschernig

We wish to test the following statement: the cross price elasticities are

identical, keeping everything else fixed (ceteris paribus) (though the

competing goods come from different market segments).

• The initial hypotheses are given by

H0 : βK1 = βK2 versus H1 : βK1 6= βK2.

We reformulate them by re-parametrization according to

θ = βK1 − βK2, θ0 = 0

H0 : θ = 0 versus H1 : θ 6= 0.

• Thus, due to βK1 = θ + βK2, the initial regression model

ln(S) = β1 + β2 ln(P ) + βK1 ln(PK1) + βK2 ln(PK2) + u

can be rendered to

ln(S) = β1 + β2 ln(P ) + θ ln(PK1) + βK2(ln(PK2) + ln(PK1)) + u.

231

Page 238: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.6 — UR March 2009 — R. Tschernig

• Given the estimates of the last regression

Dependent Variable: log(S), Method: Least Squares, Obs: 6917

Variable Coeff. Std.Err. t-Stat. Prob.

C 4.407786 0.079559 55.40268 0.0000

LOG(P) -3.955281 0.068095 -58.08499 0.0000

LOG(P_K1) -0.443889 0.112543 -3.944165 0.0001

LOG(P_K1)+LOG(P_K2) 1.154163 0.079815 14.46046 0.0000

R-squared 0.332264 Mean dependent var 2.824244

Adj. R-squared 0.331974 S.D. dependent var 1.250189

S.E. regression 1.021815 Akaike info crit. 2.881617

Sum squared resid 7217.910 Schwarz criterion 2.885573

F-statistic 1146.631 Prob(F-statistic) 0.000000

calculate t statistic ast =

−0.443889− 0

0.112543≈ −3.94.

For a given significance level of α = 0.05, the critical values are

-1.96 and 1.96. Thus, we have to reject H0.

Reading: Sections 4.3-4.4 in Wooldridge (2009).

232

Page 239: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7 — UR March 2009 — R. Tschernig

4.7 Jointly Testing Several Linear Combinations of

Parameters: The F Test

Some examples of possible restrictions within the MLR framework:

1. H0 : β1 = 3

2. H0 : β2 = βk

3. H0 : β1 = 1, βk = 0

4. H0 : β1 = β3, β2 = β4

5. H0 : βj = 0, j = 1, . . . , k

6. H0 : βj + 2βl = 1, βk = 2

We can already check case 1. and case 2. by applying t tests. For all

other cases we need the F test.

233

Page 240: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

4.7.1 Testing of Several Exclusion Restrictions

Trade Example Continued (from Section 4.5):

Consider Model 4 in Section 3.5:

====================================================================

Dependent Variable: LOG_IMP, Method: Least Squares

Sample: 1 55 Included observations: 52

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C -13.02677 4.726426 -2.756157 0.0083

LOG(WDI_GDPUSDCR_O) 1.317565 0.129079 10.20740 0.0000

LOG(CEPII_DIST) -0.624909 0.542325 -1.152279 0.2550

CEPII_COMCOL_REV 3.351230 0.678912 4.936177 0.0000

CEPII_COMLANG_OFF 2.109579 1.319296 1.599018 0.1165

====================================================================

R-squared 0.714213 Mean dependent var 15.97292

Adjusted R-squared 0.689890 S.D. dependent var 2.613094

S.E. of regression 1.455167 Akaike info criterion 3.679330

Sum squared resid 99.52303 Schwarz criterion 3.866950

Log likelihood -90.66258 F-statistic 29.36447

Durbin-Watson stat 2.195759 Prob(F-statistic) 0.000000

234

Page 241: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

Are the control variables common colonizer since 1945 (CEPII COMCOL REV)

and common official language (CEPII COMLANG OFF) really needed in the

specification of Model 4 ?

To put it more precisely, are both parameters of two variables mentioned

jointly significantly different from zero?

H0 : βcomcol rev = 0 and βcomlang off = 0

versus

H1 : βcomcol rev 6= 0 and/or βcomlang off 6= 0

235

Page 242: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

How can one jointly test several hypotheses?

• Note that SSR decreases (or stays constant) with an additional re-

gressor.

⇒ Idea: Compare the SSR of a model on which the null hypotheses

are imposed (restricted model) with the SSR of another model that

does not impose the joint restrictions (unrestricted model).

• The estimation under H0 is easy: simply exclude all regressors from

the regression whose parameters under H0 are set to zero and re-

estimate the restricted model.

In case of Model 4 for the trade example the OLS estimates are for

the restricted model (that corresponds to Model 2 in Section 3.5):

236

Page 243: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

====================================================================

Dependent Variable: LOG_IMP

Method: Least Squares

Sample: 1 55 Included observations: 52

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C 4.800950 3.497341 1.372743 0.1761

LOG(WDI_GDPUSDCR_O) 1.088546 0.137001 7.945508 0.0000

LOG(CEPII_DIST) -1.970804 0.480555 -4.101103 0.0002

====================================================================

R-squared 0.563937 Mean dependent var 15.97292

Adjusted R-squared 0.546138 S.D. dependent var 2.613094

S.E. of regression 1.760423 Akaike info criterion 4.024946

Sum squared resid 151.8554 Schwarz criterion 4.137518

Log likelihood -101.6486 F-statistic 31.68448

Durbin-Watson stat 2.117895 Prob(F-statistic) 0.000000

====================================================================

237

Page 244: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

Results:

– The R2 of the unrestricted model is 0.7142 while the R2 of the

restricted model is (only) 0.5640.

– Correspondingly, the standard error of regression σ increases from

1.4551 to 1.7604.

– Are these changes large? It looks like that but what does “large”

really mean here?

– Note that all three model selection criteria, AIC, HQ, and SC,

“prefer” the unrestricted model, see Section 3.5. Will this finding

be confirmed by the test?

238

Page 245: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

• In order to be able to use a statistic (a function that can be com-

puted from sample values) as a test statistic, one has to know its

probability distribution under the null hypothesis H0.

One can show (→ advanced econometrics course or Section 4.4

in Davidson & MacKinnon (2004)) that the following test statistic

follows an F distribution

F =(SSRH0

− SSRH1)/q

SSRH1/(n − k − 1)

∼ Fq,n−k−1.

Therefore this test is called F test and the test statistic is abbre-

viated as F statistic.

• Note that the F distribution has two different degrees of freedom,

q degrees of freedom for the random variable in the numerator, and

n−k−1 degrees of freedom for the random variable in denominator.

The value q contains the number of restrictions that are jointly

239

Page 246: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

tested.

• Details of the F statistic:

– Its minimum is 0 since SSRH0≥ SSRH1

and SSRH1> 0. (There-

fore the F statistic cannot be normally distributed!)

– There is no upper bound.

• When should the joint null hypothesis be rejected?

– The larger the absolute difference between the SSRs of the re-

stricted and the unrestricted model, SSRH0− SSRH1

, the more

likely is a violation of the exclusion restrictions.

– However, be aware that absolute differences do not say much.

Why?

240

Page 247: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

– It makes much more sense to consider the relative difference

between the SSRs. This is exactly what the F statistic does.

It scales the difference in SSRs by the SSR of the unrestricted

model. If the relative difference is large, then the joint null hy-

pothesis is likely to be violated.

– On the other hand, if the relative difference is small, then it is

likely that the excluded variables do not have any relevant impact

in the unrestricted model since they can be neglected without any

noticeable effect.

• Decision rule:

Reject H0 if the test statistic is larger than the critical value:

Reject H0 if F > c.

Thus, the critical region is (c,∞).

241

Page 248: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

Calculation of the critical region:

For a given significance level α, the critical value c is implicitly

defined by the probability

P (F > c|H0) = α.

The corresponding value for c given α can be found in tables on

the F distribution, e.g. Table G.3 in Appendix G in Wooldridge

(2009) or be computed in EViews or Excel. (For the latter one has

(Finv(0.05;q;n-k-1) for alpha=0.05).

242

Page 249: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

Trade Example Continued (from the beginning of this section):

• The joint null hypothesis contains two exclusion restrictions, thus

the degree of freedoms for the numerator are two, q = 2. The

degrees of freedom for the denominator correspond to the degrees

of freedom of Model 4, n − k − 1 = 52 − 4 − 1 = 47. Choosing

a significance value of α = 0.05, we check Table G.3 in Appendix

G in Wooldridge (2009) for the appropriate critical value. Listed

values are F2,40 = 3.23 and F2,60 = 3.15. While the former implies

a true significance level smaller than 0.05, the latter implies one

above 0.05. If one is interested in an exact critical value, one can

obtain it from Excel as Finv(0.05;2;47) = 3.1951.

• Collecting the SSRs from the regression outputs for Model 4 and

Model 2 at the beginning of the section, the F statistic can be

243

Page 250: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

computed as

F =(151.855 − 99.523)/2

99.523/(47)= 12.3570.

Since

F = 12.3570 > 3.1951 = c,

reject H0 on a significance level of 5%.

• Check that the same decision holds for a significance level of 1%.

The two variables common colonizer since 1945 (CEPII COMCOL REV)

and common official language (CEPII COMLANG OFF) are statistically

significant at the 1% significance level and thus at least one of

the two variables has an impact on imports on the 5% as well

as on the 1% significance level.

244

Page 251: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

Calculation of p-values for F statistics:

• In empirical work one is frequently interested in the largest signifi-

cance level for which the null hypothesis can be rejected given the

observed test statistic.

As explained in Section 4.1, this information is provided by the p-

value. Alternatively, it is the smallest significance level at which the

null cannot be rejected.

Given the significance level that was chosen prior to any calculations,

the null hypothesis is rejected if the p-value is smaller than the given

significance level α.

245

Page 252: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

• Trade Example Continued: The p-value can be computed (in

Excel =FVERT(K10;2;47)= 4.87099E-05= 4.87099 ·10−05. The

p-value can also be calculated in EViews, see below.

Thus, there is extremely strong statistical evidence against the null

hypothesis.

Direct Calculation of the F statistic in EViews:

• In the Equation window one clicks on View and then on

Representations where one can read how the parameter num-

bering relates to the regressor variables.

• One again clicks on View and then on Coefficient Tests and

further on Wald-Coefficient Restrictions .... Then one

enters all restrictions into the opened EViews window Wald Test.

For the test of the joint significance of the control variables in the

246

Page 253: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

trade example one types C(4)=0,C(5)=0 and confirms:

Wald Test:

Equation: EQ_LN_LN_COL_R_LANG_OFF

====================================================================

Test Statistic Value df Probability

====================================================================

F-statistic 12.35704 (2, 47) 0.0000

Chi-square 24.71407 2 0.0000

====================================================================

Null Hypothesis Summary:

====================================================================

Normalized Restriction (= 0) Value Std. Err.

====================================================================

C(4) 3.351230 0.678912

C(5) 2.109579 1.319296

====================================================================

Restrictions are linear in coefficients.

Note that the Chi-square test statistic is kind of a variant of the

F statistic which is useful in large samples. Here, it will not be

further discussed.

247

Page 254: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

Remarks:

• One can, of course, test the simple null hypothesis with a two-sided

alternative

H0 : βj = 0 versus H1 : βj 6= 0

by means of an F test.

It holds that the square of a random variable X that follows a t

distribution with n − k − 1 degrees of freedom just corresponds to

a random variable that follows an F distribution with (1, n−k− 1)

degrees of freedom

X ∼ tn−k−1 =⇒ X2 ∼ F1,n−k−1.

Therefore, a two-sided t test and an F test lead to exactly the same

result for the pair of hypotheses above.

248

Page 255: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

• It may happen that each regressor tested by itself is not statisti-

cally significant but if they are jointly tested they are statistically

significant (at the same significance level). This is a sign of mul-

ticollinearity between the regressors considered. Then, the given

sample size is only sufficient for providing statistical significance

jointly for both regressors. However, it is not sufficient for providing

statistical evidence for each regressor separately. In such cases you

may check the covariance between the parameter estimates that are

included in the test (in EViews View → Covariance Matrix).

• It may also happen that one variable is statistically significant but if

jointly tested with other variables it becomes insignificant. This can

happen if the other variables that are included in the joint hypothesis

are redundant in the population regression. In this case, the power of

a single hypothesis test is weakened by the other irrelevant variables.

249

Page 256: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

• Thus, there is no general rule on whether to prefer joint or single

tests results.

• Trade Example Continued (from the middle of this section):

Comparing four different model specifications using model selection

criteria, see Section 3.5, HQ and AIC favor Model 4. Inspecting its

parameter estimates at the beginning of this section, one finds two

parameters to be statistically insignificant even at the 10% level:

βdistance and βcommon off language.

Why, then, was this Model 4 found to be best by HQ and AIC?

Answer:

The parameter estimators for βdistance and βcom. off. lang. might

be highly correlated so that only a joint impact is significant. One

reason could be that a lot of variation of distance can be explained

by common off. language, among other things. In this case, it

250

Page 257: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

can be expected that if both parameters are jointly tested by means

of an F test, they are statistically significant. Test the pair of

hypotheses:

H0 : βdistance = 0 and βcomlang off = 0

H1 : βdistance 6= 0 and/or βcomlang off 6= 0

The EViews output is:

Wald Test:

Equation: EQ_LN_LN_COL_LANG

Test Statistic Value df Probability

F-statistic 4.749269 (2, 47) 0.0132

Chi-square 9.498538 2 0.0087

Thus, reject H0 at the 5% significance level. The previous conjecture

is statistically confirmed.

251

Page 258: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

The effect of multicollinearity can nicely be seen in

-2

-1

0

1

2

3

4

5

6

-2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0

C(5

)C(3)

Note that C(3) and C(5) correspond to βdistance and βcom. off. lang., respectively. The ellipse

is a generalization of confidence intervals to two dimensions. Thus, all points outside the ellipse

are joint null hypotheses that are rejected. Note that the origin also lies outside while the zero

is included in each one-dimensional confidence interval. (Get the graph in EViews via Views →

Coefficient Tests→ Confidence Ellipse.) More on confidence ellipses, e.g. in Davidson

& MacKinnon (2004).

252

Page 259: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

• R2 version of the F statistic:

If a regression model contains a constant, then the decomposition

SSR = SST(1−R2) holds. Inserting each SSR into the F statistic

delivers

F =(R2

H1− R2

H0)/q

(1 − R2H1

)/(n − (k + 1))∼ Fq,n−k−1.

Note:

– SST is canceled if the dependent variable y is the same under H0

and H1 as, for example, in case of exclusion restrictions. However,

this is not always true if general linear restrictions are tested.

– There can be slight differences between both versions of the F

statistic due to rounding errors.

253

Page 260: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.1 — UR March 2009 — R. Tschernig

Overall F Test

Standard software packages (such as EViews) include in their OLS

output for the multiple regression model y = β0+β1x1+. . .+βkxk+u

the F statistic and its p-value for the pair of hypotheses:

“None of the (non-constant) regressors has impact on the dependent

variable and thus the corresponding parameters are all zero.”

H0 : β1 = · · · = βk = 0 (and y = β0 + u)

H1 : βj 6= 0 for at least one j = 1, . . . , k.

If H0 is not rejected, this possibly indicates that

- all regressors are possible badly/wrongly chosen,

- or at least a substantial number of regressors has no impact on y,

- or too many regressors were considered for given sample size n.

This test is a first rough check for the validity of the model.

254

Page 261: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.2 — UR March 2009 — R. Tschernig

4.7.2 Testing of Several General Linear Restrictions

• Generalization of the F test for exclusion restrictions.

• Works equivalently by computing the relative change in the SSRs.

• R2 version cannot be used in this case!

Examples of possible pairs of hypotheses:

H0 : β2 = β3 = 1 versus H1 : β2 6= 1 and/or β3 6= 1,

H0 : β1 = 1, βj = 2βl versus H1 : β1 6= 1 and/or βj 6= 2βl.

Trade Example Continued (from previous subsection):

• One may conjecture that due to the multicollinearity between the es-

timates for distance and common official language the impact

of distance might be underestimated in absolute value while the

255

Page 262: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.2 — UR March 2009 — R. Tschernig

impact of language is zero. Thus, consider the pair of hypotheses:

H0 : βdistance = −1 and βcomlang off = 0

H1 : βdistance 6= −1 and/or βcomlang off 6= 0

In order to compute the SSR under H0 impose these restrictions on

the regression as

log(imports) − (−1) log(distance) = β0 + βgdp log(gdp) + βcomcolcepii comcol rev + u

256

Page 263: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.2 — UR March 2009 — R. Tschernig

The EViews output is:

Dependent Variable: LOG_IMP+LOG(CEPII_DIST)

Method: Least Squares

Sample: 1 55, Included observations: 52

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C -10.49513 3.196796 -3.283015 0.0019

LOG(WDI_GDPUSDCR_O) 1.344714 0.122454 10.98137 0.0000

CEPII_COMCOL_REV 3.215739 0.611625 5.257696 0.0000

====================================================================

R-squared 0.720026 Mean dependent var 24.24218

Adjusted R-squared 0.708599 S.D. dependent var 2.713964

S.E. of regression 1.465041 Akaike info criterion 3.657604

Sum squared resid 105.1709 Schwarz criterion 3.770176

Log likelihood -92.09771 F-statistic 63.00827

Durbin-Watson stat 2.053004 Prob(F-statistic) 0.000000

====================================================================

257

Page 264: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.7.2 — UR March 2009 — R. Tschernig

This allows to compute the F statistic

F =

(SSRH0

− SSRH1

)/q

SSRH1/(n − k − 1)

=(105.1709 − 99.5230)/2

99.5230/47= 1.333618 < c = 3.195.

Alternatively, one can conduct this general F test directly in EViews

via the Wald test option C(3)=-1,C(5)=0:

Wald Test:

Equation: EQ_LN_LN_COL_LANG

====================================================================

Test Statistic Value df Probability

====================================================================

F-statistic 1.333603 (2, 47) 0.2733

Chi-square 2.667205 2 0.2635

→ The claim that a “common official language has no impact

while distance has an elasticity of −1” cannot be rejected at

any reasonable significance level since the p-value is about 27%.

258

Page 265: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.8 — UR March 2009 — R. Tschernig

4.8 Reporting Regression Results

In general, empirical researchers investigate a number of different spec-

ifications of regression functions.

In order to make visible how robust the conclusions are with respect

to model choice it is good practice to report the results of the most

important specifications so that each reader can evaluate the findings

in her own manner.

This is most easily achieved by summarizing the relevant results in a

table, see the example below.

259

Page 266: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.8 — UR March 2009 — R. Tschernig

For each specification a minimum number of results should be:

• OLS parameter estimates βj of the regression parameters βj, j =

0, 1, . . . , k (plus variable names),

• Standard error of βj, σβj

,

• Number of observations n,

• R2 and adjusted R2,

• Standard error of regression or estimated variance of the regression

error σ2.

If possible, one should also report

• Model selection criteria such as AIC, HQ or SC,

• Sum of squared residuals (SSR).

Based on the SSRs one can easily compute F tests.

260

Page 267: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 4.8 — UR March 2009 — R. Tschernig

Trade Example Continued:

Dependent Variable: ln(imports to Kazakhstan)

Independent Variables/Model (1) (2) (3) (4)

constant -3.461 4.801 -9.579 -13.027

(3.280) (3.497) (4.274) (4.726)

ln(gdp) 0.770 1.089 1.357 1.318

(0.130) (0.137) (0.129) (0.129)

ln(distance) — -1.971 -1.144 -0.625

(0.481) (0.441) (0.542)

common ”colonizer” since 1945 — — 3.127 3.351

(0.675) (0.679)

common official language — — — 2.110

(1.319)

Number of observations 52 52 52 52

R2 0.414 0.564 0.699 0.714

Standard error of regression 2.020 1.760 1.479 1.455

Sum of squared residuals 203.980 151.855 104.937 99.523

AIC 4.2816 4.0249 3.6938 3.6793

HQ 4.3103 4.0681 3.7514 3.7513

SC 4.3566 4.1375 3.8439 3.8670

Reading: Sections 4.5-4.6 in Wooldridge (2009).

261

Page 268: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

5 Multiple Regression Analysis: Asymptotics

The assumption of a normal (or gaussian) distribution MLR.6 is fre-

quently violated in empirical practice. How can we then proceed to

calculate test statistics or confidence intervals?

5.1 Large Sample Distribution of the Mean Estimator

• Example: Testing the mean of hourly wages: the empirical distri-

bution is steep at the left and skewed to the right (as is typical for

262

Page 269: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

prices and wages which are not generated additively).

0

20

40

60

80

100

120

140

0 2 4 6 8 10 12 14 16 18 20 22 24

Series: WAGE

Sample 1 526

Observations 526

Mean 5.896103

Median 4.650000

Maximum 24.98000

Minimum 0.530000

Std. Dev. 3.693086

Skewness 2.007325

Kurtosis 7.970083

Jarque-Bera 894.6195

Probability 0.000000

• Examples of random variables with right-skewed distribution:

– A χ2(m) distributed random variable X is defined as the sum of

m squared i.i.d. standard normal random variables

X =

m∑

j=1

u2j, uj ∼ i.i.d.N (0, 1).

263

Page 270: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

(Details on the χ2 distribution can be found in Appendix B in

Wooldridge (2009).)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 1 2 3 4 5 6 7 8

Chi-Quadrat(1)

Density of the χ2(1) distribution.

Moments of a χ2(1) distributed random variable:

E[X ] = E[u2]

= V ar(u) + E[u]2 = 1,

V ar(X) = E[X2]− E[X ]2 = E[u4] − 12 = 2,

u2 − 1√2

=X − 1√

2∼ (0, 1).

264

Page 271: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

Note that for a standard normal random variable we have E[u4] =

3.

– Linear functions of a χ2(1) distributed random variable, e.g.

yi = ν + σyu2

i − 1√2

, ui ∼ i.i.d.N (0, 1). (5.1)

Moments:

E[yi] = ν,

V ar(yi) = V ar

(σy

u2i − 1√

2

)= σ2

yV ar

(u2

i√2

)= σ2

y.

265

Page 272: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

• Expectation and variance of mean estimators

µn =1

n

n∑

i=1

yi.

E[µn] =1

n

n∑

i=1

E[yi] = ν,

V ar (µn) =1

n2

n∑

i=1

V ar(yi) =V ar(yi)

n=

σ2y

n,

sd (µn) =σy√n.

In this example the estimator is unbiased and the variance decreases

with rate n as sample size increases.

266

Page 273: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

• Consistency of an estimator θn:

For every ǫ > 0 and δ > 0 there exists an N such that

P(|θn − θ| < ǫ

)> 1 − δ for all n > N.

Alternatively:

– limn→∞ P(|θn − θ| < ǫ

)= 1,

– plim θn = θ,

– θnp−→ θ.

The “plim” notation stands for probability limit. This concept

of convergence is usually denoted as convergence in probability or

(weak) consistency. Some notes on calculation rules for the “plim”

are given in Appendix C.3 in Wooldridge (2009).

267

Page 274: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

A consistent estimator θn has the properties

– limn→∞ E[θn

]= θ and

– limn→∞ V ar(θn

)= 0.

If one of these conditions fails to hold, the estimator is called in-

consistent. In general:

• Weak law of large numbers (WLLN):

For yi ∼ i.i.d. with −∞ < E[yi] = µ < ∞, the mean estimator

µn = 1n

∑ni=1 yi is weakly consistent, that is

µnp−→ µ.

• Then we can consistently estimate the variance of i.i.d. random

variables wi ∼ i.i.d.(µw, σ2w) with σ2 = 1

n

∑ni=1(wi−µw)2. Why?

268

Page 275: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

• But how can we derive the asymptotic probability distri-

bution of the mean estimator µn?

• Monte Carlo Simulation (MC):

The EViews-program mcarlo1 est mu.prg allows us to iteratively

draw R = 1000 samples of size n with elements y1, . . . , yn, where

yi ∼ i.i.d.(ν, σ2y) with ν = 3 and σ2

y = 1 and yi is generated from

(5.1). One frequently calls (5.1) the data generating process

(DGP). For every sample yr1, y

r2, . . . , y

rn generated in this way,

where r = 1, . . . , 1000, the mean estimator µr = 1n

∑ni=1 yr

i is

calculated and stored. After all iterations, a histogram is calculated

based on R estimates µ1, µ2, . . . , µR.

269

Page 276: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

First, the results for the simulated moments:

Elements in sample average std. deviation true std. deviation

n of estimated means of MC DGP

10 2.993908 0.298423 0.3162278

30 3.003103 0.182773 0.1825742

50 3.001263 0.142403 0.1414214

100 3.001637 0.098750 0.1

500 2.999619 0.046355 0.04472136

1000 3.000503 0.031865 0.03162278

– The true moments are accurately estimated,

– and we can observe how the LLN works.

270

Page 277: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

n = 10 n = 30

0

10

20

30

40

50

60

70

2.50 2.75 3.00 3.25 3.50 3.75 4.00

Series: MU_HAT

Sample 1 1000

Observations 1000

Mean 2.993908

Median 2.966680

Maximum 4.061612

Minimum 2.394145

Std. Dev. 0.298423

Skewness 0.621673

Kurtosis 3.250171

Jarque-Bera 67.02061

Probability 0.0000000

20

40

60

80

100

120

140

2.6 2.8 3.0 3.2 3.4 3.6 3.8

Series: MU_HAT

Sample 1 1000

Observations 1000

Mean 3.003103

Median 2.991643

Maximum 3.795529

Minimum 2.562306

Std. Dev. 0.182773

Skewness 0.572018

Kurtosis 3.450607

Jarque-Bera 62.99434

Probability 0.000000

n = 50 n = 100

0

40

80

120

160

2.6 2.8 3.0 3.2 3.4 3.6

Series: MU_HAT

Sample 1 1000

Observations 1000

Mean 3.001263

Median 2.992116

Maximum 3.650012

Minimum 2.595097

Std. Dev. 0.142403

Skewness 0.509651

Kurtosis 3.745789

Jarque-Bera 66.46577

Probability 0.0000000

20

40

60

80

100

120

2.750 2.875 3.000 3.125 3.250 3.375

Series: MU_HAT

Sample 1 1000

Observations 1000

Mean 3.001637

Median 2.994952

Maximum 3.434677

Minimum 2.721445

Std. Dev. 0.098750

Skewness 0.383232

Kurtosis 3.350859

Jarque-Bera 29.60699

Probability 0.000000

n = 500 n = 1000

0

20

40

60

80

100

2.90 2.95 3.00 3.05 3.10 3.15 3.20

Series: MU_HAT

Sample 1 1000

Observations 1000

Mean 2.999619

Median 2.998812

Maximum 3.201741

Minimum 2.861536

Std. Dev. 0.046355

Skewness 0.198532

Kurtosis 3.293122

Jarque-Bera 10.14917

Probability 0.0062540

20

40

60

80

100

120

140

2.90 2.95 3.00 3.05 3.10

Series: MU_HAT

Sample 1 1000

Observations 1000

Mean 3.000503

Median 2.999499

Maximum 3.108631

Minimum 2.905122

Std. Dev. 0.031865

Skewness 0.170954

Kurtosis 2.950809

Jarque-Bera 4.971676

Probability 0.083256

271

Page 278: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

• Results for simulated distributions:

– Right-skewness decreases with increase in sample size n.

– A test for normality (Jarque-Bera-Test): null hypothesis of normal

distribution cannot be rejected for large n.

Theoretical explanation of this phenomenon: a cental limit

theorem holds under certain (rather weak) conditions that is one of

the most important tools in statistics!

• Central limit theorem (CLT):

For yi ∼ i.i.d.(µ, σ2) with 0 < σ2 < ∞, µn = 1n

∑ni=1 yi is

asymptotically normally distributed:√

n (µn − µ)d−→ N (0, σ2).

.

272

Page 279: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

– Interpretation: the larger the number of sample elements n, the

more precise is the approximation of the exact distribution of µn

(see the MC results) by an exactly specified normal distribution.

Hence the label large sample distribution.

– But how good is the asymptotic approximation for a given sample

size n?

∗ The CLT is not informative on this question, though we may

get an answer by conducting MC simulations for certain cases

or by using rather involved finite sample statistics.

∗ Experience: as the distribution of the yi approaches the nor-

mal distribution, smaller and smaller n suffice for a very good

approximation. In some cases even n = 30 is enough.

Do some experiments using mcarlo1 est mu.prg, and grad-

273

Page 280: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

ually increase the degrees of freedom r of the χ2 distribution

(and observe that the skewness decreases in this process)!

– Alternative notations (Φ(z) is the cumulative distribution func-tion of the standard normal distribution):

√n

(µn − µ

σ

)d−→ N(0, 1) (5.2)

P

(√n

(µn − µ

σ

)≤ z

)−→ Φ(z) (5.3)

µn − µ

σ/√

n

approx∼ N(0, 1) (5.4)

µnapprox∼ N

(µ,

σ2

n

)(5.5)

Notation: the mean estimator is asymptotically normally dis-

tributed.

274

Page 281: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

• In large samples the standardized mean estimator is approximated

by a standard normal distribution. Then, due to (5.4)

wi ∼ i.i.d.N (µ, σ2) t(w1, . . . , wn) = µ−µσµ

∼ N (0, 1)

wi ∼ i.i.d.(µ, σ2) t(w1, . . . , wn) = µ−µσµ

approx∼ N (0, 1)

and it can be shown that

wi ∼ i.i.d.N (µ, σ2) t(w1, . . . , wn) = µ−µσµ

∼ tn−k−1

wi ∼ i.i.d.(µ, σ2) t(w1, . . . , wn) = µ−µσµ

approx∼ N (0, 1)

and we get the following (very convenient) result: the (small sam-

ple) theory of t tests and confidence intervals for the mean

estimator of i.i.d. variables holds approximately in large

(enough) samples.

275

Page 282: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.1 — UR March 2009 — R. Tschernig

• Hence the test results in our empirical exercise are still approximately

valid!

• How about this concept of validity in a regression context?

276

Page 283: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.2 — UR March 2009 — R. Tschernig

5.2 Large Sample Inference for the OLS Estimator

• The OLS-estimator

β = β +(X′X

)−1X′u = β + Wu

depends on X or W. Hence, for the OLS estimator to be consistent

and asymptotically normal, certain conditions must hold for the re-

gressor variables as n → ∞. One of these conditions is that for all

i, l = 0, 1, . . . , k we have plim 1n

∑ni=1 xijxil = E[xjxl] = aij or

1

nX′X

p−→ A. (5.6)

277

Page 284: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.2 — UR March 2009 — R. Tschernig

• Asymptotic normality of the OLS estimator

All necessary conditions for asymptotic normality are fulfilled if the

standard assumptions MLR.1-MLR.5 hold true. Then (see a sketch

of proof in Appendix E.4 in Wooldridge (2009)):

√n(β − β

)d−→ N

(0, σ2A

). (5.7)

278

Page 285: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.2 — UR March 2009 — R. Tschernig

For the (asymptotic) distributions of the t statistics we get:

MLR.1-MLR.6 t (X,y) =βj−βjσ

βj

∼ N (0, 1)

MLR.1-MLR.5 t (X,y) =βj−βjσ

βj

approx∼ N (0, 1)

and it can be shown that

MLR.1-MLR.6 t (X,y) =βj−βj

σ/(SSTj(1−R2j))

∼ tn−k−1

MLR.1-MLR.5 t (X,y) =βj−βj

σ/(SSTj(1−R2j))

approx∼ N (0, 1)

279

Page 286: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.2 — UR March 2009 — R. Tschernig

A frequent observation from many Monte Carlo simulations and

empirical practice is that

– for small n one proceeds as in the case of normally distributed

errors and uses the critical values of the t distribution:

MLR.1-MLR.5 t (X,y) =βj−βj

σ/(SSTj(1−R2j))

approx∼ tn−k−1

– and analogously for the F statistic the critical values are deter-

mined from the F distribution.

– Note again: the critical values are valid only approximately, not

exactly. Analogously, the p-values (calculated in EViews) are valid

only approximately!

280

Page 287: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 5.2 — UR March 2009 — R. Tschernig

• Conclusion:

– For the calculation of test statistics and confidence intervals (ex-

ception: forecast intervals) we proceed as hitherto. However, all

statistical results hold only as an approximation.

– If the assumption of homoskedasticity is violated, even the asymp-

totic results do not hold and models for heteroskedastic errors are

required (with stronger assumptions for LLN and CLT), see Chap-

ter 8.

Reading: Chapter 5 and Appendix C.3 in Wooldridge (2009).

281

Page 288: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

6 Multiple Regression Analysis: Interpretation

6.1 Level and Log Models

Recall section 2.6 on level-level, level-log, log-level, log-log models. All

the results remain valid in the multiple regression model in a ceteris-

paribus analysis.

282

Page 289: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.2 — UR March 2009 — R. Tschernig

6.2 Data Scaling

•• Scaling the dependent variable:

– Initial model:

y = Xβ + u.

– Variable transformation: y∗i = a · yi with scale factor a.

→ New, transformed regression equation:

ay︸︷︷︸y∗

= X aβ︸︷︷︸β∗

+ au︸︷︷︸u∗

y∗ = Xβ∗ + u∗ (6.1)

283

Page 290: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.2 — UR March 2009 — R. Tschernig

– OLS-estimator for β∗ in (6.1):

β∗ =(X′X

)−1X′y∗

= a(X′X

)−1X′y = aβ.

– Error variance:

V ar(u∗) = V ar(au) = a2V ar(u) = a2σ2I.

– Variance-covariance matrix:

V ar(β∗) = σ∗2 (X′X

)−1= a2σ2 (X′X

)−1= a2V ar(β)

– t statistic:

t∗ =β∗

j − 0

σβ∗

j

=aβj

aσβj

= t.

284

Page 291: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.2 — UR March 2009 — R. Tschernig

• Scaling explanatory variables:

– Variable transformation: X∗ = X · a. New regression equation:

y = Xa · a−1β + u = X∗β∗ + u. (6.2)

– OLS estimator for β∗ in (6.2):

β∗ =(X∗′X∗

)−1X∗′y =

(a2X′X

)−1X′ay

= a−2a(X′X

)−1X′y = a−1β.

– Result: The sole magnitude of βj is no indicator for the relevance

of the impact of the j-th regressor. One always has to take the

scale of the variable into account.

– Example: In Section 2.3 a simple level-level model was estimated

for imports on gdp. The parameter estimate βgdp = 2.16 · 10−05

appears very small. However, taking into account that gdp is

285

Page 292: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.2 — UR March 2009 — R. Tschernig

measured in dollars, this estimate is not small. Simply rescale

gdp to millions of dollars with a = 10−6 and you obtain β∗gdp =

106 · 2.16 · 10−05 = 21.6.

• Scaling of variables in logarithmic form

just alters the constant β0 since ln y∗ = ln ay = ln a + ln y.

• Standardized Coefficients:

We just saw that it is not possible to deduce the relevance of ex-

planatory variables from the magnitude of the corresponding coef-

ficient. This is possible, however, if the regression is suitably stan-

dardized.

286

Page 293: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.2 — UR March 2009 — R. Tschernig

Deviation: First, consider the following sample regression model

yi = β0 + xi1β1 + . . . + xikβk + ui, (6.3)

and its representation after taking means over all n observations

y = β0 + x1β1 + . . . + xkβk. (6.4)

Then we calculate the difference between (6.4) and (6.3)

(yi − y) = (xi1 − x1)β1 + . . . + (xik − xk)βk + ui. (6.5)

Finally, we divide equation (6.5) by the estimated standard deviation

of y, say σy, and expand every term on the right-hand side by

the estimated standard deviations of the corresponding explanatory

variables, say σxj , j = 1, . . . , k,

(yi − y)

σy=

(xi1 − x1)

σy· σx1

σx1

β1 + . . . +(xik − xk)

σy· σxk

σxk

βk +ui

σy.

287

Page 294: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.2 — UR March 2009 — R. Tschernig

Simple algebra gives(yi − y)

σy︸ ︷︷ ︸zi,y

=(xi1 − x1)

σx1︸ ︷︷ ︸zi,x1

σx1

σyβ1

︸ ︷︷ ︸b1

+ . . . +(xik − xk)

σxk︸ ︷︷ ︸zi,xk

σxk

σyβk

︸ ︷︷ ︸bk

+ui

σy︸︷︷︸ξi

.

In the literature the transformed variables zi,y and zi,x1, . . . , zi,xk

are usually denoted as z-scores.

In compact notation we get

zi,y = zi,x1b1 + · · · + zi,xk

bk + ξi.

where bj are denoted as standardized coefficients (or simply

beta coefficients).

The magnitudes of the standardized coefficients can be compared

to each other. Hence, the explanatory variable with the largest

parameter βj has the relatively largest impact on the dependent

variable.

288

Page 295: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.2 — UR March 2009 — R. Tschernig

Interpretation: a one standard deviation increase in xj changes y

by bj standard deviations.

Standardized coefficients can be calculated in SPSS (see Example

6.1 in Wooldridge (2009)).

289

Page 296: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

6.3 Dealing with Nonlinear or Transformed Regressors

• Further details on logarithmic variables:

Consider the following log-level regression model

ln y = β0 + β1x1 + β2x2 + u, (6.6)

where x2 is a dummy variable (it is either equal to 0 or 1).

– How can we determine the exact impact of x2, that is, how

should we interpret β2? From (6.6) follows

y = eln y = eβ0+β1x1+β2x2+u = eβ0+β1x1+β2x2 · eu

and for the conditional expectation

E[y|x1, x2] = eβ0+β1x1+β2x2 · E[eu|x1, x2]. (6.7)

290

Page 297: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

Inserting the two possible values of x2 into (6.7) delivers

E[y|x1, x2 = 0] = eβ0+β1x1 · E[eu|x1, x2]

E[y|x1, x2 = 1] = eβ0+β1x1 · E[eu|x1, x2] · eβ2

= E[y|x1, x2 = 0] · eβ2.

– Thus, if E[eu|x1, x2] is constant (for x2), the relative mean

change of the dependent variable with respect to a unit

change in x2 is equal to

∆E[y|x1, x2]

E[y|x1, x2 = 0]=

E[y|x1, x2 = 1] − E[y|x1, x2 = 0]

E[y|x1, x2 = 0]

=E[y|x1, x2 = 0] · eβ2 − E[y|x1, x2 = 0]

E[y|x1, x2 = 0]

= eβ2 − 1.

291

Page 298: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

This implies

%∆E[y|x1, x2] = 100(eβ2 − 1

).

– In the general case of k regressors:

%∆E[y|x1, x2, . . . , xk] = 100(eβj∆xj − 1

). (6.8)

Obviously (6.8) represents the exact partial effect, whereas

the interpretation as an approximate semi-elasticity may be rather

crude in some cases.

– Trade Example Continued (from Section 4.8 and specifically

from Section 4.4):

For Model 3 we obtained the sample regression

LOG(TRADE_0_D_O) = -9.5789 + 1.3566*LOG(WDI_GDPUSDCR_O)

- 1.1443*LOG(CEPII_DIST) + 3.1265*CEPII_COMCOL_REV + RESIDUAL

Recall that CEPII COMCOL REV denotes a dummy variable.

292

Page 299: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

∗ The approximate interpretation of βcomcol is that 1 unit change

changes imports by 100βcomcol = 313%.

∗ The exact partial effect is 100(eβcomcol − 1

)= 2179.5%, al-

most 7 times as large!

∗ Of course, the difference between the approximate and exact

effect are smaller if β is closer to zero.

293

Page 300: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

• Models with quadratic regressors:

– For example, consider the multiple regression

y = β0 + β1x1 + β2x2 + β3x22 + u.

The marginal effect of a change in x2 on the conditional expec-

tation of y is equal to

∂E[y|x1, x2]

∂x2= β2 + 2β3x2.

Therefore a change of ∆x2 in x2 changes ceteris paribus the

dependent variable y on average by

(β2 + 2β3x2)∆x2.

Clearly, this effect depends on the level of x2 (and an interpreta-

tion of β2 alone does not make any sense!).

294

Page 301: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

– In some empirical applications regressor variables are considered

using quadratics and logarithms, in order to approximate a non-

linear regression function.

Example: we can approximate non-constant elasticities using the

model

ln y = β0 + β1x1 + β2 ln x2 + β3(ln x2)2 + u.

Then the elasticity of y with respect to x2 equals

β2 + 2β3 ln x2

and is constant if and only if β3 = 0.

295

Page 302: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

– Trade Example Continued:

So far we only considered multiple regression models that are

log-log or log-level in the original variables.

Now consider a further specification for modeling imports where

a log regressors also enters as square.

Model 5:

ln(imports) = β0 + β1 ln(gdp) + β2 (ln(gdp))2 + β3 ln(distance)

+ β4 com colonizer + β5 com language + u.

Using the previous result, the elasticity of imports with respect

to gdp is

β1 + 2β2 ln(gdp). (6.9)

296

Page 303: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

Estimation of Model 5 delivers:====================================================================

Dependent Variable: LOG(TRADE_0_D_O)

Method: Least Squares

Sample: 1 55, Included observations: 52

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C -85.82647 27.89947 -3.076276 0.0035

LOG(WDI_GDPUSDCR_O) 7.087930 2.186465 3.241731 0.0022

(LOG(WDI_GDPUSDCR_O))^2 -0.111982 0.042366 -2.643219 0.0112

LOG(CEPII_DIST) -0.761431 0.513375 -1.483187 0.1448

CEPII_COMCOL_REV 3.911289 0.673603 5.806523 0.0000

CEPII_COMLANG_OFF 2.312292 1.244898 1.857414 0.0697

====================================================================

R-squared 0.751895 Mean dependent var 15.97292

Adjusted R-squared 0.724927 S.D. dependent var 2.613094

S.E. of regression 1.370499 Akaike info criterion 3.576394

Sum squared resid 86.40031 Schwarz criterion 3.801537

Log likelihood -86.98624 Hannan-Quinn criter. 3.662709

F-statistic 27.88113 Durbin-Watson stat 2.094393

Prob(F-statistic) 0.000000

====================================================================

297

Page 304: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

Comparing the AIC, HQ, and SC of Model 5 with those of Models

1 to 4, see Section 4.4, one finds that Model 5 exhibits the lowest

values throughout. In addition, the (approximate) p-value of β2

is 0.011 and the quadratic term is statistically significant at the

5% significance level.

This provides also evidence for a nonlinear elasticity. Inserting

the parameter estimates into (6.9) delivers

η(gdp) = 7.088 − 0.224 ln(gdp).

One may plot the elasticity η(gdp) versus gdp for each observedvalue of gdp. In EViews this can be done by a little program

’ Model 5

equation model5.ls log(trade_0_d_o) c log(wdi_gdpusdcr_o)

(log(wdi_gdpusdcr_o))^2 log(cepii_dist) cepii_comcol_rev cepii_comlang_off

’ generate elasticities for gdp

genr elast_gdp = model5.@coefs(2) + 2*eq_ln_sq_ln_col.@coefs(3)*log(wdi_gdpusdcr_o)

group group_elast_gdp (wdi_gdpusdcr_o) elast_gdp ’ create group

group_elast_gdp.scat ’make scatter plot

298

Page 305: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

0.0

0.4

0.8

1.2

1.6

2.0

2.4

0.0E+00 4.0E+12 8.0E+12 1.2E+13

WDI_GDPUSDCR_O

EL

AS

T_

GD

P

The import elasticity with respect to gdp is much larger for small

economies in terms of gdp than for large economies.

Warning: Nonlinearities are sometimes due to missing variables.

Can you think of any control variables left out that should be

included in Model 5?

299

Page 306: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.3 — UR March 2009 — R. Tschernig

• Interactions:

Example:

y = β0 + β1x1 + β2x2 + β3x2x1 + u.

The marginal effect of a change in x2 is given by

∆E[y|x1, x2] = (β2 + β3x1)∆x2.

Hence, in this case the marginal effect also depends on the level of

x1!

• Selection between non-nested models of the same dependent

variable:

Definition: Non-nested means, that one model cannot be repre-

sented as a special case of the other model.

Example: the choice between models y = β0 +β1x1+β2x21+u and

y = β0 + β1 ln x1 + u can be based on SC or AIC (or R2). Why?

300

Page 307: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

6.4 Regressors with Qualitative Data

Dummy variables or binary variables

A binary variable can take exactly two different values and allows to

describe two qualitatively different states.

Examples: female vs. male, employed vs. unemployed, etc.

• In general these values are coded as 0 and 1. This allows for a very

easy and straightforward interpretation. Example:

y = β0 + β1x1 + β2x2 + · · · + βk−1xk−1 + δD + u,

where D equals 0 or 1.

301

Page 308: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

• Interpretation (well known by now):

E[y|x1, . . . , xk−1, D = 1] − E[y|x1, . . . , xk−1, D = 0] =

β0 + β1x1 + β2x2 + · · · + βk−1xk−1 + δ

− (β0 + β1x1 + β2x2 + · · · + βk−1xk−1) = δ

The coefficient of a dummy variable is equal to an intercept shift of

size δ in the case D = 1. All slope parameters βi, i = 1, . . . , k− 1

remain unchanged.

• Wage Example Continued:

– Question of interest: Do females earn significantly less than males?

– Data: a sample of n = 526 U.S. workers obtained in 1976.

(Source: Examples 2.4, 7.1 in Wooldridge (2009)).

302

Page 309: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

∗ wage in dollars per hour,

∗ educ: years of schooling of each worker,

∗ exper: years of professional experience,

∗ tenure: years of employment in current firm,

∗ female: dummy=1 if female, dumm=0 otherwise.

====================================================================

Dependent Variable: ln(WAGE), Method: Least Squares, Sample: 1 526

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C 0.416691 0.098928 4.212066 0.0000

FEMALE -0.296511 0.035805 -8.281169 0.0000

EDUC 0.080197 0.006757 11.86823 0.0000

EXPER 0.029432 0.004975 5.915866 0.0000

EXPER^2 -0.000583 0.000107 -5.430528 0.0000

TENURE 0.031714 0.006845 4.633035 0.0000

TENURE^2 -0.000585 0.000235 -2.493365 0.0130

====================================================================

R-squared 0.440769, Adjusted R-squared 0.434304, Akaike info crit. 1.017438

Mean dependent var 1.623268, S.D. dependent var 0.531538, Schwarz criterion 1.074200

S.E. of regression 0.399785, Sum squared resid 82.95065

F-statistic 68.17659, Prob(F-statistic) 0.000000

303

Page 310: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

– Note: In order to be able to interpret the coefficients of dummy

variables one has to know the reference group. The reference

group is given by the group for which the dummy equals zero.

– Prediction: How much earns a woman with 12 years of school-

ing, 10 years of experience, and 1 year tenure? (Or course, you

can insert any other numbers here.)

E[ln(wage)|female = 1, educ = 12, exper = 10, tenure = 1]

= 0.4167 − 0.2965 · 1 + 0.0802 · 12 + 0.0294 · 10

− 0.0006 · (102) + 0.0317 · 1 − 0.0006 · (12)

= 1.35

Thus, the expected hourly wage is approximately exp(1.35) =

3.86 US dollar.

– We already know that in case of a log-level model the expected

304

Page 311: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

value of y given the regressors x1, x2 is given by

E[y|x1, x2] = eβ0+β1x1+β2x2 · E[eu|x1, x2].

The true value of E[eu|x1, x2] depends on the probability distri-

bution of u.

It holds that: If u is normally distributed with variance σ2, then

E[eu|x1, x2] = eE[u|x1,x2]+σ2/2.

The precise prediction is therefore

E[y|x1, x2] = eβ0+β1x1+β2x2+E[u|x1,x2]+σ2/2.

305

Page 312: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

The exact prediction of the desired hourly wage is

E[wage|female = 1, educ = 12, exper = 10, tenure = 1]

= exp(0.4167 − 0.2965 · 1 + 0.0802 · 12 + 0.02943 · 10

− 0.0006 · (102) + 0.0317 · 1 − 0.0006 · (12) + 0.39982/2)

= 4.18.

Thus, the precise value of the mean hourly wage for the specified

person is about 4.18$ and thus 30 Cent larger than the approxi-

mate value.

– The parameter δ corresponds to the difference between the log

income of female and male workers keeping everything else con-

stant (e.g. years of schooling, experience, etc.).

Question: How large is the exact wage difference?

Answer: 100(e−0.2965 − 1) = 34.51%.

306

Page 313: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

Note that ceteris paribus analysis is much more informative than

the comparison of the unconditional means of male and female

wages. Assuming normal errors one has

E[wagef ] − E[wagem]

E[wagem]=

eE[ln(wagef)]+σ2

f/2 − eE[ln(wagem)]+σ2m/2

eE[ln(wagem)]+σ2m/2

.

Inserting estimates one obtains

e1.416+0.442/2 − e1.814+0.532/2

e1.814+0.532/2= −0.3570,

which, by the way, is very similar to inserting estimates for (E[wagef ]−E[wagem])/E[wagem] leading to -0.3538.

Females earn 36% less than males if one does not control for

other effects.

307

Page 314: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

Several subgroups

• Example: A worker is female or male and married or unmarried

=⇒ 4 subgroups:

1. female and not married

2. female and married

3. male and not married

4. male and married

How to proceed:

– Choose one subgroup to be the reference group, for example:

female and not married

– Define dummy variables for the other subgroups. For example, in

EViews with the Command “generate series” (genr):

308

Page 315: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

∗ FEMMARR = FEMALE*MARRIED

∗ MALESING = (1-FEMALE) * (1-MARRIED)

∗ MALEMARR = (1-FEMALE) * MARRIED

Dependent Variable: ln(WAGE), Method: Least Squares, Sample: 1 526

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C 0.211028 0.096644 2.183548 0.0294

FEMMARR -0.087917 0.052348 -1.679475 0.0937

MALESING 0.110350 0.055742 1.979658 0.0483

MALEMARR 0.323026 0.050114 6.445759 0.0000

EDUC 0.078910 0.006694 11.78733 0.0000

EXPER 0.026801 0.005243 5.111835 0.0000

EXPER^2 -0.000535 0.000110 -4.847105 0.0000

TENURE 0.029088 0.006762 4.301613 0.0000

TENURE^2 -0.000533 0.000231 -2.305552 0.0215

====================================================================

R-squared 0.460877, Adjusted R-squared 0.452535, Akaike info crit. 0.988423

Mean dependent var 1.623268, S.D. dependent var 0.531538, Schwarz criterion 1.061403

S.E. of regression 0.393290, Sum squared resid 79.96799

F-statistic 55.24559, Prob(F-statistic) 0.000000

309

Page 316: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

Examples for Interpretation:

– Married women earn about 8.8% less than unmarried women.

However, this effect is only significant at the 10% significance

level (for a two-sided test).

– The wage difference between married men and women is about

32.3 − (−8.8) = 41.1%. A t test cannot be directly applied.

(Solution: Choose a new reference group with one of the two

subgroups as the reference group.)

Remarks:

– Using dummies for all subgroups is not recommended since then

differences with respect to the ref. group cannot be tested directly.

– If you use dummies for all subgroups you cannot include a con-

stant. Otherwise MLR.3 is violated. Why?

310

Page 317: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

• Using ordinal information in regression

Example: Ranking of universities

The quality difference between ranks 1 and 2 and ranks 11 and 12,

respectively, may be dramatically different. Hence, ranks should not

be used as regressors. Instead, we have to assign a dummy variable

Dj for all but one (the “reference category”) of the universities,

inducing several new parameters which have to be estimated.

Note: Then, the coefficient of a dummy variable Dj denotes the

intercept shift between university j and the reference university.

Sometimes there are too many ranks and hence too many parame-

ters to be estimated. Then it proves useful to group the data, e.g.,

ranks 1-10, 11-20, etc.

311

Page 318: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

Interactions and Dummy Variables

• Interactions between dummy variables:

– May be used to define sub-groups (e.g., married males).

– Note that a useful interpretation and comparison of sub-group

effects crucially depends on a correct setup of dummies. For

example, let us include the dummies male and married and

their interaction in a wage equation

y = β0 + δ1male + δ2married + δ3male · married + . . .

Then, a comparison between male-married and male-single is

given by

E[y|male = 1,married = 1] − E[y|male = 1,married = 0]

= β0 + δ1 + δ2 + δ3 + . . . − (β0 + δ1 + . . .) = δ2 + δ3

312

Page 319: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

• Interactions between dummies and quantitative variables:

– Allows different slope parameters for different groups

y = β0 + β1D + β2x1 + β3(x1 · D) + u.

Note: here β1 denotes the difference between both groups only

for the case x1 = 0. If x1 6= 0, then this difference is equal to

E[y|D = 1, x1] − E[y|D = 0, x1]

= β0 + β1 · 1 + β2x1 + β3(x1 · 1) − (β0 + β2x1)

= β1 + β3x1

Even if β1 is negative, the total effect may be positive!

313

Page 320: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

– Wage Example Continued:Dependent Variable: ln(WAGE), Method: Least Squares, Sample: 1 526

====================================================================

Variable Coefficient Std. Error t-Statistic Prob.

====================================================================

C 0.388806 0.118687 3.275892 0.0011

FEMALE -0.226789 0.167539 -1.353643 0.1764

EDUC 0.082369 0.008470 9.724919 0.0000

EXPER 0.029337 0.004984 5.885973 0.0000

EXPER^2 -0.000580 0.000108 -5.397767 0.0000

TENURE 0.031897 0.006864 4.646956 0.0000

TENURE^2 -0.000590 0.000235 -2.508901 0.0124

FEMALE*EDUC -0.005565 0.013062 -0.426013 0.6703

====================================================================

R-squared 0.440964, Adjusted R-squared 0.433410, Akaike info crit. 1.020890

Mean dependent var 1.623268, S.D. dependent var 0.531538, Schwarz criterion 1.085761

S.E. of regression 0.400100, Sum squared resid 82.92160

F-statistic 58.37084, Prob(F-statistic) 0.000000

Are returns to schooling sensitive to gender?

314

Page 321: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

• Testing for differences between groups

– Can be done with F tests.

– Chow Test: Allows to test whether there is a difference between

groups in a sense that there may be group specific intercepts

and/or (at least one) slope parameter.

Illustration:

y = β0 + β1D + β2x1 + β3(x1 · D) + β4x2 + β5(x2 · D) + u.

(6.10)

Pair of hypotheses:

H0 :β1 = β3 = β5 = 0 vs.

H1 :β1 6= 0 and/or β3 6= 0 and/or β5 6= 0

315

Page 322: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 6.4 — UR March 2009 — R. Tschernig

Application of F tests:

∗ Estimate the regression equation for each group l

y = β0l + β2lx1 + β4lx2 + u, l = 1, 2,

and calculate SSR1 and SSR2.

∗ Then estimate this regression for both groups together and

calculate SSR.

∗ Compute the F statistic

F =SSR − (SSR1 + SSR2)

SSR1 + SSR2

n − 2(k + 1)

(k + 1)

where the degrees of freedom for the F distribution are equal

to k + 1 and n − 2(k + 1).

Reading: Chapter 6 (without Section 6.4) and Chapter 7 (without

Sections 7.5 und 7.6) in Wooldridge (2009).

316

Page 323: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

7 Multiple Regression Analysis: Prediction

7.1 Prediction and Prediction Error

• Consider the multiple regression model y = Xβ + u, i.e.

yi = β0 + β1xi1 + · · · + βkxik + ui, 1 ≤ i ≤ n.

• We search for a predictor y0 for y0 given x01, . . . , x0k.

• Define the prediction error

y0 − y0.

317

Page 324: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 7.1 — UR March 2009 — R. Tschernig

• We assume that MLR.1 to MLR.5 hold for the prediction sample

(x0, y0). Then

y0 = β0 + β1x01 + · · · + βkx0k + u0 (7.1)

and

E[u0|x01, . . . , x0k] = 0,

so that

E[y0|x01, . . . , x0k] = β0 + β1x01 + · · · + βkx0k = x′0β,

where x′0 = (1, x01, . . . , x0k).

MLR.4 guarantees that for known parameters the predictions are un-

biased. Then, the prediction is, loosely speaking, correct on average

(if averaged over many samples).

It can be shown that the conditional expectation is optimal in the

sense of minimizing the mean squared prediction error.

318

Page 325: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 7.1 — UR March 2009 — R. Tschernig

• In practice, the true regression coefficients βj, j = 0, . . . , k, are

unknown. Inserting the OLS estimators βj gives

y0 = E[y0|x01, . . . , x0k] = β0 + β1x01 + · · · + βkx0k.

Using compact notation the prediction rule is:

y0 = x′0β (7.2)

• This prediction rule only makes sense if (y0,x′0) belongs to the

population as well. Otherwise the population regression model is

not valid for (y0,x′0) and the prediction based on the estimated

version possibly strongly misleading.

319

Page 326: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 7.1 — UR March 2009 — R. Tschernig

• General decomposition of the prediction error

u0 = y0 − y0 (7.3)

= (y0 − E[y0|x0])︸ ︷︷ ︸unavoidable error v0

+(E[y0|x0] − x′0β

)︸ ︷︷ ︸

possible specification error

+(x′0β − x′0β

)

︸ ︷︷ ︸estimation error

– If MLR.1 and MLR.4 are correct for the population and if the

prediction sample also belongs to the population, then the spec-

ification error is zero. Then v0 = u0 in (7.1).

– If the estimator is consistent, plim β = β, then the estimation

error becomes negligible in large samples.

320

Page 327: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 7.1 — UR March 2009 — R. Tschernig

– Using the OLS estimator, the estimation error is

x′0β − x′0β = x′0(β − β)

= x′0β − x′0((X′X)−1X′y

)

= x′0β − x′0(β + (X′X)−1X′u

)

= −x′0(X′X)−1X′u. (7.4)

Thus, the estimation error only depends on the estimation sample.

– The OLS prediction error under MLR.1 to MLR.5 is given by

(using (7.3) and (7.4)):

u0 = u0 + x′0(β − β)

= u0 − x′0(X′X)−1X′u. (7.5)

321

Page 328: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 7.1 — UR March 2009 — R. Tschernig

• Variance of the prediction error:

– Extension of Assumption MLR.2 (Random Sampling):

u0 and u are uncorrelated.

– Conditional variance of (7.5) given X and x0:

V ar(u0|X,x0) = V ar(u0|X,x0) + V ar(x′0(β − β)|X,x0

)

= σ2 + x′0V ar(β − β|X)x0

= σ2 + x′0σ2(X′X)−1x0

or

V ar(u0|X,x0) = σ2(1 + x′0(X

′X)−1x0

). (7.6)

– Relevant in practice: Estimated variance of the prediction

error

V ar(u0|X,x0) = σ2(1 + x′0(X

′X)−1x0

).

322

Page 329: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 7.1 — UR March 2009 — R. Tschernig

• Prediction interval: A prediction interval is (given an a priori

chosen confidence probability 1 − α) for the multiple regression

model given by[y0 − tn−k−1

√V ar(u0|X,x0) , y0 + tn−k−1

√V ar(u0|X,x0)

].

Notes:

– Derivation and structure are analogous to the case of confidence

intervals for the parameter estimates.

– Prediction intervals are in contrast to confidence intervals even

in large samples only valid if the prediction errors are normally

distributed. This is because there is no averaging of the true

prediction error u0 as it occurs for β − β = Wu due to the

central limit theorem.

323

Page 330: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 7.2 — UR March 2009 — R. Tschernig

7.2 Statistical Properties of Linear Predictions

Apparently the prediction rule is linear (in y) since

y0 = x′0β = x′0(X′X)−1X′y.

Gauss-Markov property of linear prediction

If β is the BLU estimator for β, then

y0 = x′0β

is the BLU prediction rule. Among all linear prediction rules with a

mean prediction error of zero it exhibits the smallest prediction error

variance.

Reading: Section 6.4 in Wooldridge (2009).

324

Page 331: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

8 Multiple Regression Analysis: Heteroskedasticity

• In this chapter Assumptions MLR.1 through MLR.4 continue to

hold.

• If MLR.5 fails to hold such that

V ar(ui|xi1, . . . , xik) = σ2i 6= σ2, i = 1, . . . , n,

the errors of the regression model exhibit heteroscedasticity. More

precisely (instead of MLR.5) we have

325

Page 332: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Chapter 8 — UR March 2009 — R. Tschernig

– Assumption GLS.5: Heteroskedasticity

V ar(ui|xi1, . . . , xik) = σ2i (xi1, . . . , xik)

= σ2h(xi1, . . . , xik) = σ2hi, i = 1, . . . , n.

The error variance of the i-th sample observation σ2i is a function

h(·) of the regressors.

• Examples:

– The variance of net rents depends on the size of the flat.

– The variance of consumption expenditures depends on the level

of income.

– The variance of log hourly wages depends on years of education.

326

Page 333: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Chapter 8 — UR March 2009 — R. Tschernig

• The covariance matrix of the errors of the regression is given

by:

V ar(u|X) = E[uu′|X] =

σ2h1 0 · · · 0

0 σ2h2 · · · 0

... ... . . . ...

0 0 · · · σ2hn

= σ2

h1 0 · · · 0

0 h2 · · · 0

... ... . . . ...

0 0 · · · hn

︸ ︷︷ ︸Ψ

.

Thus, we have

y = Xβ + u, V ar(u|X) = σ2Ψ, (8.1)

which will be referred to as the original model in matrix nota-

tion.

327

Page 334: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.1 — UR March 2009 — R. Tschernig

• When estimating models with heteroskedastic errors three cases

have to be distinguished:

1. Function h(·) is known, see Section 8.3.

2. Function h(·) is only partially known, see Section 8.4.

3. Function h(·) is completely unknown, see Section 8.2.

8.1 Consequences of Heteroskedasticity for OLS

• The OLS estimator is unbiased and consistent.

• Variance of the OLS estimator in the presence of heteroskedas-

tic errors (compare Section 3.4.2):

328

Page 335: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.1 — UR March 2009 — R. Tschernig

From β − β = (X′X)−1X′u it can be derived that

V ar(β|X) = E

[(β − β

)(β − β

)′|X]

= E[(X′X)−1X′uu′X(X′X)−1|X

]

= (X′X)−1X′ E[uu′|X

]︸ ︷︷ ︸

σ2Ψ

X(X′X)−1

= (X′X)−1X′σ2ΨX(X′X)−1. (8.2)

• Note that with homoskedastic errors one has Ψ = I. Then (8.2)

yields the usual OLS covariance matrix, namely σ2(X′X)−1.

• If heteroskedasticity is present, using the usual covariance

matrix σ2(X′X)−1 is misleading and leads to faulty inference.

• The problem with using (8.2) directly is that Ψ is unknown. The

next section introduces an appropriate estimator.

329

Page 336: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.2 — UR March 2009 — R. Tschernig

• Even if Ψ is known, OLS is not the best linear unbiased estimator,

and thus not efficient. One has to use the GLS estimator instead,

see Section 8.3.

8.2 Heteroskedasticity-Robust Inference after OLS

• Derivation of heteroskedasticity-robust standard errors

Let x′i = (1, xi1 . . . , xik). Note that the middle term in the variance-

covariance matrix (8.2) with dimension (k + 1) × (k + 1) can be

written as

X′σ2ΨX =

n∑

i=1

σ2hixix′i.

Because E[u2i |X] = σ2hi, one can estimate σ2hi by the “one ob-

330

Page 337: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.2 — UR March 2009 — R. Tschernig

servation average” u2i . Of course this is not a good estimator but for

the present purpose it is doing well enough. Since ui is not known,

one takes the residual ui.

Hence one can estimate the covariance matrix (8.2) of the OLS

estimator in presence of heteroskedasticity by

V ar(β|X) = (X′X)−1

n∑

i=1

u2ixix

′i

(X′X)−1. (8.3)

• Comments:

– Standard errors obtained from (8.3) are called heteroskedasticity-

robust standard errors or also White standard errors named

after Halbert White, an econometrician at the University of Cal-

ifornia in San Diego.

331

Page 338: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.2 — UR March 2009 — R. Tschernig

– For single βj heteroskedasticity-robust standard errors can be

smaller or larger than the usual OLS standard errors.

– If heteroskedasticity-robust standard errors are used, it can be

shown that the OLS estimator β has no longer a known finite

sample distribution. However, it is asymptotically normally

distributed. Thus, critical values and p-values remain approxi-

mately valid if (8.3) is used.

– The OLS estimator with White standard errors is unbiased and

consistent since MLR.1 to MLR.4 are unaffected by heteroskedas-

ticity.

– However, the OLS estimator is not efficient. Efficient estima-

tors will be presented in the next sections.

332

Page 339: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.3 — UR March 2009 — R. Tschernig

8.3 The General Least Squares (GLS) Estimator

• Original model (8.1):

yi = β0 + β1xi1 + . . . + βkxik + ui, (8.4)

V ar(ui|xi1, . . . , xik) = σ2h(xi1, . . . , xik) = σ2hi.

• Basic idea: Weighted estimation of (8.4):

Transformation of the initial model to a model that satisfies all

assumptions, including MLR.5. This is achieved by kind of stan-

dardizing the regression error ui. This amounts to dividing ui and

thus the whole regression equation (8.4) by the square root of hi:

yi√hi︸︷︷︸

y∗i

= β01√hi︸︷︷︸

x∗i0

+β1xi1√hi︸︷︷︸

x∗i1

+ . . . + βkxik√hi︸︷︷︸

x∗ik

+ui√hi︸︷︷︸

u∗i

.

333

Page 340: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.3 — UR March 2009 — R. Tschernig

The resulting model is

y∗i = β0x∗i0 + β1x

∗i1 + . . . + βkx

∗ik + u∗i . (8.5)

Note: For the transformed error u∗i we have

V ar(u∗i |xi1, . . . , xik) = V ar

(ui√hi

∣∣∣∣xi1, . . . , xik

)

= E

[u2

i

hi

∣∣∣∣∣xi1, . . . , xik

]

=1

hiE[u2

i |xi1, . . . , xik] =1

hiσ2hi = σ2.

Result: We have transformed the original regression (8.4) in such

a way that the homoskedasticity assumption MLR.5 holds for the

resulting regression model (8.5).

334

Page 341: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.3 — UR March 2009 — R. Tschernig

• Therefore the OLS estimator based on the transformed model (8.5)

has all desirable properties: BLU (best linear unbiased).

• The OLS estimator of the transformed model (8.5) is based on the

minimization of a weighted sum of squared residualsn∑

i=1

(yi − β0 − β1xi1 − . . . − βkxik)2/hi.

Therefore, it is called a weighted least squares (WLS) procedure.

Note in its current form it requires that h(·) is known.

• The transformed model does not contain a constant term if√

hi is

not identical to one of the regressors in model (8.4).

• Next we derive the transformed model in matrix notation.

335

Page 342: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.3 — UR March 2009 — R. Tschernig

• Explicit statement of y∗, X∗, and u∗ in matrix notation:

y∗1y∗2...

y∗n

︸ ︷︷ ︸y∗

=

h−1/21 0 · · · 0

0 h−1/22 · · · 0

... ... . . . ...

0 0 · · · h−1/2n

︸ ︷︷ ︸P

y1

y2

...

yn

︸ ︷︷ ︸y

x∗10 x∗11 · · · x∗1kx∗20 x∗21 · · · x∗2k... ... ...

x∗n0 x∗n1 · · · x∗nk

︸ ︷︷ ︸X∗

= P ·

1 x11 · · · x1k

1 x21 · · · x2k... ... ...

1 xn1 · · · xnk

︸ ︷︷ ︸X

,

u∗1u∗2...

u∗n

︸ ︷︷ ︸u∗

= P ·

u1

u2

...

un

︸ ︷︷ ︸u

336

Page 343: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.3 — UR March 2009 — R. Tschernig

• For the transformation matrix P it holds that

P′P = Ψ−1

and hence

E[uu′|X] = σ2Ψ = σ2(P′P)−1.

• Therefore, the transformed model (8.5) in matrix notation is

given by

Py = PXβ + Pu,

or

y∗ = X∗β + u∗, E[u∗(u∗)′|X∗] = σ2I. (8.6)

• Obviously (8.6) is obtained by multiplying the original model (8.1)

y = Xβ + u by the transformation matrix P from the left.

• What is the explicit formula for the OLS estimator in terms of the

transformed model (8.6) and the original model (8.1)?

337

Page 344: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.3 — UR March 2009 — R. Tschernig

GLS (generalized least squares) estimator

• OLS estimation of (8.6) yields

βGLS =(X∗′X∗)−1

X∗′y∗

=((PX)′PX

)−1(PX)′Py

=(X′P′PX

)−1X′P′Py

and therefore

βGLS =(X′Ψ−1X

)−1X′Ψ−1y. (8.7)

βGLS in (8.7) is called the GLS estimator.

In case of heteroskedasticity Ψ is a diagonal matrix and each of the

n observations is weighted by 1/√

hi.

338

Page 345: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.3 — UR March 2009 — R. Tschernig

• Properties for known h(·):Under MLR.1 to MLR.4 and GLS.5 the GLS-estimator βGLS

– is unbiased and consistent,

– is BLUE (best linear unbiased), and thus efficient,

– has variance-covariance matrix V ar(βGLS|X) = σ2(X′Ψ−1X

)−1,

– is unbiased and consistent even if Ψ is misspecified since Ψ is a

function of X and not of u and thus

E[βGLS − β|X] =(X′Ψ−1X

)−1X′Ψ−1E[u|X] = 0.

As a consequence, OLS is inefficient since OLS and GLS are both

linear estimators. OLS variances are larger than or equal to those

of the GLS estimator. This can be shown using matrix algebra.

• Analogously to MLR.6 in Section 4.2 above, we assume

339

Page 346: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.3 — UR March 2009 — R. Tschernig

– Assumption GLS.6: Normal Distribution

ui|xi ∼ N (0, σ2hi), i = 1, . . . , n,

which, together with MLR.2 (Random Sampling) implies the

multivariate normal distribution

u|X ∼ N(0, σ2Ψ

).

Note GLS.6 implies that ui given xi is independently but not iden-

tically distributed since the variance changes with i.

All test statistics based on the transformed model (8.6) and ap-

propriately modified for the original model (8.1) exhibit the exact

distributions of Chapter 4 (normal, t, F ).

• Frequent problem in practice: hi is not known. In this case,

the feasible GLS estimator has to be used −→ Case 2.

340

Page 347: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

8.4 Feasible Generalized Least Squares (FGLS)

• In general, the variance function hi is not known and has to be

estimated. Frequently neither the relevant factors nor the functional

relationship are known.

• Hence, one needs a specification that flexibly captures a large range

of possibilities, e.g.

hi = h(xi1, . . . , xik) = exp (δ1xi1 + . . . + δkxik)

and thus

V ar(ui|xi1, xi2, . . . , xik) = σ2hi = σ2 exp (δ1xi1 + . . . + δkxik) .

Remark: On pp. 282, Wooldridge (2009) considers in hi additionally

the factor exp δ0. As this factor is constant, it can also be captured

by σ2.

341

Page 348: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

• How can one estimate the unknown parameters δ1, . . . , δk?

Standardizing ui delivers vi = ui/(σ√

hi) with E[vi|X] = 0 and

V ar(vi|X) = 1. Therefore ui = σ√

hivi and

u2i = σ2hiv

2i , i = 1, . . . , n.

Taking logarithms leads to

ln u2i = ln σ2 + ln hi + ln v2

i

= ln σ2 + ln exp (δ1xi1 + · · · + δkxik) + ln v2i

= ln σ2 + E[ln v2i ]︸ ︷︷ ︸

α0

+δ1xi1 + · · · + δkxik + ln v2i − E[ln v2

i ]︸ ︷︷ ︸ei

ln u2i = α0 + δ1xi1 + · · · + δkxik + ei. (8.8)

For the regression equation (8.8) the assumptions MLR.1-MLR.4

are satisfied. Hence, the OLS estimator for δj is unbiased and

consistent.

342

Page 349: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

In practice, the u2i ’s in the variance regression (8.8) are replaced

by the squared OLS residuals u2i ’s from the sample regression y =

Xβ+ u of (8.1). The resulting δj’s are used to get the fitted values

hi’s which are inserted into the GLS estimator (8.7) in step II.

• Outline of the FGLS-method:

Step I

a) Regress y on X and compute the residual vector u by OLS

estimation of the original specification (8.1).

b) Calculate ln u2i , i = 1, . . . , n, that are used as regressand in

the variance regression (8.8).

c) Estimate the variance regression (8.8) by OLS.

d) Compute hi = exp(δ1xi1 + · · · + δkxik

), i = 1, . . . , n.

343

Page 350: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

Step II

The FGLS estimator βFGLS is obtained analogously to the

GLS procedure. The original regression (8.1) is multiplied from

the left with the matrix

P =

h−1/21 · · · 0

... . . . ...

0 · · · h−1/2n

.

This delivers a variant of the transformed regression

y# = X#β + u#. (8.9)

Hence, OLS estimation of (8.9) leads to the FGLS estimator

βFGLS =(X′Ψ

−1X)−1

X′Ψ−1

y, (8.10)

with Ψ−1

= P′P.

344

Page 351: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

• Estimation properties of the FGLS estimators:

– They are consistent, that is, they converge in probability to the

true parameters for n → ∞plim βFGLS = β.

– The FGLS estimator is asymptotically efficient: For a cor-

rectly specified hi and a sufficiently large sample, the FGLS esti-

mator is preferable to the OLS estimator as the former one has a

lower estimation-variance. (This is plausible, as FGLS also uses

information on the functional form of the heteroskedasticity while

OLS with heteroskedasticity-robust standard errors does not.)

– If the variance function hi is misspecified, then the FGLS esti-

mator is inefficient.

– Be aware that there may be considerable differences between the

345

Page 352: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

FGLS estimates and the OLS estimates.

• Comparing OLS with heteroskedasticity-robust standard

errors and FGLS

– If you know something about the variance function hi, then

FGLS is preferable. If you have no idea about it, then OLS with

heteroskedasticity-robust standard errors may be better.

– It is always a good idea to run an OLS regression also with

heteroskedasticity-robust standard errors in order to see

whether the significance of parameters depends on the presence

of heteroskedasticity.

– Since any estimator taking into account heteroskedasticity should

be avoided if there is no heteroskedasticity, one should test for

the presence of heteroskedasticity, see Section 9.2.

346

Page 353: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

• Trade Example Continued

– Consider Model 5 of Section 6.3 and compare OLS estimates,

FGLS estimates, and OLS estimates with heteroskedasticity-robust

standard errors.

– EViews program to run OLS, FGLS with both steps, and OLS

with White standard errors, and scatter plots of residuals against

fitted values for both estimators.

’EViews program for FGLS estimation, Chapter 8 Heteroskedasticity

’RT, 2009_02_26

’requires workfile IMPORTS_KAZAKHSTAN_2004

’ define variables

’ define log of dependent variable

genr log_imp = log(trade_0_d_o)

’ define group of regressors (right hand side variables)

’ Model 5

group rhs_model5_base c log(wdi_gdpusdcr_o) (log(wdi_gdpusdcr_o))^2 log( cepii_dist)

cepii_comcol_rev cepii_comlang_off

347

Page 354: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

’ potential regressors to add

group rhs_model5_add log(cepii_area_o) log(weo_pop_o) cepii_comlang_ethno_rev

cepii_col45_rev cepii_contig

’ rhs_model5_add (can be added)

group rhs_model5 rhs_model5_base

’ Step I: OLS regression

’ ols regression

equation eq_ols_model5.ls log_imp rhs_model5

’ compute residuals

eq_ols_model5.makeresids res_ols_model5

’ compute fitted values

eq_ols_model5.fit fit_ols_model5

’ plot residuals versus fitted log imports in order to check for heteroskedasticity

group group_ols_res fit_ols_model5 res_ols_model5

group_ols_res.scat

’Step II: FGLS regression

’ square residuals and take logs

genr ln_u_hat_sq = log(res_ols_model5^2)

’ estimate variance equation

348

Page 355: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

equation eq_h_model5.ls ln_u_hat_sq rhs_model5

’ predicted squared residuals

eq_h_model5.fit ln_u_hat_sq_hat

’ compute exponential of fitted values of variance regression

genr h_hat = exp(ln_u_hat_sq_hat)

’ estimate FGLS using w=h_hat^(-1/2)

equation eq_fgls_model5.ls(w=h_hat^(-1/2)) log_imp rhs_model5

’ compute fitted values based on FGLS

eq_fgls_model5.fit fit_fgls_model5

’ compute residuals based on FGLS

eq_fgls_model5.makeresids res_fgls_model5

’ standardize residuals with weight function

series res_fgls_model5_star = res_fgls_model5*h_hat^(-1/2)

’ plot residuals versus fitted

group group_fgls_res fit_fgls_model5 res_fgls_model5_star

group_fgls_res.scat

’ OLS regression with heteroskedasticity-robust standard errors

equation eq_white_model5.ls(h) log_imp rhs_model5

349

Page 356: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

– OLS output with usual standard errors

====================================================================

Dependent Variable: LOG_IMP

Method: Least Squares

Sample: 1 55; Included observations: 52

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C -85.82647 27.89947 -3.076276 0.0035

LOG(WDI_GDPUSDCR_O) 7.087930 2.186465 3.241731 0.0022

(LOG(WDI_GDPUSDCR_O))^2 -0.111982 0.042366 -2.643219 0.0112

LOG(CEPII_DIST) -0.761431 0.513375 -1.483187 0.1448

CEPII_COMCOL_REV 3.911289 0.673603 5.806523 0.0000

CEPII_COMLANG_OFF 2.312292 1.244898 1.857414 0.0697

====================================================================

R-squared 0.751895 Mean dependent var 15.97292

Adjusted R-squared 0.724927 S.D. dependent var 2.613094

S.E. of regression 1.370499 Akaike info criterion 3.576394

Sum squared resid 86.40031 Schwarz criterion 3.801537

Log likelihood -86.98624 Hannan-Quinn criter. 3.662709

F-statistic 27.88113 Durbin-Watson stat 2.094393

Prob(F-statistic) 0.000000

====================================================================

350

Page 357: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

– FGLS - Step I: estimate variance regression (8.8)

====================================================================

Dependent Variable: LN_U_HAT_SQ

Method: Least Squares

Sample: 1 55, Included observations: 52

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C 10.63404 44.84623 0.237122 0.8136

LOG(WDI_GDPUSDCR_O) -0.848029 3.514572 -0.241290 0.8104

(LOG(WDI_GDPUSDCR_O))^2 0.004631 0.068099 0.068007 0.9461

LOG(CEPII_DIST) 0.863174 0.825210 1.046006 0.3010

CEPII_COMCOL_REV -0.946104 1.082764 -0.873786 0.3868

CEPII_COMLANG_OFF 1.187134 2.001077 0.593248 0.5559

====================================================================

R-squared 0.177909 Mean dependent var -0.847962

Adjusted R-squared 0.088551 S.D. dependent var 2.307505

S.E. of regression 2.202971 Akaike info criterion 4.525657

Sum squared resid 223.2417 Schwarz criterion 4.750801

Log likelihood -111.6671 Hannan-Quinn criter. 4.611972

F-statistic 1.990976 Durbin-Watson stat 1.711027

Prob(F-statistic) 0.097787

====================================================================

351

Page 358: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

– Estimate FGLS - Step II: estimate (8.10)====================================================================

Dependent Variable: LOG_IMP

Method: Least Squares

Sample: 1 55, Included observations: 52

Weighting series: H_HAT^(-1/2)

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C -88.19314 22.59938 -3.902459 0.0003

LOG(WDI_GDPUSDCR_O) 7.425860 1.675365 4.432383 0.0001

(LOG(WDI_GDPUSDCR_O))^2 -0.117133 0.030951 -3.784479 0.0004

LOG(CEPII_DIST) -1.107230 0.347434 -3.186879 0.0026

CEPII_COMCOL_REV 3.704807 0.678132 5.463250 0.0000

CEPII_COMLANG_OFF 2.012321 0.989031 2.034640 0.0477

====================================================================

Weighted Statistics

====================================================================

R-squared 0.780089 Mean dependent var 16.88216

Adjusted R-squared 0.756186 S.D. dependent var 10.32249

S.E. of regression 1.034215 Akaike info criterion 3.013329

Sum squared resid 49.20159 Schwarz criterion 3.238472

Log likelihood -72.34654 Hannan-Quinn criter. 3.099643

352

Page 359: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

F-statistic 32.63518 Durbin-Watson stat 2.577604

Prob(F-statistic) 0.000000

====================================================================

Unweighted Statistics

====================================================================

R-squared 0.746834 Mean dependent var 15.97292

Adjusted R-squared 0.719316 S.D. dependent var 2.613094

S.E. of regression 1.384409 Sum squared resid 88.16300

Durbin-Watson stat 2.055925

====================================================================

353

Page 360: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

– OLS with heteroskedasticity-robust standard errors====================================================================

Dependent Variable: LOG_IMP

Method: Least Squares

Sample: 1 55, Included observations: 52

White Heteroskedasticity-Consistent Standard Errors & Covariance

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C -85.82647 28.84426 -2.975513 0.0046

LOG(WDI_GDPUSDCR_O) 7.087930 2.230037 3.178391 0.0026

(LOG(WDI_GDPUSDCR_O))^2 -0.111982 0.041483 -2.699469 0.0097

LOG(CEPII_DIST) -0.761431 0.456362 -1.668480 0.1020

CEPII_COMCOL_REV 3.911289 0.777904 5.027986 0.0000

CEPII_COMLANG_OFF 2.312292 0.789404 2.929161 0.0053

====================================================================

R-squared 0.751895 Mean dependent var 15.97292

Adjusted R-squared 0.724927 S.D. dependent var 2.613094

S.E. of regression 1.370499 Akaike info criterion 3.576394

Sum squared resid 86.40031 Schwarz criterion 3.801537

Log likelihood -86.98624 Hannan-Quinn criter. 3.662709

F-statistic 27.88113 Durbin-Watson stat 2.094393

Prob(F-statistic) 0.000000

====================================================================

354

Page 361: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

– Diagnostic plots: (standardized) residuals against fitted values

-4

-3

-2

-1

0

1

2

3

10 12 14 16 18 20 22

FIT_OLS_MODEL5

RE

S_

OL

S_

MO

DE

L5

-4

-3

-2

-1

0

1

2

3

4

8 10 12 14 16 18 20 22

FIT_FGLS_MODEL5

RE

S_

FG

LS

_M

OD

EL

5_

ST

AR

355

Page 362: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

– Output table for Model 4 and Model 5 using various

estimators (compare Section 4.8):

356

Page 363: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

Dependent Variable: ln(imports to Kazakhstan)

Independent Variables/Model (4)-OLS (5)-OLS (5)-FGLS

constant -13.027 -85.827 -88.193

(4.726) (27.900) (22.599)

[4.778] [28.844]

ln(gdp) 1.318 7.088 7.426

(0.129) (2.187) (1.675)

[0.158] [2.230]

(ln(gdp))2 — -0.112 -0.117

(0.042) (0.031)

[0.042]

ln(distance) -0.625 -0.761 -1.107

(0.542) (0.513) (0.347)

[0.438] [0.456]

common ”colonizer” since 1945 3.351 3.911 3.705

(0.679) (0.674) (0.678)

[0.782] [0.778]

common official language 2.110 2.312 2.012

(1.319) (1.245) (0.989)

[1.020] [0.789]

Number of observations 52 52 52

R2 0.714 0.752 0.747

Standard error of regression 1.455 1.371 1.384

Sum of squared residuals 99.523 86.40 88.16

AIC 3.6793 3.576

HQ 3.7513 3.663

SC 3.8670 3.802

Notes: OLS or FGLS standard errors in paren-

theses, White standard errors in brackets

357

Page 364: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

– Results and Interpretation:

∗ OLS and FGLS parameter estimates for ln(distance) and com.

official language are quite different: the FGLS estimates im-

ply less impact of language, more impact of distance. In addi-

tion, log(distance) is statistically significant in the FGLS esti-

mate while insignificant in the OLS estimate with heteroskedas-

ticity-robust standard errors. This makes the FGLS estimates

more plausible.

∗ When taking into account heteroskedasticity, based on FGLS

there is no insignificant parameter at the 5% significance level

and based on heteroskedasticity-robust OLS standard errors

only log(distance) is insignificant.

∗ Comparing the usual OLS standard errors and White standard

errors shows that the multicollinearity problem for the param-

358

Page 365: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

eter estimates of distance and language decreases.

∗ Inspecting the scatter plots of OLS and standardized FGLS

residuals against fitted values does not automatically suggest

heteroskedasticity. Thus heteroskedasticity tests are useful, see

Section 9.2.

• Cigarette Example (Wooldridge 2009, Example 8.7) (with EViews-

outputs/commands):

Step I

1. OLS-estimationDependent Variable: CIGS, Method: Least Squares, Sample: 1 807

Variable Coefficient Std. Error t-Statistic Prob.

C -3.639855 24.07866 -0.151165 0.8799

LINCOME 0.880268 0.727783 1.209520 0.2268

LCIGPRIC -0.750855 5.773343 -0.130056 0.8966

EDUC -0.501498 0.167077 -3.001596 0.0028

AGE 0.770694 0.160122 4.813155 0.0000

AGE^2 -0.009023 0.001743 -5.176494 0.0000

359

Page 366: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

RESTAURN -2.825085 1.111794 -2.541016 0.0112

R-squared 0.052737, Adjusted R-squared 0.045632, Akaike info crit. 8.037737

Mean dependent var 8.686493, S.D. dependent var 13.72152, Schwarz criterion 8.078448

S.E. of regression 13.40479, Sum squared resid 143750.7, Log likelihood -3236.227

F-statistic 7.423062, Prob(F-statistic) 0.000000, Durbin-Watson stat 2.012825

2. Save the residuals using genr u hat = resid

3. Taking the logarithm of the squared residuals by using

genr ln u sq = log(u hat^ 2)

360

Page 367: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

4. Estimation of variance regression (8.8) with OLS yieldsDependent Variable: LN_U_SQ, Method: Least Squares, Sample: 1 807

Variable Coefficient Std. Error t-Statistic Prob.

C -1.920704 2.563034 -0.749387 0.4538

LINCOME 0.291541 0.077468 3.763351 0.0002

LCIGPRIC 0.195421 0.614539 0.317996 0.7506

EDUC -0.079704 0.017784 -4.481656 0.0000

AGE 0.204005 0.017044 11.96927 0.0000

AGE^2 -0.002392 0.000186 -12.89313 0.0000

RESTAURN -0.627012 0.118344 -5.298212 0.0000

R-squared 0.247362, Adjusted R-squared 0.241717, Akaike info crit. 3.557469

Mean dependent var 4.207485, S.D. dependent var 1.638575, Schwarz criterion 3.598179

S.E. of regression 1.426862, Sum squared resid 1628.749, Log likelihood -1428.439

F-statistic 43.82126, Prob(F-statistic) 0.000000, Durbin-Watson stat 2.024587

– Saving the hi, i = 1, . . . , n, using

genr h hat = exp(ln u sq - resid).

361

Page 368: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 8.4 — UR March 2009 — R. Tschernig

Step IIWeighted LS estimate (see Options) with weights h hat^ (-1/2)Dependent Variable: CIGS, Method: Least Squares, Sample: 1 807, Weighting series: WEIGHTS

Variable Coefficient Std. Error t-Statistic Prob.

C 5.635433 17.80314 0.316541 0.7517

LINCOME 1.295240 0.437012 2.963855 0.0031

LCIGPRIC -2.940305 4.460145 -0.659240 0.5099

EDUC -0.463446 0.120159 -3.856953 0.0001

AGE 0.481948 0.096808 4.978378 0.0000

AGE^2 -0.005627 0.000939 -5.989707 0.0000

RESTAURN -3.461064 0.795505 -4.350776 0.0000

Weighted Statistics

R-squared 0.002751, Adjusted R-squared -0.004728, Akaike info crit. 7.765025

Mean dependent var 7.158227, S.D. dependent var 11.66855, Schwarz criterion 7.805736

S.E. of regression 11.69611, Sum squared resid 109439.1, Log likelihood -3126.188

F-statistic 17.05549, Prob(F-statistic) 0.000000, Durbin-Watson stat 2.049719

Unweighted Statistics

R-squared 0.045739, Adjusted R-squared 0.038582, S.E. of regression 13.45421

Mean dependent var 8.686493, S.D. dependent var 13.72152, Sum squared resid 144812.7

(Remark: The Unweighted Statistics are based on the resid-

uals y − XβGLS; see EViews-helpfile.)

– Compare with the OLS estimator based on White standard errors.

362

Page 369: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

9 Multiple Regression Analysis: Model Diagnostics

9.1 The RESET Test

RESET Test (regression specification error test)

Idea and implementation:

• If the original model

y = x0β0 + . . . + xkβk + u = x′β + u

363

Page 370: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.1 — UR March 2009 — R. Tschernig

satisfies assumption MLR.4 E[u|x0, . . . , xk] = 0, it holds that

E[y|x0, . . . , xk] = x0β0 + . . . + xkβk + u = x′β.

• Then, any further term added to the model should not be significant.

Thus, any nonlinear function of the independent variables should be

insignificant.

• Thus, the null hypothesis of the RESET test is formulated such

that one can test the significance of nonlinear functions of the fit-

ted values y = x′β that are added to the model. Note that the

fitted values are a linear function of the regressors of the original

specification.

364

Page 371: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.1 — UR March 2009 — R. Tschernig

• In practice it turned out that for implementing the RESET test it is

sufficient to include quadratic and cubic terms of y only

y = x′β + αy2 + γy3 + ε.

The pair of hypotheses is

H0 : α = 0, γ = 0 (linear model is correctly specified)

H1 : α 6= 0 and/or γ 6= 0.

The null hypothesis is tested using an F test with 2 degrees of

freedom in the numerator and n − k − 3 in the denominator.

• Be aware that the null hypothesis may also be rejected because of

omitting relevant regressor variables.

365

Page 372: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

9.2 Heteroskedasticity Tests

• As already noted, it does not make sense to “automatically” use the

FGLS estimator. If the errors are homoskedastic, the OLS estimator

with OLS standard errors should be used.

• Thus, one should test if there is statistical evidence for heteroskedas-

ticity.

• In the following, two different test for heteroskedasticity are dis-

cussed: the Breusch-Pagan test and the White test. For both, the

null hypothesis is “homoskedastic errors”.

• Both tests are implemented in EViews 6.0, however, for the White

test without cross terms the level variables are included only in earlier

EViews versions.

366

Page 373: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

It is assumed that for the multiple linear regression

y = β0 + x1β1 + . . . + xkβk + u

assumptions MLR.1 to MLR.4 hold.

The pair of hypotheses that has to be tested is

H0 : V ar(ui|xi) = σ2 (homoskedasticity),

H1 : V ar(ui|xi) = σi 6= σ2 (heteroskedasticity).

The general idea underlying heteroskedasticity tests is that under the

null hypothesis no regressor should have any explanatory power for

V ar(ui|xi). If the null hypothesis is not true, V ar(ui|xi) can be a

(nearly arbitrary) function of the regressors xj, (1 ≤ j ≤ k).

Note: The Breusch-Pagan test and the White test differ with respect

to the specification of their alternative hypothesis.

367

Page 374: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

Breusch-Pagan Test

• Idea: Consider the regression

u2i = δ0 + δ1xi1 + · · · + δkxik + vi, i = 1, . . . , n. (9.1)

Under assumptions MLR.1 to MLR.4 the OLS estimator for the δj’s

is unbiased.

The pair of hypotheses is:

H0 : δ1 = δ2 = · · · = δk = 0 versus

H1 : δ1 6= 0 and/or δ2 6= 0 and/or . . .,

since under H0 it holds that E[u2i |X] = δ0.

368

Page 375: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

• Difference from the previous application of the F test:

– The squares of the errors u2i are by no means normally distributed

since they are squared quantities and thus cannot take negative

values. Hence, the vi cannot be normally distributed and the

F distribution of the F statistic does not hold exactly in finite

samples. However, the central limit theorem (CLT) works here as

well, see Section 5.2, and the F statistic follows approximately

an F distribution in large samples.

– The errors ui are unknown.They can be replaced by the OLS

residuals ui. In doing so, the F test remains asymptotically valid

(proof is formally sophisticated).

369

Page 376: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

• The R2 version of the test statistic can be used. Note that for a

regression including only a constant, it holds that R2 = 0 since

SSR = SST (there are no regressors that show a variation). Call

the coefficient of variation of the OLS estimation of (9.1) R2u2 then

F =R2

u2/k

(1 − R2u2)/(n − k − 1)

.

The F statistic for testing the joint significance of all regressors is

generally given by the appropriate software.

• H0 is rejected if F exceeds the critical value for a chosen significance

level (or equivalently if the p-value is smaller than the significance

level).

370

Page 377: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

• Cigarette Example Continued: (from Section 8.4):Dependent Variable: U_HAT_SQ, Method: Least Squares, Sample: 1 807

Variable Coefficient Std. Error t-Statistic Prob.

C -636.3031 652.4946 -0.975185 0.3298

LINCOME 24.63849 19.72180 1.249302 0.2119

LCIGPRIC 60.97656 156.4487 0.389754 0.6968

EDUC -2.384226 4.527535 -0.526606 0.5986

AGE 19.41748 4.339068 4.475034 0.0000

AGE^2 -0.214790 0.047234 -4.547398 0.0000

RESTAURN -71.18138 30.12789 -2.362641 0.0184

R-squared 0.039973, Adjusted R-squared 0.032773, Akaike info crit. 14.63669

Mean dependent var 178.1297, S.D. dependent var 369.3519, Schwarz criterion 14.67740

S.E. of regression 363.2491, Sum squared resid 1.06E+08, Log likelihood -5898.905

F-statistic 5.551687, Prob(F-statistic) 0.000012, Durbin-Watson stat 1.937302

The F statistic for the above H0 is 5.55 and the corresponding p-

value is smaller than 1%. The null hypothesis of homoskedastic

errors thus is rejected at a level of 1%.

371

Page 378: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

• Note:

– If one conjectures that the heteroskedasticity is caused by specific

variables that have not been included previously, they can be

included in regression (9.1).

– If H0 is not rejected, this does not mean automatically that the

ui’s are homoskedastic. If the specification (9.1) does not con-

tain all relevant variables causing heteroskedasticity, then it may

happen that all δj, j = 1, . . . , k, are jointly insignificant.

– A variant of the Breusch-Pagan test is a test for multiplicative

heteroskedasticity, i.e. the variance is of the form σ2i = σ2 ·

h(x′iβ). If, for example, the case h(·) = exp(·) is assumed, the

test equation ln(u2i ) = ln(σ2) + x′iβ + v results.

372

Page 379: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

White Test

• Background:

For deriving the asymptotic distribution of the OLS estimator the

assumption of homoskedastic errors MLR.5 is not necessary.

It is enough that the squared errors u2i are uncorrelated with all

regressors and the squares and cross products of the latter.

This can easily be tested using the following regression, where the

errors are already replaced by the residuals:

u2i = δ0 + δ1xi1 + · · · + δkxik

+ δk+1x2i1 + · · · + δJ1

x2ik

+ δJ1+1xi1xi2 + · · · + δJ2xik−1xik

+ vi, i = 1, . . . , n. (9.2)

373

Page 380: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

• The pair of hypotheses is:

H0 : δj = 0 for all j = 1, 2, . . . , J2, vs. H1 : δj 6= 0 for at least one j.

Again, a F test can be used whose distribution is approximated by

the F distribution (asymptotic distribution).

• With many regressors, it is tedious to implement the F test for (9.2)

manually. However, most software packages provide the White test.

• When implementing the White test, a large number of parameters

has to be estimated if the original model exhibits large k. This is

hardly possible in small samples. Then one only includes the squares

x2ij into the regression and neglects all cross products.

• Note: If the null hypothesis is rejected, this may also be due to

violation of MLR.1 or MLR.4. Then, the original regression is mis-

specified!

374

Page 381: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

• Cigarette Example Continued:White Heteroskedasticity Test:

F-statistic 2.159257 Probability 0.000905

Obs*R-squared 52.17244 Probability 0.001140

Test Equation: Dependent Variable: RESID^2, Method: Least Squares, Sample: 1 807

Variable Coefficient Std. Error t-Statistic Prob.

C 29374.75 20559.14 1.428793 0.1535

LINCOME -1049.627 963.4360 -1.089462 0.2763

LINCOME^2 -3.941187 17.07122 -0.230867 0.8175

LINCOME*LCIGPRIC 329.8888 239.2417 1.378893 0.1683

LINCOME*EDUC -9.591844 8.047067 -1.191968 0.2336

LINCOME*AGE -3.354564 6.682195 -0.502015 0.6158

LINCOME*(AGE^2) 0.026704 0.073025 0.365689 0.7147

LINCOME*RESTAURN -59.88701 49.69040 -1.205203 0.2285

LCIGPRIC -10340.67 9754.559 -1.060086 0.2894

LCIGPRIC^2 668.5282 1204.316 0.555110 0.5790

LCIGPRIC*EDUC 32.91400 59.06252 0.557274 0.5775

LCIGPRIC*AGE 62.88178 55.29011 1.137306 0.2558

LCIGPRIC*(AGE^2) -0.622372 0.594730 -1.046479 0.2957

LCIGPRIC*RESTAURN 862.1558 720.6219 1.196405 0.2319

EDUC -117.4717 251.2852 -0.467484 0.6403

EDUC^2 -0.290344 1.287605 -0.225492 0.8217

EDUC*AGE 3.617047 1.724659 2.097253 0.0363

EDUC*(AGE^2) -0.035558 0.017664 -2.012988 0.0445

EDUC*RESTAURN -2.896491 10.65709 -0.271790 0.7859

375

Page 382: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

AGE -264.1467 235.7624 -1.120394 0.2629

AGE^2 3.468605 3.194651 1.085754 0.2779

AGE*(AGE^2) -0.019111 0.028655 -0.666935 0.5050

AGE*RESTAURN -4.933195 10.84029 -0.455080 0.6492

(AGE^2)^2 0.000118 0.000146 0.807552 0.4196

(AGE^2)*RESTAURN 0.038446 0.120459 0.319160 0.7497

RESTAURN -2868.188 2986.776 -0.960296 0.3372

R-squared 0.064650, Adjusted R-squared 0.034709, Akaike info crit. 14.65774

Mean dependent var 178.1297, S.D. dependent var 369.3519, Schwarz criterion 14.80895

S.E. of regression 362.8853, Sum squared resid 1.03E+08, Log likelihood -5888.398

F-statistic 2.159257, Prob(F-statistic) 0.000905, Durbin-Watson stat 1.933288

Result: With the White test H0 is also rejected.

376

Page 383: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

Trade Example Continued

(from Section 8.4):

• Breusch-Pagan test for heteroskedasticity using OLS residuals====================================================================

Heteroskedasticity Test: Breusch-Pagan-Godfrey

====================================================================

F-statistic 4.139262 Prob. F(5,46) 0.0035

Obs*R-squared 16.13595 Prob. Chi-Square(5) 0.0065

Scaled explained SS 11.61688 Prob. Chi-Square(5) 0.0404

====================================================================

Test Equation:

Dependent Variable: RESID^2

Method: Least Squares

Sample: 1 55, Included observations: 52

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C 59.29291 40.51240 1.463575 0.1501

LOG(WDI_GDPUSDCR_O) -4.634815 3.174932 -1.459815 0.1511

(LOG(WDI_GDPUSDCR_O))^2 0.075281 0.061518 1.223720 0.2273

LOG(CEPII_DIST) 1.382417 0.745464 1.854439 0.0701

377

Page 384: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

CEPII_COMCOL_REV -1.642264 0.978128 -1.678986 0.0999

CEPII_COMLANG_OFF 0.394264 1.807698 0.218103 0.8283

====================================================================

R-squared 0.310307 Mean dependent var 1.661544

Adjusted R-squared 0.235340 S.D. dependent var 2.275813

S.E. of regression 1.990081 Akaike info criterion 4.322395

Sum squared resid 182.1794 Schwarz criterion 4.547538

Log likelihood -106.3823 Hannan-Quinn criter. 4.408709

F-statistic 4.139262 Durbin-Watson stat 2.285868

Prob(F-statistic) 0.003459

====================================================================

378

Page 385: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

• White test (without cross terms) for heteroskedasticity using OLSresidualsHeteroskedasticity Test: White

====================================================================

F-statistic 4.085786 Prob. F(5,46) 0.0037

Obs*R-squared 15.99159 Prob. Chi-Square(5) 0.0069

Scaled explained SS 11.51295 Prob. Chi-Square(5) 0.0421

====================================================================

Test Equation:

Dependent Variable: RESID^2

Method: Least Squares

Sample: 1 55, Included observations: 52

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C 18.75859 10.91313 1.718900 0.0924

(LOG(WDI_GDPUSDCR_O))^2 -0.055924 0.031777 -1.759911 0.0851

((LOG(WDI_GDPUSDCR_O))^2)^2 3.05E-05 2.34E-05 1.307676 0.1975

(LOG(CEPII_DIST))^2 0.090498 0.049960 1.811423 0.0766

CEPII_COMCOL_REV^2 -1.582928 0.992436 -1.594992 0.1176

CEPII_COMLANG_OFF^2 0.291977 1.766624 0.165274 0.8695

====================================================================

R-squared 0.307531 Mean dependent var 1.661544

379

Page 386: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

Adjusted R-squared 0.232262 S.D. dependent var 2.275813

S.E. of regression 1.994082 Akaike info criterion 4.326412

Sum squared resid 182.9127 Schwarz criterion 4.551555

Log likelihood -106.4867 Hannan-Quinn criter. 4.412726

F-statistic 4.085786 Durbin-Watson stat 2.285412

Prob(F-statistic) 0.003749

====================================================================

380

Page 387: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

• Breusch-Pagan test for heteroskedasticity using standardized FGLSresiduals====================================================================

Heteroskedasticity Test: Breusch-Pagan-Godfrey

====================================================================

F-statistic 0.683750 Prob. F(5,46) 0.6381

Obs*R-squared 3.597320 Prob. Chi-Square(5) 0.6087

Scaled explained SS 1.922069 Prob. Chi-Square(5) 0.8598

====================================================================

Test Equation:

Dependent Variable: WGT_RESID^2

Method: Least Squares

Sample: 1 55, Included observations: 52

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C -0.323375 1.044251 -0.309672 0.7582

LOG(WDI_GDPUSDCR_O)*WGT 0.122435 0.229469 0.533560 0.5962

(LOG(WDI_GDPUSDCR_O))^2*WGT -0.010813 0.007708 -1.402801 0.1674

LOG(CEPII_DIST)*WGT 0.678927 0.586156 1.158269 0.2527

CEPII_COMCOL_REV*WGT -0.781193 0.641851 -1.217095 0.2298

CEPII_COMLANG_OFF*WGT 0.299789 1.228329 0.244062 0.8083

====================================================================

381

Page 388: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

R-squared 0.069179 Mean dependent var 0.946185

Adjusted R-squared -0.031997 S.D. dependent var 1.116472

S.E. of regression 1.134193 Akaike info criterion 3.197887

Sum squared resid 59.17414 Schwarz criterion 3.423031

Log likelihood -77.14507 Hannan-Quinn criter. 3.284202

F-statistic 0.683750 Durbin-Watson stat 1.973137

Prob(F-statistic) 0.638076

====================================================================

382

Page 389: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

• White test (without cross terms) for heteroskedasticity using FGLSresidualsHeteroskedasticity Test: White

====================================================================

F-statistic 0.538645 Prob. F(6,45) 0.7760

Obs*R-squared 3.484361 Prob. Chi-Square(6) 0.7460

Scaled explained SS 1.861714 Prob. Chi-Square(6) 0.9320

====================================================================

Test Equation:

Dependent Variable: WGT_RESID^2

Method: Least Squares

Sample: 1 55, Included observations: 52

====================================================================

Coefficient Std. Error t-Statistic Prob.

====================================================================

C 0.568776 0.520032 1.093733 0.2799

WGT^2 6.158506 9.676452 0.636443 0.5277

(LOG(WDI_GDPUSDCR_O))^2*WGT^2 -0.014641 0.023638 -0.619379 0.5388

((LOG(WDI_GDPUSDCR_O))^2)^2*WGT^2 6.39E-06 1.41E-05 0.454559 0.6516

(LOG(CEPII_DIST))^2*WGT^2 0.021214 0.022901 0.926322 0.3592

CEPII_COMCOL_REV^2*WGT^2 -0.941910 1.019790 -0.923632 0.3606

CEPII_COMLANG_OFF^2*WGT^2 -0.421654 1.220702 -0.345420 0.7314

====================================================================

383

Page 390: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

R-squared 0.067007 Mean dependent var 0.946185

Adjusted R-squared -0.057392 S.D. dependent var 1.116472

S.E. of regression 1.148063 Akaike info criterion 3.238680

Sum squared resid 59.31224 Schwarz criterion 3.501347

Log likelihood -77.20568 Hannan-Quinn criter. 3.339380

F-statistic 0.538645 Durbin-Watson stat 1.955059

Prob(F-statistic) 0.775963

====================================================================

Results:

– Note that the specification of the White test without cross terms

follows EViews 6.0 and does not include level terms (in contrast

to (9.2)).

– Both, the Breusch-Pagan and the White test reject the null hy-

pothesis of homoskedastic errors for the OLS residuals at the 1%

significance level. Thus, using OLS with heteroskedasticity-robust

standard errors or FGLS in Section 8.4 was justified.

384

Page 391: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.2 — UR March 2009 — R. Tschernig

– Both, the Breusch-Pagan and the White test do not reject the

null hypothesis of homoskedastic standardized errors in the FGLS

framework. Both p-values are above 60%. Thus, the variance

regression in Section 8.4 does not seem to be misspecified.

– In sum, among all models and estimation procedures considered,

the FGLS estimates of Model 5 seem to be the most reliable ones.

Reading: Chapter 8 in Wooldridge (2009) (without Section 8.5 con-

cerning linear probability models).

385

Page 392: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

9.3 Model Specification II: Useful Tests

9.3.1 Comparing Models with Identical Regressand

Starting point: two non-nested models

(M1) y = x0β0 + . . . + xkβk + u = x′β + u,

(M2) y = z0γ0 + . . . + zmγm + v = z′γ + v,

where k = m does not have to hold.

Decision between (M1) and (M2): using

• information criteria (AIC, SC, HQ, ...),

• encompassing test,

• non-nested F test,

• J test.

386

Page 393: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

All three tests can be constituted on the encompassing principle.

Encompassing Principle

Let two non-nested models be given:

(M1) y = x′β + u,

(M2) y = z′γ + v.

For clarifying the non-nested relationship between (M1) and (M2), de-

fine

x′ =(w′ x′B

), β =

(βA βB

),

z′ =(w′ z′B

), γ =

(γA γB

),

such that w contains all common regressors

(M1) y = w′βA + x′BβB + u,

(M2) y = w′γA + z′BγB + v.

387

Page 394: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

Idea of the encompassing principle:

• If (M1) is correctly specified, it must be able to explain the results

of an estimation of (M2) (and vice versa).

• If not, (M1) has to be rejected (and vice versa).

Derivation:

Consider the “artificial nesting model”

(ANM) y = w′a + x′Bbx + z′Bbz + ε, E[ε|w,xB, zB] = 0.

Different settings:

• (ANM) correctly specified model such that (M1) and (M2) are mis-

specified. Model (M2) is estimated.

• (M1) correctly specified model. Model (M2) is estimated.

• (M2) correctly specified model. Model (M1) is estimated.

388

Page 395: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

In general an omitted variable bias results for all cases.

Details:

• (ANM) correctly specified model such that (M1) and (M2) are mis-

specified. Model (M2) is estimated. ⇒ xB omitted.

E[y|w, zB] = E[w′a + x′Bbx + z′Bbz + ε|w, zB]

= E[w′a|w, zB] + E[x′Bbx|w, zB]

+ E[z′Bbz|w, zB] + E[ε|w, zB]

= w′a + E[x′B|w, zB]bx + z′Bbz + E[ε|w, zB].

For simplicity it is assumed that xB is scalar. Then it holds that

xB = w′q + z′Bp + ν,

E[xB|w, zB] = w′q + z′Bp.

It also holds that

E [E[ε|w,xB, zB]|w, zB] = E[ε|w, zB].

389

Page 396: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

Since (ANM) is correct, it holds that E[ε|w,xB, zB] = 0 and thus

E[0] = 0 = E[ε|w, zB].

When estimating (M2) instead of (ANM), one gets

E[y|w, zB] = w′a + [w′q + z′Bp]bx + z′Bbz

= w′ [a+qbx]︸ ︷︷ ︸γA

+z′B [bz+pbx]︸ ︷︷ ︸γB

. (9.3)

Note that the biases qbx and pbx are caused by omitting the variable

xB. These effects bias the direct impact of w via a and of zB via

bz on y.

390

Page 397: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

• (M1) correctly specified model. Model (M2) is estimated.

Then bz = 0 and from (9.3) the following restriction results:

pbx = γB.

Now it can be seen that knowing the correctly specified model (M1)

is enough for deriving model (M2), thus predicting γB or the expec-

tation of the OLS estimator. In other words: Since (M2) is “smaller”

than (M1) with respect to the relevant variables, the behavior of

(M2) can be predicted with the help of (M1) when an unbiased es-

timator is used for the latter. Then one says “(M1) encompasses

(M2)”. (Knowing (M1) is not enough here if (ANM) is the correct

model, bz 6= 0.)

• (M2) correctly specified model. Model (M1) is estimated.

Can be derived just as in the above case.

391

Page 398: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

Thus, for the null hypothesis “(M1) encompasses (M2)” two equivalent

hypotheses can be tested:

• H0 : pb − γB = 0 - more complicated, no details here. (This

version is often termed encompassing test and sometimes has

advantages in more general models.)

• H0 : bz = 0 in (ANM) - easy: by the help of a non-nested F

test.

Proceeding for more than two alternatives: selection procedure

• Based on this same principle, the remaining model competes with

further alternative models as long as it is not rejected.

• Problem of this principle: it can happen that both null hypotheses

have to be rejected.

392

Page 399: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

Non-nested F test

Idea and implementation:

• Hypotheses: “H0: model (M1) is correct” versus “H1: model (M1)

incorrect”.

• Again, partition z′ = (w′, z′B), where the kA regressors from w are

contained in x but the kB regressors from zB are not contained.

• Formulate the artificial nesting model (ANM)

y = x′β + z′Bbz + ε.

• Based on this ANM test H0 where

H0 : bz = 0

using an F test with kB degrees of freedom in the numerator and

n − m − kB in the denominator.

393

Page 400: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

• For the test of (M2) vs. (M1) proceed analogously with partition

x′ = (w′,x′B) ...

J test (Davidson-MacKinnon test)

Idea and implementation:

• For the J test the ANM is formulated such that both (M1) and

(M2) are nested in the ANM:

y = (1 − λ)x′β + λz′γ + ε.

For the case λ = 0 the model (M1) results, for λ = 1 model (M2).

• Problem: λ, β and γ are not identified in the above approach.

• Solution: replace γ by the OLS estimator from (M2) γ.

I.e. test H0 : λ = 0 with test equation y = x′β∗+λyM2+η, where

β∗ = (1 − λ)β and yM2 = z′γ is the fitted value from the OLS

estimation of (M2).

394

Page 401: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

• For testing whether (M2) is valid, proceed analogously ...

• Interpretation of the logic of the test:

For testing model (M1) it is enlarged by the fitted values of model

(M2); these (i.e. the by the regressors in (M2) explained part of y)

are tested for their significance in the test equation.

• Advantages of the J test compared to the non-nested F test:

– only one single restriction has to be tested,

– higher power, if kB or respectively mB are very large,

– in case of kB = 1 or respectively mB = 1 the tests are equivalent.

395

Page 402: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

9.3.2 Comparing Models with differing Regressand

Idea and implementation (of the P test):

Example linear model versus log-log alternative

• Step 1: Run an OLS estimation for both models.

• Step 2: Compute the corresponding fitted values

ylin (linear model) and ln(ylog) (log-log model).

• Step 3a: Test the linear approach against the log-log alternative

using the ANM

y =∑

xjβj,lin + δlin[ln(ylin) − ln(ylog)] + u,

by a t test with the null hypothesis

H0 : δlin = 0 (linear model is correct).

396

Page 403: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 9.3 — UR March 2009 — R. Tschernig

• Step 3b: Test the log-log approach against the linear alternative

using the ANM

ln(y) =∑

ln(xj)βj,log + δlog[ylin − exp( ln ylog)] + v,

by a t test with the null hypothesis

H0 : δlog = 0 (log-log model is correct).

Problem: it is possible that both hypotheses are rejected (i.e. another

functional form is relevant) or both cannot be rejected (i.e. the problem

of lacking power or something else).

Note: in this case a comparison using the information criteria is not

possible.

Reading: Chapter 9 in Wooldridge (2009).

397

Page 404: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

10 Appendix

10.1 A Condensed Introduction to Probability

Preliminary Statement: The following pages are not considered as de-

terrence, but as supplement to the illustrations found in introductory

textbooks for econometrics. This supplement is intended to explain

the intuition underlying the large amount of definitions and concepts

in probability theory. Nevertheless it is not possible to completely avoid

formulas, although it may take some time to clarify your mind.

I

Page 405: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

• Sample space, outcome space:

The set Ω contains all possible outcomes of a random experiment.

This set can contain (countably) finite or infinite outcomes.

Examples:

– Urn with 4 balls of different color: Ω = yellow, red, blue, green– Monthly income of a household in the future: Ω = [0,∞)

Remark:

– If there is a finite number of outcomes, they are often denoted

as ωi. For S outcomes, Ω appears as

Ω = ω1, ω2, . . . , ωS.

– If there is an infinite number of outcomes, each one is often

denoted as ω.

II

Page 406: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

• Event:

Every set of possible outcomes = every subset of the set Ω including

Ω itself.

Examples:

– Urn-example: possible events are for example yellow, red or

red, blue, green.– Household income: possible events are all possible subintervals

and combinations of them, e.g. (0, 5000], [1000, 1001), (400,∞),

4000, and so on.

Remark: By using the general point of view with the ω’s, one has

– for the case of S outcomes: ω1, ω2, ωS, ω3, . . . , ωS, and

so on.

– for the case of infinitely many outcomes located inside an interval

III

Page 407: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

Ω = (−∞,∞): (a1, b1], [a2, b2), (0,∞), and so on, where the

lower bound always has to be lower or equal the upper bound

(ai ≤ bi).

• Random variable:

A random variable is a function that assigns a real number X(ω) to

each outcome ω ∈ Ω.

Urn example: X(ω1) = 0, X(ω2) = 3, X(ω3) = 17, X(ω4) = 20.

• Density function

– Preliminary statement: As we have already seen, it gets com-

plicated if Ω contains infinitely many outcomes. Consider for

example Ω = [0, 4]. If one wants to compute the probability for

the number π to appear, this probability is equal to zero. If it

were not equal to zero, we had the problem that a sum of all

probabilities for all (infinitely many) numbers could not be equal

IV

Page 408: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

to 1. What to do?

– A back door is the following trick: Consider the probability for the

outcome of the random variable X being located in the interval

[0, x], with x < 4. This probability can be written as P (X ≤ x).

Now determine how the probability changes by extending the size

of this interval [0, x] by h. The solution to this is: P (X ≤x + h) − P (X ≤ x). By relating this change in probability to

the interval length one gets

P (X ≤ x + h) − P (X ≤ x)

h.

For a decreasing interval length h that approaches zero, one ob-

tains the following limit:

limh→0

P (X ≤ x + h) − P (X ≤ x)

h= f (x).

This limit is called probability density function or shortly

V

Page 409: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

density function that belongs to the probability function P .

– How to interpret a density function?

By using the sloppy formulation

P (X ≤ x + h) − P (X ≤ x)

h≈ f (x)

and rewriting as

P (X ≤ x + h) − P (X ≤ x) ≈ f (x)h,

one can see that f (x) determines the rate of change for the

probability that X falls into the interval [0, x] if the interval length

is extended by h. Hence, the density function is a rate.

– As the density function is a derivative, we get conversely for our

example ∫ x

0f (u)du = P (X ≤ x) = F (x).

VI

Page 410: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

Here, F (x) = P (X ≤ x) is called probability distribution

function. Certainly, in this example we get∫ 4

0f (u)du = P (X ≤ 4) = 1.

In general, the integral of the density function over the full support

of the random variable yields a value of 1. Consider for example

X(ω) ∈ R:∫ ∞

−∞f (u)du = P (X ≤ ∞) = 1.

VII

Page 411: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

• Conditional probability function

Let’s begin with an example:

Let the random variable X ∈ [0,∞) be the payoff in a lottery.

The probability function (distribution function) P (X ≤ x) = F (x)

is the probability for a maximum payoff x. Additionally, we know

that there are two machines (machine A and B) that determine the

payoff.

Question: What is the probability for a maximum payoff of x if

machine A is used?

In other words, what is the probability of interest if the condition

“Machine A is used” is applied? Hence, the probability under con-

sideration is also called conditional probability and written as

P (X ≤ x|A).

VIII

Page 412: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

Accordingly one writes P (X ≤ x|B), if the condition “Machine B

is used” is applied.

Question: What is the relationship between the unconditional

probability P (X ≤ x) and the conditional probabilities P (X ≤x|A) and P (X ≤ x|B)?

To answer this question one has to clarify what the corresponding

probabilities of using machine A or B are. Denoting these proba-

bilities by P (A) and P (B) we have:

P (X ≤ x) = P (X ≤ x|A)P (A) + P (X ≤ x|B)P (B)

F (x) = F (x|A)P (A) + F (x|B)P (B)

In this example there are two outcomes. The corresponding relation-

IX

Page 413: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

ship can be extended to n discrete outcomes Ω = A1, A2, . . . , An:F (x) = F (x|A1)P (A1) + F (x|A2)P (A2) + · · · + F (x|An)P (An)

(10.1)

Until now we defined the conditions in terms of events and not

in terms of random variables. An example for the latter one were

if the payoff is determined by only one machine, but where the

mode of operation for this machine is conditioned upon the payoffs’

magnitude Z. In this case, the conditional distribution function

is F (x|Z = z), with Z = z meaning that the random variable

Z exactly takes the value z. For relating the unconditional and

conditional probability we have to replace the sum by an integral,

and the probability of the conditioning event by the corresponding

density function, as Z can have infinitely many values. For our

X

Page 414: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

example we obtain:

F (x) =

∫ ∞

0F (x|Z = z)f (z)dz =

∫ ∞

0F (x|z)f (z)dz

or generally

F (x) =

∫F (x|Z = z)f (z)dz =

∫F (x|z)f (z)dz (10.2)

Another important property:

If the random variables X and Z are stochastically independent, we

have

F (x|z) = F (x).

• Conditional density function

The conditional density function can be heuristically derived from

the conditional distribution function in the same way as for the case

of the unconditional density function: one simply replaces the uncon-

XI

Page 415: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

ditional probabilities by conditional probabilities. The conditional

density function arises from

limh→0

P (X ≤ x + h|A) − P (X ≤ x|A)

h= f (x|A).

For finitely many conditions equation (10.1) becomes

f (x) = f (x|A1)P (A1) + f (x|A2)P (A2) + · · · f (x|An)P (An).

The relationship (10.2) turns to

f (x) =

∫f (x|Z = z)f (z)dz =

∫f (x|z)f (z)dz. (10.3)

• Expectation

Consider again the payoff example.

Question: Which payoff would you expect “on average”?

Answer:∫∞0 xf (x)dx. For a payoff paid in n different discrete

amounts, one would expect∑n

i=1 xiP (X = xi) on average. Each

XII

Page 416: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

possible payoff is multiplied by its probability of entry and added up.

It is not surprising that the result is denoted as expectation.

In general the expectation is defined as

E[X ] =

∫xf (x)dx, continuous X,

E[X ] =∑

xiP (X = xi), discrete X.

XIII

Page 417: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

• Rules for the expectation e.g. Appendix B in Wooldridge (2009).

1. For each constant c it holds that

E[c] = c.

2. For all constants a and b and all random variables X and Y it

holds that

E[aX + bY ] = aE[X ] + bE[Y ].

3. If the random variables X and Y are independent, it holds that

E[Y X ] = E[Y ]E[X ].

• Conditional expectation

So far we did not care for the machine that was used to create the

payoff. If we are interested in the expected payoff of using machine

XIV

Page 418: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

A, we have to calculate the conditional expectation

E[X|A] =

∫ ∞

0xf (x|A)dx.

This is easily achieved by replacing the unconditional density f (x) by

the conditional density f (x|A) and stating the condition in the no-

tation of expectations accordingly. Analogously the expected payoff

for machine B is determined as

E[X|B] =

∫ ∞

0xf (x|B)dx.

In general one has for discrete conditioning events

E[X|A] =

∫xf (x|A)dx, continuous X,

E[X|A] =∑

xiP (X = xi|A), discrete X,

XV

Page 419: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

and for continuous conditions

E[X|Z = z] =

∫xf (x|Z = z)dx, continuous X,

E[X|Z = z] =∑

xiP (X = xi|Z = z), discrete X.

Remark: Frequently, the short versions are used as in Wooldridge

(2009).

E[X|z] =

∫xf (x|z)dx, continuous X,

E[X|z] =∑

xiP (X = xi|z), discrete X.

In accordance to the relationship of unconditional and conditional

probabilities there is a similar relationship for unconditional and con-

ditional expectations. The relationship is

E[X ] = E [E[X|Z]]

which is denoted as law of iterated expectations (LIE).

XVI

Page 420: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

Sketch of proof:

E[X ] =

∫xf (x)dx

=

∫x

[∫f (x|z)f (z)dz

]dx (insert (10.3))

=

∫ ∫xf (x|z)f (z)dzdx

=

∫ ∫xf (x|z)dx

︸ ︷︷ ︸E[X|z]

f (z)dz (interchange dx and dz)

=

∫E[X|z]f (z)dz

=E [E[X|Z]]

In our example with 2 machines, the law of iterated expectations

XVII

Page 421: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

yields

E[X ] = E[X|A]P [A] + E[X|B]P (B).

This example also shows that the conditional expectations E[X|A]

and E[X|B] are random variables. If they are weighted by the cor-

responding probabilities of entry P (A) and P (B), they yield E[X ].

Suppose that, prior to the lottery, you only know both conditional

expectations but not which machine is used. Then the expected

payoff is equal to E[X ] and both conditional expectations are con-

sidered as random variables. After knowing what machine is used,

the corresponding conditional expectation is the outcome of the ran-

dom variable. This is a general property of conditional expectations.

XVIII

Page 422: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

• Rules for conditional expectations

e.g. Appendix B in Wooldridge (2009).

1. For each function c(·) it holds that

E[c(X)|X ] = c(X).

2. For all functions a(·) and b(·) it holds that

E[a(X)Y + b(X)|X ] = a(X)E[Y |X ] + b(X).

3. If the random variables X and Y are independent, it holds that

E[Y |X ] = E[Y ].

4. Law of iterated expectations (LIE)

E[E[Y |X ]] = E[Y ].

5. E[Y |X ] = E[E[Y |X,Z]|X ].

XIX

Page 423: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.1 — UR March 2009 — R. Tschernig

6. If it holds that E[Y |X ] = E[Y ], then it also holds that Cov(X,Y ) =

0.

7. If E[Y 2]

< ∞ and E[g(X)2] < ∞ for an arbitrary function

g(·), then the following inequalities hold:

E[Y − E[Y |X ]]2|X ≤ E[Y − g(X)]2|XE[Y − E[Y |X ]]2 ≤ E[Y − g(X)]2.

XX

Page 424: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.2 — UR March 2009 — R. Tschernig

10.2 Important Rules of Matrix Algebra

Matrix addition

A =

a11 a12 . . . a1K

a21 a22 . . . a2K

... ... ...

aT1 aT2 . . . aTK

, C =

c11 c12 . . . c1K

c21 c22 . . . c2K

... ... ...

cT1 cT2 . . . cTK

.

If A and C are of the same dimension

A + C =

a11 + c11 a12 + c12 · · · a1K + c1K

a21 + c21 a22 + c22 · · · a2K + c2K

... ... ...

aT1 + cT1 aT2 + cT2 · · · aTK + cTK

.

XXI

Page 425: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.2 — UR March 2009 — R. Tschernig

Matrix multiplication

A =

a11 a12 · · · a1K

a21 a22 · · · a2K

... ... ...

aT1 aT2 · · · aTK

, B =

b11 b12 · · · b1L

b21 b22 · · · b2L

... ... ...

bK1 bK2 · · · bKL

.

If the number of columns in A is equal to the number of rows in B,

then the product C = AB is defined and the following equality holds

for every element in C

cij =(

ai1 · · · aiK

)

b1j

...

bKj

= ai1b1j + · · · + aiKbKj =

K∑

l=1

ailblj.

Caution: In general it holds that AB 6= BA.

XXII

Page 426: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.2 — UR March 2009 — R. Tschernig

Transpose of a matrix

Given the (2 × 3)-matrix (i.e. 2 rows, 3 columns)

A =

(a11 a12 a13

a21 a22 a23

),

the transpose of A is the (3 × 2)-matrix

A′ =

a11 a21

a12 a22

a13 a23

.

It holds that

(AB)′ = B′A′.

XXIII

Page 427: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.2 — UR March 2009 — R. Tschernig

Inverse of a matrix

Let A be the (K × K)-matrix

A =

a11 a12 · · · a1K

a21 a22 · · · a2K

... ... ...

aK1 aK2 · · · aKK

,

then the inverse of A is A−1 and is defined by

AA−1 = A−1A = IK =

1 0 . . . 0

0 1 0

... . . .

0 0 . . . 1

with IK as identity matrix of dimension (K × K).

XXIV

Page 428: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.2 — UR March 2009 — R. Tschernig

The matrix A is invertible if the rows respectively columns are linearly

independent. In other words: No row (column) can be described as

linear combination of the other rows (columns). Technically this is

satisfied whenever the determinant of A is unequal to zero.

Frequently, a noninvertible matrix is called singular.

The calculation of an inverse is better left to a computer. Only for

matrices of 2 or 3 columns/rows, the calculation is of moderate com-

plexity. Hence a manual calculation can be useful.

XXV

Page 429: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.2 — UR March 2009 — R. Tschernig

Special issue of a (2 × 2) matrix:

For a square (2 × 2) matrix

B =

(b11 b12

b21 b22

)

the determinant is computed as

det(B) = b11b22 − b21b12

and the inverse as

B−1 =1

det(B)

(b22 −b12

−b21 b11

)

=1

b11b22 − b21b12

(b22 −b12

−b21 b11

).

XXVI

Page 430: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.2 — UR March 2009 — R. Tschernig

Example:

C =

(0 2

1 −1

), with det(C) = 0 · (−1) − 1 · 2 = −2

C−1 =1

−2

(−1 −2

−1 0

)=

(12 112 0

)

Check:

CC−1 =

(0 2

1 −1

)(12 112 0

)=

(1 0

0 1

)

Reading: As supplement for matrix algebra and its implementation

in the multiple linear regression framework see Appendices D, E.1 in

Wooldridge (2009).

XXVII

Page 431: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.3 — UR March 2009 — R. Tschernig

10.3 Rules for Matrix Differentiation

c =

c1

c2

...

cT

, w =

w1

w2

...

wT

z = c′w =(

c1 c2 · · · cT

)

w1

w2

...

wT

∂z

∂w= c

XXVIII

Page 432: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.3 — UR March 2009 — R. Tschernig

A =

a11 a12 · · · a1T

a21 a22 · · · a2T

. . . . . . . . . . . . . . . . .

aT1 aT2 · · · aTT

z = w′Aw =(

w1 w2 · · · wT

)

a11 a12 · · · a1T

a21 a22 · · · a2T

. . . . . . . . . . . . . . . . .

aT1 aT2 · · · aTT

w1

w2

...

wT

∂z

∂w= (A′ + A)w

XXIX

Page 433: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.4 — UR March 2009 — R. Tschernig

10.4 Data for Estimating Gravity Equations

Legend for data in gravity data kaz.wf1

• Countries and country codes1 ALB Albania 17 GBR United Kingdom 33 NLD Netherlands

2 ARM Armenia 18 GEO Georgia 34 NOR Norway

3 AUT Austria 19 GER Germany 35 POL Poland

4 AZE Azerbaijan 20 GRC Greece 36 PRT Portugal

5 BEL Belgium and 21 HRV Croatia 37 ROM Romania

Luxembourg

6 BGR Bulgaria 22 HUN Hungary 38 RUS Russia

7 BLR Belarus 23 IRL Ireland 39 SVK Slovakia

8 CAN Canada 24 ISL Iceland 40 SVN Slovenia

9 CHE Switzerland 25 ITA Italy 41 SWE Sweden

10 CYP Cyprus 26 KAZ Kazakhstan 42 TKM Turkmenistan

11 CZE Czech Republic 27 KGZ Kyrgyzstan 43 TUR Turkey

12 DNK Denmark 28 LTU Lithuania 44 UKR Ukraine

13 ESP Spain 29 LVA Latvia 45 USA United States

14 EST Estonia 30 MDA Moldova 46 YUG Serbia and

Montenegro

15 FIN Finland 31 MKD Macedonia

16 FRA France 32 MLT Malta

XXX

Page 434: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.4 — UR March 2009 — R. Tschernig

Notes: Table is based on Table 1 in “Explanatory notes on gravity data.wf1 ” .

Countries that feature only as origin countries:BIH Bosnia and Herzegovina

TJK Tajikistan

UZB Uzbekistan

CHN China

HKG Hong Kong

JPN Japan

KOR South Korea

TWN Taiwan

THA Thailand

XXXI

Page 435: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.4 — UR March 2009 — R. Tschernig

• Endogenous variable:

– TRADE 0 D O:

Imports of country d from country o (i.e., exports of country o

to country d) in current US dollars

– Commodity classifications: Trade flows are based on aggregating

disaggregate trade flows according to the Standard International

Trade Classification, Revision 3 (SITC, Rev.3) at the lowest ag-

gregation levels (4- or 5-digit). Source: UN COMTRADE

– Without fuels and lubricants (i.e., specifically without petrol and

natural gas products). Cut-off value for underlying disaggregated

trade flows (at SITC Rev.3 5-digit level) is 500 US dollars.

• Explanatory variables:

XXXII

Page 436: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.4 — UR March 2009 — R. Tschernig

Origin country

WDI GDPUSDCR O Origin country GDP data; in current US dollars World Bank - World Development Indicato

WDI GDPPCUSDCR O Origin country GDP per capita data; in current US dollars World Bank - World Development Indicato

WEO GDPCR O Destination and origin country GDP data; in current US dollars IMF - World Economic Outlook database

WEO GDPPCCR O Destination and origin country GDP per capita data; in current US dollars IMF - World Economic Outlook database

WEO POP O Origin country population data IMF - World Economic Outlook database

CEPII AREA O area of origin country in km2 CEPII

CEPII COL45 dummy; d and o country have had a colonial relationship after 1945 CEPII

CEPII COL45 REV dummy; revised by “expert knowledge”

CEPII COLONY dummy; d and o country have ever had a colonial link CEPII

CEPII COMCOL dummy; d and o country share a common colonizer since 1945 CEPII

CEPII COMCOL REV dummy; revised by “expert knowledge”

CEPII COMLANG ETHNO dummy; d and o country share a language CEPII

CEPII COMLANG ETHNO REV at least spoken by 9% of each population

CEPII COMLANG OFF dummy; d and o country share common official language CEPII

CEPII CONTIG dummy; d and o country are contiguous (neighboring countries) CEPII

CEPII DISINT O internal distance in origin country CEPII

CEPII DIST geodesic distance between d and o country CEPII

CEPII DISTCAP distance between d and o country based on capitals 0.67√

area/π CEPII

CEPII DISTW weighted distances, see CEPII for details CEPII

CEPII DISTWCES weighted distances, see CEPII for details CEPII

CEPII LAT O latitute of the city CEPII

CEPII LON O longitute of the city CEPII

CEPII SMCTRY REV dummy; d and o country were/are the same country CEPII, revised

ISO O ISO codes in three characters of origin country CEPII

EBRD TFES O EBRD measure of foreign trade and payments liberalisation of o country EBRD

XXXIII

Page 437: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.4 — UR March 2009 — R. Tschernig

Destination country

WDI GDPUSDCR D Destination country GDP data; in current US dollars World Bank - World Development Indicators

WDI GDPPCUSDCR D Destination country GDP per capita data; in current US dollars World Bank - World Development Indicators

WEO GDPCR D Destination and origin country GDP data; in current US dollars IMF - World Economic Outlook database

WEO GDPPCCR D Destination and origin country GDP per capita data; in current US dollars IMF - World Economic Outlook database

WEO POP D Destination country population data IMF - World Economic Outlook database

Notes: The EBRD measures reform on a scale between 1 and 4+ (=4.33); 1 represents no or little progress; 2 indicates important

progress; 3 is substantial progress; 4 indicates comprehensive progress, while 4+ indicates countries have reached the standards and

performance norms of advanced industrial countries, i.e., of OECD countries. By construction, this variable is ordered qualitative

rather than cardinal.

• Thanks: to Richard Frensch, Osteuropa-Institut, for providing the

data set.

XXXIV

Page 438: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.4 — UR March 2009 — R. Tschernig

• EViews-Commands to extract selected data from main workfile:

– to select observations of countries that export to Kazakhstan:

in workfile: Proc → Copy/Extract from Current Page

→ By Value to New Page or Workfile:

in Sample - observations to copy: @all if (iso d="KAZ"). Objects to

copy: select. Page Destination: select.

– to select observations for one period, e.g. 2004:

as above but: in Sample - observations to copy: 2004 2004

– to select observations for trade flows from Germany to Kaza-

khstan for all periods:

as above but: in Sample - observations to copy: @all if (iso o="KAZ")

and (iso d="GER")

• Websites CEPII

XXXV

Page 439: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Bibliography

Anderson, J. E. & Wincoop, E. v. (2003), ‘Gravity with gravitas: A

solution to the border puzzle’, The American Economic Review

93, 170–192.

Davidson, R. & MacKinnon, J. G. (2004), Econometric Theory and

Methods, Oxford University Press, Oxford.

Fratianni, M. (2007), The gravity equation in international trade, Tech-

nical report, Dipartimento di Economia, Universita Politecnica delle

Marche.

XXXVI

Page 440: Intensive Course in Econometrics Slides...Intensive Course in Econometrics — Section 1.2 — UR March 2009 — R. Tschernig • Econometrics – offers solutions for dealing with

Intensive Course in Econometrics — Section 10.4 — UR March 2009 — R. Tschernig

Pindyck, R. S. & Rubinfeld, D. L. (1998), Econometric models and

economic forecasts, Irwin McGraw-Hill.

Stock, J. H. & Watson, M. W. (2007), Introduction to Econometrics,

Pearson, Boston, Mass.

Wooldridge, J. M. (2009), Introductory Econometrics. A Modern Ap-

proach, 4th edn, Thomson South-Western.

XXXVII