2014 02 03 and 05 econ 141 uc berkeley

Ch.4: Simple Linear Regression

Econ 141 Spring 2014

Lecture: February 02 and 05, 2014

Bart Hobijn

2/03&05/2014 Econ 141, Spring 2014 1

The views expressed in these lecture notes are solely those of the instructor and do not necessarily

reflect those of the UC Berkeley, or other institutions with which he is affiliated.

Example: Estimate MPC

• MPC: Marginal Propensity to Consume

Suppose households’ pre-tax income

increases by a dollar, what fraction of this

dollar would they end up spending versus

paying in taxes or saving?

2/03&05/2014 Econ 141, Spring 2014 2

Example: Estimate MPC

• Basic equation

𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖 – 𝑖 MSA, (unit of observation)

– 𝑌𝑖 Average consumption expenditures per household

– 𝑋𝑖 Average pre-tax income per household

– 𝛽0 Average consumption level at zero income

– 𝛽1 Marginal propensity to consume (MPC)

– 𝑢𝑖 MSA-specific deviation from average linear

relationship between income and spending

• How can we estimate value of MPC, i.e. 𝛽1?

2/03&05/2014 Econ 141, Spring 2014 3

Income and spending by MSA

MSA

(𝒊) Spending

(𝒀𝒊)

Income

(𝑿𝒊)

MSA

(𝒊) Spending

(𝒀𝒊)

Income

(𝑿𝒊)

Chicago 57.7 74.4 Atlanta 51.9 71.2

Detroit 50.5 79.8 Miami 40.6 58.9

Minneapolis-

St. Paul 56.7 66.8

Dallas-

Fort Worth 57.1 71.0

Cleveland 48.0 65.9 Houston 58.2 73.5

New York 58.7 80.2 Los

Angeles 55.3 69.6

Philadelphia 53.5 71.7 San

Francisco 73.6 98.2

Boston 65.0 79.8 San Diego 56.2 76.4

Washington,

D.C. 77.9 111.9 Seattle 60.7 74.1

Baltimore 62.3 96.9 Phoenix 53.7 63.2

2/03&05/2014 Econ 141, Spring 2014 4

Note: Spending and income are annual average across households in thousands of dollars

Source: Consumer Expenditure Survey

http://www.bls.gov/cex/csxmsa.htm



Data scatterplot

2/03&05/2014 Econ 141, Spring 2014 5

Chicago

Detroit

Minneapolis-St. Paul

Cleveland

New York

Philadelphia

Boston

Washington,D.C.

Baltimore

Atlanta

Miami

Dallas-Fort Worth

Houston

LosAngeles

SanFrancisco

San Diego

Seattle

Phoenix

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120

Source: Consumer Expenditure Survey by MSA

Annual income and expenditures by household; 000's dollars; 2012

Average Income and Expenditures by major MSA

Income

Expenditures

Data scatterplot

2/03&05/2014 Econ 141, Spring 2014 6

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

Estimate of MPC (𝜷𝟏)?

2/03&05/2014 Econ 141, Spring 2014 7

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

Estimate of MPC (𝜷𝟏)?

2/03&05/2014 Econ 141, Spring 2014 8

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

What is best estimate of line

defined by 𝜷𝒐 and 𝜷𝟏?

Ordinary Least Squares

2/03&05/2014 Econ 141, Spring 2014 9

Simple linear regression model

𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖

– 𝑖 observation number

– 𝑌𝑖 dependent variable (regressand)

– 𝑋𝑖 independent (explanatory) variable (regressor)

– 𝛽0 intercept / constant

– 𝛽1 slope coefficient

– 𝑢𝑖 error term / residual

2/03&05/2014 Econ 141, Spring 2014 10









2/03&05/2014 Econ 141, Spring 2014 11

Population regression line /

Population regression function

Average linear relationship between

dependent and independent variable.









2/03&05/2014 Econ 141, Spring 2014 12

Error term / Residual

Observation-specific deviation from

average linear relationship between

dependent and independent variable.






– 𝛽0 Intercept / constant

– 𝛽1 Slope coefficient


2/03&05/2014 Econ 141, Spring 2014 13

Why we need to estimate

Observed: Sample 𝑌𝑖 , 𝑋𝑖 for 𝑖 = 1, … , 𝑛.

Unobserved: Parameters 𝛽0 and 𝛽1 as well

as error terms 𝑢𝑖 for 𝑖 = 1, … , 𝑛.

Ordinary Least Squares (OLS)

• OLS estimates: Choose 𝛽 0 and 𝛽 1 to minimize the sum of squared

residuals (SSR)

𝛽 0, 𝛽 1 = argmin𝑏1,𝑏2

𝑌𝑖 − 𝑏0 − 𝑏1𝑋𝑖2

𝑛

𝑖=1

• Properties:

– What is solution for 𝛽 0, 𝛽 1 ?

– Are 𝛽 0, 𝛽 1 consistent estimates of true 𝛽0 and 𝛽1?

– Are 𝛽 0, 𝛽 1 unbiased estimates of true 𝛽0 and 𝛽1?

– What is their asymptotic distribution? 2/03&05/2014 Econ 141, Spring 2014 14

Solution for 𝜷 𝟎, 𝜷 𝟏

First order necessary condition for 𝜷 𝟎

0 =𝜕

𝜕𝛽 0 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

2𝑛

𝑖=1

= 𝜕

𝜕𝛽 0𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

2𝑛

𝑖=1

= −2 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

𝑛

𝑖=1

⇒ 0 =1

𝑛 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

𝑛

𝑖=1

.

Solving for 𝛽 0

0 =1

𝑛 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

𝑛

𝑖=1

= 𝑌 − 𝛽 0 − 𝛽 1𝑋

Such that

𝛽 0 = 𝑌 − 𝛽 1𝑋

2/03&05/2014 Econ 141, Spring 2014 15


First order necessary condition for 𝜷 𝟏

0 =𝜕

𝜕𝛽 1 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

2𝑛

𝑖=1

=𝜕

𝜕𝛽 1 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

2𝑛

𝑖=1

= −2 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

⇒ 0 =1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

.

2/03&05/2014 Econ 141, Spring 2014 16



0 =𝜕

𝜕𝛽 1 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

2𝑛

𝑖=1

=𝜕

𝜕𝛽 1 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

2𝑛

𝑖=1

= −2 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

⇒ 0 =1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

.

2/03&05/2014 Econ 141, Spring 2014 17

This implies that

0 =1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

𝑛𝑖=1



0 =𝜕

𝜕𝛽 1 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

2𝑛

𝑖=1

=𝜕

𝜕𝛽 1 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

2𝑛

𝑖=1

= −2 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

⇒ 0 =1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌

𝑛

𝑖=1

− 𝛽 11

𝑛 𝑋𝑖 − 𝑋 2

𝑛

𝑖=1

= 𝑠𝑋𝑌 −𝛽 1𝑠𝑋2 .

2/03&05/2014 Econ 141, Spring 2014 18



0 =𝜕

𝜕𝛽 1 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

2𝑛

𝑖=1

=𝜕

𝜕𝛽 1 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

2𝑛

𝑖=1

= −2 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

⇒ 0 =1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌

𝑛

𝑖=1

− 𝛽 11


𝑛

𝑖=1

= 𝑠𝑋𝑌 − 𝛽 1𝑠𝑋2 .

2/03&05/2014 Econ 141, Spring 2014 19

𝜷 𝟏 =𝑠𝑋𝑌

𝑠𝑋2

Simple linear regression with OLS

• OLS estimators of 𝜷𝟎 and 𝜷𝟏:

– 𝛽 1 =1

𝑛 𝑋𝑖−𝑋 𝑌𝑖−𝑌 𝑛

𝑖=11

𝑛 𝑋𝑖−𝑋 2𝑛

𝑖=1

=𝑠𝑋𝑌

𝑠𝑋2

– 𝛽 0 = 𝑌 − 𝛽 1𝑋

• Derived estimates for each 𝒊 = 𝟏, … , 𝒏

– 𝑌 𝑖 = 𝛽 0 + 𝛽 1𝑋𝑖, predicted/fitted value of 𝑌𝑖

– 𝑢 𝑖 = 𝑌𝑖 − 𝑌 𝑖, residual

2/03&05/2014 Econ 141, Spring 2014 20

Estimating the MPC

2/03&05/2014 Econ 141, Spring 2014 21

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

Estimating the MPC

2/03&05/2014 Econ 141, Spring 2014 22

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

San Francisco

Estimating the MPC

2/03&05/2014 Econ 141, Spring 2014 23

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

San Francisco

Estimated MPC out of

pre-tax income is 56

cents on the dollar

73.6

69.6

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

San Francisco

SF predicted value and residual

2/03&05/2014 Econ 141, Spring 2014 24

Expenditures in

SF higher than

predicted by

regression

SF predicted value and residual

2/03&05/2014 Econ 141, Spring 2014 25

73.6

69.6

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

San Francisco

Residual:

Goodness of fit

• Main question

What fraction of the variance of the

dependent variable, 𝑌𝑖, is explained by the

regression line rather than unexplained? (unexplained means part of the residuals)

• Variance accounting – 𝑇𝑆𝑆 = 𝑌𝑖 − 𝑌 2𝑛

𝑖=1 total sum of squares

– 𝐸𝑆𝑆 = 𝑌 𝑖 − 𝑌 2𝑛

𝑖=1 , estimated sum of squares

a.k.a. model sum of squares

– 𝑆𝑆𝑅 = 𝑌𝑖 − 𝑌 𝑖2𝑛

𝑖=1 = 𝑢 𝑖2 𝑛

𝑖=1 sum of squares residuals

a.k.a. residual sum of squares

2/03&05/2014 Econ 141, Spring 2014 26

TSS = ESS + SSR decomposition

𝑇𝑆𝑆 = 𝑌𝑖 − 𝑌 2

𝑛

𝑖=1

= 𝑌 𝑖 − 𝑌 + 𝑌𝑖 − 𝑌 𝑖2

𝑛

𝑖=1

= 𝑌 𝑖 − 𝑌 + 𝑢 𝑖

2=

𝑛

𝑖=1

𝛽 0 + 𝛽 1𝑋𝑖 − 𝑌 + 𝑢 𝑖

2𝑛

𝑖=1

= 𝑌 − 𝛽 1𝑋 + 𝛽 1𝑋𝑖 − 𝑌 + 𝑢 𝑖

2𝑛

𝑖=1

= 𝛽 1 𝑋𝑖 − 𝑋 + 𝑢 𝑖2

𝑛

𝑖=1

= 𝛽 1 𝑋𝑖 − 𝑋 2

𝑛

𝑖=1

+ 2𝛽 1 𝑋𝑖 − 𝑋 𝑢 𝑖

𝑛

𝑖=1

+ 𝑢 𝑖2

𝑛

𝑖=1

.

2/03&05/2014 Econ 141, Spring 2014 27



𝑛

𝑖=1

= 𝑌 𝑖 − 𝑌 + 𝑌𝑖 − 𝑌 𝑖2

𝑛

𝑖=1

= 𝑌 𝑖 − 𝑌 + 𝑢 𝑖

2=

𝑛

𝑖=1

𝛽 0 + 𝛽 1𝑋𝑖 − 𝑌 + 𝑢 𝑖

2𝑛

𝑖=1

= 𝑌 − 𝛽 1𝑋 + 𝛽 1𝑋𝑖 − 𝑌 + 𝑢 𝑖

2𝑛

𝑖=1

= 𝛽 1 𝑋𝑖 − 𝑋 + 𝑢 𝑖2

𝑛

𝑖=1

= 𝛽 1 𝑋𝑖 − 𝑋 2

𝑛

𝑖=1

+ 2𝛽 1 𝑋𝑖 − 𝑋 𝑢 𝑖

𝑛

𝑖=1

+ 𝑢 𝑖2

𝑛

𝑖=1

.

2/03&05/2014 Econ 141, Spring 2014 28

𝑋𝑖 − 𝑋 𝑢 𝑖𝑛𝑖=1 = 𝟎 according to first-order

necessary condition derived on slide 17.



𝑛

𝑖=1

= 𝑌 𝑖 − 𝑌 + 𝑌𝑖 − 𝑌 𝑖2

𝑛

𝑖=1

= 𝑌 𝑖 − 𝑌 + 𝑢 𝑖

2=

𝑛

𝑖=1

𝛽 0 + 𝛽 1𝑋𝑖 − 𝑌 + 𝑢 𝑖

2𝑛

𝑖=1

= 𝑌 − 𝛽 1𝑋 + 𝛽 1𝑋𝑖 − 𝑌 + 𝑢 𝑖

2𝑛

𝑖=1

= 𝛽 1 𝑋𝑖 − 𝑋 + 𝑢 𝑖2

𝑛

𝑖=1

= 𝛽 1 𝑋𝑖 − 𝑋 2

𝑛

𝑖=1

+ 2𝛽 1 𝑋𝑖 − 𝑋 𝑢 𝑖

𝑛

𝑖=1

+ 𝑢 𝑖2

𝑛

𝑖=1

.

2/03&05/2014 Econ 141, Spring 2014 29

𝛽 1 𝑋𝑖 − 𝑋 = 𝑌 𝑖 − 𝑌



𝑛

𝑖=1

= 𝑌 𝑖 − 𝑌 + 𝑌𝑖 − 𝑌 𝑖2

𝑛

𝑖=1

= 𝑌 𝑖 − 𝑌 + 𝑢 𝑖

2=

𝑛

𝑖=1

𝛽 0 + 𝛽 1𝑋𝑖 − 𝑌 + 𝑢 𝑖

2𝑛

𝑖=1

= 𝑌 − 𝛽 1𝑋 + 𝛽 1𝑋𝑖 − 𝑌 + 𝑢 𝑖

2𝑛

𝑖=1

= 𝛽 1 𝑋𝑖 − 𝑋 + 𝑢 𝑖2

𝑛

𝑖=1

= 𝛽 1 𝑋𝑖 − 𝑋 2

𝑛

𝑖=1

+ 2𝛽 1 𝑋𝑖 − 𝑋 𝑢 𝑖

𝑛

𝑖=1

+ 𝑢 𝑖2

𝑛

𝑖=1

= 𝑌 𝑖 − 𝑌 2

𝑛

𝑖=1

+ 𝑢 𝑖2

𝑛

𝑖=1

= 𝐸𝑆𝑆 + 𝑆𝑆𝑅.

2/03&05/2014 Econ 141, Spring 2014 30

Goodness of fit for MCP regression

2/03&05/2014 Econ 141, Spring 2014 31

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

San Francisco

𝑹𝟐: measure of goodness of fit

Measure Equation Value Share (percentage)

ESS 𝑌 𝑖 − 𝑌 2𝑛

𝑖=1 944.0 74.9

SSR 𝑢 𝑖2

𝑛

𝑖=1 316.4 25.1

TSS 𝑌𝑖 − 𝑌 2𝑛

𝑖=1 1260.3 100.0

2/03&05/2014 Econ 141, Spring 2014 32

𝑹𝟐: measure of goodness of fit



𝑖=1 944.0 74.9

SSR 𝑢 𝑖2

𝑛

𝑖=1 316.4 25.1


𝑖=1 1260.3 100.0

2/03&05/2014 Econ 141, Spring 2014 33

𝑹𝟐 fraction of the variation in

the dependent variable, i.e.

of 𝑻𝑺𝑺, explained by the

regression line. 𝑹𝟐 = 𝟎. 𝟕𝟒𝟗.

Standard error of the regression



𝑖=1 944.0 74.9

SSR 𝑢 𝑖2

𝑛

𝑖=1 316.4 25.1


𝑖=1 1260.3 100.0

2/03&05/2014 Econ 141, Spring 2014 34

𝑺𝑬𝑹 = 𝒔𝒖 , where 𝒔𝒖 𝟐 =

𝟏

𝒏−𝟐 𝒖 𝒊

𝟐 𝒏𝒊=𝟏 is

an unbiased estimate of variance of

residuals 𝒗𝒂𝒓 𝒖𝒊


𝑆𝐸𝑅 = 𝑠𝑢 , where 𝑠𝑢 2 =

1

𝑛−2 𝑢 𝑖

2 𝑛𝑖=1

• Degrees of freedom correction If we had only two observations we would be able to

perfectly fit a straight line and residuals would be

zero.

2/03&05/2014 Econ 141, Spring 2014 35


𝑆𝐸𝑅 = 𝑠𝑢 , where 𝑠𝑢 2 =

1

𝑛−2 𝑢 𝑖

2 𝑛𝑖=1

• Degrees of freedom correction If we had only two observations we would be able to

perfectly fit a straight line and residuals would be

zero.

• Measure of spread around regression line Estimate of standard deviation of deviation from the

regression line.

2/03&05/2014 Econ 141, Spring 2014 36

MCP regression in Excel

2/03&05/2014 Econ 141, Spring 2014 37

30

40

50

60

70

80

90

50 60 70 80 90 100 110 120




Income

Expenditures

San Francisco

SUMMARY OUTPUT

Dependent variable: Expenditures

Regression Statistics

Multiple R 0.865434015

R Square 0.748976034

Adjusted R Square 0.733287037

Standard Error 4.446724217

Observations 18

ANOVA

df SS MS F Significance F

Regression 1 943.9589518 943.9589518 47.73893411 3.51406E-06

Residual 16 316.3737002 19.77335626

Total 17 1260.332652

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

Intercept 14.69690817 6.304502449 2.331176535 0.033146025 1.331960021 28.061856 1.33196002 28.06185632

Income 0.558872878 0.080886618 6.909336734 3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847

Example Excel regression output

2/03&05/2014 Econ 141, Spring 2014 38

𝜷 𝟎 estimated

intercept

SUMMARY OUTPUT




R Square 0.748976034



Observations 18

ANOVA


Regression 1 943.9589518 943.9589518 47.73893411 3.51406E-06

Residual 16 316.3737002 19.77335626

Total 17 1260.332652


Intercept 14.69690817 6.304502449 2.331176535 0.033146025 1.331960021 28.061856 1.33196002 28.06185632

Income 0.558872878 0.080886618 6.909336734 3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847


2/03&05/2014 Econ 141, Spring 2014 39

𝜷 𝟏 estimated slope

SUMMARY OUTPUT




R Square 0.748976034



Observations 18

ANOVA


Regression 1 943.9589518 943.9589518 47.73893411 3.51406E-06

Residual 16 316.3737002 19.77335626

Total 17 1260.332652


Intercept 14.69690817 6.304502449 2.331176535 0.033146025 1.331960021 28.061856 1.33196002 28.06185632

Income 0.558872878 0.080886618 6.909336734 3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847


2/03&05/2014 Econ 141, Spring 2014 40

𝒏 sample size

SUMMARY OUTPUT




R Square 0.748976034



Observations 18

ANOVA


Regression 1 943.9589518 943.9589518 47.73893411 3.51406E-06

Residual 16 316.3737002 19.77335626

Total 17 1260.332652


Intercept 14.69690817 6.304502449 2.331176535 0.033146025 1.331960021 28.061856 1.33196002 28.06185632

Income 0.558872878 0.080886618 6.909336734 3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847


2/03&05/2014 Econ 141, Spring 2014 41

𝑬𝑺𝑺 Explained sum of squares

𝑺𝑺𝑹 Sum of squared residuals

𝑻𝑺𝑺 Total sum of squares

SUMMARY OUTPUT




R Square 0.748976034



Observations 18

ANOVA


Regression 1 943.9589518 943.9589518 47.73893411 3.51406E-06

Residual 16 316.3737002 19.77335626

Total 17 1260.332652


Intercept 14.69690817 6.304502449 2.331176535 0.033146025 1.331960021 28.061856 1.33196002 28.06185632

Income 0.558872878 0.080886618 6.909336734 3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847


2/03&05/2014 Econ 141, Spring 2014 42

𝑹𝟐

SUMMARY OUTPUT




R Square 0.748976034



Observations 18

ANOVA


Regression 1 943.9589518 943.9589518 47.73893411 3.51406E-06

Residual 16 316.3737002 19.77335626

Total 17 1260.332652


Intercept 14.69690817 6.304502449 2.331176535 0.033146025 1.331960021 28.061856 1.33196002 28.06185632

Income 0.558872878 0.080886618 6.909336734 3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847


2/03&05/2014 Econ 141, Spring 2014 43

𝑠𝑢 2 =

1

𝑛 − 2 𝑢 𝑖

2 𝑛

𝑖=1

SUMMARY OUTPUT




R Square 0.748976034



Observations 18

ANOVA


Regression 1 943.9589518 943.9589518 47.73893411 3.51406E-06

Residual 16 316.3737002 19.77335626

Total 17 1260.332652


Intercept 14.69690817 6.304502449 2.331176535 0.033146025 1.331960021 28.061856 1.33196002 28.06185632

Income 0.558872878 0.080886618 6.909336734 3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847


2/03&05/2014 Econ 141, Spring 2014 44

𝑆𝐸𝑅 = 𝑠𝑢

Why use OLS?

• Most common estimation method

– Implemented in many different applications

– Most common methodology. Thus important to

understand.

• OLS has very desirable properties

Under relatively general conditions OLS estimates

are

– Consistent

– Unbiased

– Have tractable asymptotic distribution 2/03&05/2014 Econ 141, Spring 2014 45

Why use OLS?

• Most common estimation method

– Implemented in many different applications

– Most common methodology. Thus important to

understand.

• OLS has very desirable properties

Under relatively general conditions OLS estimates

are

– Consistent

– Unbiased

– Have tractable asymptotic distribution 2/03&05/2014 Econ 141, Spring 2014 46

Conditions listed in book

𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖, where 𝑖 = 1, … , 𝑛

• No information in 𝑿𝒊 about 𝒖𝒊

𝐸 𝑢𝑖 = 𝐸 𝑢𝑖 𝑋𝑖 = 0

2/03&05/2014 Econ 141, Spring 2014 47




𝐸 𝑢𝑖 = 𝐸 𝑢𝑖 𝑋𝑖 = 0 Suppose not and instead 𝐸 𝑢𝑖 𝑋𝑖 = 𝛾𝑋𝑖, then we can write

𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝛾𝑋𝑖 + 𝑢𝑖 − 𝛾𝑋𝑖

= 𝛽0 + 𝛽1 + 𝛾 𝑋𝑖 + 𝑢𝑖 − 𝛾𝑋𝑖 = 𝛽0 + 𝛽 1𝑋𝑖 + 𝑢 𝑖

where 𝐸 𝑢 𝑖 𝑋𝑖 = 𝐸 𝑢𝑖 − 𝛾𝑋𝑖 𝑋𝑖 = 𝐸 𝑢𝑖 𝑋𝑖 − 𝛾𝑋𝑖 = 0

So, in this case there is an alternative representation of the

linear regression line with a different slope parameter,

𝛽 1 = 𝛽1 + 𝛾, that satisfies this assumption.

2/03&05/2014 Econ 141, Spring 2014 48





Note, this implies that

𝐸 𝑢𝑖𝑋𝑖 = 𝐸 𝑋𝑖𝐸 𝑢𝑖 𝑋𝑖 = 0

Which, given 𝐸 𝑢𝑖 = 0 implies that

cov 𝑋𝑖 , 𝑢𝑖 = 𝐸 𝑢𝑖𝑋𝑖 − 𝐸 𝑢𝑖 𝐸 𝑋𝑖 = 0

2/03&05/2014 Econ 141, Spring 2014 49





• 𝑿𝒊, 𝒀𝒊 well-behaved random variables

– 𝑋𝑖 , 𝑌𝑖 , 𝑖 = 1, … , 𝑛, are independently drawn from

identical joint distribution.

– Large outliers are unlikely.

2/03&05/2014 Econ 141, Spring 2014 50





• 𝑿𝒊, 𝒀𝒊 well-behaved random variables

– 𝑋𝑖 , 𝑌𝑖 , 𝑖 = 1, … , 𝑛, are independently drawn from

identical joint distribution.

– Large outliers are unlikely.

Last two assumptions are made such that we can apply LLN

and CLT to derive properties of 𝛽 0 and 𝛽 1.

2/03&05/2014 Econ 141, Spring 2014 51

Properties of 𝛽 0 and 𝛽 1

Properties of 𝛽 0 and 𝛽 1 are derived by manipulating

the first-order necessary conditions from slides 15 and

17.

0 =1

𝑛 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

𝑛

𝑖=1

=1

𝑛 𝑢 𝑖

𝑛

𝑖=1

and

0 =1


𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝑢 𝑖

𝑛

𝑖=1

2/03&05/2014 Econ 141, Spring 2014 52

Properties of 𝛽 0 and 𝛽 1

Properties of 𝛽 0 and 𝛽 1 are derived by manipulating

the first-order necessary conditions from slides 15 and

17.

0 =1

𝑛 𝑌𝑖 − 𝛽 0 − 𝛽 1𝑋𝑖

𝑛

𝑖=1

=1

𝑛 𝑢 𝑖

𝑛

𝑖=1

and

0 =1


𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝑢 𝑖

𝑛

𝑖=1

These are sample approximations of condition

𝐸 𝑢𝑖 = 𝐸 𝑢𝑖𝑋𝑖 = 0

2/03&05/2014 Econ 141, Spring 2014 53

Consistency of 𝛽 1

0 =1


𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖 − 𝛽0 − 𝛽1𝑋𝑖 − 𝑢 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝛽1 − 𝛽 1 𝑋𝑖 − 𝑋 + 𝑢𝑖 − 𝑢

𝑛

𝑖=1

= 𝛽1 − 𝛽 11


𝑛

𝑖=1

+1

𝑛 𝑋𝑖 − 𝑋 𝑢𝑖 − 𝑢

𝑛

𝑖=1

.

2/03&05/2014 Econ 141, Spring 2014 54


0 =1


𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

=1


𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝛽1 − 𝛽 1 𝑋𝑖 − 𝑋 + 𝑢𝑖 − 𝑢

𝑛

𝑖=1

= 𝛽1 − 𝛽 11


𝑛

𝑖=1

+1


𝑛

𝑖=1

.

2/03&05/2014 Econ 141, Spring 2014 55

𝒑 𝒗𝒂𝒓 𝑿𝒊 > 𝟎 𝒑

𝒄𝒐𝒗 𝑿𝒊, 𝒖𝒊 = 𝟎


0 =1


𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

=1


𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝛽1 − 𝛽 1 𝑋𝑖 − 𝑋 + 𝑢𝑖 − 𝑢

𝑛

𝑖=1

= 𝛽1 − 𝛽 11


𝑛

𝑖=1

+1


𝑛

𝑖=1

𝑝 0 = 𝛽1 − 𝛽 1 var 𝑋𝑖 .

Such that

𝛽1 − 𝛽 1𝑝 0, that is 𝛽 1

𝑝 𝛽1

2/03&05/2014 Econ 141, Spring 2014 56


0 =1


𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝑌𝑖 − 𝑌 − 𝛽 1 𝑋𝑖 − 𝑋

𝑛

𝑖=1

=1


𝑛

𝑖=1

=1

𝑛 𝑋𝑖 − 𝑋 𝛽1 − 𝛽 1 𝑋𝑖 − 𝑋 + 𝑢𝑖 − 𝑢

𝑛

𝑖=1

= 𝛽1 − 𝛽 11


𝑛

𝑖=1

+1


𝑛

𝑖=1

𝑝 0 = 𝛽1 − 𝛽 1 var 𝑋𝑖 .

Such that

𝛽1 − 𝛽 1𝑝 0, that is 𝛽 1

𝑝 𝛽1

2/03&05/2014 Econ 141, Spring 2014 57

As the sample size 𝒏 gets arbitrarily

large, i.e. 𝒏 ∞, our estimate of the

slope coefficient, 𝜷 𝟏, gets arbitrarily

close to the true parameter value

𝜷𝟏from the population regression line

But, in real life, 𝒏 is finite

Small sample properties of OLS estimators

• Unbiasedness

On average OLS estimate equals true

parameter value of interest.

• Asymptotic distribution

OLS assumptions imply we can use CLT to

derive asymptotic normal distribution of OLS

estimates, that can be used as approximation

when 𝑛 is big.

2/03&05/2014 Econ 141, Spring 2014 58

Unbiasedness of 𝛽 1

0 = 𝛽1 − 𝛽 11


𝑛

𝑖=1

+1


𝑛

𝑖=1

.

Such that

𝛽 1 = 𝛽1 +

1𝑛

𝑋𝑖 − 𝑋 𝑢𝑖 − 𝑢 𝑛𝑖=1

1𝑛

𝑋𝑖 − 𝑋 2𝑛𝑖=1

= 𝛽1 +

1𝑛

𝑋𝑖 − 𝑋 𝑢𝑖𝑛𝑖=1

1𝑛


Taking expectations yields

E 𝛽 1 = E 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛

𝑋𝑖 − 𝑋 E 𝑢𝑖 − 𝑢 𝑋1, … , 𝑋𝑛𝑛𝑖=1

1𝑛


= 𝛽1 +E

1𝑛

𝑋𝑖 − 𝑋 E 𝑢𝑖 − 𝑢 𝑋𝑖𝑛𝑖=1

1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1

2/03&05/2014 Econ 141, Spring 2014 59


0 = 𝛽1 − 𝛽 11


𝑛

𝑖=1

+1


𝑛

𝑖=1

.

Such that

𝛽 1 = 𝛽1 +

1𝑛


1𝑛


= 𝛽1 +

1𝑛


1𝑛



E 𝛽 1 = E 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 +E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1

2/03&05/2014 Econ 141, Spring 2014 60

Here is where we

apply second

condition from slide 43


0 = 𝛽1 − 𝛽 11


𝑛

𝑖=1

+1


𝑛

𝑖=1

.

Such that

𝛽 1 = 𝛽1 +

1𝑛


1𝑛


= 𝛽1 +

1𝑛


1𝑛



E 𝛽 1 = E 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 +E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1

2/03&05/2014 Econ 141, Spring 2014 61

Implied by first

condition from slide 43


0 = 𝛽1 − 𝛽 11


𝑛

𝑖=1

+1


𝑛

𝑖=1

.

Such that

𝛽 1 = 𝛽1 +

1𝑛


1𝑛


= 𝛽1 +

1𝑛


1𝑛



E 𝛽 1 = E 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 +E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1

2/03&05/2014 Econ 141, Spring 2014 62

Even as the sample size 𝒏 is not large, on

average our estimate of the slope

coefficient, 𝜷 𝟏, will equal the true parameter

value 𝜷𝟏from the population regression line.

𝐄 𝜷 𝟏 = 𝜷𝟏. Thus, 𝜷 𝟏 unbiased.


0 = 𝛽1 − 𝛽 11


𝑛

𝑖=1

+1


𝑛

𝑖=1

.

Such that

𝛽 1 = 𝛽1 +

1𝑛


1𝑛


= 𝛽1 +

1𝑛


1𝑛



E 𝛽 1 = E 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1 +E

1𝑛


1𝑛


= 𝛽1 + E

1𝑛


1𝑛


= 𝛽1

2/03&05/2014 Econ 141, Spring 2014 63

Note that the first condition that

E 𝑢𝑖 − 𝑢 𝑋𝑖 = 0

is crucial for OLS to be unbiased.

If this condition is not true the average OLS

estimate will deviate from

the true parameter value.

Asymptotic distribution of 𝛽 1

𝛽 1 − 𝛽1 =

1𝑛


1𝑛


=

1𝑛


1𝑛


See slide 48

3 steps to deriving asymptotic distribution

1. Apply CLT to numerator

2. Apply LLN to denominator

3. Combine using Slutsky’s theorem (S&W page 676)

2/03&05/2014 Econ 141, Spring 2014 64

Apply CLT to 1

𝑛 𝑋𝑖 − 𝑋 𝑢𝑖

𝑛𝑖=1

• Define random variable

𝑣𝑖 = 𝑋𝑖 − 𝑋 𝑢𝑖

• Condition 1 from slide 47: E 𝑣𝑖 = 0

• Conditions 2&3 from slide 50 imply that

– var 𝑣𝑖 exists and is finite.

– CLT applies to sample mean of 𝑣𝑖.

𝑣 =1


𝑛

𝑖=1

• Note that: 𝛽 1 − 𝛽1 = 𝑣 𝑠𝑋2

2/03&05/2014 Econ 141, Spring 2014 65

Apply CLT to 1


𝑛𝑖=1

Apply the Central Limit Theorem

𝑍 =𝑣 𝑖 − E 𝑣𝑖

var 𝑣𝑖 𝑛 =

𝑣 𝑖

var 𝑣𝑖 𝑛 𝑑 𝑁 0,1

Such that

𝑣 𝑖𝑑 𝑁 0, var 𝑣𝑖 𝑛

Thus, the numerator of

𝛽 1 − 𝛽1 = 𝑣 𝑠𝑋2

has an asymptotic distribution that is normal

with a mean equal to zero. 2/03&05/2014 Econ 141, Spring 2014 66

Apply LLN to 𝑠𝑋2

• Conditions 2&3 from slide 50 imply that we

can apply the Law of Large Numbers to 𝑠𝑋2

and that 𝑠𝑋2 converges in probability to the

variance of 𝑋𝑖, i.e. to var 𝑋𝑖 .

• Thus

𝑠𝑋2

𝑝 var 𝑋𝑖

• Now that we know asymptotic behavior of

numerator and denominator of 𝛽 1 − 𝛽1 = 𝑣 𝑠𝑋2

the only thing left is to combine them.

2/03&05/2014 Econ 141, Spring 2014 67

Apply Slutsky’s theorem (S&W page 676)

Slutsky’s theorem implies that we can combine the

asymptotic properties of

𝑣 𝑖𝑑 𝑁 0, var 𝑣𝑖 𝑛

and

𝑠𝑋2

𝑝 var 𝑋𝑖

such that

𝛽 1 − 𝛽1 = 𝑣 𝑠𝑋2

𝑑 𝑁 0,

var 𝑣𝑖 𝑛

var 𝑋𝑖

and thus

𝛽 1 = 𝛽1 + 𝑣 𝑠𝑋2

𝑑 𝑁 𝛽1,

var 𝑣𝑖 𝑛

var 𝑋𝑖

2/03&05/2014 Econ 141, Spring 2014 68

𝛽 1 has tractable asymptotic distribution

Thus as sample size 𝑛 get large then estimated

slope coefficient has approximately a normal

distribution, such that

𝛽 1~𝑁 𝜇𝛽 1, 𝜎𝛽 1

where

𝜇𝛽 1= 𝛽1

𝜎𝛽 1

2 =var 𝑣𝑖 𝑛

var 𝑋𝑖

2

=1

𝑛

var 𝑣𝑖

var 𝑋𝑖2

2/03&05/2014 Econ 141, Spring 2014 69





𝛽 1~𝑁 𝜇𝛽 1, 𝜎𝛽 1

where

𝜇𝛽 1= 𝛽1

𝜎𝛽 1


var 𝑋𝑖

2

=1

𝑛

var 𝑣𝑖

var 𝑋𝑖2

2/03&05/2014 Econ 141, Spring 2014 70

Remember:

𝐄 𝜷 𝟏 = 𝜷𝟏. Thus, 𝜷 𝟏 unbiased.





𝛽 1~𝑁 𝜇𝛽 1, 𝜎𝛽 1

where

𝜇𝛽 1= 𝛽1

𝜎𝛽 1


var 𝑋𝑖

2

=1

𝑛

var 𝑣𝑖

var 𝑋𝑖2

2/03&05/2014 Econ 141, Spring 2014 71

Remember:

𝜎𝛽 1

2 𝟎 as 𝒏 ∞. Thus, 𝜷 𝟏 consistent.





𝛽 1~𝑁 𝜇𝛽 1, 𝜎𝛽 1

where

𝜇𝛽 1= 𝛽1

𝜎𝛽 1


var 𝑋𝑖

2

=1

𝑛

var 𝑣𝑖

var 𝑋𝑖2

2/03&05/2014 Econ 141, Spring 2014 72

This term is like a noise to signal ratio.

The numerator is related to variance of the

residual

The denominator is related to the variance of the

explanatory variable

The large the variation in the explanatory variable

relative to the residual the more accurate the

OLS estimate.

Virtues of asymptotic normality

• Asymptotic normality of OLS coefficients

allows us to

– Do hypothesis tests

– Calculate confidence intervals

• Simple generalizations of same techniques

applied to the population mean.

• Chapter 5! Next week.

2/03&05/2014 Econ 141, Spring 2014 73

Summary

• Linear regression

model

• OLS estimators

• 𝑅2 and 𝑆𝐸𝑅

• OLS conditions

• Consistency of 𝛽 1

• Unbiasedness of 𝛽 1

• Asymptotic distribution

of 𝛽 1

• Even-numbered

problems:

4.2, 4.4, 4.6, 4.10,

4.12, 4.14

• Study STATA tutorial

• Empirical exercises

next week.

2/03&05/2014 Econ 141, Spring 2014 74

http://wps.aw.com/aw_stock_ie_2/50/13016/3332253.cw/

Documents

2014 02 03 and 05 econ 141 uc berkeley