41
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41

ECON 351 - The Simple Regression Model - The Simple... · The Simple Regression Model I Equation 1 assumes a linear functional form, i.e. it assumes that the relationship between

  • Upload
    lythuan

  • View
    231

  • Download
    0

Embed Size (px)

Citation preview

ECON 351 - The Simple Regression

Model

Maggie Jones

1 / 41

The Simple Regression Model

I Our starting point will be the simple regression modelwhere we look at the relationship between two variables

I In general, more complicated econometric models are used forempirical analysis, but this provides a good starting point

I Suppose we have two variables, x and y, and we are interestedin the relationship between the two

I Specifically, we care about the question, “how does x affect y?”

I Typically, we don’t observe the full population of y or the fullpopulation of x so we can think of y and x as random samples

2 / 41

The Simple Regression Model

I In determining the relationship between x and y, we shouldkeep three questions in mind:

1 How do we allow for factors other than x that might affect y?

2 What is the functional relationship between x and y?

3 How can we be certain we are capturing the ceteris paribusrelationship between x and y?

I We resolve these questions by writing down an equationrelating y to x

3 / 41

The Simple Regression Model

y = β0 + β1x+ u (1)

I We call equation 1 the simple linear regression model

I y is called the dependent variable

I x is called the independent variable

I u is called the error term, it represents everything else thathelps to explain y, but is not contained in x

4 / 41

The Simple Regression Model

I Equation 1 assumes a linear functional form, i.e. it assumesthat the relationship between x and y is linear

I β0 is the intercept term/parameter

I β1 is the slope parameter - it measures the effect of x on y,holding all other factors constant:

∆y = ∆β0 + β1∆x+ ∆u

I Note: in what instances would a linear functional form be apoor choice?

5 / 41

The Simple Regression Model

I Equation 1 assumes a linear functional form, i.e. it assumesthat the relationship between x and y is linear

I β0 is the intercept term/parameter

I β1 is the slope parameter

6 / 41

More on the Error Term

I As long as β0 is included in the equation, we can assume thatthe average value of u in the population is zero

E(u) = 0 (2)

I A crucial assumption is that the average value of u does notdepend on x, this is known as mean independence

E(u|x) = E(u) (3)

I Combining equation 2 and 3 yields one of the most importantassumptions in regression analysis, the zero conditionalmean assumption

E(u|x) = 0 (4)

7 / 41

The Simple Regression Model

I Equation 1 assumes a linear functional form, i.e. it assumesthat the relationship between x and y is linear

I β0 is the intercept term/parameter

I β1 is the slope parameter

8 / 41

The Simple Regression Model

I The zero conditional mean assumption gives β1 anotherinterpretation

I Taking conditional expectations of equation 1 yields:

E(y|x) = β0 + β1x (5)

I which is known as the population regression function

I We interpret β1 as, “a 1 unit increase in x increases theexpected value of y by β1 units”

9 / 41

The Simple Regression Model

I We can now re-consider equation 1

y = β0 + β1x︸ ︷︷ ︸explained part

+ u︸︷︷︸unexplained part

I y can be decomposed intoI the explained part - part of y explained by x

I the unexplained portion - part of y that can’t be explainedby x

10 / 41

Ordinary Least Squares

I Now we can begin to discuss the way to estimate β0 and β1

given a random sample of y and x

I Let {(xi, yi) : i = 1, . . . , n} be a random sample of size ndrawn from the population (x, y)

yi = β0 + β1xi + ui (6)

I How do we use the data to obtain parameter estimates of thepopulation intercept and slope?

11 / 41

Ordinary Least Squares

I We begin with the zero conditional mean assumption ofequation 4, which implies:

Cov(x, u) = E(ux) = 0 (7)

I And the zero mean assumption of equation 2

E(u) = 0 (8)

I These two equations are known as moment conditions

12 / 41

Ordinary Least Squares

I We then define u in terms of the simple regression equationand our moment conditions become

E(ux) = E [(y − β0 − β1x)x] = 0 (9)

I And the zero mean assumption of equation 2

E(u) = E(y − β0 − β1x) = 0 (10)

13 / 41

Ordinary Least Squares

I Given our sample of x and y, using the method ofmoments, we choose our parameter estimates, β0 and β1 tosolve the system of equations

E(ux) =1

n

n∑i=1

(yi − β0 − β1xi)xi = 0 (11)

E(u) =1

n

n∑i=1

(yi − β0 − β1xi) = 0 (12)

14 / 41

Ordinary Least Squares

I Solving yields the parameter estimate for β0

β0 = y + β1x (13)

I And the estimate for β1

β1 =

∑ni=1(xi − x)(yi − y)∑n

i=1(xi − x)2(14)

I Equation 14 is actually just the sample covariance between xand y divided by the sample variance of x

15 / 41

Ordinary Least Squares

I The method of moments is not the only way to arrive at theseequations for parameter estimates of β0 and β1

I The focus of Econ 351 will be on the method of OrdinaryLeast Squares

I Our estimates β0 and β1 are also called the ordinary leastsquares estimates

16 / 41

Ordinary Least Squares

I To see why, define a fitted value as the value of yi that weobtain from combining the sample xi with our parameterestimates, β0 and β1

yi = β0 + β1xi

I Define the residual as the difference between the actual valueof yi and the fitted value yi

ui = yi − yi = yi − β0 − β1xi

17 / 41

Ordinary Least Squares

as small as possible. The appendix to this chapter shows that the conditions necessaryfor (!0,!1) to minimize (2.22) are given exactly by equations (2.14) and (2.15), withoutn"1. Equations (2.14) and (2.15) are often called the first order conditions for the OLSestimates, a term that comes from optimization using calculus (see Appendix A). Fromour previous calculations, we know that the solutions to the OLS first order conditionsare given by (2.17) and (2.19). The name “ordinary least squares” comes from the factthat these estimates minimize the sum of squared residuals.

Once we have determined the OLS intercept and slope estimates, we form the OLSregression line:

y # !0 $ !1x, (2.23)

where it is understood that !0 and !1 have been obtained using equations (2.17) and(2.19). The notation y, read as “y hat,” emphasizes that the predicted values from equa-tion (2.23) are estimates. The intercept, !0, is the predicted value of y when x # 0,although in some cases it will not make sense to set x # 0. In those situations, !0 is not,in itself, very interesting. When using (2.23) to compute predicted values of y for vari-ous values of x, we must account for the intercept in the calculations. Equation (2.23)is also called the sample regression function (SRF) because it is the estimated versionof the population regression function E(y!x) # !0 $ !1x. It is important to rememberthat the PRF is something fixed, but unknown, in the population. Since the SRF is

Chapter 2 The Simple Regression Model

31

F i g u r e 2 . 4

Fitted values and residuals.

y # !0 $ !1x

y

ˆ ˆˆ

x1 xi x

yi

yi # Fitted valuey1

ûi # residual

ˆ

d 7/14/99 4:30 PM Page 31

18 / 41

Ordinary Least Squares

I It seems reasonable to want parameter values that minimizethe difference between the true yi and the fitted value yi

I Sometimes ui will be positive and sometimes it will benegative, thus in theory summing over all residuals could equalzero

I However, if we square the residuals, we have a more accuratesummary of the total error in the regression residuals

19 / 41

Ordinary Least Squares

I Choosing parameter values for β0 and β1 that minimize thesum of squared residuals is the basic principle behindordinary least squares

n∑i=1

u2i =

n∑i=1

(yi − β0 − β1xi

)2

(15)

I To minimize equation 15 we set the first order conditions withrespect to each of the βs equal to zero

20 / 41

Ordinary Least Squares

I The fitted values and parameter values form the OLSregression line

y = β0 + β1x (16)

I The slope estimate tells us the amount by which y changeswhen x changes by one unit

β1 =∆y

∆x

21 / 41

Useful Properties of OLS Estimates

1 The sum of the OLS residuals is zero

n∑i=1

ui = 0

2 The sample covariance between x and u is zero

n∑i=1

xiui = 0

3 The point (x, u) is always on the OLS regression line

22 / 41

Useful Properties of OLS Estimates

I Re-writing yi in terms of its fitted value and its residual isuseful

yi = yi + ui

I From here we see thatI If 1

n

∑ni=1 ui = 0 then yi = ¯yi

I The covariance of yi and ui is zero

I OLS decomposes yi into two parts: a fitted value and aresidual, both of which are uncorrelated

23 / 41

Sum of Squares

1 Total Sum of Squares

SST =n∑i=1

(yi − y)2

2 Explained Sum of Squares

SSE =n∑i=1

(yi − y)2

3 Residual Sum of Squares

SSR =n∑i=1

(yi − yi)2

24 / 41

Sum of Squares

1 Total Sum of Squares: measures the total sample variation inthe yi (measures how spread out the yi are in the sample)

2 Explained Sum of Squares: measures the sample variation inthe fitted values, yi

3 Residual Sum of Squares: measures the sample variation inthe residuals, ui

I Note that the total variation can be expressed as the sum ofthe explained and unexplained variation:

SST = SSE + SSR

25 / 41

Goodness of Fit

I One of the most common ways to measure how “well” aregression fits the data is to use the R-squared

R2 = SSE/SST = 1− SSR/SST (17)

I It tells us the ratio of the explained variation compared to thetotal variation

I So if the majority of y is explained by unobserved factors, theR2 tends to be very low

I R2 is always between 0 and 1

26 / 41

Notes on the R2

I A low R2 does not necessarily mean that the regression is“bad” and shouldn’t be used

I It simply means that the variable x does not explain much ofthe variation in the variable y

I i.e. there are other variables that might help to explain y

I The regression may still provide an accurate summary of therelationship between x and y

27 / 41

Functional FormI Level-Level: dependent and independent variables are in

levels and related linearly

y = β0 + β1x+ u

I Log-Level: dependent variable is in log form, independentvariable in levels

log(y) = β0 + β1x+ u

I Log-Log: dependent and independent variables are in logform - can be interpreted as an elasticity

log(y) = β0 + β1 log(x) + u

I Level-Log: dependent variable is in levels and independentvariable in log form

y = β0 + β1 log(x) + u

28 / 41

Functional Form

Model Equation Y X β1

Lev-Lev y = β0 + β1x+ u y x ∆y = β1∆x

Log-Lev log(y) = β0 + β1x+ u log(y) x %∆y = (100β1)∆x

Log-Log log(y) = β0 + β1 log(x) + u log(y) log(x) %∆y = β1%∆x

Lev-Log y = β0 + β1 log(x) + u y log(x) ∆y = (β1/100)%∆x

29 / 41

Unbiasedness of OLS

I Unbiasedness is a statistical property that we will examinein the context of our simple linear regression model

I We require four assumptions to establish the unbiasedness ofOLS parameters

I SLR. 1 - Linear in Parameters: needs to be in the formy = β0 + β1x+ u

I SLR. 2 - Random Sampling: {(xi, yi) : i = 1, . . . , n} must bedrawn from a random sample

I SLR. 3 - Variation in x: the sample outcomes on x are not allthe same value

I SLR. 4 - Zero Conditional Mean: our previous assumptionE(u|x) = 0 holds

30 / 41

Unbiasedness of OLS

I Now consider rewriting β1 as

β1 =

∑ni=1(xi − x)yi∑ni=1(xi − x)2

I Recall from the review that a parameter is unbiased if itsexpectation equals its true value

I Substituting in the regression equation for yi yields

β1 =

∑ni=1(xi − x)(β0 + β1xi + ui)∑n

i=1(xi − x)2

31 / 41

Unbiasedness of OLS

I Which, cancelling terms that equal 0, is

β1 = β1 +

∑ni=1(xi − x)ui∑ni=1(xi − x)2

I Checking unbiasedness:

E(β1) = E(β1)︸ ︷︷ ︸=β1

+ E[∑n

i=1(xi − x)ui∑ni=1(xi − x)2

]︸ ︷︷ ︸

= 1∑ni=1

(xi−x)2

∑ni=1(xi−x)E(ui)

I And since E(ui) = 0, we have:

E(β1) = β1

32 / 41

Unbiasedness of OLS

I Now to verify the unbiasedness of β0

β0 = y − β1x

= β0 + β1x+ u− β1x

E(β0) = E(β0)︸ ︷︷ ︸=β0

+E(β1x)︸ ︷︷ ︸=β1x

+E(u)︸︷︷︸=0

−E(β1x)︸ ︷︷ ︸=β1x

E(β0) = β0

I So β0 is also unbiased under SLR. 1 - SLR. 4

33 / 41

Variance of the OLS Estimate

I We also wish to know how far we can expect β1 to be from β1

on average

I We can compute the variance of the OLS estimators underassumptions SLR. 1 - SLR. 4, plus one additional assumption

I SLR. 5 - Homoskedasticity: the error term has the samevariance given any value of the explanatory variable

Var(u|x) = σ2 ∀x

34 / 41

Variance of the OLS Estimate

I Under SLR. 1 - SLR. 5, the variance of the OLS estimators are:

Var(β1) =σ2∑n

i=1(xi − x)2

I And

Var(β0) =σ2

n

∑ni=1 x

2i∑n

i=1(xi − x)2

35 / 41

Estimating the Error Variance

I Typically, we don’t know the true value of σ2, so we need toobtain an estimate of it

I The errors are never observed, but the regression residuals are

I Note that E(u2) = σ2

I Thus, an unbiased estimator of σ2 is 1n

∑ni=1 u

2i

I However, we do not observe ui, we observe ui

36 / 41

Estimating the Error Variance

I Replacing ui with ui yields the estimator

σ2 =1

n

n∑i=1

u2i

I However, this estimator is biased

I Recall the two restrictions from the first order conditions:∑ni=1 ui = 0 and

∑ni=1 xiui = 0

I If we observed n− 2 residuals, we could always use the aboveconditions to “back out” the remaining two residuals

37 / 41

Estimating the Error Variance

I Our estimate of the error variance makes an adjustment forthe degrees of freedom

σ2 =1

n− 2

n∑i=1

u2i (18)

I Is σ2 unbiased? Yes!

38 / 41

Estimators of the OLS Parameter

Variances

I We can use equation 18 in Var(β0) and Var(β1) to obtain anestimate of the variances of β0 and β1

Var(β1) =1

n−2

∑ni=1 u

2i∑n

i=1(xi − x)2

Var(β0) =

1n−2

∑ni=1 u

2i

n

∑ni=1 x

2i∑n

i=1(xi − x)2

39 / 41

Additional Notes on Variance

Estimates

I We call the square root of the estimate of the variance of theerrors the standard error of the regression

σ =√σ2

I σ is used to compute the standard error of β1

se(β1) =σ√∑n

i=1(xi − x)2

40 / 41

Regression Through the Origin

I In some instances it makes sense to exclude the constant termfrom the model

I This regression equation is called a regression through theorigin since we are imposing the intercept to be equal to 0

y = β1x+ u (19)

I Minimizing the sum of squared residuals for this regressionyields the following estimate for β1

β1 =

∑ni=1 xiyi∑ni=1 x

2i

41 / 41