Upload
jade-nguyen
View
214
Download
0
Embed Size (px)
Citation preview
7/30/2019 Econometrics Chapter 4
1/5
Section 4.1
is the intercept of this line and is the slope. This equation is the linearregression model with a single regressor, in which Y is the dependent
variable and X is the independent variable or the regressor.
is the population regression line or the population regressionfunction. If you knew the value of X, according to this population regression
line, you would be able to predict that the value of dependent variable Y is
.
The intercept and the slope are the coefficients of the populationregression line, also known as the parameters of the population regression
line. The slope is the change in Y associated with a unit change in X. Theintercept is the value of the population regression line when X=0. Sometimes,
this intercept does not provide real-world meaning such as in STR example; it
would be the predicted value of test scores when there are no students in the
class.
ui is the error term which includes every other factors beside X that
determine Y for a specific observation, i.
7/30/2019 Econometrics Chapter 4
2/5
Section 4.2
The OLS estimator chooses the regression coefficients so that the estimated
regression line is as close as possible to the observed data, where closeness is
measured by the sum of the squared mistakes made in predicting Y given X.
As discussed in Section 3.1, the sample average, , is the least squaresestimator of the population mean E(Y); that is, minimizes the total squaredestimation mistakes among all possible estimators m. In order tominimize that:
=> Solving for the final equation for m shows that when m = .
7/30/2019 Econometrics Chapter 4
3/5
The OLS estimator extends this idea to the linear regression model
from . The sum of the squared prediction mistakes over all nobservations is:
Where b0 and b1 are the estimators of and , respectively. Theseestimators are called the ordinary least squares (OLS) estimators of and .
The OLS estimators of the slope and intercept are:
The OLS regression line: The predicted value of Yi given Xi based on OLS regression line: The residual for the ith observation:
Section 4.3
A) Measures of Fit
7/30/2019 Econometrics Chapter 4
4/5
Explained sum of squares (ESS): Total sum of squares (TSS): Sum of squared residuals (variance of Yi NOT explained by Xi) (SSR):
Regression R2 =
(ranges between 0 and 1 and measures the
fraction of the variance of Yi that is explained by Xi)
The R2 of the regression of Y on the single regressor X is the square of the
correlation coefficient between X and Y.
TSS = ESS + SSR
If then Xi explains none of the variation of Yi and the predicted valueof Yi based on the regression is just the sample average of Yi. In this case, the
ESS is 0 and SSR = TSS; thus, the R2 is 0.
Conversely, if Xi explains all of the variation of Yi then Yi = i for all everyresidual and i is 0 so that ESS = TSS and R2 = 1.
=> R2 near 0 means Xi is not very good at predicting Yi and vice versa.
B) The standard error of the regression (SER)SER is an estimator of the standard deviation of the regression error ui. The
units of ui and Yi are the same so SER is a measure of the spread of the
observations around the regression line, measured in the units of the
dependent variable.
The SER is computed using OLS residuals:
Also, the sample average of the OLS residuals is 0.
The reason to use n-2 is the same as the reason to use n-1, that is, to correct
the slight downward bias introduced because two regression coefficients were
estimated.
7/30/2019 Econometrics Chapter 4
5/5
High SER means there is a large spread of scatterplot around the regression
line as measured in points of the test. That also means the prediction of test
scores using only STR will often be wrong by a large amount.
Low R2(and large SER) does not, by itself, imply that this regression is good
or bad but it does tell that there is other important factors influence test
scores.