Upload
bethanie-webb
View
229
Download
1
Embed Size (px)
Citation preview
9 - 1
Intrinsically Linear Regression
Chapter 9
9 - 2
Introduction
• In Chapter 7 we discussed some deviations from the assumptions of the regression model.
• One of the assumptions was that the residuals are normally distributed.
• If this assumption does not hold, then we may have to transform the data into a form that will make it appear linear, so that regression analysis can be used.
9 - 3
Non-linear Models – Introduction
• To model non-linear relationships with OLS regression, the data must first be transformed in a way that makes the relationship linear
• All the steps for linear regression may then be performed on the transformed data
• The most common forms of non-linear models are:– Logarithmic– Exponential– Power
9 - 4
X
ln Y
Linear transformations
Logarithmicy = a + b ln x
Exponentialy = a e b x
ln y = ln a + b x
Powery = a x b
ln y = ln a + b ln x
X
Y
ln X
Y
X
ln Y
X
Y
X
Y
X
Y
X
Y X
Yln X
ln Y
ln X
ln Y
ln X
ln Y
ModelUnitSpace
LogSpace12
b < 0
b > 0
b < 0
b > 0
b > 1
b < 0b < 0
b > 1
0 < b < 10 < b < 1
9 - 5
• The data is plotted in unit space (left) then trans-formed and plotted on a semi-log graph (right)
• The next step is to conductlinear regression analysison the data in semi-logspace
• After the analysis is complete, we will transform the parameters of the linear equation back to unit space
Example: Exponential Model
Unit III - Module 6 14
Certification Training
prepared by
Scatter Plots – Exponential Function
• Then, x is plotted on the horizontal axis andln(y) is plotted on the vertical axis
• This transformation is shown graphically below
Exponential Trend
Cost = 0.0485e 1.2189Weight
R2 = 0.8846
0.0
20.0
40.0
60.0
80.0
100.0
120.0
140.0
160.0
3.00 4.00 5.00 6.00 7.00
Weight
Co
st
Exponential Trend on Semi-Log Axes
ln (Cost) = 1.2189Weight - 3.0269R2 = 0.8846
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
2.00 3.00 4.00 5.00 6.00 7.00
Weight
ln(C
ost
)
Slope on semi-log graph is the coefficient of x in
the exponential equation
3
Unit III - Module 6 14
Certification Training
prepared by
Scatter Plots – Exponential Function
• Then, x is plotted on the horizontal axis andln(y) is plotted on the vertical axis
• This transformation is shown graphically below
Exponential Trend
Cost = 0.0485e 1.2189Weight
R2 = 0.8846
0.0
20.0
40.0
60.0
80.0
100.0
120.0
140.0
160.0
3.00 4.00 5.00 6.00 7.00
Weight
Co
st
Exponential Trend on Semi-Log Axes
ln (Cost) = 1.2189Weight - 3.0269R2 = 0.8846
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
2.00 3.00 4.00 5.00 6.00 7.00
Weight
ln(C
ost
)
Slope on semi-log graph is the coefficient of x in
the exponential equation
36
y = a e b xln y = ln a + b x
a = eln a
b = b
9 - 6
The Multiplicative Model
Y
X
Y
X
b1<0b1=0b1<1
b1=1
b1>1
• Linear Equation: A unit change in X causes Y to change by b1
• Multiplicative Equation: A change in X causes Y to change by a percentage proportional to the change in X
XbbYX 10ˆ 1
0ˆ bX XbY
9 - 7
The Multiplicative Model
• However, in order to produce a cost estimating relationship using the method of least squares, we must transform the multiplicative model into a linear model (at least temporarily).
• The solution is to create a log-linear equation.
• Now we can perform a linear regression on ln(Y) and ln(X), then transform the results of the linear regression into an exponential equation.
)ln()ˆln(ˆ10
1 XbbYAXY b
9 - 8
Interpreting the Results
• A linear regression of transformed data will provide exactly the same type of results as a linear regression of raw data.– The computer doesn’t know you’ve given it
transformed data.• So you need to re-transform the results into an
exponential model.
• The Intercept corresponds to ln(b0), and the slope corresponds to the exponent, b1.
• Moreover, the statistics of the transformed data can be misleading.
11
0
bb
eeA bIntercept
9 - 9
Interpreting the Results
• The Standard Error and the R2 reported for a log-linear model can not be compared to those for a linear model. This is because both are functions of SSE.
• Recall that SSE is the error sum of squares, and the standard error is expressed in terms of dollars.
• The Standard Error in log space has a different meaning than that in unit space.
• However, we can compare between log-linear models.
XXXkn
YY
kn
SSESE i
unit $1
)ˆ(
1
2
XXXkn
YY
kn
SSESE i .
1
))ˆln()(ln(
1
2
log
9 - 10
Interpreting the Results
• When comparing linear and log-linear cost models, use section III in Cost$tat entitled Predictive Measures (unit space).
• Cost$tat provides measures in unit space in section III for log-linear models. These numbers can be compared directly to corresponding linear models.– Standard Error– Coefficient of Variation– Adjusted R-squared
• Remember, you CANNOT compare these measures in log space to the same measures in unit space. All comparisons must be in unit space.