9 - 1 Intrinsically Linear Regression Chapter 9. 9 - 2 Introduction In Chapter 7 we discussed some deviations from the assumptions of the regression model

9 - 1

Intrinsically Linear Regression

Chapter 9

9 - 2

Introduction

• In Chapter 7 we discussed some deviations from the assumptions of the regression model.

• One of the assumptions was that the residuals are normally distributed.

• If this assumption does not hold, then we may have to transform the data into a form that will make it appear linear, so that regression analysis can be used.

9 - 3

Non-linear Models – Introduction

• To model non-linear relationships with OLS regression, the data must first be transformed in a way that makes the relationship linear

• All the steps for linear regression may then be performed on the transformed data

• The most common forms of non-linear models are:– Logarithmic– Exponential– Power

9 - 4

X

ln Y

Linear transformations

Logarithmicy = a + b ln x

Exponentialy = a e b x

ln y = ln a + b x

Powery = a x b

ln y = ln a + b ln x

X

Y

ln X

Y

X

ln Y

X

Y

X

Y

X

Y

X

Y X

Yln X

ln Y

ln X

ln Y

ln X

ln Y

ModelUnitSpace

LogSpace12

b < 0

b > 0

b < 0

b > 0

b > 1

b < 0b < 0

b > 1

0 < b < 10 < b < 1

9 - 5

• The data is plotted in unit space (left) then trans-formed and plotted on a semi-log graph (right)

• The next step is to conductlinear regression analysison the data in semi-logspace

• After the analysis is complete, we will transform the parameters of the linear equation back to unit space

Example: Exponential Model

Unit III - Module 6 14

Certification Training

prepared by

Scatter Plots – Exponential Function

• Then, x is plotted on the horizontal axis andln(y) is plotted on the vertical axis

• This transformation is shown graphically below

Exponential Trend

Cost = 0.0485e 1.2189Weight

R2 = 0.8846

0.0

20.0

40.0

60.0

80.0

100.0

120.0

140.0

160.0

3.00 4.00 5.00 6.00 7.00

Weight

Co

st

Exponential Trend on Semi-Log Axes

ln (Cost) = 1.2189Weight - 3.0269R2 = 0.8846

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

2.00 3.00 4.00 5.00 6.00 7.00

Weight

ln(C

ost

)

Slope on semi-log graph is the coefficient of x in

the exponential equation

3

Unit III - Module 6 14

Certification Training

prepared by

Scatter Plots – Exponential Function

• Then, x is plotted on the horizontal axis andln(y) is plotted on the vertical axis

• This transformation is shown graphically below

Exponential Trend

Cost = 0.0485e 1.2189Weight

R2 = 0.8846

0.0

20.0

40.0

60.0

80.0

100.0

120.0

140.0

160.0

3.00 4.00 5.00 6.00 7.00

Weight

Co

st

Exponential Trend on Semi-Log Axes

ln (Cost) = 1.2189Weight - 3.0269R2 = 0.8846

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

2.00 3.00 4.00 5.00 6.00 7.00

Weight

ln(C

ost

)

Slope on semi-log graph is the coefficient of x in

the exponential equation

36

y = a e b xln y = ln a + b x

a = eln a

b = b

9 - 6

The Multiplicative Model

Y

X

Y

X

b1<0b1=0b1<1

b1=1

b1>1

• Linear Equation: A unit change in X causes Y to change by b1

• Multiplicative Equation: A change in X causes Y to change by a percentage proportional to the change in X

XbbYX 10ˆ 1

0ˆ bX XbY

9 - 7

The Multiplicative Model

• However, in order to produce a cost estimating relationship using the method of least squares, we must transform the multiplicative model into a linear model (at least temporarily).

• The solution is to create a log-linear equation.

• Now we can perform a linear regression on ln(Y) and ln(X), then transform the results of the linear regression into an exponential equation.

)ln()ˆln(ˆ10

1 XbbYAXY b

9 - 8

Interpreting the Results

• A linear regression of transformed data will provide exactly the same type of results as a linear regression of raw data.– The computer doesn’t know you’ve given it

transformed data.• So you need to re-transform the results into an

exponential model.

• The Intercept corresponds to ln(b0), and the slope corresponds to the exponent, b1.

• Moreover, the statistics of the transformed data can be misleading.

11

0

bb

eeA bIntercept

9 - 9


• The Standard Error and the R2 reported for a log-linear model can not be compared to those for a linear model. This is because both are functions of SSE.

• Recall that SSE is the error sum of squares, and the standard error is expressed in terms of dollars.

• The Standard Error in log space has a different meaning than that in unit space.

• However, we can compare between log-linear models.

XXXkn

YY

kn

SSESE i

unit $1

)ˆ(

1

2

XXXkn

YY

kn

SSESE i .

1

))ˆln()(ln(

1

2

log

9 - 10


• When comparing linear and log-linear cost models, use section III in Cost$tat entitled Predictive Measures (unit space).

• Cost$tat provides measures in unit space in section III for log-linear models. These numbers can be compared directly to corresponding linear models.– Standard Error– Coefficient of Variation– Adjusted R-squared

• Remember, you CANNOT compare these measures in log space to the same measures in unit space. All comparisons must be in unit space.

Documents

9 - 1 Intrinsically Linear Regression Chapter 9. 9 - 2 Introduction In Chapter 7 we discussed some deviations from the assumptions of the regression model