19
Chapter 8 Linear Regression

Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

Embed Size (px)

DESCRIPTION

linear model is an equation of a straight line through the data. – The points don’t all line up – straight line can summarize the general pattern

Citation preview

Page 2: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

Fat Versus Protein: An Example• 30 items on the Burger King menu:

Page 3: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

• linear model is an equation of a straight line through the data. –The points don’t all line up–straight line can summarize the

general pattern

Page 4: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

Residuals• The scattered points are actual data, • The corresponding values on the line

are the predicted values,• A residual is an error: difference

between the actual and the predicted (the line)

y

ˆresidual actual predicted y y

y

Page 5: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

• A negative residual means the predicted value’s too big (an overestimate).

• A positive residual means the predicted value’s too small (an underestimate).

• A residual of zero means the line predicted exact

Page 6: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

“Best Fit” Line• We want the total residuals to be small as

possible (minimizing error)• The smaller the sum, the better the fit line• Why the linear model is the line of best fit or

regression line or least squares line• The line always passes through the mean of the x

variable and the mean of the y variable

Page 7: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

Regression Line• Algebra : • Statistics:

• If the model is a good one, the data values will scatter closely around it.

ymx b

1 0y b x b

Page 8: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

• slope (b1) formula

–In units of y per unit of x–Interpretation:

b1 rsysx

Page 9: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

• intercept (b0)

–In units of y–Usefulness of intercept:

b0 y b1x

Page 10: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

Correlation and the Line• Correlation coefficient tells us “how linear”• So we will use r in the calculation of the slope• Neg. Slope = Neg. Correlation• Pos. Slope = Pos. Correlation

• Moving 1 st. dev. away from the mean in x moves us r st. dev. away from the mean in y

(and visa versa)

Ex: If you’re 1.5 st. dev. above avg. in GPA…Then how many st. dev. above on SAT?

Page 11: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

Ways to find the Regression Line

1. Use the formulas for slope and intercept

2. Use raw data and run a stat -> calc -> LinReg

3. Use a given compute output

Page 12: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

• Burger King data fits the data well:– The equation is

predict fat of a 30 g protein sandwich:

Page 13: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

Check for same conditions:–Quantitative Variables –Straight Enough–No Outlier

Page 14: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

• To check whether a linear model is appropriate:

1. Look at the scatter plot of residuals vs. x var or y var (run a LinReg first)

2. If there is a pattern = linear model NOT appropriate

3. If there is NO pattern = linear model is appropriate

Page 15: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

• The residuals for the BK menu regression look appropriately boring:

Page 16: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

The Residual St. Dev.

• The st. dev. of the residuals, se , to be relatively small

= spread/scatter around the line is small

Page 17: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

R2—Coefficient of Strength• It’s the square of r• It’s a percent• It’s between 0 and 100%• The closer to 100 the stronger the linear

model is at predicting• Interpretation:“the % of the variation in y var that is explained by the x var”

EX: Burger King Fat vs. Protein had R2 = 69%.

Page 18: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

Summary• In order to use linear regression:1. Check the original x, y scatter plot of

straightness2. Run a LinReg, then make a scatterplot of

residuals vs x var….check for no pattern3. Turn on diagnostic to check and run a LinReg

to check R2

Page 19: Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:

Chapter 8 Assignment

Pg: 192: #1-11 odd, 19-23 odd, 27, 31, 35, 37, 45, 49