18
AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Embed Size (px)

Citation preview

Page 1: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

AP Stat Day 15

63 Days until AP ExamLeast Squares Regression

Coefficient of DeterminationResiduals

Page 2: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Least Squares Regression Line

• A regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes.

• You may have performed a linear regression in Algebra II, and in Statistics the process is very similar.

• But before we run a regression, let’s learn about EXACTLY what we are doing.

Page 3: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

The “Least Squares” part…

• A random scatterplot….

Page 4: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

What your calculator does…

• So, your calculator sorts through all the possible variations of lines to come up with and estimate for slope and the y-intercept of a line that has the smallest least-squares sum (be GLAD you don’t have to do this…)

• It then reports those values so you can write your equation in form.

• y-hat is the predicted value of your dependent variable. It is different from y- your oberved value.

ˆ y = a + bx

Page 5: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

EXAMPLE:

Using the data from our Kalama children from last class, let’s run a least squares regression and write our equation.

Page 6: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Interpreting the Slope

• So, slope in this equation is given by b- please don’t let this confuse you.

• When we interpret slope, we talk about how much y changes for a 1-unit change in x.

• Let’s put this in terms of our Kalama children problem…

Page 7: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Interpreting y-intercept

• y-intercept in our equation is given by a.• The y-intercept will often be an extrapolation

and thus may not make any sense in terms of the problem.

• Let’s practice with our Kalama children problem…

Page 8: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Correlation Coefficient

• In order to see r, our correlation coefficient, we need to turn on the diagnostics in your calculator.

• Now, let’s run the regression analysis again. We have more numbers now- r and r2.

• r is the correlation coefficient. This is the number that determines the “strength” in our strength, form, and direction description.

• Kalama children:

Page 9: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Coefficient of Determination

• r2 is the coefficient of determination. • What this means is that r2 tells us how much

of the variation of the data is explained by the relationship between x and y.

• r2 is always reported as a percentage.• Kalama children:

Page 10: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Residuals

• Residuals are simply the distance between the observed (y) and the predicted (y-hat) values.

• The residuals are plotted against the horizontal axis, some positive and some negative.

• Unlike the normal probability plot, in a residual plot PATTERNS ARE BAD!

• Residuals help us determine if a linear model is appropriate for our data.

• Kalama children:

Page 11: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

ACTIVITY- Guess My Age

• On p.13 record the answers to the following questions.

• I will give you the answers to the questions, then you will create a scatterplot by hand.

• You will run a regression and interpret your slope and y-intercept. (Do these make sense?)

• You will interpret your correlation coefficient and coefficient of determination.

Page 12: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Minitab Outputs EXAMPLE• The following output data from MINITAB shows the number of teachers (in

thousands) for each of the states plus the District of Columbia against the number of students (in thousands) enrolled in grades K-12.

• Predictor Coef Stdev t-ratio p• Constant 4.486 2.025 2.22 0.031• Enroll 0.053401 0.001692 31.57 0.000

• s=2.589 R-sq=81.5%

• What is the equation of the least squares line? Interpret the slope.• Find the correlation coefficient and coefficient of determination. Interpret in the

context of the problem.• Predict the number of students if the number of teachers in the state is 40,000.• Predict the number of teachers if the number of students in the state is 35,700.

Page 13: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

ACTIVITY- Reading Excel and Minitab• On p. 14 of your notebook, answer the following questions:• The growth and decline of forests is a matter of great public and scientific interest. The paper

“Relationships Among Crown Condition, Growth, and Stand Nutrition in Seven Northern Vermont Sugarbushes” included a scatter plot of y = mean crown dieback (%), which is one indicator of growth retardation, and x = soil pH. A statistical computer package MINITAB gives the following analysis:

• The regression equation is: dieback=31.0 – 5.79 soil pH • Predictor Coef Stdev t-ratio p• Constant 31.040 5.445 5.70 0.000• soil pH -5.792 1.363 -4.25 0.001• s=2.981 R-sq=51.5%• What is the equation of the least squares line?• Where else in the printout do you find the information for the slope and y-intercept?• Roughly, what change in crown dieback would be associated with an increase of 1 in soil pH?• What value of crown dieback would you predict when soil pH = 4.0?• Would it be sensible to use the least squares line to predict crown dieback when soil pH = 5.67?• What is the correlation coefficient?

Page 14: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Rules of Thumb• Properties of Correlation

• A negative r means that there is a negative association. A positive r means that there is a positive association.

• 0 means that there is no association.• The closer r is to -1 or 1, the stronger the association.• r only measures the strength of a LINEAR relationship and is

completely useless in other types of regression.• r is NOT resistant. This means that correlation is easily

affected by outliers.€

−1≤ r ≤1

Page 15: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

More Rules of Thumb

• Properties of the coefficient of determination:

• This value represents the proportion of variability in y that can be explained by the relationship with x.

0 ≤ r2 ≤1

Page 16: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Formulas

• We can also calculate the slope of the regression line using the standard deviation and the correlation coefficient…

• And the intercept can found using the mean of x and y.

b = rsy

sx

a = y − bx

Page 17: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Summary p. 15

• How do we interpret the correlation coefficient?

• How do we interpret the coefficient of determination?

• What do you look for in a Minitab output to write the least squares regression equation?

Page 18: AP Stat Day 15 63 Days until AP Exam Least Squares Regression Coefficient of Determination Residuals

Prep Questions p.16

• What is horsepower?• What do you think YOUR horsepower would

be?• REMEMBER to wear comfortable clothes and

running shoes on WEDNESDAY 10/12.