54
Modeling a Linear Modeling a Linear Relationship Relationship Lecture 47 Lecture 47 Secs. 13.1 – 13.3.1 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006 Tue, Apr 25, 2006

Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Embed Size (px)

Citation preview

Page 1: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Modeling a Linear Modeling a Linear RelationshipRelationship

Lecture 47Lecture 47

Secs. 13.1 – 13.3.1Secs. 13.1 – 13.3.1

Tue, Apr 25, 2006Tue, Apr 25, 2006

Page 2: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Bivariate DataBivariate Data

Data is called Data is called bivariatebivariate if each if each observations consists of a pair of observations consists of a pair of values (values (xx, , yy).).

xx is the is the explanatoryexplanatory variable. variable. yy is the is the responseresponse variable. variable. xx is also called the is also called the independentindependent

variable.variable. yy is also called the is also called the dependentdependent

variable.variable.

Page 3: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

ScatterplotsScatterplots

ScatterplotScatterplot – A display in which each – A display in which each observation (observation (xx, , yy) is plotted as a ) is plotted as a point in the point in the xyxy plane. plane.

Page 4: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

ExampleExample

Draw a scatterplot of the percent on-time Draw a scatterplot of the percent on-time arrivals vs. percent on-time departures arrivals vs. percent on-time departures for the 22 airports listed in Exercise 4.29, for the 22 airports listed in Exercise 4.29, p. 252, and also in Exercise 13.5, p 822.p. 252, and also in Exercise 13.5, p 822. OnTimeArrivals.xlsOnTimeArrivals.xls..

Does there appear to be a relationship?Does there appear to be a relationship? How can we tell?How can we tell? How would we describe that relationship?How would we describe that relationship?

Page 5: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Linear AssociationLinear Association Draw (or imagine) an oval around the Draw (or imagine) an oval around the

data set.data set. If the oval is If the oval is tiltedtilted, then there is some , then there is some

linear associationlinear association.. If the oval is tilted If the oval is tilted upwardsupwards from left to from left to

right, then there is right, then there is positive associationpositive association.. If the oval is tilted If the oval is tilted downwardsdownwards from left to from left to

right, then there is right, then there is negative associationnegative association.. If the oval is not tilted at all, then there is If the oval is not tilted at all, then there is

no associationno association..

Page 6: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Positive Linear Positive Linear AssociationAssociation

x

y

Page 7: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Positive Linear Positive Linear AssociationAssociation

x

y

Page 8: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Negative Linear Negative Linear AssociationAssociation

x

y

Page 9: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Negative Linear Negative Linear AssociationAssociation

x

y

Page 10: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

No Linear AssociationNo Linear Association

x

y

Page 11: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

No Linear AssociationNo Linear Association

x

y

Page 12: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Strong vs. Weak Strong vs. Weak AssociationAssociation

The association is The association is strongstrong if the oval if the oval is narrow.is narrow.

The association is The association is weakweak if the oval is if the oval is wide.wide.

Page 13: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Strong Positive Linear Strong Positive Linear AssociationAssociation

x

y

Page 14: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Strong Positive Linear Strong Positive Linear AssociationAssociation

x

y

Page 15: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Weak Positive Linear Weak Positive Linear AssociationAssociation

x

y

Page 16: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Weak Positive Linear Weak Positive Linear AssociationAssociation

x

y

Page 17: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

TI-83 - ScatterplotsTI-83 - Scatterplots

To set up a scatterplot,To set up a scatterplot, Enter the Enter the xx values in L values in L11..

Enter the Enter the yy values in L values in L22.. Press 2Press 2ndnd STAT PLOT. STAT PLOT. Select Plot1 and press ENTER.Select Plot1 and press ENTER.

Page 18: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

TI-83 - ScatterplotsTI-83 - Scatterplots

The Stat Plot display appears.The Stat Plot display appears. Select On and press ENTER.Select On and press ENTER. Under Type, select the first icon (a Under Type, select the first icon (a

small image of a scatterplot) and press small image of a scatterplot) and press ENTER.ENTER.

For XList, enter LFor XList, enter L11..

For YList, enter LFor YList, enter L22.. For Mark, select the one you want and For Mark, select the one you want and

press ENTER.press ENTER.

Page 19: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

TI-83 - ScatterplotsTI-83 - Scatterplots

To draw the scatterplot,To draw the scatterplot, Press ZOOM. The Zoom menu appears.Press ZOOM. The Zoom menu appears. Select ZoomStat (#9) and press Select ZoomStat (#9) and press

ENTER. The scatterplot appears.ENTER. The scatterplot appears. Press TRACE and use the arrow keys to Press TRACE and use the arrow keys to

inspect the individual points.inspect the individual points.

Page 20: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

ExampleExample

Use the TI-83 to draw a scatterplot Use the TI-83 to draw a scatterplot of the following data.of the following data.

x y

2 3

3 5

5 9

6 12

9 16

Page 21: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Simple Linear Simple Linear RegressionRegression

To quantify the linear relationship To quantify the linear relationship between between xx and and yy, we wish to find the , we wish to find the equation of the line that “best” fits equation of the line that “best” fits the data.the data.

Typically, there will be many lines Typically, there will be many lines that all look pretty good.that all look pretty good.

How do we measure how well a line How do we measure how well a line fits the data?fits the data?

Page 22: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Measuring the Goodness Measuring the Goodness of Fitof Fit

Start with the scatterplot.Start with the scatterplot.

x

y

Page 23: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Measuring the Goodness Measuring the Goodness of Fitof Fit

Draw any line through the Draw any line through the scatterplot.scatterplot.

x

y

Page 24: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Measuring the Goodness Measuring the Goodness of Fitof Fit

Measure the vertical distances from Measure the vertical distances from every point to the lineevery point to the line

x

y

Page 25: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Measuring the Goodness Measuring the Goodness of Fitof Fit

Each of these represents a deviation, Each of these represents a deviation, called a called a residualresidual ee, from the line., from the line.

x

y

e

Page 26: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

ResidualsResiduals The The i i thth residual residual – The difference between – The difference between

the the observedobserved value of value of yyii and the and the predictedpredicted value of value of yyii..

Use Use yyii^̂ for the predicted for the predicted yyii..

The formula for the The formula for the iithth residual is residual is

Notice that the residual is positive if the Notice that the residual is positive if the data point is data point is aboveabove the line and it is the line and it is negative if the data point is negative if the data point is belowbelow the line. the line.

iii yye ˆ

Page 27: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Measuring the Goodness Measuring the Goodness of Fitof Fit

Each of these represents a deviation, Each of these represents a deviation, called a called a residualresidual ee, from the line., from the line.

x

y

e

xi

yi^

yi

Page 28: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Measuring the Goodness Measuring the Goodness of Fitof Fit

Find the sum of the squared Find the sum of the squared residuals.residuals.

x

y

e

xi

yi^

yi

Page 29: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Measuring the Goodness Measuring the Goodness of Fitof Fit

The smaller the sum of squared The smaller the sum of squared residuals, the better the fit.residuals, the better the fit.

x

y

e

xi

yi^

yi

Page 30: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

ExampleExample

Consider the data pointsConsider the data points

x y

2 3

3 5

5 9

6 12

9 16

Page 31: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

ExampleExample

2 3 4 5 6 7 8 9

5

10

15

Page 32: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Least Squares LineLeast Squares Line

Let’s see how good the fit is for the Let’s see how good the fit is for the lineline

yy^̂ = -1 + 2 = -1 + 2xx,,

where where yy^̂ represents the represents the predictedpredicted value of value of yy, not the observed value., not the observed value.

Page 33: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Begin with the data set.Begin with the data set.

x y

2 3

3 5

5 9

6 12

9 16

Page 34: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Compute the predicted Compute the predicted yy, using , using yy^̂ = = -1 + 2-1 + 2xx..

x y y^

2 3 3

3 5 5

5 9 9

6 12 11

9 16 17

Page 35: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Compute the residuals, Compute the residuals, yy – – yy^̂..

x y y^ y – y^

2 3 3 0

3 5 5 0

5 9 9 0

6 12 11 1

9 16 17 -1

Page 36: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Compute the squared residuals.Compute the squared residuals.

x y y^ y – y^ (y – y^)2

2 3 3 0 0

3 5 5 0 0

5 9 9 0 0

6 12 11 1 1

9 16 17 -1 1

Page 37: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Compute the sum of the squared Compute the sum of the squared residuals.residuals.

x y y^ y – y^ (y – y^)2

2 3 3 0 0

3 5 5 0 0

5 9 9 0 0

6 12 11 1 1

9 16 17 -1 1(y – y^)2 = 2.00

Page 38: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Now let’s see how good the fit is for Now let’s see how good the fit is for the linethe line

yy^̂ = -0.5 + 1.9 = -0.5 + 1.9xx..

Page 39: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Begin with the data set.Begin with the data set.

x y

2 3

3 5

5 9

6 12

9 16

Page 40: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Compute the predicted Compute the predicted yy, using , using yy^̂ = = -0.5 + 1.9-0.5 + 1.9xx..

x y y^

2 3 3.3

3 5 5.2

5 9 9.0

6 12 10.9

9 16 16.6

Page 41: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Compute the residuals, Compute the residuals, yy – – yy^̂..

x y y^ y – y^

2 3 3.3 -0.3

3 5 5.2 -0.2

5 9 9.0 0.0

6 12 10.9 1.1

9 16 16.6 -0.6

Page 42: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Compute the squared residuals.Compute the squared residuals.

x y y^ y – y^ (y – y^)2

2 3 3.3 -0.3 0.09

3 5 5.2 -0.2 0.04

5 9 9.0 0.0 0.00

6 12 10.9 1.1 1.21

9 16 16.6 -0.6 0.36

Page 43: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

Compute the sum of the squared Compute the sum of the squared residuals.residuals.

x y y^ y – y^ (y – y^)2

2 3 3.3 -0.3 0.09

3 5 5.2 -0.2 0.04

5 9 9.0 0.0 0.00

6 12 10.9 1.1 1.21

9 16 16.6 -0.6 0.36(y – y^)2 = 1.70

Page 44: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

We conclude that We conclude that yy^̂ = -0.5 + 1.9 = -0.5 + 1.9xx is is a a betterbetter fit than fit than yy^̂ = -1 + 2 = -1 + 2xx..

Page 45: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

2 3 4 5 6 7 8 9

5

10

15

y^ = -1 + 2x

Page 46: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Sum of Squared Sum of Squared ResidualsResiduals

2 3 4 5 6 7 8 9

5

10

15

y^ = -0.5 + 1.9x

Page 47: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Least Squares LineLeast Squares Line

Least squares lineLeast squares line – The line for – The line for which the sum of the squares of the which the sum of the squares of the distances is as small as possible.distances is as small as possible.

The least squares line is also called The least squares line is also called the the line of best fitline of best fit or the or the regression regression lineline..

Page 48: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

ExampleExample

For all the lines that one could draw For all the lines that one could draw through this data set, through this data set,

it turns out that 1.70 is the it turns out that 1.70 is the smallest smallest possiblepossible value for the sum of the value for the sum of the squares of the residuals.squares of the residuals.

x y

2 3

3 5

5 9

6 12

9 16

Page 49: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

ExampleExample

Therefore, Therefore,

yy^̂ = -0.5 + 1.9 = -0.5 + 1.9xx

is the regression line for this data is the regression line for this data set.set.

Page 50: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Regression LineRegression Line

We will write regression line asWe will write regression line as

aa is the is the yy-intercept.-intercept. bb is the slope. is the slope.

This is the usual slope-intercept form This is the usual slope-intercept form yy = = mxmx + + bb with the two terms with the two terms rearranged and relabeled.rearranged and relabeled.

bxay ˆ

Page 51: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

TI-83 – Computing TI-83 – Computing ResidualsResiduals

It is not hard to compute the residuals It is not hard to compute the residuals and the sum of their squares on the TI-and the sum of their squares on the TI-83.83.

(Later, we will see a faster method.)(Later, we will see a faster method.) Enter the Enter the xx-values in list L-values in list L11 and the and the yy-values -values

in list Lin list L22.. Compute Compute aa + + bb*L*L11 and store in list L and store in list L33 ( (yy^̂

values).values). Compute (LCompute (L22 – L – L33))22. This is a list of the . This is a list of the

squared residuals.squared residuals. Compute sum(Ans). This is the sum of the Compute sum(Ans). This is the sum of the

squared residuals.squared residuals.

Page 52: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

TI-83 – Computing TI-83 – Computing ResidualsResiduals

Enter the data setEnter the data set

and use the equation and use the equation yy^̂ = -0.5 + = -0.5 + 1.91.9xx to compute the sum of squared to compute the sum of squared residuals.residuals.

x y

2 3

3 5

5 9

6 12

9 16

Page 53: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

PredictionPrediction

Use the regression line to predict Use the regression line to predict yy when when xx = 4 = 4 xx = 7 = 7 xx = 20 = 20

InterpolationInterpolation – Using an – Using an xx value within value within the observed extremes of the observed extremes of xx values to values to predict predict yy..

ExtrapolationExtrapolation – Using an – Using an xx value beyond value beyond the observed extremes of the observed extremes of xx values to values to predict predict yy..

Page 54: Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006

Interpolation vs. Interpolation vs. ExtrapolationExtrapolation

Interpolated values are more Interpolated values are more reliable then extrapolated values.reliable then extrapolated values.

The farther out the values are The farther out the values are extrapolated, the less reliable they extrapolated, the less reliable they are.are.