23

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Embed Size (px)

Citation preview

Page 1: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response
Page 2: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.

Chapter 13Multiple Regression

Section 13.1

Using Several Variables to Predict a

Response

Page 3: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.3

Regression Models

The model that contains only two variables, x and y, is called a bivariate model.

xy

Page 4: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.4

Suppose there are two predictors, denoted by and .

This is called a multiple regression model.

2211xx

y

Regression Models

1x 2x

Page 5: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.5

The multiple regression model relates the mean of a

quantitative response variable y to a set of explanatory

variables

Multiple Regression Model

y

1 2, ,...x x

Page 6: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.6

Example: For three explanatory variables, the multiple

regression equation is:

332211xxx

y

Multiple Regression Model

Page 7: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.7

Example: The sample prediction equation with three

explanatory variables is:

332211ˆ xbxbxbay

Multiple Regression Model

Page 8: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.8

The data set “house selling prices” contains observations

on 100 home sales in Florida in November 2003.

A multiple regression analysis was done with selling price

as the response variable and with house size and

number of bedrooms as the explanatory variables.

Example: Predicting Selling Price Using House Size and Number of Bedrooms

Page 9: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.9

Output from the analysis:

Table 13.3 Regression of Selling Price on House Size and Bedrooms. The regression equation is price = 60,102 + 63.0 house size + 15,170 bedrooms.

Example: Predicting Selling Price Using House Size and Number of Bedrooms

Page 10: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.10

Prediction Equation:

where y = selling price, =house size and = number

of bedrooms.

21 170,150.63102,60ˆ xxy

Example: Predicting Selling Price Using House Size and Number of Bedrooms

1x 2x

Page 11: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.11

One house listed in the data set had house size = 1679

square feet, number of bedrooms = 3:

Find its predicted selling price:

389,211$

)3(170,15)1679(0.63102,60ˆ

y

Example: Predicting Selling Price Using House Size and Number of Bedrooms

Page 12: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.12

Find its residual:

The residual tells us that the actual selling price was

$21,111 higher than predicted.

111,21389,211500,232ˆ yy

Example: Predicting Selling Price Using House Size and Number of Bedrooms

Page 13: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.13

The Number of Explanatory Variables

You should not use many explanatory variables in a

multiple regression model unless you have lots of data.

A rough guideline is that the sample size n should be at

least 10 times the number of explanatory variables.

For example, to use two explanatory variables, you

should have at least n = 20.

Page 14: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.14

Plotting Relationships

Always look at the data before doing a multiple regression.

Most software has the option of constructing scatterplots on a single graph for each pair of variables. This is called a scatterplot matrix.

Page 15: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.15

Figure 13.1 Scatterplot Matrix for Selling Price, House Size, and Number of Bedrooms. The middle plot in the top row has house size on the x -axis and selling price on the y -axis. The first plot in the second row reverses this, with selling price on the x -axis and house size on the y -axis. Question: Why are the plots of main interest the ones in the first row?

Plotting Relationships

Page 16: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.16

Interpretation of Multiple Regression Coefficients

The simplest way to interpret a multiple regression

equation looks at it in two dimensions as a function of a

single explanatory variable.

We can look at it this way by fixing values for the other

explanatory variable(s).

Page 17: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.17

Example using the housing data:

Suppose we fix number of bedrooms = three bedrooms.

The prediction equation becomes:

1

1

0.63612,105

)3(170,150.63102,60ˆ

x

xy

Interpretation of Multiple Regression Coefficients

2x

Page 18: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.18

Since the slope coefficient of is 63, the predicted

selling price increases, for houses with this number of

bedrooms, by $63.00 for every additional square foot in

house size.

For a 100 square-foot increase in lot size, the predicted

selling price increases by 100(63.00) = $6300.

Interpretation of Multiple Regression Coefficients

2x

Page 19: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.19

Summarizing the Effect While Controlling for a Variable

The multiple regression model assumes that the slope for

a particular explanatory variable is identical for all fixed

values of the other explanatory variables.

Page 20: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.20

For example, the coefficient of in the prediction

equation:

is 63.0 regardless of whether we plug in or

or .

21 170,150.63102,60ˆ xxy

Summarizing the Effect While Controlling for a Variable

1x

2 1x 2 2x

2 3x

Page 21: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.21

Figure 13.2 The Relationship Between and for the Multiple Regression Equation . This shows how the equation simplifies when number of bedrooms , or , or . Question: The lines move upward (to higher -values ) as increases. How would you interpret this fact?

Summarizing the Effect While Controlling for a Variable

y 1x1 2ˆ 60,102 63.0 15,170y x x

2 1x 2 2x 2 3x 2xy

Page 22: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.22

Slopes in Multiple Regression and in Bivariate Regression

In multiple regression, a slope describes the effect of

an explanatory variable while controlling effects of the

other explanatory variables in the model.

Bivariate regression has only a single explanatory

variable. A slope in bivariate regression describes the

effect of that variable while ignoring all other possible

explanatory variables.

Page 23: Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response

Copyright © 2013, 2009, and 2007, Pearson Education, Inc.23

Importance of Multiple Regression

One of the main uses of multiple regression is to identify potential lurking variables and control for them by

including them as explanatory variables in the model.

Doing so can have a major impact on a variable’s effect.

When we control a variable, we keep that variable from influencing the associations among the other variables in the study.