94
1 Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis Section 12.1: How Can We Model How Two Variables Are Related?

Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis

  • Upload
    emery

  • View
    91

  • Download
    3

Embed Size (px)

DESCRIPTION

Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis. Section 12.1: How Can We Model How Two Variables Are Related?. Learning Objectives. Regression Analysis The Scatterplot The Regression Line Equation Outliers Influential Points - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

1

Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis

Section 12.1: How Can We Model How Two Variables Are Related?

Page 2: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

2

Learning Objectives

1. Regression Analysis2. The Scatterplot3. The Regression Line Equation4. Outliers5. Influential Points6. Residuals are Prediction Errors7. Regression Model: A Line Describes How the

Mean of y Depends on x8. The Population Regression Equation9. Variability about the Line10. A Statistical Model

Page 3: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

3

Learning Objective 1:Regression Analysis

The first step of a regression analysis is to identify the response and explanatory variables

We use y to denote the response variable

We use x to denote the explanatory variable

Page 4: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

4

Learning Objective 2:The Scatterplot

The first step in answering the question of association is to look at the data

A scatterplot is a graphical display of the relationship between the response variable (y-axis) and the explanatory variable (x-axis)

Page 5: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

5

Learning Objective 2: Example: What Do We Learn from a Scatterplot in the Strength Study? An experiment was designed to measure

the strength of female athletes The goal of the experiment was to find the

maximum number of pounds that each individual athlete could bench press

Page 6: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

6

57 high school female athletes participated in the study

The data consisted of the following variables: x: the number of 60-pound bench presses

an athlete could do y: maximum bench press

Learning Objective 2: Example: What Do We Learn from a Scatterplot in the Strength Study?

Page 7: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

7

For the 57 girls in this study, these variable are summarized by:

x: mean = 11.0, st.deviation = 7.1 y: mean = 79.9 lbs, st.dev. = 13.3 lbs

Learning Objective 2: Example: What Do We Learn from a Scatterplot in the Strength Study?

Page 8: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

8

Learning Objective 2: Example: What Do We Learn from a Scatterplot in the Strength Study?

Page 9: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

9

Learning Objective 3:The Regression Line Equation

When the scatterplot shows a linear trend, a straight line can be fitted through the data points to describe that trend

The regression line is:

is the predicted value of the response variable y is the y-intercept and is the slope

bxay ˆy

a b

Page 10: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

10

Learning Objective 3:Example: What Do We Learn from a Scatterplot in the Strength Study?

Page 11: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

11

The MINITAB output shows the following regression equation:

BP = 63.5 + 1.49 (BP_60)

The y-intercept is 63.5 and the slope is 1.49 The slope of 1.49 tells us that predicted maximum

bench press increases by about 1.5 pounds for every additional 60-pound bench press an athlete can do

Learning Objective 3:Example: What Do We Learn from a Scatterplot in the Strength Study?

Page 12: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

12

Learning Objective 4:Outliers

Check for outliers by plotting the data

The regression line can be pulled toward an outlier and away from the general trend of points

Page 13: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

13

Learning Objective 5:Influential Points

An observation can be influential in affecting the regression line when two thing happen: Its x value is low or high compared to the

rest of the data It does not fall in the straight-line pattern

that the rest of the data have

Page 14: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

14

Learning Objective 6:Residuals are Prediction Errors

The regression equation is often called a prediction equation

The difference between an observed outcome and its predicted value is the prediction error, called a residual

y ˆ y

Page 15: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

15

Learning Objective 6:Residuals

Each observation has a residual

A residual is the vertical distance between the data point and the regression line

Page 16: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

16

Learning Objective 6:Residuals

We can summarize how near the regression line the data points fall by

The regression line has the smallest sum of squared residuals and is called the least squares line

22 )ˆ()(

yyresiduals

residualssquaredofsum

Page 17: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

17

Learning Objective 7:Regression Model: A Line Describes How the Mean of y Depends on x At a given value of x, the equation:

Predicts a single value of the response variable

But… we should not expect all subjects at that value of x to have the same value of y

Variability occurs in the y values

bxay ˆ

Page 18: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

18

Learning Objective 7:The Regression Line

The regression line connects the estimated means of y at the various x values

In summary,

Describes the relationship between x and the estimated means of y at the various values of x

bxay ˆ

Page 19: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

19

Learning Objective 8:The Population Regression Equation

The population regression equation describes the relationship in the population between x and the means of y

The equation is:

xy

Page 20: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

20

Learning Objective 8:The Population Regression Equation

In the population regression equation, α is a population y-intercept and β is a population slope These are parameters

In practice we estimate the population regression equation using the prediction equation for the sample data

Page 21: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

21

Learning Objective 8:The Population Regression Equation

The population regression equation merely approximates the actual relationship between x and the population means of y

It is a model

A model is a simple approximation for how variables relate in the population

Page 22: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

22

Learning Objective 8:The Regression Model

Page 23: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

23

Learning Objective 8:The Regression Model

If the true relationship is far from a straight line, this regression model may be a poor one

Page 24: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

24

Learning Objective 9:Variability about the Line

At each fixed value of x, variability occurs in the y values around their mean, µy

The probability distribution of y values at a fixed value of x is a conditional distribution

At each value of x, there is a conditional distribution of y values

An additional parameter σ describes the standard deviation of each conditional distribution

Page 25: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

25

Learning Objective 10:A Statistical Model

A statistical model never holds exactly in practice.

It is merely an approximation for reality

Even though it does not describe reality exactly, a model is useful if the true relationship is close to what the model predicts

Page 26: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

26

Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis

Section 12.2: How Can We Describe Strength of Association?

Page 27: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

27

Learning Objectives

1. Correlation and Slope

2. Example: What’s the Correlation for Predicting Strength?

3. The Squared Correlation

Page 28: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

28

Learning Objective 1:Correlation

The correlation, denoted by r, describes linear association

The correlation ‘r’ has the same sign as the slope ‘b’

The correlation ‘r’ always falls between -1 and +1

The larger the absolute value of r, the stronger the linear association

Page 29: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

29

Learning Objective 1:Correlation and Slope

We can’t use the slope to describe the strength of the association between two variables because the slope’s numerical value depends on the units of measurement

The correlation is a standardized version of the slope

The correlation does not depend on units of measurement.

Page 30: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

30

Learning Objective 1:Correlation and Slope

The correlation and the slope are related in the following way:

y

x

s

sbr

Page 31: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

31

Learning Objective 2:Example: What’s the Correlation for Predicting Strength? For the female athlete strength study:

x: number of 60-pound bench presses y: maximum bench press x: mean = 11.0, st.dev.=7.1 y: mean= 79.9 lbs., st.dev. = 13.3 lbs.

Regression equation:

xy 49.15.63ˆ

Page 32: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

32

The variables have a strong, positive association

r bsx

sy

1.49

7.1

13.3

0.80

Learning Objective 2:Example: What’s the Correlation for Predicting Strength?

Page 33: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

33

Learning Objective 3:The Squared Correlation

Another way to describe the strength of association refers to how close predictions for y tend to be to observed y values

The variables are strongly associated if you can predict y much better by substituting x values into the prediction equation than by merely using the sample mean and ignoring x

y

Page 34: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

34

Learning Objective 3:The Squared Correlation

Consider the prediction error: the difference between the observed and predicted values of y

Using the regression line to make a prediction, each error is:

Using only the sample mean, , to make a prediction, each error is:

yy ˆ

yy

y

Page 35: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

35

Learning Objective 3:The Squared Correlation

When we predict y using (that is, ignoring x), the error summary equals:

This is called the total sum of squares

2)( yy

y

Page 36: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

36

Learning Objective 3:The Squared Correlation

When we predict y using x with the regression equation, the error summary is:

This is called the residual sum of squares

2)ˆ( yy

Page 37: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

37

Learning Objective 3:The Squared Correlation

When a strong linear association exists, the regression equation predictions tend to be much better than the predictions using

We measure the proportional reduction in error and call it, r2

y

Page 38: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

38

Learning Objective 3:The Squared Correlation

We use the notation r2 for this measure because it equals the square of the correlation r

2

22

2

)(

)ˆ()(

yy

yyyyr

Page 39: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

39

Learning Objective 3:The Squared Correlation Example: What Does r2 Tell Us in the Strength Study? For the female athlete strength study:

x: number of 60-pund bench presses y: maximum bench press The correlation value was found to be r = 0.80

We can calculate r2 from r: (0.80)2=0.64

For predicting maximum bench press, the regression equation has 64% less error than has

y

Page 40: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

40

Learning Objective 3:The Squared Correlation

Properties: r2 falls between 0 and 1 r2=1 when . This happens only

when all the data points fall exactly on the regression line

r2=0 when . This happens when the slope b=0, in which case each

The closer r2 is to 1, the stronger the linear association: the more effective the regression equation is compared to in predicting y

y ˆ y 2 0

y ˆ y 2 y y 2

ˆ y y

y

Page 41: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

41

Learning Objective 3:Correlation r and Its Square r2

Both r and r2 describe the strength of association

‘r’ falls between -1 and +1 It represents the slope of the regression line when

x and y have been standardized

‘r2’ falls between 0 and 1 It summarizes the reduction in sum of squared

errors in predicting y using the regression line instead of using

y

Page 42: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

42

Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis

Section 12.3: How Can We Make Inferences About the Association?

Page 43: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

43

Learning Objectives

1. Descriptive and Inferential Parts of Regression

2. Assumptions for Regression Analysis

3. Testing Independence between Quantitative Variables

4. A Confidence Interval for β

Page 44: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

44

Learning Objective 1:Descriptive and Inferential Parts of Regression

The sample regression equation, r, and r2 are descriptive parts of a regression analysis

The inferential parts of regression use the tools of confidence intervals and significance tests to provide inference about the regression equation, the correlation and r-squared in the population of interest

Page 45: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

45

Learning Objective 2:Assumptions for Regression Analysis

Basic assumption for using regression line for description:

The population means of y at different values of x have a straight-line relationship with x, that is:

This assumption states that a straight-line regression model is valid

This can be verified with a scatterplot.

xy

Page 46: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

46

Learning Objective 2:Assumptions for Regression Analysis

Extra assumptions for using regression to make statistical inference: The data were gathered using

randomization The population values of y at each value

of x follow a normal distribution, with the same standard deviation at each x value

Page 47: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

47

Learning Objective 2:Assumptions for Regression Analysis

Models, such as the regression model, merely approximate the true relationship between the variables

A relationship will not be exactly linear, with exactly normal distributions for y at each x and with exactly the same standard deviation of y values at each x value

Page 48: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

48

Learning Objective 3:Testing Independence between Quantitative Variables Suppose that the slope β of the regression line

equals 0 Then…

The mean of y is identical at each x value The two variables, x and y, are statistically

independent: The outcome for y does not depend on the value

of x It does not help us to know the value of x if we

want to predict the value of y

Page 49: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

49

Learning Objective 3:Testing Independence between Quantitative Variables

Page 50: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

50

Learning Objective 3:Testing Independence between Quantitative Variables Steps of Two-Sided Significance Test about a

Population Slope β:1. Assumptions:

The population satisfies regression line:

Randomization The population values of y at each value of x

follow a normal distribution, with the same standard deviation at each x value

xy

Page 51: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

51

Learning Objective 3:Testing Independence between Quantitative Variables

Steps of Two-Sided Significance Test about a Population Slope β:

2. Hypotheses:

H0: β = 0, Ha: β ≠ 0

3. Test statistic:

Software supplies sample slope b and its se

se

bt

0

Page 52: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

52

Learning Objective 3:Testing Independence between Quantitative Variables Steps of Two-Sided Significance Test

about a Population Slope β: 4. P-value: Two-tail probability of t test statistic value

more extreme than observed:

Use t distribution with df = n-2

5. Conclusions: Interpret P-value in context If decision needed, reject H0 if P-value ≤ significance

level

Page 53: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

53

Learning Objective 3: Example: Is Strength Associated with 60-Pound Bench Press?

Page 54: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

54

Conduct a two-sided significance test of the null hypothesis of independence

Assumptions: A scatterplot of the data revealed a linear trend so the

straight-line regression model seems appropriate The scatter of points have a similar spread at

different x values The sample was a convenience sample, not a random

sample, so this is a concern

Learning Objective 3: Example: Is Strength Associated with 60-Pound Bench Press?

Page 55: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

55

Hypotheses: H0: β = 0, Ha: β ≠ 0

Test statistic:

P-value: 0.000 Conclusion: An association exists between the number of

60-pound bench presses and maximum bench press

96.9150.0

)049.1(0 se

bt

Learning Objective 3: Example: Is Strength Associated with 60-Pound Bench Press?

Page 56: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

56

Learning Objective 4:A Confidence Interval for β

A small P-value in the significance test of H0: β = 0 suggests that the population regression line has a nonzero slope

To learn how far the slope β falls from 0, we construct a confidence interval:

2 with )(025.

ndfsetb

Page 57: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

57

Learning Objective 4:Example: Estimating the Slope for Predicting Maximum Bench Press Construct a 95% confidence interval for β

Based on a 95% CI, we can conclude, on average, the maximum bench press increases by between 1.2 and 1.8 pounds for each additional 60-pound bench press that an athlete can do

1.8) (1.2,or 0.301.49

:is which )150.0(00.249.1

Page 58: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

58

Let’s estimate the effect of a 10-unit increase in x: Since the 95% CI for β is (1.2, 1.8), the

95% CI for 10β is (12, 18) On the average, we infer that the maximum

bench press increases by at least 12 pounds and at most 18 pounds, for an increase of 10 in the number of 60-pound bench presses

Learning Objective 4:Example: Estimating the Slope for Predicting Maximum Bench Press

Page 59: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

59

Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis

Section 12.4: What Do We Learn from How the Data Vary Around the Regression Line?

Page 60: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

60

Learning Objectives

1. Residuals and Standardized Residuals

2. Analyzing Large Standardized Residuals

3. The Residual Standard Deviation

4. Confidence Interval for µy

5. Prediction Interval for y

6. Prediction Interval for y vs Confidence Interval for µy

Page 61: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

61

Learning Objective 1:Residuals and Standardized Residuals

A residual is a prediction error – the difference between an observed outcome and its predicted value The magnitude of these residuals depends

on the units of measurement for y A standardized version of the residual

does not depend on the units

Page 62: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

62

Learning Objective 1:Standardized Residuals

Standardized residual:

The se formula is complex, so we rely on software to find it

A standardized residual indicates how many standard errors a residual falls from 0

If the relationship is truly linear and the standardized residuals have approximately a bell-shaped distribution, observations with standardized residuals larger than 3 in absolute value often represent outliers

(y ˆ y )

se(y ˆ y )

Page 63: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

63

Learning Objective 1:Example: Detecting an Underachieving College Student Data was collected on a sample of 59

students at the University of Georgia Two of the variables were:

CGPA: College Grade Point Average HSGPA: High School Grade Point Average

Page 64: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

64

A regression equation was created from the data:

x: HSGPA y: CGPA

Equation: xy 64.019.1ˆ

Learning Objective 1:Example: Detecting an Underachieving College Student

Page 65: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

65

MINITAB highlights observations that have standardized residuals with absolute value larger than 2:

Learning Objective 1:Example: Detecting an Underachieving College Student

Page 66: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

66

Consider the reported standardized residual of -3.14

This indicates that the residual is 3.14 standard errors below 0

This student’s actual college GPA is quite far below what the regression line predicts

Learning Objective 1:Example: Detecting an Underachieving College Student

Page 67: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

67

Learning Objective 2:Analyzing Large Standardized Residuals

Does it fall well away from the linear trend that the other points follow?

Does it have too much influence on the results?

Note: Some large standardized residuals may occur just because of ordinary random variability-even if the model is perfect, we’d expect about 5% of the standardized residuals to have absolute values > 2 by chance.

Page 68: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

68

Learning Objective 2:Histogram of Residuals

A histogram of residuals or standardized residuals is a good way of detecting unusual observations

A histogram is also a good way of checking the assumption that the conditional distribution of y at each x value is normal Look for a bell-shaped histogram

Page 69: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

69

Learning Objective 2:Histogram of Residuals

Suppose the histogram is not bell-shaped: The distribution of the residuals is not

normal

However…. Two-sided inferences about the slope

parameter still work quite well The t- inferences are robust

Page 70: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

70

Learning Objective 3:The Residual Standard Deviation

For statistical inference, the regression model assumes that the conditional distribution of y at a fixed value of x is normal, with the same standard deviation at each x

This standard deviation, denoted by σ, refers to the variability of y values for all subjects with the same x value

Page 71: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

71

Learning Objective 3:The Residual Standard Deviation The estimate of σ, obtained from the data, is:

2

)ˆ( 2

n

yys

Page 72: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

72

Learning Objective 3:Example: How Variable are the Athletes’ Strengths? From MINITAB output, we obtain s, the

residual standard deviation of y:

For any given x value, we estimate the mean y value using the regression equation and we estimate the standard deviation using s: s = 8.0

0.855

8.3522 s

Page 73: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

73

Learning Objective 4:Confidence Interval for µy

We estimate µy, the population mean of y at a

given value of x by:

We can construct a 95% confidence interval for µy using:

where the t-score has df=n-2

bxay ˆ

ˆ y t.025(se of ˆ y )

Page 74: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

74

Learning Objective 5:Prediction Interval for y

The estimate for the mean of y at a fixed value of x is also a prediction for an individual outcome y at the fixed value of x

Most regression software will form this interval within which an outcome y is likely to fall

ˆ y a bx

ˆ y 2s

where s is the residual standard deviation

Page 75: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

75

Learning Objective 6:Prediction Interval for y vs Confidence Interval for µy

The prediction interval for y is an inference about where individual observations fall

Use a prediction interval for y if you want to predict where a single observation on y will fall for a particular x value

Page 76: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

76

The confidence interval for µy is an

inference about where a population mean falls

Use a confidence interval for µy if you

want to estimate the mean of y for all individuals having a particular x value

Learning Objective 6:Prediction Interval for y vs Confidence Interval for µy

ˆ y 2 s n where s is the residual standard deviation

Page 77: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

77

Learning Objective 6:Prediction Interval for y vs Confidence Interval for µy

Note that the prediction interval is wider than the confidence interval - you can estimate a population mean more precisely than you can predict a single observation

Caution: in order for these intervals to be valid, the true relationship must be close to linear with about the same variability of y-values at each fixed x-value

Page 78: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

78

Learning Objective 6:Example: Predicting Maximum Bench Press and Estimating its Mean

Page 79: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

79

Use the MINITAB output to find and interpret a 95% CI for the population mean of the maximum bench press values for all female high school athletes who can do x = 11 sixty-pound bench presses

For all female high school athletes who can do 11 sixty-pound bench presses, we estimate the mean of their maximum bench press values falls between 78 and 82 pounds

Learning Objective 6:Example: Predicting Maximum Bench Press and Estimating its Mean

Page 80: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

80

Use the MINITAB output to find and interpret a 95% Prediction Interval for a single new observation on the maximum bench press for a randomly chosen female high school athlete who can do x = 11 sixty-pound bench presses

For all female high school athletes who can do 11 sixty-pound bench presses, we predict that 95% of them have maximum bench press values between 64 and 96 pounds

Learning Objective 6:Example: Predicting Maximum Bench Press and Estimating its Mean

Page 81: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

81

Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis

Section 12.5: Exponential Regression: A Model for Nonlinearity

Page 82: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

82

Learning Objectives

1. Nonlinear Regression Models

2. Exponential Regression Model

3. Interpreting Exponential Regression Models

Page 83: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

83

Learning Objective 1:Nonlinear Regression Models

If a scatterplot indicates substantial curvature in a relationship, then equations that provide curvature are needed Occasionally a scatterplot has a parabolic

appearance: as x increases, y increases then it goes back down

More often, y tends to continually increase or continually decrease but the trend shows curvature

Page 84: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

84

Learning Objective 1:Example: Exponential Growth in Population Size Since 2000, the population of the U.S. has been

growing at a rate of 2% a year

The population size in 2000 was 280 million The population size in 2001 was 280 x 1.02 The population size in 2002 was 280 x (1.02)2

… The population size in 2010 is estimated to be 280 x (1.02)10

This is called exponential growth

Page 85: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

85

Learning Objective 2:Exponential Regression Model

An exponential regression model has the formula:

For the mean µy of y at a given value of x, where α and β are parameters

x

y

Page 86: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

86

Learning Objective 2:Exponential Regression Model

In the exponential regression equation, the explanatory variable x appears as the exponent of a parameter

The mean µy and the parameter β can take only positive values

Page 87: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

87

Learning Objective 2:Exponential Regression Model

As x increases, the mean µy increases when β>1

It continually decreases when 0 < β<1

Page 88: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

88

Learning Objective 2:Exponential Regression Model

For exponential regression, the logarithm of the mean is a linear function of x

When the exponential regression model holds, a plot of the log of the y values versus x should show an approximate straight-line relation with x

Page 89: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

89

Learning Objective 2: Example: Explosion in Number of People Using the Internet

Page 90: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

90

Learning Objective 2: Example: Explosion in Number of People Using the Internet

Plot of Number of People Using Internet between 1995 and 2001

Page 91: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

91

Learning Objective 2: Example: Explosion in Number of People Using the Internet

Plot of Log Number of People Using Internet between 1995 and 2001

Page 92: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

92

Using regression software, we can create the exponential regression equation: x: the number of years since 1995. Start

with x = 0 for 1995, then x=1 for 1996, etc y: number of internet users Equation:

xy )7708.1(38.20ˆ

Learning Objective 2: Example: Explosion in Number of People Using the Internet

Page 93: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

93

Learning Objective 3:Interpreting Exponential Regression Models

In the exponential regression model,

the parameter α represents the mean value of y when x = 0;

The parameter β represents the multiplicative effect on the mean of y for a one-unit increase in x

x

y

Page 94: Chapter 12: Analyzing  Association  Between Quantitative Variables: Regression Analysis

94

Learning Objective 3:Example: Explosion in Number of People Using the Internet In this model:

The predicted number of Internet users in 1995 (for which x = 0) is 20.38 million

The predicted number of Internet users in 1996 is 20.38 times 1.7708

xy )7708.1(38.20ˆ