of 34 /34
ANOVA for Regression ANOVA tests whether the regression model has any explanatory power. In the case of simple regression analysis the ANOVA test and the test for b 1 are identical.

ANOVA for Regression

  • Upload
    dillon

  • View
    119

  • Download
    0

Embed Size (px)

DESCRIPTION

ANOVA for Regression. ANOVA tests whether the regression model has any explanatory power. In the case of simple regression analysis the ANOVA test and the test for b 1 are identical. ANOVA for Regression. MSE = SSE/(n-2) MSR = SSR/p where p=number of independent variables F = MSR/MSE. - PowerPoint PPT Presentation

Citation preview

Page 1: ANOVA for Regression

ANOVA for Regression

ANOVA tests whether the regression model has any explanatory power.

In the case of simple regression analysis the ANOVA test and the test for b1 are identical.

Page 2: ANOVA for Regression

ANOVA for Regression

MSE = SSE/(n-2)

MSR = SSR/pwhere p=number of independent variables

F = MSR/MSE

Page 3: ANOVA for Regression

ANOVA Hypothesis Test

H0: b1 = 0Ha: b1 ≠ 0

Reject H0 if:F > Fa Or if:p < a

Page 4: ANOVA for Regression

Regression and ANOVASource of variation

Sum of squares Degrees of freedom

Mean Square F

Regression SSR 1 MSR=SSR/1 F=MSR/MSE

Error SSE n-2 MSE=SSE/(n-2)

Total SST n-1

Page 5: ANOVA for Regression

ANOVA and RegressionANOVA

  df SS MS FSignificance 

F

Regression 1 3364 3364 273 1.23E-15

Residual 27 3334 12.3

Total 28 3697

Fa = 4.21 given a=.05, df num. = 1, df denom. = 27

Page 6: ANOVA for Regression

Issues with Hypothesis Test Results

• Correlation does NOT prove causation• The test does not prove we used the correct

functional form

Page 7: ANOVA for Regression

Output with Temperature as Y

SUMMARY OUTPUT

Regression StatisticsMultiple R 0.953884648R Square 0.909895922Adjusted R Square 0.906558734Standard Error 5.053605155Observations 29

ANOVA  df SS MS F Significance F

Regression 1 6963.27661 6963.27661 272.6535 1.23118E-15Residual 27 689.5509766 25.5389251Total 28 7652.827586

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 67.59301867 1.358242515 49.7650588 4.24E-28 64.80613526 70.3799021Thousands of cubic feet -1.372438825 0.083116544 -16.512222 1.23E-15 -1.542979885 -1.20189776

Page 8: ANOVA for Regression

Jun-07

Aug-07

Oct-07

Dec-07

Feb-08

Apr-08

Jun-08

Aug-08

Oct-08

Dec-08

Feb-09

Apr-09

Jun-09

Aug-09

Oct-09

0

10

20

30

40

50

60

70

80

Temperature and Natural Gas Consumed

Average daily temperature Thousands of cubic feet

Page 9: ANOVA for Regression

0 10 20 30 40 50 60 70 800

5

10

15

20

25

30

35

40

Monthly Natural Gas Use and Temperature

Average Daily Temperature

Thou

sand

s of

cub

ic fe

et

Page 10: ANOVA for Regression

Confidence Interval for Estimated Mean Value of y

xp = particular or given value of xyp = value of the dependent variable for xp E(yp) = expected value of yp

or E(y|x= xp)

)(ˆ

ˆ 10

p

p

yEy

xbby

of estimate our is

Page 11: ANOVA for Regression

Confidence Interval for Estimated Mean Value of y

p

p

y

i

py

sty

xx

xx

nss

ˆ2/

2

2

ˆ

ˆ

1

Page 12: ANOVA for Regression

Computing b0 and b1, Examplex y1 15 -3 3 -9 93 14 -1 2 -2 13 11 -1 -1 1 14 12 0 0 0 09 8 5 -4 -20 25

Sum = 20 60 -30 36Mean = 4 12

b1 = -0.83b0 = 15.33

)( xxi )( yyi 2)( xxi ))(( yyxx ii

From example of car age, price:

Page 13: ANOVA for Regression

x y 1 15 9 14.5 6.2 0.3 93 14 1 12.84 0.7 1.3 43 11 1 12.84 0.7 3.4 14 12 0 12.01 0.0 0.0 09 8 25 7.86 17.4 0.0 16

Sum=20 Sum=60 36 SSR=25.0 SSE=5.0 SST=30Mean=4 Mean=12

b1=-0.833b0=15.33

r2 = 25/30 = .833

y 2)ˆ( yy 2)ˆ( yy 2)( yy 2)( xx

Confidence Interval of Conditional Mean

Page 14: ANOVA for Regression

616.0228.29.1

36

45

5

129.1

1

29.125

5

2

2

2

2

ˆ

xx

xx

nss

n

SSEMSEs

i

py p

Confidence Interval of Conditional Mean

Page 15: ANOVA for Regression

Confidence Interval of Conditional Mean

14.13,22.996.118.11

616.182.318.11

ˆ ˆ2/

py

sty

Given 1-a = .95 and df = 3:

Page 16: ANOVA for Regression

Confidence Interval for Predicted Values of y

A confidence interval for a predicted value of y must take into account both random error in the estimate of b1 and the random deviations of individual values from the regression line.

Page 17: ANOVA for Regression

Confidence Interval for Estimated Mean Value of y

ind

i

pind

sty

xx

xx

nss

2/

2

2

ˆ

11

Page 18: ANOVA for Regression

43.1228.129.1

36

45

5

1129.1

11

2

2

2

ˆ

xx

xx

nss

i

pyind

Confidence Interval of Individual Value

Page 19: ANOVA for Regression

Confidence Interval of Conditional Mean

73.15,63.655.418.11

43.1182.318.11

ˆ ˆ2/

indysty

Given 1-a = .95 and df = 3:

Page 20: ANOVA for Regression

Residual Plots Against x

Residual – the difference between the observed value and the predicted value

Look for:• Evidence of a nonconstant variance• Nonlinear relationship

Page 21: ANOVA for Regression

Regression and Outliers

Outliers can have a disproportionate effect on the estimated regression line.

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Natural Gas Usage and Tem-perature

Temperature

000'

s Cu

bic

Feet

  CoefficientsIntercept 36.19972

X Variable 1 -0.44381

Page 22: ANOVA for Regression

Regression and Outliers

One solution is to estimate the model with and without the outlier.

Questions to ask:•Is the value a error?•Does the value reflect some unique circumstance?•Is the data point providing unique information about values outside of the range of other observations?

Page 23: ANOVA for Regression

Chapter 15

Multiple Regression

Page 24: ANOVA for Regression

Regression

Multiple Regression Modely = b0 + b1x1 + b2x2 + … + bpxp + e

Multiple Regression Equationy = b0 + b1x1 + b2x2 + … + bpxp

Estimated Multiple Regression Equation

ppxbxbxbby ...ˆ 22110

Page 25: ANOVA for Regression

Car DataMPG Weight Year Cylinders

18 3504 70 815 3693 70 818 3436 70 816 3433 70 817 3449 70 815 4341 70 814 4354 70 814 4312 70 814 4425 70 815 3850 70 815 3563 70 814 3609 70 8… … … …

Page 26: ANOVA for Regression

Multiple Regression, Example  Coefficients Standard Error t Stat

Intercept 46.3 0.800 57.8Weight -0.00765 0.000259 -29.4

R Square 0.687

  Coefficients Standard Error t StatIntercept -14.7 3.96 -3.71Weight -0.00665 0.000214 -31.0Year 0.763 0.0490 15.5

R Square 0.807

Page 27: ANOVA for Regression

Multiple Regression, Example

  Coefficients Standard Error t StatIntercept -14.4 4.03 -3.58Weight -0.00652 0.000460 -14.1Year 0.760 0.0498 15.2Cylinders -0.0741 0.232 -0.319

R Square 0.807

Predicted MPG for car weighing 4000 lbs built in 1980 with 6 cylinders:-14.4 -.00652(4000)+.76(80)-.0741(6)=-14.4-26.08+60.8-.4446=19.88

Page 28: ANOVA for Regression

Multiple Regression Model

2ˆ ii yySSE

2ˆ yySSR i

2 yySST i

SST = SSR + SSE

Page 29: ANOVA for Regression

Multiple Coefficient of Determination

The share of the variation explained by the estimated model.

R2 = SSR/SST

Page 30: ANOVA for Regression

F Test for Overall Significance

H0: b1 = b1 = . . . = bp

Ha: One or more of the parameters is not equal to zero

Reject H0 if: F > Fa OrReject H0 if: p-value < a

F = MSR/MSE

Page 31: ANOVA for Regression

ANOVA Table for Multiple Regression Model

Source Sum of Squares

Degrees of Freedom

Mean Squares

F

Regression SSR p MSR = SSR/p

F=MSR/MSE

Error SSE n-p-1 MSE = SSE/(n-p-1)

Total SST n-1

Page 32: ANOVA for Regression

t Test for Coefficients

H0: b1 = 0Ha: b1 ≠ 0

Reject H0 if:t < -t /2a or t > t /2a Or if:p < a

t = b1/sb1

With a t distribution of n-p-1 df

Page 33: ANOVA for Regression

MulticollinearityWhen two or more independent variables are highly correlated.

When multicollinearity is severe the estimated values of coefficients will be unreliable

Two guidelines for multicollinearity:• If the absolute value of the correlation coefficient for two independent variables exceeds 0.7• If the correlation coefficient for independent variable and some other independent variable is greater than the correlation with the dependent variable

Page 34: ANOVA for Regression

Multicollinearity

  MPG Weight Year CylindersMPG 1Weight -0.829 1Year 0.578 -0.300 1Cylinders -0.773 0.895 -0.344 1