23
Two-Way Analysis of Variance STAT E-150 Statistical Methods

Two-Way Analysis of Variance

  • Upload
    erling

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

STAT E-150 Statistical Methods. Two-Way Analysis of Variance. - PowerPoint PPT Presentation

Citation preview

Page 1: Two-Way Analysis  of Variance

Two-Way Analysis of Variance

STAT E-150Statistical Methods

Page 2: Two-Way Analysis  of Variance

2

One-way ANOVA analyzes the relationship between two variables, a quantitative response variable and the groups that are categories of a factor. In a two-way ANOVA model, there are two factors, each with its own number of levels. The two-way ANOVA tests to see if the factors are significant, either separately (called the main effects) or in combination (via an interaction). 

Page 3: Two-Way Analysis  of Variance

3

In a one-way ANOVA, we test for the equality of the means of several levels of a variable, by comparing the variation between the levels and the variation within each level (or group, or treatment.)

In two-way ANOVA the total variance is again partitioned into separate components. One is the variation within the groups. But in this case the between-groups variability is further divided into the variability due to Factor A, the variability due to Factor B, and the variability due to the interaction between the factors. 

Page 4: Two-Way Analysis  of Variance

4

The assumptions for this test are: Independence Assumption

The groups must be independent of each other, and the subjects within each group must be randomly assigned

 Equal Variance Assumption

The variances of the treatment groups are equal Normal Population Assumption

The values for each treatment group are normally distributed

Page 5: Two-Way Analysis  of Variance

5

Example:  Suppose we are interested in analyzing the effect of gender and age on income. We will treat the ages as categories: ages 18 - 29, 30 - 39, 40 - 49, and 50 or higher.

This is the structure of this design:

    Gender

    Female Male

Age category

18 - 29Income for n subjectsFemale, 18-29

Income for n subjectsMale, 18-29

30 - 39Income for n subjectsFemale, 30-39

Income for n subjectsMale, 30-39

40 - 49Income for n subjectsFemale, 40-49

Income for n subjectsMale, 40-49

≥ 50Income for n subjectsFemale, ≥ 50

Income for n subjectsMale, ≥ 50

Page 6: Two-Way Analysis  of Variance

6

We can then address these questions:

- Are there significant mean differences for income between male and female employees?

- Are there significant mean differences for income by age category among employees?

- Is there a significant interaction on income between gender and age category?

Page 7: Two-Way Analysis  of Variance

7

The hypotheses are:

H0: μF1= μF2= μF3= μM1= μM2= μM3 (1, 2, and 3 refer to the age groups)

Ha: the means are not all equal

Page 8: Two-Way Analysis  of Variance

8

The first step is to determine whether there is interaction between the two factors, age and gender, by creating an interaction plot. If the lines intersect suggest that there is factor interaction. However, it is important to check the ANOVA results to see if any interaction is significant. Here is the interaction plot for our example:

Since there is no intersection, we can conclude that there is no interaction between gender and age.

Page 9: Two-Way Analysis  of Variance

9

The next step is to check the Equal Variances assumption:

Tests the null hypothesis that the error variance of the dependent variable is equal across groups.

a. Design: Intercept + agecat4 + sex + agecat4 * sex

Since p is large, the null hypothesis of equal variances is not rejected; we can conclude that the data does not violate the Equal Variances assumption.

Levene's Test of Equality of Error Variancesa

Dependent Variable:rincom2

F df1 df2 Sig.

1.061 7 701 .387

Page 10: Two-Way Analysis  of Variance

10

Finally, here are the results of the ANOVA test:

A two-way ANOVA consists of three separate hypothesis tests, for - the mean difference between levels of the first factor- the mean difference between levels of the second factor- any other mean differences that may result from the combination of the

factors

Tests of Between-Subjects Effects

Dependent Variable:rincom2

Source

Type III Sum of

Squares df Mean Square F Sig.

Corrected Model 2236.798a 7 319.543 15.360 .000

Intercept 121426.954 1 121426.954 5836.953 .000

agecat4 1350.516 3 450.172 21.640 .000

sex 842.073 1 842.073 40.478 .000

agecat4 * sex 60.316 3 20.105 .966 .408

Error 14583.002 701 20.803    

Total 149966.000 709      

Corrected Total 16819.800 708      

a. R Squared = .133 (Adjusted R Squared = .124)

 

Page 11: Two-Way Analysis  of Variance

11

The first two are tests for the main effects. The null hypothesis is always that there are no differences between the levels of the factor.

Example: H0: μM - μF = 0 The third test is the test for interaction; the null hypothesis is that there is no interaction between the factors.

Tests of Between-Subjects Effects

Dependent Variable:rincom2

Source

Type III Sum of

Squares df Mean Square F Sig.

Corrected Model 2236.798a 7 319.543 15.360 .000

Intercept 121426.954 1 121426.954 5836.953 .000

agecat4 1350.516 3 450.172 21.640 .000

sex 842.073 1 842.073 40.478 .000

agecat4 * sex 60.316 3 20.105 .966 .408

Error 14583.002 701 20.803    

Total 149966.000 709      

Corrected Total 16819.800 708      

a. R Squared = .133 (Adjusted R Squared = .124)

 

Page 12: Two-Way Analysis  of Variance

12

It is important to note that these tests are independent; the outcome of one test does not affect the outcome of any other test. Therefore, it is possible to have any combination of significant and nonsignificant main effects and interactions.

Page 13: Two-Way Analysis  of Variance

13

What do these results show? We can see that the two categories Age and Sex are significant; however the interaction is not, as the interaction graph suggested. And so we have two significant main effects and an insignificant interaction.

Tests of Between-Subjects Effects

Dependent Variable:rincom2

Source

Type III Sum of

Squares df Mean Square F Sig.

Corrected Model 2236.798a 7 319.543 15.360 .000

Intercept 121426.954 1 121426.954 5836.953 .000

agecat4 1350.516 3 450.172 21.640 .000

sex 842.073 1 842.073 40.478 .000

agecat4 * sex 60.316 3 20.105 .966 .408

Error 14583.002 701 20.803    

Total 149966.000 709      

Corrected Total 16819.800 708      

a. R Squared = .133 (Adjusted R Squared = .124)

 

Page 14: Two-Way Analysis  of Variance

14

To investigate which groups are different, we can conduct a Scheffe post hoc test to compare all group combinations and identify any significant pairs. Here are the results of this test:

We can see that the age category 18 - 29 differs significantly in income from all other age categories. In addition, those 30 - 39 are significantly different in income from those 40 - 49 years of age.

Multiple Comparisons

rincom2Scheffe

(I) 4 categories of age (J) 4 categories of ageMean

Difference (I-J) Std. Error Sig.95% Confidence Interval

Lower Bound Upper Bound18-29 30-39 -2.1172* .49334 .000 -3.4997 -.7348

40-49 -3.9165* .51013 .000 -5.3461 -2.487050+ -3.2930* .54550 .000 -4.8216 -1.7643

30-39 18-29 2.1172* .49334 .000 .7348 3.499740-49 -1.7993* .44207 .001 -3.0381 -.560550+ -1.1757 .48245 .116 -2.5277 .1762

40-49 18-29 3.9165* .51013 .000 2.4870 5.346130-39 1.7993* .44207 .001 .5605 3.038150+ .6235 .49961 .669 -.7765 2.0236

50+ 18-29 3.2930* .54550 .000 1.7643 4.821630-39 1.1757 .48245 .116 -.1762 2.527740-49 -.6235 .49961 .669 -2.0236 .7765

Based on observed means. The error term is Mean Square(Error) = 20.803.

*. The mean difference is significant at the .05 level. 

Page 15: Two-Way Analysis  of Variance

15

How do you report the results? The two-way analysis of variance was conducted to investigate income differences in gender and age categories among employees. The results show a significant main effect for gender (F = 40.48, p < .001) and age category (F = 21.64, p < .001). Interaction between the factors was not significant (F = .966, p = .408).

The Scheffe post-hoc test revealed that the age category of 18-29 differed significantly in income from the other age categories. In addition, the income for those employees 30-39 years of age differed significantly from those 40-49 years of age.  

Page 16: Two-Way Analysis  of Variance

16

SPSS Instructions for Two-Way ANOVA To create an interaction plot: Choose > Graphs > Chart BuilderChoose Line and drag the second graph (Multiple) to the preview area.Select the response variable and move it to the y-axis.Select one predictor and move it to the x-axis. Select the other predictor and move it to the Set Color area.Click OK.

Page 17: Two-Way Analysis  of Variance

17

To perform a Two-Way Analysis of Variance Choose > Analyze > General Linear Model > Univariate

Identify the response variable and move it to the Dependent Variable list. Select the variables that define the groups and move them into the Fixed Factors box.

Page 18: Two-Way Analysis  of Variance

18

Then click on Options, and under Display, select Descriptive statistics and Homogeneity tests. Click on Continue and then OK.

Also click on Save and save the unstandardized predicted values and residuals: Group 1  

Page 19: Two-Way Analysis  of Variance

19

Then click on Options, and under Display, select Descriptive statistics and Homogeneity tests. Click on Continue and then OK.

Also click on Save and save the unstandardized predicted values and residuals: Group 1  

Page 20: Two-Way Analysis  of Variance

20

You will then see the results for Levene's Test and the Tests of Between-Subjects Effects.

The predicted values and residuals will be saved in your data sheet as PRE_1 and RES_1.

In the Univariate dialog box, you can also choose Post-hoc… and then in the next dialog box, choose the Scheffe post hoc test. This will produce the Multiple Comparisons table.  

Page 21: Two-Way Analysis  of Variance

21

Note that this analysis can also be used for a One-Way ANOVA so that the residuals and predicted values can be saved. Using the data for the Anorexia study, once these two values are saved, the residuals can be graphed.

Page 22: Two-Way Analysis  of Variance

22

Use > Analyze > Descriptive Statistics > Explore and choose the residuals as the dependent variable. Then choose Plots and select Histogram and Normality Plots with tests. Then click on Continue and OK.

Page 23: Two-Way Analysis  of Variance

23

The results include the following graphs of the residuals: