Upload
julian-murphy
View
230
Download
1
Embed Size (px)
Citation preview
Chapter 19: The Two-Factor ANOVA for Independent Groups
An extension of the One-Factor ANOVA experiment has more than one independent variable, or ‘factor’.
For example, suppose we were interested in how both caffeine and beer influence response times.
You could run two separate studies, one comparing caffeine to a control group, and another comparing beer to another control group.
However, a more interesting experiment would be a ‘two-factor’ design and put subjects into one of four categories, which includes beer only, caffeine only and beer and caffeine.
Note that for two-factor ANOVAS, the sample size in each group is always the same.
Note: we’ll be skipping sections 19.8, 19.9, 19.10 and 19.13 from the book
Chapter 19: The Two-Factor ANOVA for Independent Groups
Here’s are some summary statistics for an example data set for n=12 subjects in each group (or cell)
SSW is the sums of squared deviations from the means within each cell, just like for the 1-Factor ANOVA
No Beer Beer
No Caffeine Mean: 1.08SSW = 0.66
Mean: 1.16SSW = 0.54
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Grand mean: 1.02SStotal = 3.07
No Beer Beer0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
Res
pons
e T
ime
(sec
)
No CaffeineCaffeine
It is common to plot the results of Two-Way experiments like this, with error bars representing the standard error of the mean.
No Beer Beer
No Caffeine Mean: 1.08SSW = 0.66
Mean: 1.16SSW = 0.54
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Grand mean: 1.02SStotal = 3.07
No Beer Beer0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
Res
pons
e T
ime
(sec
)
No CaffeineCaffeine
What statistical tests can we conduct on these results?
1) Effect of Beer on response times, averaged across Caffeine levels – a ‘main effect’ for Beer2) Effect of Caffeine on response times, averaged across Beer levels – a ‘main effect for Caffeine3) Interaction between Caffeine and Beer.
A significant interaction means that the main effects do not collectively explain all of the influence of the factors on the dependent variable.
Graphically, interactions happen when the lines are not parallel.
No Beer Beer0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
Res
pons
e T
ime
(sec
)
No CaffeineCaffeine
The main effect for rows the difference between the means for the rows, averaging across the columns.
No Beer Beer
No Caffeine Mean: 1.08SSW = 0.66
Mean: 1.16SSW = 0.54
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Main effect for ROWS
In this example, it is used to test for the effect of Caffeine on response times, averaging across the No Beer and Beer groups.
Statistically, its significance is determined by a One-Factor ANOVA, ‘collapsing’ across the columns
Graphically, it is a test if to see if the middle of the blue line is different than the middle of the green line.
No Beer Beer0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
Res
pons
e T
ime
(sec
)
No CaffeineCaffeine
The main effect for columns the difference between the means for the columns, averaging across the rows.
No Beer Beer
No Caffeine Mean: 1.08SSW = 0.66
Mean: 1.16SSW = 0.54
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Main effect for COLUMNS
In this example, it is used to test for the effect of Beer on response times, averaging across the No Caffeine and Caffeine groups.
Graphically, it is a test if to see if the midpoint between the blue and green lines differs across the groups (or columns).
Statistically, its significance is determined by a One-Factor ANOVA, ‘collapsing’ across the rows
No Beer Beer Row meansNo Caffeine Mean: 1.08
SSW = 0.66Mean: 1.16SSW = 0.54
Mean: 1.12
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Mean: 0.91
Column means Mean: 0.94 Mean: 1.09 Grand mean: 1.02SStotal = 3.07
No Beer Beer0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
Res
pons
e T
ime
(sec
)
No CaffeineCaffeine
Main effects for rows and columns are calculated by averaging the data across rows and columns.
1RX
2CX1CX
2RX
X
We can partition the total variance into these components:SStotal
scoresall
XX 2)(
SSwithin cell
scoresall
cellXX 2)(
SSbetween
SSrows
rowsall
RR XXn 2)(
SScols
colsall
CC XXn 2)(
SSrows x cols
colsrowswithintotal SSSSSSSS
With degrees of freedom: dftotal=ntotal-1
dfwithin cell=ntotal-RxC SSbetween
dfrows=R-1 dfcols=C-1 dfrows x cols=(R-1)(C-1)
C (R) is the total number of scores for that column (row)
We can partition the total variance into these components:
Three F-tests can then be conducted by computing the following four variances:
wc
wcwc df
SSs 2 s2
wc estimates the variance within each cell, or ‘inherent variance’. This is also an estimate of the population variance s2. s2
wc is used as the denominator for all of the F-tests – just like the 1-way ANOVA
R
RR df
SSs 2 s2
R estimates the inherent variance plus the main effect for the row factor. It increases with variance across the row means.
C
CC df
SSs 2
s2C estimates the inherent variance plus the main effect for the
column factor. It increases with variance across the column means.
RxC
RxCRxC df
SSs 2 s2
RxC estimates the inherent variance plus the interaction effect. If s2
RxC is small (near s2wc) then the total variance is completely explained
by the inherent variance plus the effects of the row and column factors alone. Hence, no interaction between the two factors.
2
2
wc
R
s
sF
The four variances are used to compute the three F-ratios to make the three hypothesis tests about the row factor, column factor and the interaction:
tests for the main effect of the row factor and has dfs of (R-1) and (ntotal – RxC)
2
2
wc
C
s
sF tests for the main effect of the column factor
and has dfs of (C-1) and (ntotal – RxC)
2
2
wc
RxC
s
sF tests for the interaction between the row and column factors
and has dfs of (R-1)x(C-1) and (ntotal – RxC)
Source SS df s2 F
Rows SSR R-1 SSR/dfR s2R/s2
wc
Columns SSC C-1 SSC/dfC s2C/s2
wc
RxC SSRxC (R-1)x(C-1) SSRxC/dfRxC s2RxC/s2
wc
Within cells SSwc ntotal-RxC SSwc/dfwc
Total SStotal ntotal-1
Typically, we conduct a two-factor ANOVA by filling in a table like this:
No Beer Beer Row meansNo Caffeine Mean: 1.08
SSW = 0.66Mean: 1.16SSW = 0.54
Mean: 1.12
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Mean: 0.91
Column means Mean: 0.94 Mean: 1.09 Grand mean: 1.02SStotal = 3.07
Source SS df s2 FRows SSR 1 SSR/dfR s2
R/s2wc
Columns SSC 1 SSC/dfC s2C/s2
wc
RxC SSRxC 1 SSRxC/dfRxC s2RxC/s2
wc
Within cells SSwc 44 SSwc/dfwc
Total 3.07 47
We can start filling in the table from our example about beer and caffeine.
No Beer Beer Row meansNo Caffeine Mean: 1.08
SSW = 0.66Mean: 1.16SSW = 0.54
Mean: 1.12
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Mean: 0.91
Column means Mean: 0.94 Mean: 1.09 Grand mean: 1.02SStotal = 3.07
SSwithin= 18.232.054.067.066.0)( 2 scoresall
cellXX
06.0)54.028.018.2(07.3 colsrowswithintotal SSSSSSSS
Source SS df s2 F
Rows SSR 1 SSR/dfR s2R/s2
wc
Columns SSC 1 SSC/dfC s2C/s2
wc
RxC SSRxC 1 SSRxC/dfRxC s2RxC/s2
wc
Within cells 2.18 44 0.0495
Total 3.07 47
0495.044
18.22 wc
wcwc df
SSs
Source SS df s2 F
Rows SSR 1 SSR/dfR s2R/s2
wc
Columns SSC 1 SSC/dfC s2C/s2
wc
RxC SSRxC 1 SSRxC/dfRxC s2RxC/s2
wc
Within cells SSwc 44 SSwc/dfwc
Total 3.07 47
No Beer Beer Row meansNo Caffeine Mean: 1.08
SSW = 0.66Mean: 1.16SSW = 0.54
Mean: 1.12
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Mean: 0.91
Column means Mean: 0.94 Mean: 1.09 Grand mean: 1.02SStotal = 3.07
54.01
54.02 R
RR df
SSs 98.10
0495.0
54.02
2
wc
R
s
sF
Source SS df s2 FRows 0.54 1 0.54 10.98Columns SSC 1 SSC/dfC s2
C/s2wc
RxC SSRxC 1 SSRxC/dfRxC s2RxC/s2
wc
Within cells 2.18 44 0.0495
Total 3.07 47
54.0)02.191.0(24)02.112.1(24)( 222 rowsall
RR XXnSSrows=
No Beer Beer Row meansNo Caffeine Mean: 1.08
SSW = 0.66Mean: 1.16SSW = 0.54
Mean: 1.12
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Mean: 0.91
Column means Mean: 0.94 Mean: 1.09 Grand mean: 1.02SStotal = 3.07
28.0)02.109.1(24)02.194.0(24)( 222 rowsall
CC XXnSScols=
28.01
28.02 C
CC df
SSs
Source SS df s2 F
Rows 0.54 1 0.54 10.98
Columns 0.28 1 0.28 5.62
RxC SSRxC 1 SSRxC/dfRxC s2RxC/s2
wc
Within cells 2.18 44 0.0495
Total 3.07 47
62.50495.0
28.02
2
wc
C
s
sF
No Beer Beer Row meansNo Caffeine Mean: 1.08
SSW = 0.66Mean: 1.16SSW = 0.54
Mean: 1.12
Caffeine Mean: 0.80SSW = 0.67
Mean: 1.02SSW = 0.32
Mean: 0.91
Column means Mean: 0.94 Mean: 1.09 Grand mean: 1.02SStotal = 3.07
SSrows x cols = 064.0)54.028.018.2(07.3 colsrowswithintotal SSSSSSSS
Source SS df s2 F
Rows 0.54 1 0.54 10.98
Columns 0.28 1 0.28 5.62
RxC 0.064 1 0.064 1.30
Within cells 2.18 44 0.0495
Total 3.07 47
064.01
064.02 RxC
RxCRxC df
SSs 30.1
0495.0
064.02
2
wc
RxC
s
sF
Source SS df s2 F Fcrit
Rows 0.54 1 0.54 10.98 4.06Columns 0.28 1 0.28 5.62 4.06
RxC 0.064 1 0.064 1.30 4.06
Within cells 2.18 44 0.0495Total 3.07 47
We can either use our F-tables (Table E) to find the critical values of F
Source SS df s2 F P-value
Rows 0.54 1 0.54 10.98 0.0019
Columns 0.28 1 0.28 5.62 0.0214
RxC 0.064 1 0.064 1.30 0.2604
Within cells 2.18 44 0.0495
Total 3.07 47
Or, more commonly, we can use our F-calculator to calculate the corresponding p-value for our observed values of F.
Source SS df s2 F P-valueRows 0.54 1 0.54 10.98 0.0019Columns 0.28 1 0.28 5.62 0.0214
RxC 0.064 1 0.064 1.30 0.2604
Within cells 2.18 44 0.0495Total 3.07 47
No Beer Beer0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
Res
pons
e T
ime
(sec
)
No CaffeineCaffeine
We show a significant main effect for rows (Caffeine) and for Columns (Beer), but not a significant interaction between rows and columns (Caffeine x Beer).
1 280
90
100
110
120
130
140
Columns
Sco
re
Row 1Row 2
Source SS df s2 F p-value
Rows 2629.420 1 2629.420 11.519 0.0015
Columns 6306.492 1 6306.492 27.628 0.0000
RxC 779.260 1 779.260 3.414 0.0714
Within 10043.672 44 228.265
Total 19758.844 47
Let’s play “guess that significance!”
Row 1Row 2
1 270
80
90
100
110
120
130
Columns
Sco
re
Source SS df s2 F p-value
Rows 5431.309 1 5431.309 19.340 0.0001
Columns 366.798 1 366.798 1.306 0.2593
RxC 3354.865 1 3354.865 11.946 0.0012
Within 12356.377 44 280.827
Total 21509.349 47
Guess that significance!
Row 1Row 2
Source SS df s2 F p-value
Rows 3946.596 1 3946.596 21.685 0.0000
Columns 68.646 1 68.646 0.377 0.5423
RxC 442.586 1 442.586 2.432 0.1261
Within 8007.838 44 181.996
Total 12465.666 47
1 290
95
100
105
110
115
120
125
Columns
Sco
re
Guess that significance!
Row 1Row 2
1 285
90
95
100
105
110
115
Columns
Sco
re
Source SS df s2 F p-value
Rows 7.143 1 7.143 0.027 0.8707
Columns 8.773 1 8.773 0.033 0.8568
RxC 2547.690 1 2547.690 9.565 0.0034
Within 11719.076 44 266.343
Total 14282.682 47
Guess that significance!
Row 1Row 2
1 295
100
105
110
115
120
125
130
Columns
Sco
re
Source SS df s2 F p-value
Rows 402.806 1 402.806 1.906 0.1743
Columns 4266.756 1 4266.756 20.193 0.0001
RxC 13.740 1 13.740 0.065 0.7999
Within 9296.999 44 211.295
Total 13980.299 47
Guess that significance!
Row 1Row 2
Source SS df s2 F p-value
Rows 4677.560 1 4677.560 20.283 0.0000
Columns 0.653 1 0.653 0.003 0.9578
RxC 6777.132 1 6777.132 29.387 0.0000
Within 10147.208 44 230.618
Total 21602.554 47
1 270
80
90
100
110
120
130
Columns
Sco
re
Guess that significance!
Row 1Row 2
1 2 380
85
90
95
100
105
Columns
Sco
re
Source SS df s2 F p-value
Rows 1717.957 1 1717.957 7.282 0.0088
Columns 4.773 2 2.386 0.010 0.9899
RxC 508.478 2 254.239 1.078 0.3463
Within 15569.688 66 235.904
Total 17800.896 71
Guess that significance!
Row 1Row 2
1 2 380
90
100
110
120
130
Columns
Sco
re
Source SS df s2 F p-value
Rows 439.149 1 439.149 2.011 0.1609
Columns 9877.896 2 4938.948 22.619 0.0000
RxC 324.247 2 162.123 0.742 0.4799
Within 14411.484 66 218.356
Total 25052.776 71
Guess that significance!
Row 1Row 2
Source SS df s2 F p-value
Rows 608.976 1 608.976 2.710 0.1045
Columns 818.724 2 409.362 1.822 0.1698
RxC 4498.794 2 2249.397 10.010 0.0002
Within 14830.738 66 224.708
Total 20757.233 71
Guess that significance!
1 2 370
80
90
100
110
120
Columns
Sco
re
n = 12 ExerciseNone A little A lot
Diet A Mean: 20.73SSW = 20.01
Mean: 20.12SSW = 37.86
Mean: 19.21SSW = 29.84
Diet B Mean: 20.83SSW = 13.14
Mean: 20.57SSW = 44.59
Mean: 18.08SSW = 20.27
Grand mean: 19.92SStotal = 235.90
Suppose you wanted to test the effects of diet and exercise on body mass index. You choose two diets (A and B) and three levels of exercise (none, a little, a lot). You then find 12 subjects for each group and obtain the following descriptive statistics:
Conduct a two factor ANOVA to determine if there is a main effect for diet, exercise and if there is an interaction between diet and exercise.
none A little A lot17.5
18
18.5
19
19.5
20
20.5
21
21.5
Exercise
BM
I
Diet ADiet B
First, let’s plot the data with error bars as the standard error of the means.
n = 12 ExerciseNone A little A lot
Diet A Mean: 20.73SSW = 20.01
Mean: 20.12SSW = 37.86
Mean: 19.21SSW = 29.84
Mean: 20.02
Diet B Mean: 20.83SSW = 13.14
Mean: 20.57SSW = 44.59
Mean: 18.08SSW = 20.27
Mean: 19.83
Mean: 20.78 Mean: 20.35 Mean: 18.64Grand mean: 19.92
SStotal = 235.90
Source SS df s2 F p-value
Rows SSR R-1 SSR/dfR s2R/s2
wc
Columns SSC C-1 SSC/dfC s2C/s2
wc
RxC SSRxC (R-1)x(C-1) SSRxC/dfRxC s2RxC/s2
wc
Within cells SSwc ntotal-RxC SSwc/dfwc
Total SStotal ntotal-1
n = 12 ExerciseNone A little A lot
Diet A Mean: 20.73SSW = 20.01
Mean: 20.12SSW = 37.86
Mean: 19.21SSW = 29.84
Mean: 20.02
Diet B Mean: 20.83SSW = 13.14
Mean: 20.57SSW = 44.59
Mean: 18.08SSW = 20.27
Mean: 19.83
Mean: 20.78 Mean: 20.35 Mean: 18.64Grand mean: 19.92
SStotal = 235.90
Source SS df s2 F p-valueRows 0.70 1 0.70 0.28 0.5999Columns 61.23 2 30.61 12.19 0.0000RxC 8.26 2 4.13 1.65 0.2007Within cells 165.71 66 2.51Total 235.90 71
No main effect for Rows (Diet)Main effect for Columns (Exercise)
No interaction between rows and columns (Diet and Exercise)
Source SS df s2 F p-value
Rows 0.70 1 0.70 0.28 0.5999Columns 61.23 2 30.61 12.19 0.0000RxC 8.26 2 4.13 1.65 0.2007Within cells 165.71 66 2.51Total 235.90 71
none A little A lot17.5
18
18.5
19
19.5
20
20.5
21
21.5
Exercise
BM
I
Diet ADiet B