(One-Way) Repeated Measures ANOVA
PSYC 6130A, PROF. J. ELDER 2
One-Way Repeated Measures ANOVA
• Generalization of repeated-measures t-test to independent variable with more than 2 levels.
• Each subject has a score for each level of the independent variable.
• May be used for repeated or matched designs.
PSYC 6130A, PROF. J. ELDER 3
Example: Visual Grating Detection in Noise
200 ms
Until Response
500 ms
500 ms
PSYC 6130A, PROF. J. ELDER 4
.08480 .06830 .06540 .07283
.08290 .06090 .07610 .07330
.08880 .06440 .07120 .07480
.08550 .06453 .07090 .07364
1
2
3
Subject
Group Total Mean
.04 .15 .50
Noise
Mean
GroupTotal
Repeated Measures ANOVA Example: Grating Detection
0.009977s
Spatial frequency = 0.5 c/deg
Signal-to-noise ratio (SNR) at threshold
PSYC 6130A, PROF. J. ELDER 5
Example Grating Detection
PSYC 6130A, PROF. J. ELDER 6
Sum of Squares Analysis
T A B AB errSS SS SS SS SS
2
A iSS b X X
2
B jSS a X X
resid AB err T A BSS SS SS SS SS SS
Let Factor A represent Subject.
Let Factor B represent the within-subjects independent variable
(noise level in our example).
2T TSS df s
PSYC 6130A, PROF. J. ELDER 7
Degrees of Freedom Tree
1Bdf b
( 1)within Sdf a b
resid A Bdf df df
T 1Tdf N
1Adf a
PSYC 6130A, PROF. J. ELDER 8
Test Statistic
BB
B
SSMS
df
residresid
resid
SSMS
df
B
resid
MSF
MS
PSYC 6130A, PROF. J. ELDER 9
.08480 .06830 .06540 .07283
.08290 .06090 .07610 .07330
.08880 .06440 .07120 .07480
.08550 .06453 .07090 .07364
1
2
3
Subject
Group Total Mean
.04 .15 .50
Noise
Mean
GroupTotal
Repeated Measures ANOVA Example: Grating Detection
0.009977s
Spatial frequency = 0.5 c/deg
Signal-to-noise ratio (SNR) at threshold
PSYC 6130A, PROF. J. ELDER 10
Step 1. State the Hypothesis
• Same as for 1-way independent ANOVA:
0 1 2 3:H
: at least 2 means differaH
PSYC 6130A, PROF. J. ELDER 11
Step 2. Select Statistical Test and Significance Level
• As usual
PSYC 6130A, PROF. J. ELDER 12
Step 3. Select Samples and Collect Data
• Ideally, randomly sample
• More probably, random assignment
.08480 .06830 .06540 .07283
.08290 .06090 .07610 .07330
.08880 .06440 .07120 .07480
.08550 .06453 .07090 .07364
1
2
3
Subject
Group Total Mean
.04 .15 .50
Noise
Mean
GroupTotal
PSYC 6130A, PROF. J. ELDER 13
Step 4. Find Region of Rejection
1Bdf b
( 1)within Sdf a b
resid A Bdf df df
T 1Tdf N
1Adf a
2
2 2 4
PSYC 6130A, PROF. J. ELDER 14
Step 5. Calculate the Test Statistic
T A B AB errSS SS SS SS SS
2
A iSS b X X
2
B jSS a X X
resid AB err T A BSS SS SS SS SS SS
Let Factor A represent Subject.
Let Factor B represent the within-subjects independent variable.
PSYC 6130A, PROF. J. ELDER 15
Step 5. Calculate the Test Statistic
BB
B
SSMS
df
residresid
resid
SSMS
df
B
resid
MSF
MS
PSYC 6130A, PROF. J. ELDER 16
Step 6. Make the Statistical Decisions
PSYC 6130A, PROF. J. ELDER 17
SPSS Output
Tests of Within-Subjects Effects
Measure: MEASURE_1
.001 2 .000 14.355 .015
.001 1.237 .001 14.355 .044
.001 2.000 .000 14.355 .015
.001 1.000 .001 14.355 .063
9.66E-005 4 2.41E-005
9.66E-005 2.473 3.91E-005
9.66E-005 4.000 2.41E-005
9.66E-005 2.000 4.83E-005
Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound
Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound
Sourcenoise
Error(noise)
Type III Sumof Squares df Mean Square F Sig.
BSS
residSS
PSYC 6130A, PROF. J. ELDER 18
Assumptions
• Independent random sampling
• Multivariate normal distribution
• Homogeneity of variance (not a huge concern, since there is the same number of observations at each treatment level).
• Sphericity (new).
PSYC 6130A, PROF. J. ELDER 19
Homogeneity of Variance
• Homogeneity of Variance is the property that the variance in the dependent variable is the same at each level of the independent variable.
– In the context of RM ANOVA, this means that the variance between subjects is the same at each level of the independent variable.
– Since RM ANOVA designs are balanced by default, homogeneity of variance is not a critical issue.
PSYC 6130A, PROF. J. ELDER 20
Homogeneity of Variance
Subject Noise 0.04 0.14 0.5
1 0.0848 0.0683 0.0654
2 0.0829 0.0609 0.0761
3 0.0888 0.0644 0.0712
Variance 9.07E-06 1.37E-05 2.87E-05
PSYC 6130A, PROF. J. ELDER 21
Sphericity
• Sphericity is the property that the degree of interaction (covariance) between any two different levels of the independent variable is the same.
• Sphericity is critical for RM ANOVA because the error term is the average of the pairwise interactions.
• Violations generally lead to inflated F statistics (and hence inflated Type I error).
PSYC 6130A, PROF. J. ELDER 22
Sphericity Does Not Hold
-4 -2 0 2 4-4-3-2-101234
Xi1
Xi2
-4 -2 0 2 4-3-2-101234
Xi1
Xi3
-4 -2 0 2 4-3-2-101234
Xi2
Xi3
PSYC 6130A, PROF. J. ELDER 23
Sphericity Does Hold
-4 -2 0 2 4-4-3-2-101234
Xi1
Xi2
-4 -2 0 2 4-4-3-2-101234
Xi1
Xi3
-4 -2 0 2 4-4-3-2-101234
Xi2
Xi3
PSYC 6130A, PROF. J. ELDER 24
Sphericity• Does sphericity appear to hold?
• Do these graphs suggest that the RM design will yield a large increase in statistical power?
63.2 10XYs 64.3 10XYs 52.0 10XYs
0.06
0.062
0.064
0.066
0.068
0.082 0.084 0.086 0.088 0.09
Noise = .04
No
ise
= .
15
0.064
0.066
0.068
0.07
0.072
0.074
0.076
0.078
0.082 0.084 0.086 0.088 0.09
Noise = .04
No
ise
= .
50
0.064
0.066
0.068
0.07
0.072
0.074
0.076
0.078
0.06 0.062 0.064 0.066 0.068 0.07
Noise = .14
No
ise
= .
50
PSYC 6130A, PROF. J. ELDER 25
Testing Sphericity
• Mauchly (1940) test: provided automatically by SPSS
– Test has low power (for small samples, likely to accept sphericity assumption when it is false).
PSYC 6130A, PROF. J. ELDER 26
Alternative: Assume the Worst! (Total Lack of Sphericity)
• Conservative Geisser-Greenhouse F Test (1958)
– Provides a means for calculating a correct critical F value under the assumption of a complete lack of sphericity (lower bound):
(1, )
where
1 number of subjects -1
crit A
A
F df
df a
PSYC 6130A, PROF. J. ELDER 27
Estimating Sphericity
• What if your F statistic falls between the 2 critical values (assuming sphericity or assuming total lack of sphericity)?
( , ) (1, )crit B A B crit AF df df df F F df
• Solution: estimate sphericity, and use estimate to adjust critical value.
1Sphericity parameter : 1
Bdf
( , ) ( , )crit B A B crit B A BF df df df F df df df
• Two different methods for calculating :– Greenhouse and Geisser (1959)
– Huynh and Feldt (1976) – less conservative
PSYC 6130A, PROF. J. ELDER 28
SPSS Output
Tests of Within-Subjects Effects
Measure: MEASURE_1
.001 2 .000 14.355 .015
.001 1.237 .001 14.355 .044
.001 2.000 .000 14.355 .015
.001 1.000 .001 14.355 .063
9.66E-005 4 2.41E-005
9.66E-005 2.473 3.91E-005
9.66E-005 4.000 2.41E-005
9.66E-005 2.000 4.83E-005
Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound
Sphericity Assumed
Greenhouse-Geisser
Huynh-Feldt
Lower-bound
Sourcenoise
Error(noise)
Type III Sumof Squares df Mean Square F Sig.
BSS
residSS
Mauchly's Test of Sphericity
Measure: MEASURE_1
.383 .960 2 .619 .618 1.000 .500Within Subjects Effectnoise
Mauchly's WApprox.
Chi-Square df Sig.Greenhouse-Geisser Huynh-Feldt Lower-bound
Epsilona
Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables isproportional to an identity matrix.
May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed inthe Tests of Within-Subjects Effects table.
a.
End of Lecture 17
PSYC 6130A, PROF. J. ELDER 30
Multivariate Approach to Repeated Measures
• Based on forming difference scores for each pair of levels of the independent variable.
– e.g., for our 3-level example, there are 3 pairs
• Each pair of difference scores is treated as a different dependent variable in a MANOVA.
• Sphericity does not need to be assumed.
• When all assumptions of the repeated measures ANOVA are met, ANOVA is usually more powerful than MANOVA (especially for small samples).
• Thus multivariate approach should be considered only if there is doubt about sphericity assumption.
• When sphericity does not apply, MANOVA can be much more powerful for large samples.
PSYC 6130A, PROF. J. ELDER 31
Post-Hoc Comparisons• If very confident about sphericity, use standard methods (e.g.,
Fisher’s LSD, Tukey’s HSD), with MSresid as error term.
• Otherwise, use conservative approach: Bonferroni test.
– Error term calculated separately for each comparison, using only the data from the two levels.
– This means that sphericity need not be assumed.
Pairwise Comparisons
Measure: MEASURE_1
.021 * .002 .037 .003 .039
.015 .004 .197 -.015 .045
-.021* .002 .037 -.039 -.003
-.006 .005 1.000 -.046 .034
-.015 .004 .197 -.045 .015
.006 .005 1.000 -.034 .046
(J) noise
2
3
1
3
1
2
(I) noise
1
2
3
MeanDifference
(I-J) Std. Error Sig.a
Lower Bound Upper Bound
95% Confidence Interval forDifference
a
Based on estimated marginal means
The mean difference is significant at the .05 level.*.
Adjustment for multiple comparisons: Bonferroni.a.
PSYC 6130A, PROF. J. ELDER 32
Reporting the Result
• One-way repeated measures ANOVA reveals a
significant effect of noise contrast on the signal-to-noise
ratio at threshold (F[2,4]=14.4, p=.044, =0.618). Post-
hoc pairwise Bonferroni-corrected comparisons reveal
that signal-to-noise ratio at threshold was higher at 4.8%
noise contrast than at 14.3% noise contrast (p=.037).
No other significant pairwise differences were found
(p>.05).
PSYC 6130A, PROF. J. ELDER 33
Varieties of Repeated-Measures and Randomized-Blocks Designs
• Simultaneous RM Design– e.g., subject rates different aspects of stimulus on comparable rating scale.
• Successive RM Design– Here counterbalancing becomes important
• RM Over Time– Track a dependent variable over time (e.g., learning effects)
– Not likely to satisfy sphericity (scores taken closer in time will have higher covariance).
• RM with Quantitative Levels– Think about regression first.
• Randomized Blocks– Matched design useful if you cannot avoid serious carryover effects.
• Natural blocks– Blocks of subjects are naturally occuring (e.g., children in same family)