Download ppt - (One-Way) Repeated Measures ANOVA

(One-Way) Repeated Measures ANOVA

PSYC 6130A, PROF. J. ELDER 2

One-Way Repeated Measures ANOVA

• Generalization of repeated-measures t-test to independent variable with more than 2 levels.

• Each subject has a score for each level of the independent variable.

• May be used for repeated or matched designs.


Example: Visual Grating Detection in Noise

200 ms

Until Response

500 ms

500 ms


.08480 .06830 .06540 .07283

.08290 .06090 .07610 .07330

.08880 .06440 .07120 .07480

.08550 .06453 .07090 .07364

1

2

3

Subject

Group Total Mean

.04 .15 .50

Noise

Mean

GroupTotal

Repeated Measures ANOVA Example: Grating Detection

0.009977s

Spatial frequency = 0.5 c/deg

Signal-to-noise ratio (SNR) at threshold


Example Grating Detection


Sum of Squares Analysis

T A B AB errSS SS SS SS SS

2

A iSS b X X

2

B jSS a X X

resid AB err T A BSS SS SS SS SS SS

Let Factor A represent Subject.

Let Factor B represent the within-subjects independent variable

(noise level in our example).

2T TSS df s


Degrees of Freedom Tree

1Bdf b

( 1)within Sdf a b

resid A Bdf df df

T 1Tdf N

1Adf a


Test Statistic

BB

B

SSMS

df

residresid

resid

SSMS

df

B

resid

MSF

MS


.08480 .06830 .06540 .07283

.08290 .06090 .07610 .07330

.08880 .06440 .07120 .07480

.08550 .06453 .07090 .07364

1

2

3

Subject

Group Total Mean

.04 .15 .50

Noise

Mean

GroupTotal

Repeated Measures ANOVA Example: Grating Detection

0.009977s

Spatial frequency = 0.5 c/deg

Signal-to-noise ratio (SNR) at threshold


Step 1. State the Hypothesis

• Same as for 1-way independent ANOVA:

0 1 2 3:H

: at least 2 means differaH


Step 2. Select Statistical Test and Significance Level

• As usual


Step 3. Select Samples and Collect Data

• Ideally, randomly sample

• More probably, random assignment

.08480 .06830 .06540 .07283

.08290 .06090 .07610 .07330

.08880 .06440 .07120 .07480

.08550 .06453 .07090 .07364

1

2

3

Subject

Group Total Mean

.04 .15 .50

Noise

Mean

GroupTotal


Step 4. Find Region of Rejection

1Bdf b

( 1)within Sdf a b

resid A Bdf df df

T 1Tdf N

1Adf a

2

2 2 4


Step 5. Calculate the Test Statistic

T A B AB errSS SS SS SS SS

2

A iSS b X X

2

B jSS a X X

resid AB err T A BSS SS SS SS SS SS

Let Factor A represent Subject.

Let Factor B represent the within-subjects independent variable.


Step 5. Calculate the Test Statistic

BB

B

SSMS

df

residresid

resid

SSMS

df

B

resid

MSF

MS


Step 6. Make the Statistical Decisions


SPSS Output

Tests of Within-Subjects Effects

Measure: MEASURE_1

.001 2 .000 14.355 .015

.001 1.237 .001 14.355 .044

.001 2.000 .000 14.355 .015

.001 1.000 .001 14.355 .063

9.66E-005 4 2.41E-005

9.66E-005 2.473 3.91E-005

9.66E-005 4.000 2.41E-005

9.66E-005 2.000 4.83E-005

Sphericity Assumed

Greenhouse-Geisser

Huynh-Feldt

Lower-bound

Sphericity Assumed

Greenhouse-Geisser

Huynh-Feldt

Lower-bound

Sourcenoise

Error(noise)

Type III Sumof Squares df Mean Square F Sig.

BSS

residSS


Assumptions

• Independent random sampling

• Multivariate normal distribution

• Homogeneity of variance (not a huge concern, since there is the same number of observations at each treatment level).

• Sphericity (new).


Homogeneity of Variance

• Homogeneity of Variance is the property that the variance in the dependent variable is the same at each level of the independent variable.

– In the context of RM ANOVA, this means that the variance between subjects is the same at each level of the independent variable.

– Since RM ANOVA designs are balanced by default, homogeneity of variance is not a critical issue.


Homogeneity of Variance

Subject Noise 0.04 0.14 0.5

1 0.0848 0.0683 0.0654

2 0.0829 0.0609 0.0761

3 0.0888 0.0644 0.0712

Variance 9.07E-06 1.37E-05 2.87E-05


Sphericity

• Sphericity is the property that the degree of interaction (covariance) between any two different levels of the independent variable is the same.

• Sphericity is critical for RM ANOVA because the error term is the average of the pairwise interactions.

• Violations generally lead to inflated F statistics (and hence inflated Type I error).


Sphericity Does Not Hold

-4 -2 0 2 4-4-3-2-101234

Xi1

Xi2

-4 -2 0 2 4-3-2-101234

Xi1

Xi3

-4 -2 0 2 4-3-2-101234

Xi2

Xi3


Sphericity Does Hold

-4 -2 0 2 4-4-3-2-101234

Xi1

Xi2

-4 -2 0 2 4-4-3-2-101234

Xi1

Xi3

-4 -2 0 2 4-4-3-2-101234

Xi2

Xi3


Sphericity• Does sphericity appear to hold?

• Do these graphs suggest that the RM design will yield a large increase in statistical power?

63.2 10XYs 64.3 10XYs 52.0 10XYs

0.06

0.062

0.064

0.066

0.068

0.082 0.084 0.086 0.088 0.09

Noise = .04

No

ise

= .

15

0.064

0.066

0.068

0.07

0.072

0.074

0.076

0.078

0.082 0.084 0.086 0.088 0.09

Noise = .04

No

ise

= .

50

0.064

0.066

0.068

0.07

0.072

0.074

0.076

0.078

0.06 0.062 0.064 0.066 0.068 0.07

Noise = .14

No

ise

= .

50


Testing Sphericity

• Mauchly (1940) test: provided automatically by SPSS

– Test has low power (for small samples, likely to accept sphericity assumption when it is false).


Alternative: Assume the Worst! (Total Lack of Sphericity)

• Conservative Geisser-Greenhouse F Test (1958)

– Provides a means for calculating a correct critical F value under the assumption of a complete lack of sphericity (lower bound):

(1, )

where

1 number of subjects -1

crit A

A

F df

df a


Estimating Sphericity

• What if your F statistic falls between the 2 critical values (assuming sphericity or assuming total lack of sphericity)?

( , ) (1, )crit B A B crit AF df df df F F df

• Solution: estimate sphericity, and use estimate to adjust critical value.

1Sphericity parameter : 1

Bdf

( , ) ( , )crit B A B crit B A BF df df df F df df df

• Two different methods for calculating :– Greenhouse and Geisser (1959)

– Huynh and Feldt (1976) – less conservative


SPSS Output

Tests of Within-Subjects Effects

Measure: MEASURE_1

.001 2 .000 14.355 .015

.001 1.237 .001 14.355 .044

.001 2.000 .000 14.355 .015

.001 1.000 .001 14.355 .063

9.66E-005 4 2.41E-005

9.66E-005 2.473 3.91E-005

9.66E-005 4.000 2.41E-005

9.66E-005 2.000 4.83E-005

Sphericity Assumed

Greenhouse-Geisser

Huynh-Feldt

Lower-bound

Sphericity Assumed

Greenhouse-Geisser

Huynh-Feldt

Lower-bound

Sourcenoise

Error(noise)

Type III Sumof Squares df Mean Square F Sig.

BSS

residSS

Mauchly's Test of Sphericity

Measure: MEASURE_1

.383 .960 2 .619 .618 1.000 .500Within Subjects Effectnoise

Mauchly's WApprox.

Chi-Square df Sig.Greenhouse-Geisser Huynh-Feldt Lower-bound

Epsilona

Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables isproportional to an identity matrix.

May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed inthe Tests of Within-Subjects Effects table.

a.

End of Lecture 17


Multivariate Approach to Repeated Measures

• Based on forming difference scores for each pair of levels of the independent variable.

– e.g., for our 3-level example, there are 3 pairs

• Each pair of difference scores is treated as a different dependent variable in a MANOVA.

• Sphericity does not need to be assumed.

• When all assumptions of the repeated measures ANOVA are met, ANOVA is usually more powerful than MANOVA (especially for small samples).

• Thus multivariate approach should be considered only if there is doubt about sphericity assumption.

• When sphericity does not apply, MANOVA can be much more powerful for large samples.


Post-Hoc Comparisons• If very confident about sphericity, use standard methods (e.g.,

Fisher’s LSD, Tukey’s HSD), with MSresid as error term.

• Otherwise, use conservative approach: Bonferroni test.

– Error term calculated separately for each comparison, using only the data from the two levels.

– This means that sphericity need not be assumed.

Pairwise Comparisons

Measure: MEASURE_1

.021 * .002 .037 .003 .039

.015 .004 .197 -.015 .045

-.021* .002 .037 -.039 -.003

-.006 .005 1.000 -.046 .034

-.015 .004 .197 -.045 .015

.006 .005 1.000 -.034 .046

(J) noise

2

3

1

3

1

2

(I) noise

1

2

3

MeanDifference

(I-J) Std. Error Sig.a

Lower Bound Upper Bound

95% Confidence Interval forDifference

a

Based on estimated marginal means

The mean difference is significant at the .05 level.*.

Adjustment for multiple comparisons: Bonferroni.a.


Reporting the Result

• One-way repeated measures ANOVA reveals a

significant effect of noise contrast on the signal-to-noise

ratio at threshold (F[2,4]=14.4, p=.044, =0.618). Post-

hoc pairwise Bonferroni-corrected comparisons reveal

that signal-to-noise ratio at threshold was higher at 4.8%

noise contrast than at 14.3% noise contrast (p=.037).

No other significant pairwise differences were found

(p>.05).


Varieties of Repeated-Measures and Randomized-Blocks Designs

• Simultaneous RM Design– e.g., subject rates different aspects of stimulus on comparable rating scale.

• Successive RM Design– Here counterbalancing becomes important

• RM Over Time– Track a dependent variable over time (e.g., learning effects)

– Not likely to satisfy sphericity (scores taken closer in time will have higher covariance).

• RM with Quantitative Levels– Think about regression first.

• Randomized Blocks– Matched design useful if you cannot avoid serious carryover effects.

• Natural blocks– Blocks of subjects are naturally occuring (e.g., children in same family)