Lecture 9: One Way ANOVA Between Subjects Laura McAvinue School of Psychology Trinity College Dublin

Lecture 9:One Way ANOVABetween Subjects

Laura McAvinue

School of Psychology

Trinity College Dublin

Analysis of Variance

• A statistical technique for testing for differences between the means of several groups– One of the most widely used statistical tests

• T-Test– Compare the means of two groups

• Independent samples• Paired samples

• ANOVA– No restriction on the number of groups

T-test

Group 1

Group 2

Mean Mean

Is the mean of one group significantly different to the mean of the other group?

•t-test: H0 - 1= 2 H1: 1 2

F-test

Group 1

Group 2

Mean Mean

Is the mean of one group significantly different to the means of the other groups?

Group 3

Mean

Analysis of Variance

One way ANOVA Factorial ANOVA

One Independent Variable

More than One Independent Variable

Two way

Three way

Four way

Between

subjects

Repeated

measures /

Within

subjects

Different participants

Same participants

A few examples…

• Between subjects one way ANOVA– The effect of one independent variable with three or

more levels on a dependent variable

• What are the independent & dependent variables in each of the following studies?

– The effect of three drugs on reaction time– The effect of five styles of teaching on exam results– The effect of age (old, middle, young) on recall– The effect of gender (male, female) on hostility

Rationale

• Let’s say you have three groups and you want to see if they are significantly different…

• Recall inferential statistics– Sample Population

• Your question:– Are these 3 groups representative of the same

population or of different populations?

Population Draw 3 samples

1

2

3

Manipulate the samples

Drug 1 Drug 2 Drug 3

DV

µ1 µ2 µ3measure effect of

manipulation on a DV

Did the manipulation alter the samples to such an extent that they now represent different populations?

Recall sampling error & the sampling distribution of the mean…

The means of samples drawn from the same population will differ a little due to random sampling error

When comparing the means of a number of groups, your task …

•Difference due to a true difference between the samples (representative of different populations)?

•Difference due to random sampling error (representative of the same population)?

If a true difference exists, this is due to your manipulation, the independent variable

Steps of NHST

1. Specify the alternative / research hypothesis

At least one mean is significantly different from the others

At least one group is representative of a separate population

2. Set up the null hypothesis

The hypothesis that all population means are equal

All groups are representative of the same population

Omnibus Ho: µ1= µ2 = µ3

Steps of NHST

3. Collect your data

4. Run the appropriate statistical test Between subjects one way ANOVA

5. Obtain the test statistic & associated p-value

F statistic

Compare the F statistic you obtained with the distribution of F when Ho is true

Determine the probability of obtaining such an F value when Ho is true

Steps of NHST

6. Decide whether to reject or fail to reject Ho on the basis of the p value

If the p value is very small (<.5), reject Ho…

Conclude that at least one sample mean is significantly different to the other means…

Not all groups are representative of the same population

How is ANOVA done?

Assume Ho is true Assume that all three groups are representative of the

same population

Make two estimates of the variance of this population

If Ho is true, then these two estimates should be about the same

If Ho is false, these two estimates should be different

Two estimates of population variance

• Within group variance• Pooled variability among participants in each treatment

group

• Between group variance• Variability among group means

If Ho is true…

Between Groups Variance

Within Groups Variance

= 1

If Ho is false…

Between Groups Variance

Within Groups Variance

> 1

Calculations

Step…

1: Sum of squares 2: Degrees of freedom 3: Mean square 4: F ratio 5: p value

Total Variance In data

SStotal

Between groups

varianceSSbetween

Within groups

VarianceSSwithin

SStotal

• ∑ (xij - Grand Mean )2

• Based on the difference between each score and the grand mean

• The sum of squared deviations of all observations, regardless of group membership, from the grand mean

SSbetween

• n∑ (Group meanj - Grand Mean )2

• Based on the differences between groups

• Related to the variance of the group means

• The sum of squared deviations of the group means from the grand mean, multiplied by the number of observations in each group

SSwithin

• ∑ (xij - Group Meanj )2

• Based on the variability within each group

• Calculate SS within each group & add

• The sum of squared deviations within each group … or …

• SStotal - SSbetween

Degrees of Freedom

• Total variance• N – 1• Total no. of observations - 1

• Between groups variance• K – 1• No. of groups – 1

• Within groups variance• k (n – 1)• No. of groups (no. in each sample – 1)• What’s left over!

Mean Square

• SS / df

• The average variance between or within groups

• An estimate of the population variance

• MSbetween

• SSgroup / dfgroup

• MSwithin

• SSwithin / dfwithin

F Ratio

MSbetween

MSwithin

If Ho is true, F = 1 If Ho is false, F > 1

MSbetween

MSwithin

F Treatment effect + Differences due to chance

Differences due to chance

If treatment has no effect…

F0 + Differences due to chance

Differences due to chance1

If treatment has effect…

FEFFECT > 0 + Differences due to chance

Differences due to chance> 1

MSBG

MSWG

MSBG

MSWGMSWG

Variance within groups> variance between groupsF<1Fail to reject Ho

If there is more variance within the groups, then any difference observedis due to chance

Variance within groups= Variance between groupsF =1Fail to reject Ho

If both sources of variance are the same, then any difference observedis due to chance

Variance within groups <variance between groupsF >1Reject Ho

The more the group means differ relative to each otherthe more likely it is that the differences are not due to chance.

MSBG

Size of F

• How much greater than 1 does F have to be to reject Ho?

• Compare the obtained F statistic to the distribution of F when Ho is true

• Calculate the probability of obtaining this F value when Ho is true

• p value

• If p < .05, reject Ho

• Conclude that at least one of your groups is significantly different from the others

ANOVA table

Source of variation

SS df MS F p

Between groups

n∑ (Group meanj - Grand Mean )2

K - 1 SSBG / dfBG MSBetween

MSWithin

Prob. of observing F-value when Ho is true

Within

groups

∑ (xij - Group Meanj )2

K(n – 1) SSWG / dfWG

Total ∑ (xij - Grand Mean )2

N - 1

A few assumptions…

• Data in each group should be…

• Interval scale

• Normally distributed• Histograms, box plots

• Homogeneity of variance• Variance of groups should be roughly equal

• Independence of observations• Each person should be in only one group• Participants should be randomly assigned to groups

Multiple Comparison Procedures

• Obtain a significant F statistic

• Reject Ho & conclude that at least one sample mean is significantly different from the others

• But which one?• H1: µ1 ≠ µ2 ≠ µ3 • H2: µ1 = µ2 ≠ µ3 • H3: µ1 ≠ µ2 = µ3

• Necessary to run a series of multiple comparisons to compare groups and see where the significant differences lie

Problem with Multiple Comparisons

• Making multiple comparisons leads to a higher probability of making a Type I error

• The more comparisons you make, the higher the probability of making a Type I error

• Familywise error rate• The probability that a family of comparisons contains

at least one Type I error

Problem with Multiple Comparisons

familywise = 1 - (1 - )c

c = number of comparisons

– Four comparisons run at = .05

familywise = 1 - (1 - .05)4

= 1 - .8145= .19

– You think you are working at = .05, but you’re actually working at = .19

Post hoc tests

• Bonferroni Procedure

/ c• Divide your significance level by the number of

comparisons you plan on making and use this more conservative value as your level of significance

• Four comparisons at = .05

• .05 / 4 = .0125• Reject Ho if p < .0125

Post hoc tests

• Note: Restrict the number of comparisons to the ones you are most interested in

• Tukey• Compares each mean with each other mean in a way

that keeps the maximum familywise error rate to .05• Computes a single value that represents the

minimum difference between group means that is necessary for significance

Effect Size

• A statistically significant difference might not mean anything in the real world

2 SSbetweenSStotal

Eta squared

Percentage of variability among observations that can be attributed to the differences between the groups

Omega squared

A little less biased…

2 SSbetween (k 1)MSwithin

SStotal MSwithin

How big is big? Similar to correlation coefficient

Cohen’s d

When comparing two groups

Meantreat – Meancontrol

SDcontrol

Documents

Lecture 9: One Way ANOVA Between Subjects Laura McAvinue School of Psychology Trinity College Dublin