Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Introduction to ANOVA

Introduction to StatisticsChapter 13

Apr 13-15, 2010Classes #23-24

Where are we?

Concluded material on t-tests & introduction of hypothesis testingGood conceptual & computational foundation

for more advanced inferential statistics Turn now to ANOVA – more complex

statisticGood preparation for more complex situations

A slightly different type of research design…

You have three different groups of people, and want to compare the outcomes of these three groups

What are you expecting?

1. you might be predicting specific differences (e.g., group A will have a higher score than both group B and group C)

2. you might be predicting a difference somewhere

What to do?

If you have a concrete idea of where the differences lie, based on theory and previous research, you can conduct planned comparisonsDirectly test just for where you think the

differences are

What to do?

If you think there’s a difference somewhere, but you want to be able to make all the possible comparisons to see where it might be, you can’t use this strategy

Why not?

With each statistically significant test, there is a p probability that that result was just due to chance, if the null hypothesis was correct

more tests, more likelihood of Type 1 error

What to do?

Need a new test, to make comparisons across all levels of predictor variable

ANOVAStands for analysis of varianceJust like t-tests, different typesWill discuss 3 types

First type of ANOVA

Comparable to independent samples t-testUsed with one predictor variableUsed with continuous criterion variableUsed with between-subjects design

The idea behind ANOVA

Key question = where does variability lie? Two sources:

Within people in each group or condition Between groups or conditions

If research hypothesis is true, where will there be the most variability?

What if the null hypothesis is true?

Illustrate Logic of ANOVA

Group1 Group2 Group3 Group4

27 36 17 34

31 35 21 36

25 29 22 30

27 33 22 32

24 38 21 32

27 36 15 32

M = 26.8 M = 34.5 M = 19.7 M = 32.7

We want to evaluate the effects of 4 different drugs on participants level of depression as measured by the Beck

Depression Inventory.

An ANOVA allows us to quantify how far apart the sample means must be before we are no longer

willing to say they are all “approximately” equal.

Introduction to ANOVA ANOVA – the ANalysis Of Variance

(1) Inferential hypothesis-testing procedure(2) Tremendous advantage over t-tests:

used to compare MULTIPLE (two or more) treatments

(3) Provides researchers with much greater flexibility in design and analysis of experiments

Introduction to ANOVA ANOVA – the ANalysis Of Variance

(4) Multiple Forms – In Chapter 13 we’ll look at the simplest: Single-factor, independent measures ANOVA

(a) factor: new name for the independent variable (b) independent measures: separate sample for

each treatment (c) level: the individual treatment conditions that

make up a factor

Research Design for ANOVA

Factors and Levels Can be multiple factors (IV’s) and levels

(variations) Expressed as factors x levels

How many factors? How many levels?

Therapist Experience

experienced (+) inexperienced (-)

Treatment treatment A tx A + exp tx A + inexp

treatment B tx B + exp tx B + inexp

Example of ANOVA Four different test times (8am, 12pm, 4pm, and

8pm)

Does time of test affect scores? ANOVA uses variance to assess differences

among the sample means

Tx1 Tx2 Tx3 Tx4

25 30 27 22

28 29 20 27

22 30 21 24

M = 25 M = 29.67 M = 22.67 M = 24.33

Variability Components for ANOVA

The Logic of ANOVA (1) First, determine total variability for data set

Tx1 Tx2 Tx3 Tx4

25 30 27 22

28 29 20 27

22 30 21 24

M = 25 M = 29.67 M = 22.67 M = 24.33

The Logic of ANOVA

(2) Next, break this variability into two components: (a) Between-Treatments variance – two sources:

Treatment Effect: Differences are caused by treatments.

Chance: Differences simply due to chance. (b) Within-Treatments variance – one source:

Chance: Differences simply due to chance.

Partitioning variance

Math behind ANOVA = Variance between groups (MS between)

Divided by

Variance within groups (MS within, or MS error; like pooled variance from independent samples t-test)

This ratio referred to as F value

Forming an F-Ratio (3) Finally, determine the variance due to the

treatments alone by forming an F-Ratio.

F = Variance Between-Treatments

Variance Within-TreatmentsOr in terms of sources…

F = Treatment Effect + Differences due to Chance

Differences due to Chance

If no treatment effect exists, F = 1.00 If there IS some treatment effect, F > 1.00 ( but not

automatically statistically significant)

Thinking about F

Variance can’t be negative F is always positive

If F = 1, same amount of variance between groups as within groups keep null hypothesis

If F > 1, more variance between groups than within groups if F large enough, reject null

How big is big enough?

Just like all statistics, have critical F value We will be using Table B.4 (page 590-592)

Size of F is dependent on: Significance value (table uses either .01 or .05) Whether one-tailed or two-tailed test Number of groups comparing ( numerator df =

number of groups – 1) Number of participants ( denominator df = sum of

df across all groups, or sample size – number of groups)

Error Term

Error due to chance Does the treatment effect (difference

among means) produce greater variability between groups than that expected by chance?

The denominator in the F ratio

The Structure of ANOVA Calculations

New Terms and Symbols

k = number of treatment conditions (levels and factors). For independent-measures study, k = # of separate samples.

n = number of scores in a treatment condition N = total number of scores in whole study (N = nk) T = sum of scores for each treatment condition G = sum of all scores in the study (Grand Total)

Hypothesis Testing with ANOVA (4 steps)

STEP 1: State the Hypothesis H0: k (k = number of factor levels) H1 : µ1 ≠ µ2 ≠ µ3 ≠ µ4

(At least one is different from the others)

Tx1 Tx2 Tx3 Tx4

25 30 27 22

28 29 20 27

22 30 21 24

M = 25 M = 29.67 M = 22.67 M = 24.33

Hypothesis Testing with ANOVA STEP 2: Locate the Critical region

• = .05

• Calculate dfbetween = k – 1

• Calculate dfwithin = N-k

• Calculate dftotal = N-1

• Critical F will be provided for you

• dfbetween + dfwithin = dftotal (always!)

• Begin to fill in the Source Table (ANOVA Table)

k = number of factor levels

n = number of scores in a treatment condition

N = total number of scores in whole study (N = nk)

T = sum of scores for each treatment condition

G = sum of all scores in the study (Grand Total)

Tx1 Tx2 Tx3 Tx4

25 30 27 22

28 29 20 27

22 30 21 24

M = 25 M = 29.67 M = 22.67 M = 24.33

Hypothesis Testing with ANOVASTEP 2 continued…

Basic ANOVA Table

Source SS df MS F

Between SSbetween k-1 MSbetween F = Fcalculated

Within SSwithin N-k MSwithin

Total SStotal N-1

Hypothesis Testing with ANOVA

STEP 3: Collect Data and Compute Sample Statistics SSbetween = (T2/n) – (G2/N)

SSwithin = SS inside each treatment =

(SS1+SS2+SS3+...+SSk)

SStotal = X2 – (G2/N) or SSbetween + SSwithin

MSbetween = SSbetween/dfbetween

MSwithin = SSwithin/dfwithin

F = MSbetween/MSwithin

Fill in source table (ANOVA Table)

*note: SSbetween + SSwithin = SStotal (always!) n = # of scores in a tx conditionn = # of scores in a tx conditionN = total # of scores in whole study N = total # of scores in whole study T = sum of scores for each tx conditionT = sum of scores for each tx conditionG = sum of all scores in the study (Grand Total)G = sum of all scores in the study (Grand Total)

Hypothesis Testing with ANOVA

STEP 4: Make a DecisionGiven the Critical F-value (Fcritical ) - which will be

provided - decide whether or not to reject the null.Fobtained < Fcritical --> Fail to reject Ho. Fobtained > Fcritical --> Reject Ho.

Use Appendix B.4 (page 592-594) to find Fcritical

Bold-Faced = Fcritical for = 0.01

Light-Faced Fcritical for = 0.05 df-numerator = df-between

df-denominator = df-within

Hypothesis Testing with ANOVA:Need to Organize Lots of Calculations

Basic ANOVA Table

Source SS df MS F

Between SSbetween k-1 MSbetween F = Fcalculated

Within SSwithin N-k MSwithin

Total SStotal N-1

Example 1A researcher is interested in whether class

time affects exam scores. There are four different class times being examined: 8am, 12pm, 4pm, and 8pm. Run an ANOVA, = .05, to see if a significant difference exists between these treatments.

Null Hypothesis: H0: µ1 = µ2 = µ3 = µ4

Alternative Hypothesis: HA: µ1 ≠ µ2 ≠ µ3 ≠ µ4

(At least one µ is different from the others)

Example 1 DATA

Trt. 1 Trt. 2 Trt. 3 Trt. 425 30 27 2228 29 20 2722 30 21 24

m1=25 m2=29.67 m3=22.67 m4=24.33

T1=75 T2=89 T3=68 T4=73

SS1=18 SS2=0.67 SS3=28.67 SS4=12.67

n1=3 n2=3 n3=3 n4=3

X2 = 7893G = 305N = 12k = 4

Example 1 CalculationsSSbetween = (T2/n) – (G2/N)

SSbetween = ((752/3)+(892/3)+(682/3)+(732/3))- (93,025/12)

SSbetween=((5625/3)+(7921/3)+(4624/3)+(5329/3))-7752.083

SSbetween = (1875+2640.33+1541.33+1776.33)-7752.083

SSbetween = 7832.99 – 7752.083

SSbetween = 80.91

SSwithin = SS1+SS2+SS3+SS4

SSwithin = 18+.67+28.67+12.67

SSwithin = 60.01

SStotal = X2 – (G2/N) OR SSbetween + SSwithin

SStotal = 7893-7752.083 OR 80.91+ 60.01

SStotal = 140.92

Example 1 ANOVA and Decision

Source SS df MS Fcalculated

Between 80.91 3 26.97 3.596

Within 60.01 8 7.50

TOTAL 140.92 11

Fcritical = 4.07

Fcalculated < Fcritical fail to reject H0

3.596 < 4.07 Fail to reject H0

Use Appendix B.4 (pp. 592-594) for Fcritical

df numerator = 3 (df for between)

df denominator = 8 (df for within)

Example 2

A researcher is interested in whether a new drug affects activity level of lab animals. There are three different doses being examined: low, medium, large. Run an ANOVA, = .05, to see if a significant difference exists between these doses.

Null Hypothesis:

Alternative:

Example 2 DATADose 1 (lo) Dose 2 (med) Dose 3 (hi)0 1 51 3 83 4 60 1 41 1 7

mean1= mean2= mean3=

T1= T2= T3=

SS1= SS2= SS3=

n1=5 n2=5 n3=5

X2 = G = N = k =

Example 2 CalculationsSSbetween = (T2/n) – (G2/N)

SSbetween =

SSbetween=

SSbetween =

SSbetween =

SSbetween =

SSwithin = SS1+SS2+SS3+SS4

SSwithin =

SSwithin =

SStotal = X2 – (G2/N) OR SSbetween + SSwithin

SStotal =

SStotal =

Example 2 ANOVA and DecisionSource SS df MS Fcalculated

Between

Within

TOTAL

Fcritical =

If Fobtained < Fcritical fail to reject H0

df numerator = (df for between)

df denominator = (df for within)

Effect size

Don’t use Cohen’s d anymore… Instead, r2 – like always, refers to amount of variance explained by

knowing which group someone belongs to = SS between treatments, divided by SS total (SPSS will compute – need to check off “effect size estimation” under “options”)

r2 = SS between treatments = SS between treatments SS between + SS within SS total

When computed for ANOVA, r2 frequently referred to as eta squared (2)

There’s a difference – now, where is it?

If your F value is great than your critical value, you can reject the null hypothesis

But, because you have more than two groups, you just know there’s a difference somewhere

Post-hoc tests

These look for where the differences are

One option: Tukey Honestly Significant Difference Test (HSD)

Strategy: test computes how large the difference between two groups needs to be, based on variance and sample size; then, any two groups whose difference exceeds this are considered to be significantly different

Tukey HSD

group

within

nMS

qHSD

A second option: Scheffé

More conservative than Tukey The strategy: uses overall variance

estimate (MS error) from overall ANOVA, also uses numerator df from overall ANOVA keeps critical F higher

Uses MS between for the specific comparison between two groups at a time

Example 3

A high school girls basketball coach is unhappy with the free throw shooting % of her team. In fact, last year her team finished last in the league in this category. This season she wants significant improvement so she has hired a sport psychologist to implement new techniques during preseason practices to determine the method to be employed to help her girls to improve.

She teaches them two focusing strategies – 1st an internal and 2nd an external strategy

She then allows half to continue using their preferred strategy while forcing the other half to change their focus

Example 3

What is the DV? What is the IV? Make a diagram of this design. How many groups are being tested?

Example 3

For a fair comparison, during preseason competition she records only the first 15 free throws taken by each of her 16 players. The number of shots made are listed

II IE EE EI

12 4 7 10

14 6 5 3

10 6 5 2

10 2 1 9

Example 3

Are the groups different?

Credits

http://myweb.liu.edu/~nfrye/psy801/ch13.ppt http://homepages.wmich.edu/~malavosi/Chapter%2013_ANOVA_5thedition.

ppt#2

Documents

Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24