51
Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Embed Size (px)

Citation preview

Page 1: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Introduction to ANOVA

Introduction to StatisticsChapter 13

Apr 13-15, 2010Classes #23-24

Page 2: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Where are we?

Concluded material on t-tests & introduction of hypothesis testingGood conceptual & computational foundation

for more advanced inferential statistics Turn now to ANOVA – more complex

statisticGood preparation for more complex situations

Page 3: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

A slightly different type of research design…

You have three different groups of people, and want to compare the outcomes of these three groups

Page 4: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

What are you expecting?

1. you might be predicting specific differences (e.g., group A will have a higher score than both group B and group C)

2. you might be predicting a difference somewhere

Page 5: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

What to do?

If you have a concrete idea of where the differences lie, based on theory and previous research, you can conduct planned comparisonsDirectly test just for where you think the

differences are

Page 6: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

What to do?

If you think there’s a difference somewhere, but you want to be able to make all the possible comparisons to see where it might be, you can’t use this strategy

Page 7: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Why not?

With each statistically significant test, there is a p probability that that result was just due to chance, if the null hypothesis was correct

more tests, more likelihood of Type 1 error

Page 8: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

What to do?

Need a new test, to make comparisons across all levels of predictor variable

ANOVAStands for analysis of varianceJust like t-tests, different typesWill discuss 3 types

Page 9: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

First type of ANOVA

Comparable to independent samples t-testUsed with one predictor variableUsed with continuous criterion variableUsed with between-subjects design

Page 10: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

The idea behind ANOVA

Key question = where does variability lie? Two sources:

Within people in each group or condition Between groups or conditions

If research hypothesis is true, where will there be the most variability?

What if the null hypothesis is true?

Page 11: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Illustrate Logic of ANOVA

Group1 Group2 Group3 Group4

27 36 17 34

31 35 21 36

25 29 22 30

27 33 22 32

24 38 21 32

27 36 15 32

M = 26.8 M = 34.5 M = 19.7 M = 32.7

We want to evaluate the effects of 4 different drugs on participants level of depression as measured by the Beck

Depression Inventory.

An ANOVA allows us to quantify how far apart the sample means must be before we are no longer

willing to say they are all “approximately” equal.

Page 12: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Introduction to ANOVA ANOVA – the ANalysis Of Variance

(1) Inferential hypothesis-testing procedure(2) Tremendous advantage over t-tests:

used to compare MULTIPLE (two or more) treatments

(3) Provides researchers with much greater flexibility in design and analysis of experiments

Page 13: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Introduction to ANOVA ANOVA – the ANalysis Of Variance

(4) Multiple Forms – In Chapter 13 we’ll look at the simplest: Single-factor, independent measures ANOVA

(a) factor: new name for the independent variable (b) independent measures: separate sample for

each treatment (c) level: the individual treatment conditions that

make up a factor

Page 14: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Research Design for ANOVA

Page 15: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Factors and Levels Can be multiple factors (IV’s) and levels

(variations) Expressed as factors x levels

How many factors? How many levels?

Therapist Experience

experienced (+) inexperienced (-)

Treatment treatment A tx A + exp tx A + inexp

treatment B tx B + exp tx B + inexp

Page 16: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example of ANOVA Four different test times (8am, 12pm, 4pm, and

8pm)

Does time of test affect scores? ANOVA uses variance to assess differences

among the sample means

Tx1 Tx2 Tx3 Tx4

25 30 27 22

28 29 20 27

22 30 21 24

M = 25 M = 29.67 M = 22.67 M = 24.33

Page 17: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Variability Components for ANOVA

Page 18: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

The Logic of ANOVA (1) First, determine total variability for data set

Tx1 Tx2 Tx3 Tx4

25 30 27 22

28 29 20 27

22 30 21 24

M = 25 M = 29.67 M = 22.67 M = 24.33

Page 19: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

The Logic of ANOVA

(2) Next, break this variability into two components: (a) Between-Treatments variance – two sources:

Treatment Effect: Differences are caused by treatments.

Chance: Differences simply due to chance. (b) Within-Treatments variance – one source:

Chance: Differences simply due to chance.

Page 20: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Partitioning variance

Math behind ANOVA = Variance between groups (MS between)

Divided by

Variance within groups (MS within, or MS error; like pooled variance from independent samples t-test)

This ratio referred to as F value

Page 21: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Forming an F-Ratio (3) Finally, determine the variance due to the

treatments alone by forming an F-Ratio.

F = Variance Between-Treatments

Variance Within-TreatmentsOr in terms of sources…

F = Treatment Effect + Differences due to Chance

Differences due to Chance

If no treatment effect exists, F = 1.00 If there IS some treatment effect, F > 1.00 ( but not

automatically statistically significant)

Page 22: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Thinking about F

Variance can’t be negative F is always positive

If F = 1, same amount of variance between groups as within groups keep null hypothesis

If F > 1, more variance between groups than within groups if F large enough, reject null

Page 23: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

How big is big enough?

Just like all statistics, have critical F value We will be using Table B.4 (page 590-592)

Size of F is dependent on: Significance value (table uses either .01 or .05) Whether one-tailed or two-tailed test Number of groups comparing ( numerator df =

number of groups – 1) Number of participants ( denominator df = sum of

df across all groups, or sample size – number of groups)

Page 24: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Error Term

Error due to chance Does the treatment effect (difference

among means) produce greater variability between groups than that expected by chance?

The denominator in the F ratio

Page 25: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

The Structure of ANOVA Calculations

Page 26: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

New Terms and Symbols

k = number of treatment conditions (levels and factors). For independent-measures study, k = # of separate samples.

n = number of scores in a treatment condition N = total number of scores in whole study (N = nk) T = sum of scores for each treatment condition G = sum of all scores in the study (Grand Total)

Page 27: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Hypothesis Testing with ANOVA (4 steps)

STEP 1: State the Hypothesis H0: k (k = number of factor levels) H1 : µ1 ≠ µ2 ≠ µ3 ≠ µ4

(At least one is different from the others)

Tx1 Tx2 Tx3 Tx4

25 30 27 22

28 29 20 27

22 30 21 24

M = 25 M = 29.67 M = 22.67 M = 24.33

Page 28: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Hypothesis Testing with ANOVA STEP 2: Locate the Critical region

• = .05

• Calculate dfbetween = k – 1

• Calculate dfwithin = N-k

• Calculate dftotal = N-1

• Critical F will be provided for you

• dfbetween + dfwithin = dftotal (always!)

• Begin to fill in the Source Table (ANOVA Table)

k = number of factor levels

n = number of scores in a treatment condition

N = total number of scores in whole study (N = nk)

T = sum of scores for each treatment condition

G = sum of all scores in the study (Grand Total)

Tx1 Tx2 Tx3 Tx4

25 30 27 22

28 29 20 27

22 30 21 24

M = 25 M = 29.67 M = 22.67 M = 24.33

Page 29: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Hypothesis Testing with ANOVASTEP 2 continued…

Basic ANOVA Table

Source SS df MS F

Between SSbetween k-1 MSbetween F = Fcalculated

Within SSwithin N-k MSwithin

Total SStotal N-1

Page 30: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Hypothesis Testing with ANOVA

STEP 3: Collect Data and Compute Sample Statistics SSbetween = (T2/n) – (G2/N)

SSwithin = SS inside each treatment =

(SS1+SS2+SS3+...+SSk)

SStotal = X2 – (G2/N) or SSbetween + SSwithin

MSbetween = SSbetween/dfbetween

MSwithin = SSwithin/dfwithin

F = MSbetween/MSwithin

Fill in source table (ANOVA Table)

*note: SSbetween + SSwithin = SStotal (always!) n = # of scores in a tx conditionn = # of scores in a tx conditionN = total # of scores in whole study N = total # of scores in whole study T = sum of scores for each tx conditionT = sum of scores for each tx conditionG = sum of all scores in the study (Grand Total)G = sum of all scores in the study (Grand Total)

Page 31: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Hypothesis Testing with ANOVA

STEP 4: Make a DecisionGiven the Critical F-value (Fcritical ) - which will be

provided - decide whether or not to reject the null.Fobtained < Fcritical --> Fail to reject Ho. Fobtained > Fcritical --> Reject Ho.

Use Appendix B.4 (page 592-594) to find Fcritical

Bold-Faced = Fcritical for = 0.01

Light-Faced Fcritical for = 0.05 df-numerator = df-between

df-denominator = df-within

Page 32: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Hypothesis Testing with ANOVA:Need to Organize Lots of Calculations

Basic ANOVA Table

Source SS df MS F

Between SSbetween k-1 MSbetween F = Fcalculated

Within SSwithin N-k MSwithin

Total SStotal N-1

Page 33: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 1A researcher is interested in whether class

time affects exam scores. There are four different class times being examined: 8am, 12pm, 4pm, and 8pm. Run an ANOVA, = .05, to see if a significant difference exists between these treatments.

Null Hypothesis: H0: µ1 = µ2 = µ3 = µ4

Alternative Hypothesis: HA: µ1 ≠ µ2 ≠ µ3 ≠ µ4

(At least one µ is different from the others)

Page 34: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 1 DATA

Trt. 1 Trt. 2 Trt. 3 Trt. 425 30 27 2228 29 20 2722 30 21 24

m1=25 m2=29.67 m3=22.67 m4=24.33

T1=75 T2=89 T3=68 T4=73

SS1=18 SS2=0.67 SS3=28.67 SS4=12.67

n1=3 n2=3 n3=3 n4=3

X2 = 7893G = 305N = 12k = 4

Page 35: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 1 CalculationsSSbetween = (T2/n) – (G2/N)

SSbetween = ((752/3)+(892/3)+(682/3)+(732/3))- (93,025/12)

SSbetween=((5625/3)+(7921/3)+(4624/3)+(5329/3))-7752.083

SSbetween = (1875+2640.33+1541.33+1776.33)-7752.083

SSbetween = 7832.99 – 7752.083

SSbetween = 80.91

SSwithin = SS1+SS2+SS3+SS4

SSwithin = 18+.67+28.67+12.67

SSwithin = 60.01

SStotal = X2 – (G2/N) OR SSbetween + SSwithin

SStotal = 7893-7752.083 OR 80.91+ 60.01

SStotal = 140.92

Page 36: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 1 ANOVA and Decision

Source SS df MS Fcalculated

Between 80.91 3 26.97 3.596

Within 60.01 8 7.50

TOTAL 140.92 11

Fcritical = 4.07

Fcalculated < Fcritical fail to reject H0

3.596 < 4.07 Fail to reject H0

Use Appendix B.4 (pp. 592-594) for Fcritical

df numerator = 3 (df for between)

df denominator = 8 (df for within)

Page 37: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 2

A researcher is interested in whether a new drug affects activity level of lab animals. There are three different doses being examined: low, medium, large. Run an ANOVA, = .05, to see if a significant difference exists between these doses.

Null Hypothesis:

Alternative:

Page 38: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 2 DATADose 1 (lo) Dose 2 (med) Dose 3 (hi)0 1 51 3 83 4 60 1 41 1 7

mean1= mean2= mean3=

T1= T2= T3=

SS1= SS2= SS3=

n1=5 n2=5 n3=5

X2 = G = N = k =

Page 39: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 2 CalculationsSSbetween = (T2/n) – (G2/N)

SSbetween =

SSbetween=

SSbetween =

SSbetween =

SSbetween =

SSwithin = SS1+SS2+SS3+SS4

SSwithin =

SSwithin =

SStotal = X2 – (G2/N) OR SSbetween + SSwithin

SStotal =

SStotal =

Page 40: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 2 ANOVA and DecisionSource SS df MS Fcalculated

Between

Within

TOTAL

Fcritical =

If Fobtained < Fcritical fail to reject H0

df numerator = (df for between)

df denominator = (df for within)

Page 41: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Effect size

Don’t use Cohen’s d anymore… Instead, r2 – like always, refers to amount of variance explained by

knowing which group someone belongs to = SS between treatments, divided by SS total (SPSS will compute – need to check off “effect size estimation” under “options”)

r2 = SS between treatments = SS between treatments SS between + SS within SS total

When computed for ANOVA, r2 frequently referred to as eta squared (2)

Page 42: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

There’s a difference – now, where is it?

If your F value is great than your critical value, you can reject the null hypothesis

But, because you have more than two groups, you just know there’s a difference somewhere

Page 43: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Post-hoc tests

These look for where the differences are

Page 44: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

One option: Tukey Honestly Significant Difference Test (HSD)

Strategy: test computes how large the difference between two groups needs to be, based on variance and sample size; then, any two groups whose difference exceeds this are considered to be significantly different

Page 45: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Tukey HSD

group

within

nMS

qHSD

Page 46: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

A second option: Scheffé

More conservative than Tukey The strategy: uses overall variance

estimate (MS error) from overall ANOVA, also uses numerator df from overall ANOVA keeps critical F higher

Uses MS between for the specific comparison between two groups at a time

Page 47: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 3

A high school girls basketball coach is unhappy with the free throw shooting % of her team. In fact, last year her team finished last in the league in this category. This season she wants significant improvement so she has hired a sport psychologist to implement new techniques during preseason practices to determine the method to be employed to help her girls to improve.

She teaches them two focusing strategies – 1st an internal and 2nd an external strategy

She then allows half to continue using their preferred strategy while forcing the other half to change their focus

Page 48: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 3

What is the DV? What is the IV? Make a diagram of this design. How many groups are being tested?

Page 49: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 3

For a fair comparison, during preseason competition she records only the first 15 free throws taken by each of her 16 players. The number of shots made are listed

II IE EE EI

12 4 7 10

14 6 5 3

10 6 5 2

10 2 1 9

Page 50: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Example 3

Are the groups different?

Page 51: Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24

Credits

http://myweb.liu.edu/~nfrye/psy801/ch13.ppt http://homepages.wmich.edu/~malavosi/Chapter%2013_ANOVA_5thedition.

ppt#2