21
ANOVA ANOVA Single Factor Models Single Factor Models

ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

  • View
    241

  • Download
    3

Embed Size (px)

Citation preview

Page 1: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

ANOVAANOVA

Single Factor ModelsSingle Factor Models

Page 2: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

ANOVAANOVA• ANOVA (ANalysis Of VAriance) is a natural extension

used to compare the means more than 2 populations.

• Basic Question: Even if the true means of n populations were equal (i.e. we cannot expect the sample means (x1, x2, x3, x4 ) to be equal. So when we get different values for the x’s, – How much is due to randomness? – How much is due to the fact that we are sampling from

different populations with possibly different j’s.

Page 3: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

ANOVA TERMINOLOGYANOVA TERMINOLOGY• Response Variable (y) – What we are measuringWhat we are measuring

• Experimental Units– The individual unit that we will measureThe individual unit that we will measure

• Factors– Independent variables whose values can change to Independent variables whose values can change to

affect the outcome of the response variable, yaffect the outcome of the response variable, y

• Levels of Factors – Values of the factorsValues of the factors

• Treatments– The combination of the levels of the factors applied to The combination of the levels of the factors applied to

an experimental unitan experimental unit

Page 4: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

ExampleExampleWe want to know how combinations of different

amounts of water (1 ac-ft, 3 ac-ft, 5 ac-ft) and different fertilizers (A, B, C) affect crop yields

• Response variable – crop yield (bushels/acre)crop yield (bushels/acre)

• Experimental unit – Each acre that receives a treatmentEach acre that receives a treatment

• Factors (2)(2)– Water and fertilizerWater and fertilizer

• Levels (3 for Water; 3 for Fertilizer)(3 for Water; 3 for Fertilizer)– Water: 1, 3, 5; Fertilizer: A, B, CWater: 1, 3, 5; Fertilizer: A, B, C

• Treatments (9 = 3x3)(9 = 3x3)– 1A, 3A, 5A, 1B, 3B, 5B, 1C, 3C, 5C1A, 3A, 5A, 1B, 3B, 5B, 1C, 3C, 5C

Page 5: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Single Factor ANOVASingle Factor ANOVABasic AssumptionsBasic Assumptions

• If we focus on only one factor (e.g. fertilizer type in the previous example), this is called single factor ANOVA.– In this case, levels and treatments are the same thing

since there are no combinations between factors.

• Assumptions for Single Factor ANOVA1. The distribution of each population in the comparison

has a normal distribution2. The standard deviations of each population (although

unknown) are assumed to be equal (i.e.

3. Sampling is:RandomIndependent

Page 6: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

ExampleExample

• The university would like to know if the delivery mode of the introductory statistics class affects the performance in the class as measured by the scores on the final exam.

• The class is given in four different formats:– Lecture– Text Reading– Videotape– Internet

• The final exam scores from random samples of students from each of the four teaching formats was recorded.

Page 7: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

SamplesSamples

Page 8: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

SummarySummary

• There is a single factor under observation – teaching format

• There are k = 4 different treatments (or levels of teaching formats)

• The number of observations (experimental units) are n1 = 7, n2 = 8, n3 = 6, n4 = 5 total number of observations, n = 26

72 x : ns)observatio 26 all (ofmean Grand

74 x 75, x 65, x 76, x :MeansTreatment 4321

Page 9: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Why aren’t all theWhy aren’t all thex’s the same?x’s the same?• There is variability due to the different treatments

-- Between Treatment VariabilityBetween Treatment Variability (Treatment)(Treatment)• There is variability due to randomness within each

treatment -- Within Treatment VariabilityWithin Treatment Variability (Error)(Error)

If the average Between Treatment VariabilityBetween Treatment Variability is “large”

compared to the average Within Treatment VariabilityWithin Treatment Variability,

we can reasonably conclude that there really are

differences among the population means (i.e. at least

one μj differs from the others).

BASIC CONCEPTBASIC CONCEPT

Page 10: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Basic QuestionsBasic Questions

• Given this basic concept, the natural questions are:–What is “variability” due to treatment and due

to error and how are they measured?–What is “average variability” due to treatment

and due to error and how are they measured?–What is “large”?• How much larger than the observed average

variability due to error does the observed average variability due to treatment have to be before we are convinced that there are differences in the true population means (the µ’s)?

Page 11: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

How Is “Total” Variability Measured?How Is “Total” Variability Measured?Variability is defined as the Sum of Square Sum of Square

DeviationsDeviations (from the grand mean). So,

• SSTSST (Total Sum of Squares)– Sum of Squared Deviations of all observations

from the grand mean.

• SSTrSSTr (Between Treatment Sum of Squares)– Sum of Square Deviations Due to Different Treatments

• SSESSE (Within Treatment Sum of Squares)– Sum of Square Deviations Due to Error

SST = SSTr + SSESST = SSTr + SSE

Page 12: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

How is “Average” Variability Measured?How is “Average” Variability Measured?

“Average” Variability is measured in:

Mean Square ValuesMean Square Values (MSTr and MSE)– Found by dividing SSTr and SSE by their

respective degrees of freedom

VariabilityVariability SSSS DFDF Mean Square (MS)Mean Square (MS)

Between Tr. (Treatment) SSTr k-1 SSTr/DFTR

Within Tr. (Error) SSE n-k SSE/DFE

TOTAL SST n-1

ANOVA TABLEANOVA TABLE

# observations -1

# treatments -1 DFT - DFTR

Page 13: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Formula for CalculatingFormula for CalculatingSSTSST

Calculating SST

Just like the numerator of the variance assuming all (26) entries come from one population

4394 )7281(...7282

)x(x SST

22

2ij

Page 14: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Formula for Calculating Formula for Calculating SSTrSSTr

Calculating SSTr Between Treatment

Variability

Replace all entries within each treatment by its mean – now all the variability is between (not within) treatments

76767676767676

757575757575

6565656565656565

7474747474

578)7274(5)7275(6)7265(8)7276(7

)xx(n SSTr

2222

2jj

Page 15: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Formula for Calculating Formula for Calculating SSESSE

Calculating SSE (Within Treatment Variability)

The difference between the SST and SSTr ---

3816578-4394

SSTr - SST SSE

Page 16: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Can we Conclude a Difference Can we Conclude a Difference Among the 4 Teaching Formats?Among the 4 Teaching Formats?

We conclude that at least one population mean differs from the others if the average between treatment variability is large compared to the average within treatment variability, that is if MSTr/MSE is “large”.

• The ratio of the two measures of variability for these normally distributed random variables has an F distributionF distribution and the F-statistic (=MSTr/MSE)F-statistic (=MSTr/MSE) is compared to a critical F-value from an F distribution with:– Numerator degrees of freedom = DFTr– Denominator degrees of freedom = DFE

• If the ratio of MSTr to MSE (the F-statistic) exceeds the critical F-value, we can conclude that at least at least one population mean differs from the othersone population mean differs from the others.

Page 17: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Can We Conclude Different Teaching Can We Conclude Different Teaching Formats Affect Final Exam Scores?Formats Affect Final Exam Scores?

The F-testThe F-test

H0:

HA: At least one j differs from the others

Select α = .05.

Reject H0 (Accept HA) if:

3.05FF MSE

MSTr F .05,3,22DFEDFTr,α,

Page 18: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Hand Calculations for the F-Hand Calculations for the F-testtest

173.4522

3816

DFE

SSE MSE

192.673

578

DFTr

SSTr MSTr

CannotCannot conclude there is a difference among the conclude there is a difference among the μμjj’s’s

3.051.11

1.11173.45

192.67F

Page 19: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

Excel ApproachExcel Approach

Page 20: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

EXCEL OUTPUTEXCEL OUTPUT

p-value = .365975 > .05p-value = .365975 > .05Cannot conclude differencesCannot conclude differences

Page 21: ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations

REVIEWREVIEW

• ANOVA Situation and Terminology– Response variable, Experimental Units,

Factors, Levels, Treatments, Error

• Basic Concept– If the “average variability” between

treatments is “a lot” greater than the “average variability” due to error – conclude that at least one mean differs from the others.

• Single Factor Analysis– By Hand– By Excel