Download pdf - STAT 430 (Fall 2017): Tutorial 5 - Two-way ANOVAlla24/STAT430/STAT430_Tutorial5.pdfSTAT 430 (Fall 2017): Tutorial 5 Two-way ANOVA Luyao Lin October 17/19, 2017 Department Statistics

STAT 430 (Fall 2017): Tutorial 5

Two-way ANOVA

Luyao Lin

October 17/19, 2017

Department Statistics and Actuarial Science, Simon Fraser University

Outlines

Two-way ANOVA

• two treatment factors

• equal sample-size

• unequal sample-size

1

Battery Data Description

Brief Background: An engineer is designing a battery for use in a device

that will be subjected to some extreme variations in temperature. The

only design parameter that he can select at this point is the plate

material for the battery, and he has three possible choices. He also know

from experience that temperature will affect the effective battery life. So,

he also includes that as a factor in the battery life experiment.

The engineer decides to test all three plate materials at three

temperature levels: 15, 70, and 125 ◦F , because these temperature levels

are consistent with the product end-use environment.

2

Data Description

Table 1: Life data (in hours) for the Battery Design Experiment

Temperature (◦F )

Material 15 70 125 yi··

130 155 34 40 20 70

1 74 180 80 75 82 58 83.17

(134.75) (57.25) (57.5)

150 188 136 122 25 70

2 159 126 106 115 58 45 108.33

(155.75) (119.75) (49.5)

138 110 174 120 96 104

3 168 160 150 139 82 60 125.08

(144.00) (145.75) (85.5)

y·j· 144.83 107.58 64.17 105.53

Note: the numbers in parantheses are averages of each pair of levels of

the two factors. 3

Statistical Models:

Cell-means model:

yijt = µ+ τij + εijt ,

Two-way complete model:

yijt = µ+ αi + βj + (αβ)ij + εijk ,

Two-way main-effects model

yijt = µ+ αi + βj + εijt ,

where

• i = 1, . . . , a = 3; j = 1, . . . , b = 3; t = 1, . . . , n(= r).

• µ is the overall mean;

• ε is still the random error

4

Cell-means Model


• with a constraint:∑a

i=1

∑bj=1 τij = 0, why this is necessary?

• Similar to one-way anova

yijt ∼ N(µ+ τij , σ2)

• Each combination of two treatments is considered as a new

treatment

• we have ab treatment in total

• the null hypothesis:

τij = 0 ∀i , j

the alternative hypothesis:

at least one τij not equal to 0

5

cell-means model = two-way complete model


if τij = αi + βj + (αβ)ij

• again we assume

yijt ∼ N(µ+ αi + βj + (αβ)ij , σ2)

• with three constraints:∑i αi = 0,

∑j βj = 0,

∑i (αβ)ij = 0,

∑j(αβ)ij = 0

• cell-means and two-way complete models are equivalent because

given one, you can derive the other one

6

Comparing them

Cell-means model:


• there are in total 9 combinations → 9 unknown parameters in τij• with the constraint

∑i

∑j τij = 0

• we need to estimate 8 unknown τij



• 2 unknowns for the αi

• 2 unknowns for the βj• 4 unknowns for the interactions (αβ)ij• add up to 8

7

Do they answer the same questions?

Cell-means model:


H0 : τ11 = τ12 = . . . = 0 versus

Ha : at least one τij 6= 0

⇒ if one is trying to see which

combination gives the ‘best’

outcome, cell-means model is good

enough.



For the interactions:

H0 : (αβ)ij = 0 versus

Ha : at least one (αβ)ij 6= 0

For the first main effect αi

H0 : α1 = α2 = . . . = αa = 0 versus

Ha : at least one αi 6= 0

For the second main effect βjH0 : β1 = β2 = . . . βb = 0 versus

Ha : at least one βj 6= 0

⇒ if one is trying to learn about the

effect of each treatment, two-way

complete model should be chosen.8

Two-way main-effects model

yijk = µ+ αi + βj + εijk ,

• with two constraints:∑

i αi = 0,∑

j βj = 0

• 2 unknown parameters for αi

• 2 unknown parameters for βj

• in total we have 4 unknown parameters

• compared to two-way complete model, we have 4 less parameters

• because there is no ‘interaction’ terms

9

which one to use

yijk = µ+ αi + βj + εijk or

yijk = µ+ αi + βj + (αβ)ij + εijk

It depends on two things:

• whether the ’interaction’ effect is ‘huge’

• sample size

10

Interaction effect

⇒ Use the interaction plot to check, and also consider variability11

Model selection: It also depends on the sample-size

• When the sample size is small, two-way main-effects might be the

only choice

• An extreme case is: only one sample for each treatment combination

(section 6.7.1)

Source df

Temperature b-1 =2

Material a-1 = 2

Interaction (a-1)(b-1) = 4

Error ab(n-1)=0

Total abn-1 = 8

What does degree of freedom being 0 means? When the degree of

freedom for Error part is 0, we cannot estimate σ2

12

ANOVA table for two-way complete model

When the sample sizes for each group are equal

SST = SSA + SSB + SSAB + SSE

d.f. Sum of squares

(nab − 1) SST =∑n

k=1

∑bj=1

∑ai=1(yijk − y···)

2

=∑n

k=1

∑bj=1

∑ai=1 y

2ijk − naby2

···(a− 1) SSA = nb

∑ai=1(yi·· − y···)

2 = nb∑a

i=1 y2i·· − naby2

···(b − 1) SSB = na

∑bj=1(y·j· − y···)

2 = na∑b

j=1 y2·j· − naby2

···(a− 1)(b − 1) SSAB = n

∑bj=1

∑ai=1(yij· − yi·· − y·j· + y···)

2

ab(n − 1) SSE =∑n

k=1

∑bj=1

∑ai=1(yijk − yij·)

2

= SST − SSA− SSB − SSAB

13

Manual computation of SS for Battery Life Data

SS(Total) =4∑

l=1

3∑j=1

3∑i=1

(yijl − y···)2 =

4∑l=1

3∑j=1

3∑i=1

y2ijl − 4 × 3 × 3 × y2

···

= (130)2 + (155)2 + ... + (60)2 − 36 × 105.532 = 77646.97

SS(Material) = 4 × 33∑

i=1

(yi·· − y···)2 = 12 ×

3∑i=1

y2i·· − 36 × y2

···

= 12 × [(83.17)2 + ... + (125.08)2] − 36 × 105.532 = 10683.72

SS(Temp) = 4 × 33∑

j=1

(y·j· − y···)2 = 12 ×

3∑j=1

y2·j· − 36 × y2

···

= 12 × [(144.83)2 + ... + (64.17)2] − 36 × 105.532 = 39118.72

SS(interaction) = 43∑

j=1

3∑i=1

(yij· − yi·· − y·j· + y···)2 = 4 × [(134.75 − 83.17 − 144.83

+ 105.53)2 + ... + (85.5 − 125.08 − 64.17 + 105.53)2] = 9613.78

SS(Error) = SS(Total) − SS(Mat.) − SS(Temp) − SS(Interac) = 18230.75

Note: I will show that SS(Error) indeed follows that relationship using R... 14

ANOVA Table for Battery Life Data

With the SS’s all calculated, the calculation for the rest is rather

straightforward:

Source SS df MS F0 P-Value

Temperature 39119 2 19559.4 28.9677 1.909e-07

Material 10684 2 5341.9 7.9114 0.001976

Interaction 9614 4 2403.4 3.5595 0.018611

Error 18231 27 675.2

Total 77647 35

15

Hypotheses to be tested:

• Testing for interaction

• SSE

• SSAB

• Test statistics

• Rejection region

• Testing for main effects

16

Testing for interaction: SSE

• Sum of Squares for the error

• For each observation yijt , the error is yijt − yijt .

• What is yijt?

• Least square estimate yijt = y ij. (section 6.4.1)

• Sum of Squares means

SSE =n∑

t=1

b∑j=1

a∑i=1

(yijt − yij·)2

17

Testing for interaction: SSAB

(Page 153) Definition: Sum of Squares for the interaction.

SSAB = nb∑

j=1

a∑i=1

(yij· − yi·· − y·j· + y···)2

• Least square estimate: (αβ)ij = yij· − yi·· − y·j· + y···

• SSAB = SSEAB0 − SSE

• SSEAB0 is the sum of Squares for error when the HAB

0 is true: no

interaction

• SSE is the sum of Squares for error under the two-way complete

model

Larger SSAB ⇒ adding the interaction terms better explains variance

⇒ the interaction is important.

18

Testing for interaction: SSAB

• SSAB = SSEAB0 − SSE

• SSEAB0 is the Sum of Squares for error under main effect model

yijk = µ+ αi + βj + εijk ,

SSEAB0 =

∑bj=1

∑ai=1

∑nt=1(yijt − yi·· − y·j· + y···)

2

• SSE is the Sum of Squares for error under two-way complete model

yijk = µ+ αi + βj + (αβ)ij + εijk ,

SSE =∑b

j=1

∑ai=1

∑nt=1(yijt − yij.)

2

Larger SSAB ⇒ adding the interaction terms better explains variance

⇒ the interaction is important.

19

Testing for interaction: Test statistics and Rejection region

From another point of view:

E (MSAB) = σ2 +n∑

(αβ)2ij

(a− 1)(b − 1)

E (MSE ) = σ2

Reject HAB0 if msAB

msE > F(a−1)(b−1),N−ab,α

⇒ Reject the Null Hypotheses when F statistics is large!

20

Testing for Main effects

page 155

In this book, we take the view that the main effect of A would

not be tested unless the hypothesis of no interaction were first

accepted.

• Our goal in testing main effect A is to see whether factor A has no

effect on the response or outcome

• Choice 1: the levels of A (averaged over the levels of B) have the

same average level on the response):

HA0 : α?1 = α?2 = . . . = α?a = 0

where α?i = αi + (αβ)i.• Choice 2: response only depends on B

HA+AB0 : {both HA

0 and HAB0 are true }

21

Testing for Main effects: Two choices

• Choice 1:

HA0 : α?1 = α?2 = . . . = α?a = 0

where α?i = αi + (αβ)i. = αi

• Choice 2:

HA+AB0 : {both HA

0 and HAB0 are true }

yijt = µ+ βj + εijt

• Choice 1 & 2 are equivalent when there is no interaction (see Page

155 for the reason)

• Otherwise they are different tests

22

Testing for Main effects: no interaction and equal sample sizes

SSA = nba∑

i=1

(yi·· − y···)2 = nb

a∑i=1

y2i·· − naby2

···

SSA = SSEA0 − SSE

SSEA0 denotes the sum of Squares for the error when HA

0 is true.

Again

E (MSA) = σ2 +bn

∑α2i

a− 1

E (MSE ) = σ2

reject HA0 if msA

msE > F(a−1),N−ab,α

23

Testing for Main effects: unequal sample sizes

Type I and Type III sum of Squares

• They are the same if the sample sizes are equal

• Otherwise, Type III compares the SSEs in a full model and a reduced

model

• Type I compares the SSEs with the existing model and the existing

model + the tested term. Also called ”sequential” sum of squares.

• in Type I calculation, the order of the test matters

24

Model parameters estimation: Least square

• two-way complete model:

yijk = µ+ αi + βj + (αβ)ij + εijk ,

yijt = µ+ αi + βj + αβij

= y ... + (y i.. − y ...) + (y .j. − y ...)+

(y ij. − y i.. − y .j. + y ...)

yijt = y ij.

• two-way main-effects model:

yijt = µ+ αi + βj + εijt

yijt = y ... + (y i.. − y ...) + (y .j. − y ...)

25

Summary

• three models

• two-way complete VS cell-means

• two-way main-effects VS two-way complete

• Definition of SSAB, SSA, SSB, SSE

• test the interaction first, then the main effects

• when sample sizes are not equal, type I and type III sum of squares

26

Next time

• Check the assumptions

• Contrasts

• Multiple Comparisons

• SAS example

27