51
Topic 24: Two-Way ANOVA

Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Embed Size (px)

Citation preview

Page 1: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Topic 24: Two-Way ANOVA

Page 2: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Outline

• Two-way ANOVA

–Data

–Cell means model

–Parameter estimates

–Factor effects model

Page 3: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Two-Way ANOVA

• The response variable Y is continuous

• There are two categorical explanatory variables or factors

Page 4: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Data for two-way ANOVA

• Y is the response variable

• Factor A with levels i = 1 to a

• Factor B with levels j = 1 to b

• Yijk is the kth observation in cell (i,j)

• In Chapter 19, we assume equal

sample size in each cell (nij=n)

Page 5: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

KNNL Example• KNNL p 833

• Y is the number of cases of bread sold

• A is the height of the shelf display, a=3 levels: bottom, middle, top

• B is the width of the shelf display, b=2 levels: regular, wide

• n=2 stores for each of the 3x2=6 treatment combinations (nT=12)

Page 6: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Read the data

data a1; infile ‘../data/ch19ta07.txt'; input sales height width;

proc print data=a1; run;

Page 7: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

The dataObs sales height width 1 47 1 1 2 43 1 1 3 46 1 2 4 40 1 2 5 62 2 1 6 68 2 1 7 67 2 2 8 71 2 2 9 41 3 1 10 39 3 1 11 42 3 2 12 46 3 2

Page 8: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Notation

• For Yijk we use

– i to denote the level of the factor A

– j to denote the level of the factor B

–k to denote the kth observation in cell (i,j)

• i = 1, . . . , a levels of factor A

• j = 1, . . . , b levels of factor B

• k = 1, . . . , n observations in cell (i,j)

Page 9: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Model

• We assume that the response variable observations are

–Normally distributed

•With a mean that may depend on the levels of the factors A and B

•With a constant variance

– Independent

Page 10: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Cell Means Model• Yijk = μij + εijk

–where μij is the theoretical mean or expected value of all observations in cell (i,j) – the εijk are iid N(0, σ2)

• This means Yijk ~ N(μij, σ2), independent• The parameters of the model are– μij, for i = 1 to a and j = 1 to b –σ2

Page 11: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Estimates• Estimate μij by the mean of the

observations in cell (i,j), • • For each (i,j) combination, we can get

an estimate of the variance

• • We need to combine these to get an

estimate of σ2

ij.Yn/)Y(Y k ijkij.

k2

ij.ijk2ij )1n/()YY(s

Page 12: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Pooled estimate of σ2

• In general we pool the sij2, using

weights proportional to the df, nij -1

• The pooled estimate is

s2 = (Σ (nij-1)sij2) / (Σ(nij-1))

• Here, nij = n, so s2 = (Σsij2) / (ab),

which is the average sample variance

Page 13: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Run proc glm

proc glm data=a1; class height width; model sales= height width height*width; means height width height*width;run;

Page 14: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Output

Class Level InformationClass Levels Valuesheight 3 1 2 3width 2 1 2

Number of Observations Read 12Number of Observations Used 12

Page 15: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Means statement height

Level ofheight N

sales

Mean Std Dev1 4 44.0000000 3.16227766

2 4 67.0000000 3.74165739

3 4 42.0000000 2.94392029

Page 16: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Means statement width

Level ofwidth N

sales

Mean Std Dev1 6 50.0000000 12.0664825

2 6 52.0000000 13.4313067

Page 17: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Means statement ht*w

Level ofheight

Level ofwidth N

sales

Mean Std Dev1 1 2 45.0000000 2.828427121 2 2 43.0000000 4.242640692 1 2 65.0000000 4.242640692 2 2 69.0000000 2.828427123 1 2 40.0000000 1.414213563 2 2 44.0000000 2.82842712

Page 18: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Code the factor levelsdata a1; set a1; if height eq 1 and width eq 1 then hw='1_BR'; if height eq 1 and width eq 2 then hw='2_BW'; if height eq 2 and width eq 1 then hw='3_MR'; if height eq 2 and width eq 2 then hw='4_MW'; if height eq 3 and width eq 1 then hw='5_TR'; if height eq 3 and width eq 2 then hw='6_TW';

Page 19: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Plot the data

symbol1 v=circle i=none;proc gplot data=a1; plot sales*hw/frame;run;

Page 20: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

The plot

Page 21: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Put the means in a2

proc means data=a1; var sales; by height width; output out=a2 mean=avsales;proc print data=a2; run;

Page 22: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Output Data Set

Obs height width _TYPE_ _FREQ_ avsales

1 1 1 0 2 45 2 1 2 0 2 43 3 2 1 0 2 65 4 2 2 0 2 69 5 3 1 0 2 40 6 3 2 0 2 44

Page 23: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Plot the means

symbol1 v=square i=join c=black;symbol2 v=diamond i=join c=black;proc gplot data=a2; plot avsales*height=width/frame;run;

Page 24: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

The interaction plot

Page 25: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Questions to consider

• Does the height of the display affect

sales? If yes, compare top with middle,

top with bottom, and middle with bottom

• Does the width of the display affect

sales? If yes, compare regular and wide

Page 26: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

But wait!!! Are these factor level comparisons

meaningful?• Does the effect of height on sales

depend on the width?

• Does the effect of width on sales depend on the height?

• If yes, we have an interaction and we need to do some additional analysis

Page 27: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Factor effects model

• For the one-way ANOVA model, we wrote μi = μ + αi

• Here we use μij = μ + αi + βj + (αβ)ij

• Under “common” formulation– μ (μ.. in KNNL) is the “overall mean”

– αi is the main effect of A

– βj is the main effect of B

– (αβ)ij is the interaction between A and B

Page 28: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Factor effects model

• μ = (Σij μij)/(ab)

• μi. = (Σj μij)/b and μ.j = (Σi μij)/a

• αi = μi. – μ and βj = μ.j - μ

• (αβ)ij is difference between μij and μ + αi + βj

• (αβ)ij = μij - (μ + (μi. - μ) + (μ.j - μ))

= μij – μi. – μ.j + μ

Page 29: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Interpretation

• μij = μ + αi + βj + (αβ)ij

• μ is the “overall” mean

• αi is an adjustment for level i of A

• βj is an adjustment for level j of B

• (αβ)ij is an additional adjustment that takes into account both i and j that cannot be explained by the previous adjustments

Page 30: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Constraints for this framework

• α. = Σi αi= 0

• β. = Σjβj = 0

• (αβ).j = Σi (αβ)ij = 0 for all j

• (αβ)i. = Σj (αβ)ij = 0 for all i

Page 31: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Estimates for factor effects model

....j...i.ijij

....j.j.....ii

.j..j..i.i

ijk ijk...

YYYY)ˆ(

YYˆ and YYˆ

Yˆ and Yˆ

abn/)Y(Yˆ

Page 32: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

SS for ANOVA Table22

.. ...ijk

2jijk

2ijijk

2ijk .ijk

2...ijk

ˆSSA (Y Y )

ˆSSB

SSAB ( )

SSE (Y Y )

SSTO (Y Y )

ˆ

i iijk

ij

ijk

Page 33: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

df for ANOVA Table

• dfA = a-1

• dfB = b-1

• dfAB = (a-1)(b-1)

• dfE = ab(n-1)

• dfT = abn-1 = nT-1

Page 34: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

MS for ANOVA Table

• MSA = SSA/dfA

• MSB = SSB/dfB

• MSAB = SSAB/dfAB

• MSE = SSE/dfE

• MST = SST/dfT

Page 35: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Hypotheses for two-way ANOVA

• H0A: αi = 0 for all i

• H1A: αi ≠ 0 for at least one i

• H0B: βj = 0 for all j

• H1B: βj ≠ 0 for at least one j

• H0AB: (αβ)ij = 0 for all (i,j)

• H1AB: (αβ)ij ≠ 0 for at least one (i,j)

Page 36: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

F statistics

• H0A is tested by FA = MSA/MSE; df=dfA, dfE

• H0B is tested by FB = MSB/MSE; df=dfB,

dfE

• H0AB is tested by FAB = MSAB/MSE;

df=dfAB, dfE

Page 37: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

ANOVA Table

Source df SS MS F A a-1 SSA MSA MSA/MSE B b-1 SSB MSB MSB/MSE AB (a-1)(b-1) SSAB MSAB MSAB/MSEError ab(n-1) SSE MSE _ Total abn-1 SSTO MST

Page 38: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

P-values

• P-values are calculated using the F(dfNumerator, dfDenominator) distributions

• If P ≤ 0.05 we conclude that the effect being tested is statistically significant

Page 39: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

KNNL Example• NKNW p 833• Y is the number of cases of bread sold• A is the height of the shelf display, a=3

levels: bottom, middle, top• B is the width of the shelf display, b=2:

regular, wide• n=2 stores for each of the 3x2

treatment combinations

Page 40: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

PROC GLM

proc glm data=a1; class height width; model sales= height width height*width;run;

Page 41: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Output

Note that there are 6 cells inthis design…(6-1)df for model

Source DFSum of

SquaresMean

Square F Value Pr > FModel 5 1580.0000 316.000000 30.58 0.0003Error 6 62.000000 10.333333Corrected Total

11 1642.0000

Page 42: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Output ANOVA

Note Type I and Type III Analyses are the same becausenij is constant

Source DF Type III SS Mean Square F Value Pr > Fheight 2 1544.00000 772.000000 74.71 <.0001

width 1 12.000000 12.000000 1.16 0.3226

height*width 2 24.000000 12.000000 1.16 0.3747

Page 43: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Other output

R-Square Coeff Var Root MSE sales Mean0.962241 6.303040 3.214550 51.00000

Commonly do not consider R-sq when performing ANOVA…interested more in difference in levels rather than the models predictive ability

Page 44: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Results

• The main effect of height is statistically significant (F=74.71; df=2,6; P<0.0001)

• The main effect of width is not statistically significant (F=1.16; df=1,6; P=0.32)

• The interaction between height and width is not statistically significant (F=1.16; df=2,6; P=0.37)

Page 45: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Interpretation

• The height of the display affects sales of bread

• The width of the display has no apparent effect

• The effect of the height of the display is similar for both the regular and the wide widths

Page 46: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Plot of the means

Page 47: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Additional analyses

• We will need to do additional analyses to explain the height effect (factor A)

• There were three levels: bottom, middle and top

• We could rerun the data with a one-way anova and use the methods we learned in the previous chapters

• Use means statement with lines

Page 48: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Run Proc GLM

proc glm data=a1; class height width; model sales= height width height*width; means height / tukey lines; lsmeans height / adjust=tukey;run;

Page 49: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

MEANS OutputAlpha 0.05Error Degrees of Freedom 6Error Mean Square 10.33333Critical Value of Studentized Range 4.33920Minimum Significant Difference 6.9743

Means with the same letter are not significantly different.

Tukey Grouping Mean N heightA 67.000 4 2

B 44.000 4 1BB 42.000 4 3

Page 50: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

LSMEANS Outputheight sales LSMEAN

LSMEAN Number

1 44.0000000 1

2 67.0000000 2

3 42.0000000 3

Least Squares Means for effect heightPr > |t| for H0: LSMean(i)=LSMean(j)

Dependent Variable: salesi/j 1 2 31 0.0001 0.67142 0.0001 <.00013 0.6714 <.0001

Page 51: Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Last slide

• We went over Chapter 19

• We used program topic24.sas to generate the output for today.