27
The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model for the observations in a completely randomized design.

The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Embed Size (px)

Citation preview

Page 1: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

The Completely Randomized Design (§8.3)

• Introduction to the simplest experimental design - the Completely Randomized Design.

• Introduce a statistical model for the observations in a completely randomized design.

Page 2: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Completely Randomized Design

• Experimental Study - Completely randomized design (CRD)• Sampling Study - One-way classification design

Assumptions:• Independent random samples (response from one experimental

unit does not affect responses from other experimental units).• Responses follow a normal distribution.• Common true variance, 2, across all groups/treatments.• True mean for population i is i.

• Interest is in comparing means.

Two different Names for the Same Design:

Randomization: The t treatments are randomly allocated to the experimental units in such a way that n1 units receive treatment 1, n2 receive treatment 2, etc.

Page 3: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

AOV Model of Responses/Effects

Model:ijiijiijy

overall mean effect due to population i

random error ~ N(0,2)

iiijyE )(

0:

0: 210

fromdifferstheofoneleastAtH

H

ia

t

Requirement for to be the overall mean:

0t

1ii

Expected response

ˆˆi iy

All i = 0 implies all groups have the same mean ()

Estimate

Page 4: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Example

A manufacturer of concrete bridge supports is interested in determining the effect of varying the sand content on the strength of the supports. Five supports are made for each of five different amounts of sand in the concrete mix and each is tested for compression resistance.

Percent Sand

15 20 25 30 35

7 17 14 20 7

7 12 18 24 10

10 11 18 22 11

15 18 19 19 15

9 19 19 23 11

Page 5: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Percent Sand

15 20 25 30 35

7 17 14 20 7

7 12 18 24 10

10 11 18 22 11

15 18 19 19 15

9 19 19 23 11

9.6 15.4 17.6 21.6 10.8 15

-5.4 0.4 2.6 6.6 -4.2 0

MEAN

EFFECT

Overall Mean

Sum of Effects

Basic Statistics and AOV Effects

ˆ i iy y

Page 6: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Decomposing the Data

Treatment Resistance Overall Mean Effect Residual15 7 15 -5.4 -2.615 7 15 -5.4 -2.615 10 15 -5.4 0.415 15 15 -5.4 5.415 9 15 -5.4 -0.620 17 15 0.4 1.620 12 15 0.4 -3.420 11 15 0.4 -4.420 18 15 0.4 2.620 19 15 0.4 3.625 14 15 2.6 -3.625 18 15 2.6 0.425 18 15 2.6 0.425 19 15 2.6 1.425 19 15 2.6 1.430 20 15 6.6 -1.630 24 15 6.6 2.430 22 15 6.6 0.430 19 15 6.6 -2.630 23 15 6.6 1.435 7 15 -4.2 -3.835 10 15 -4.2 -0.835 11 15 -4.2 0.235 15 15 -4.2 4.235 11 15 -4.2 0.2

SSQ 6275 5625 486.4 163.6

ijiijy

= overall mean

i = i – = group i effect

ij = yij – – i = residual

(Note that sum of residuals for

each treatment is zero)Sum of squares

Page 7: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Decomposing Sums of Squares

Treatment Resistance Overall Mean Effect Residual15 7 15 -5.4 -2.615 7 15 -5.4 -2.615 10 15 -5.4 0.415 15 15 -5.4 5.415 9 15 -5.4 -0.620 17 15 0.4 1.620 12 15 0.4 -3.420 11 15 0.4 -4.420 18 15 0.4 2.620 19 15 0.4 3.625 14 15 2.6 -3.625 18 15 2.6 0.425 18 15 2.6 0.425 19 15 2.6 1.425 19 15 2.6 1.430 20 15 6.6 -1.630 24 15 6.6 2.430 22 15 6.6 0.430 19 15 6.6 -2.630 23 15 6.6 1.435 7 15 -4.2 -3.835 10 15 -4.2 -0.835 11 15 -4.2 0.235 15 15 -4.2 4.235 11 15 -4.2 0.2

SSQ 6275 5625 486.4 163.6

6275.0-5625.0=650.0-486.4=163.6-163.6

=0.0

TSSSSB

SSW

SSB SSW

i j

iijii

ii

ii j

ij yyyynnyy 2222

Page 8: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Compression Resistance

0

5

10

15

20

25

30

10 15 20 25 30 35 40

Percent Sand

Res

ista

nce

(10

,000

psi

)

Page 9: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Compression Resistance

0

5

10

15

20

25

30

10 20 30 40

Percent Sand

Res

ista

nce

(10

,000

psi

)

14

Best Treatment? Is 30% significantly better than 25%?

Page 10: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Estimation

iii y ˆˆˆ

ij i ijy

1 1

1

ˆ

int

iji j

t

ii

y

yn

yyii

Page 11: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Reference Group/Cell Model

Model:

1, 2, , 1

tj t tj

ij t i ij

y i t

y i t

Mean for the last group (i=t) is t.

Mean for the first group (i=1) is t + 1

Thus, 1 is the difference between the

mean of the reference group (cell) and the target group mean. Any group can be thereference group.

0fromdiffertheofoneleastAt:H

0:H

a

1t210

reference group mean

effect due to population i

random error ~ N(0,2)

This is the model SASuses.

All i = 0 implies all groups have the same mean.

Page 12: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Percent Sand

15 20 25 30 35

7 17 14 20 7

7 12 18 24 10

10 11 18 22 11

15 18 19 19 15

9 19 19 23 11

9.6 15.4 17.6 21.6 10.8 10.8

-1.2 4.6 6.8 10.8 0 21

MEAN

EFFECT

Reference Cell Mean

Sum of Effects

Basic Statistics and Reference Cell Effects

ˆi i ty y

0ii

t

Page 13: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Reference Cell Decomposition

Treatment Resistance Group Mean Reference Cell Mean Effect Residual15 7 9.6 10.8 -1.2 -2.615 7 9.6 10.8 -1.2 -2.615 10 9.6 10.8 -1.2 0.415 15 9.6 10.8 -1.2 5.415 9 9.6 10.8 -1.2 -0.620 17 15.4 10.8 4.6 1.620 12 15.4 10.8 4.6 -3.420 11 15.4 10.8 4.6 -4.420 18 15.4 10.8 4.6 2.620 19 15.4 10.8 4.6 3.625 14 17.6 10.8 6.8 -3.625 18 17.6 10.8 6.8 0.425 18 17.6 10.8 6.8 0.425 19 17.6 10.8 6.8 1.425 19 17.6 10.8 6.8 1.430 20 21.6 10.8 10.8 -1.630 24 21.6 10.8 10.8 2.430 22 21.6 10.8 10.8 0.430 19 21.6 10.8 10.8 -2.630 23 21.6 10.8 10.8 1.435 7 10.8 10.8 0 -3.835 10 10.8 10.8 0 -0.835 11 10.8 10.8 0 0.235 15 10.8 10.8 0 4.235 11 10.8 10.8 0 0.2

SSQ 6275 2916 927.4 163.6

Note: Sums of squares don’t quite add up.Due to fact that sum of i is not zero.

6275.0-2916.0=3369.0

-927.4=2441.6

-163.6=2278.0

Page 14: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Decomposing Sums of Squares

22

1 1 1 1

i in nt t

ij t i ijij

i j i j

y

1

0 for all iin

ijj

2 2 2 2

2 2 2

2 2 2

2

t i ij t t i i t ij i ij ij

t t i i ij

2 2 2 2

1 1 1 1

2 2 2

1 1 1 1 1

2

2

i i

i

n nt t

ij t i ij t ii j i j

nt t t t

t i i i ij t i ii i i j i

y

n n n

6275 = 2916.0 + 927.4 + 163.4 + 2278

Page 15: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Compression Resistance

0

5

10

15

20

25

30

10 15 20 25 30 35 40

Percent Sand

Res

ista

nce

(10

,000

psi

)

ˆ t

4

Reference Cell Model

Page 16: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

SAS Programoptions ls=78 ps=49 nodate;

data stress;

input sand resistance @@;

datalines;

15 7 15 7 15 10 15 15 15 9

20 17 20 12 20 11 20 18 20 19

25 14 25 18 25 18 25 19 25 19

30 20 30 24 30 22 30 19 30 23

35 7 35 10 35 11 35 15 35 11

;

proc glm data=stress;

class sand;

model resistance = sand / solution;

title2 'Compression resistance in concrete beams as';

title2 ' a function of percent sand in the mix';

run;

Page 17: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

SAS Output(1)Compression resistance in concrete beams as

a function of percent sand in the mix

The GLM Procedure

Dependent Variable: resistance

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 4 486.4000000 121.6000000 14.87 <.0001

Error 20 163.6000000 8.1800000

Corrected Total 24 650.0000000

R-Square Coeff Var Root MSE resistance Mean

0.748308 19.06713 2.860070 15.00000

Page 18: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

SAS Output(2)Source DF Type I SS Mean Square F Value Pr > F

sand 4 486.4000000 121.6000000 14.87 <.0001

Source DF Type III SS Mean Square F Value Pr > F

sand 4 486.4000000 121.6000000 14.87 <.0001

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 10.80000000 B 1.27906216 8.44 <.0001

sand 15 -1.20000000 B 1.80886705 -0.66 0.5146

sand 20 4.60000000 B 1.80886705 2.54 0.0194

sand 25 6.80000000 B 1.80886705 3.76 0.0012

sand 30 10.80000000 B 1.80886705 5.97 <.0001

sand 35 0.00000000 B . . .

NOTE: The X'X matrix has been found to be singular, and a generalized inverse

was used to solve the normal equations. Terms whose estimates are

followed by the letter 'B' are not uniquely estimable.

Page 19: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

MinitabOne-way ANOVA: Resist versus Sand

Analysis of Variance for Resist

Source DF SS MS F P

Sand 4 486.40 121.60 14.87 0.000

Error 20 163.60 8.18

Total 24 650.00

Individual 95% CIs For Mean

Based on Pooled StDev

Level N Mean StDev -------+---------+---------+---------

15 5 9.600 3.286 (----*-----)

20 5 15.400 3.647 (-----*----)

25 5 17.600 2.074 (----*-----)

30 5 21.600 2.074 (----*-----)

35 5 10.800 2.864 (-----*----)

-------+---------+---------+---------

Pooled StDev = 2.860 10.0 15.0 20.0

Page 20: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

MinitabStat ANOVA One-Way

Multiple comparisons (later)

Page 21: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Minitab Dot Plot

Page 22: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

SPSS AOV Table

ANOVA

RESIST

486.400 4 121.600 14.866 .000

163.600 20 8.180

650.000 24

Between Groups

Within Groups

Total

Sum of

Squares df Mean Square F Sig.

Page 23: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

SPSS DescriptivesDescriptives

RESIST

5 9.6000 3.28634 1.46969 5.5195 13.6805

5 15.4000 3.64692 1.63095 10.8718 19.9282

5 17.6000 2.07364 .92736 15.0252 20.1748

5 21.6000 2.07364 .92736 19.0252 24.1748

5 10.8000 2.86356 1.28062 7.2444 14.3556

25 15.0000 5.20416 1.04083 12.8518 17.1482

2.86007 .57201 13.8068 16.1932

2.20545 8.8767 21.1233

15.00

20.00

25.00

30.00

35.00

Total

Fixed Effects

Random Effects

Model

N MeanStd.

Deviation Std. Error Lower Bound Upper Bound

95% Confidence Interval forMean

7.00 15.00

11.00 19.00

14.00 19.00

19.00 24.00

7.00 15.00

7.00 24.00

22.68400

Minimum Maximum

Between-ComponentVariance

Page 24: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

CRD Analysis in R> resist <- c(7,7,10,15,9,17,12,11,18,19,14, …,19,23,7,10,11,15,11)

> sand <- factor(rep(seq(15,35,5),rep(5,5)))> myfit <- aov(resist~sand)> summary(myfit) Df Sum Sq Mean Sq F value Pr(>F) sand 4 486.40 121.60 14.866 8.655e-06 ***Residuals 20 163.60 8.18 ---Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

> coef(myfit)(Intercept) sand20 sand25 sand30 sand35 9.6 5.8 8.0 12.0 1.2

R functions aov() & lm() by default reference first cell mean!

Page 25: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Fixed Effects

Normally, the “effect” of a particular treatment is assumed to be a constant value (i) added to the response of all units in the group

receiving the treatment.

If the treatments are well defined, easily replicable and are expected to produce the same effect on average in each replicate, we have a fixed set of treatments and the AOV model is said to describe a fixed effects model.

Examples: • A scientist develops 3 new fungicides. Her interest is in these fungicides only.• The impact of 4 specific soil types on plant growth are of interest.• Three particular milling machines are being compared.• Four particular lakes are of interest in their weed biomass densities.• Three tests for assessing developmental learning are being compared.

Page 26: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Random EffectsIf the treatments cannot be assumed to be from a prespecified or known set of treatments, they are assumed to be a random sample from some larger population of potential treatments. In this case, the AOV model is called a random effects model and the i are called random effects.

Examples: • A scientist is interested in how fungicides work. Ten (10) fungicides are selected (at

random) to represent the population of all fungicides in the research (plots as replicates).• Four soil sub groups are selected for examining plant growth (pots as replicates).• Three milling machines selected at random from the production line are compared (runs

as replicates).• 16 lakes selected at random are measured for their weed biomass densities (water

samples as replicates).• A standard test for development is given to 20 middle school classes selected at random

from the over 200 available among all middle schools in the county (student as replicate).

In each case, we assume the values for the effects would change if our sample had changed. Inference is directed not to answering “which treatment is different from which other treatment?” but to the issue of “is the variability among treatments significantly greater than the residual variability?”.

Page 27: The Completely Randomized Design (§8.3) Introduction to the simplest experimental design - the Completely Randomized Design. Introduce a statistical model

Closing Comments on CRDEven though we have introduced several variations on the same basic model for defining “effects”, the final F-test for the hypothesis of overall equal group means is the same one developed as part of the analysis of variance. It turns out that there may be computational advantages to using the one formulation of the model over another, but this has absolutely no effect on the hypothesis test. We will see this in the next Section.

0fromdiffertheofoneleastAt:H

0:H

a

1t210

0fromdiffertheofoneleastAt:H

0:H

a

t210

For simple one-factor designs, whether the treatment effect is considered random or fixed, the F-test is the same, the interpretation is different.