Upload
rahsarah
View
12
Download
0
Embed Size (px)
DESCRIPTION
nested anova
Citation preview
Nested (hierarchical) ANOVA
what it is
how you do it
variance components
power and optimal allocation of replication
Nested ANOVA
designs with subsamples nested within replicates if the nesting is not acknowledged, these designs are
pseudoreplicated
nesting is usually spatial, but can be temporal
variation is partitioned among hierarchical levels
all levels of one factor are not present in all levels of another factor
some levels are uniquely present within some levels of another factor, but not other levels
nested factors are usually random factors nested fixed factors require justification
What is a nested factor?
Factor A Factor B
A 1
A 2
A 3
B 4
B 5
B 6
C 7
C 8
C 9
Factor A Factor B
A 1
A 2
A 3
B 1
B 2
B 3
C 1
C 2
C 3
Nested B(A) Not Nested: A x B
creeks (tributaries) are unique to each river
multiple samples of a single tissue type within a rat
subsamples in time (if sampled w/o replacement) can only be sampled at one time and not another
replicates are always nested within treatments -- but we dont consider this nesting when we construct the
ANOVA model
Examples of Nesting
factors A & B
factor A with p groups or levels
factor B with q groups or levels within each level of A
nested design: different (randomly chosen) levels of Factor B in each
level of Factor A
often one or more levels of subsampling
Two factor nested ANOVA design
Factor A - fixed sea urchin density four levels:
100% of original (control) 66% 33% 0%
Factor B - random randomly chosen patches 4 within each treatment
n = 5 quadrats / patch
Example: sea urchin grazing on reefs Andrew & Underwood (1997)
effect of sea urchin density on the % cover of filamentous algae
Density: 100% 66% etc.
Patch: 1 2 3 4 5 6 7 8
Reps: n = 5 in each of 16 cells
p = 4 densities, q = 4 patches
layout: sea urchin grazing on reefs
yijk = + i + j(i) + ijk where
overall mean i effect of factor A (i - ) i(i) effect of factor B within each level of A
(ij - i) ijk unexplained variation (error term) -
variation within each cell
(% cover algae)ijk = + (sea urchin density)i + (patch within sea urchin density)j(i) + ijk
Linear model
Main effect: effect of factor A i.e., variation among factor A group means
Nested (random) effect: effect of factor B within each level of factor A variation among means of factor B within each level of A
Effects
Factor A:
H0: no difference among means of factor A (= no difference among means of urchin density treatments)
1 = 2 = = i =
is equivalent to
H0: no main effect of factor A (no effect of urchin density): 1 = 2 = = i = 0 i = (i - ) = 0
Null hypotheses
H0: Factor A: No difference in mean amount of filamentous algae
between the four sea urchin density treatments
H0: Factor B: No difference in the mean amount of filamentous algae
between all possible patches in any of the treatments
Factor B(A)
H0: no difference among means of factor B within any level of factor A
(no difference among patches in mean filamentous algae
cover within any urchin density treatment)
11 = 12 = = 1j 21 = 22 = = 2j etc.
H0: no variance among levels of nested random factor B within any level of factor A
(no variance among patches within each density treatment):
2 = 0
Null hypotheses
SSTotal
SSA + SSB(A) + SSResidual
SSA variation among A means
SSB(A) variation among B means within each level of A
SSResidual variation among replicates within each cell (each B(A))
Partitioning total variation
Nested ANOVA table
Source SS df MS
Factor A SSA p-1 SSA/(p-1)
Factor B(A) SSB(A) p(q-1) SSB(A)/(p(q-1))
Residual SSResidual pq(n-1) SSResidual/(pq(n-1))
A fixed, B random:
MSA
MSB(A)
MSResidual
Expected Mean Squares
if no main effect of factor A: H0: 1 = 2 = i = (i = 0) is
true
F-ratio: MSA / MSB(A) 1
if no effect of nested random factor B(A):
H0: 2 = 0 is true F-ratio: MSB(A) / MSResidual 1
MSA
MSB(A)
MSResidual
Testing null hypotheses estimate parameters
H0: true; no variation among A H0: false; is variation among A
patches dont differ patches do differ dont differ patches do differ
A1
Ai
B1
B2
Bj
B1
B2
Bj
Bjs=0 Bjs0 Ais=0
Bjs=0 Bjs0 Ais0
4 possible outcomes
doesnt matter whether nested factor varies or not, you can look for differences among treatments
e.g., compared to the 1-way design, the nested design un-confounds subsamples from true replicates
nested designs separate confounded additive factors
Treatment effects in nested designs
no effect of urchin density on percentage cover of filamentous
algae
filamentous algal cover varies significantly from patch to patch
about 50% of the variance in percentage cover of algae is
explained by differences between patches
remaining 50% is explained by differences at the scale of
quadrats within patches
Results: Andrew & Underwood 1993 Source df MS F p var. comp. %
Density 3 4810 2.72 0.09 - -
Patches(Density) 12 1770 5.93
Main effect: planned contrasts & trend analyses as part of design unplanned multiple comparisons if main F-ratio test
significant
Nested effect: usually random factor usually of little interest in further tests often can provide information on the characteristic
spatial signal of a population
Additional Tests
what is the effect of schools on standardized tests? (i.e., do scores differ among schools?)
is the effect of school driven in part by differences in teachers?
Another worked example
Data: three schools,
two teachers at each
schools, two scores per
teacher
True data matrix,
accounts for teachers
not being the same at
each school
Data format for statistics
Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
SCHOOL$ 156.50000 2 78.25000 11.17857 0.00947
TEACHER(SCHOOL$) 567.50000 3 189.16667 27.02381 0.00070
Error 42.00000 6 7.00000
What does this mean???
ANOVA output
SYSTAT and other stats software generally will not
automatically construct the
F ratio correctly
F-ratio is: MSschool / MSteacher(school)
Big effect of teacher!
What about effect of school?
Test of Hypothesis
Source SS df MS F P
Hypothesis 156.50000 2 78.25000 0.41366 0.69397
Error 567.50000 3 189.16667
Analysis of Variance
Source Sum-of-Squares df Mean-Square F-ratio P
SCHOOL$ 156.50000 2 78.25000 11.17857 0.00947
TEACHER(SCHOOL$) 567.50000 3 189.16667 27.02381 0.00070
Error 42.00000 6 7.00000
accounting for teacher effect
No effect of school!
Before:
After:
Power
more replication always gives you more power
but in nested ANOVA, there is replication at various levels
where does your power come from? if you have nested factors within your treatments,
you need to replicate the nested factor, not the
subsamples
used to provide information on the characteristic spatial signal of populations
other techniques (geostatistical models) also can do this, but nested models are very efficient
variance component models (part of nested) can provide the percent of variation that is associated with particular spatial scales
Spatially nested designs
Region 1
Region 2
Region 3
Locations Sites
Transects
Regions
Locations(Regions)
Sites(Locations(Regions))
Transects(Sites(Locations(Regions)))
What spatial scale is most of the variance associated with?
At what scale is most
of the variance?
Source df MS F Var. comp (%)
Region 2 6658 10.4 247 (42)
Location(Region) 6 638 2.43 71 (12)
Site(Location(Region)) 18 263 1.40 88 (14)
Transect = Residual, Error 54 187 187 (32)
Optimal Allocation of Replication at different levels
can calculate at what level it is best to spend your time or money on replication
must know variance at each level (var. components) cost / effort to obtain replication at each hierarchical
level