33
eNote 5 1 eNote 5 Hierarchical Random Effects

Hierarchical Random Effects - Technical University of … · eNote 5 5.3 CONSTANT MEAN VALUE STRUCTURE 4 Obs litter pig reg status loglact Obs litter pig reg status loglact 1 1 1

Embed Size (px)

Citation preview

eNote 5 1

eNote 5

Hierarchical Random Effects

eNote 5 INDHOLD 2

Indhold

5 Hierarchical Random Effects 1

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

5.2 Main example: Lactase measurements in piglets . . . . . . . . . . . . . . . 3

5.3 Constant mean value structure . . . . . . . . . . . . . . . . . . . . . . . . . 4

5.4 The one layer model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

5.5 Two layer model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5.6 Three or more layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5.7 Zero/Negative variance parameters . . . . . . . . . . . . . . . . . . . . . . 16

5.8 Comparing different variance structures . . . . . . . . . . . . . . . . . . . . 17

5.9 The balanced case - better confidence limits . . . . . . . . . . . . . . . . . . 19

5.10 Prediction of random effect coefficients . . . . . . . . . . . . . . . . . . . . 23

5.11 R-TUTORIAL: The lactase example . . . . . . . . . . . . . . . . . . . . . . . 24

5.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.1 Introduction

This module takes a closer look at a type of models, where the experimental designleads to a natural hierarchy of the random effects. Consider for instance the following

eNote 5 5.2 MAIN EXAMPLE: LACTASE MEASUREMENTS IN PIGLETS 3

experimental design, to determine the fat content in milk (from a given area at a giventime):

1. Ten farms were selected randomly from all milk producing farms in the area.

2. From each selected farm six cows were randomly selected.

3. From each selected cow two cups of milk were taken randomly from the dailyproduction to be analyzed in the lab.

Intuitively, two cups of milk from the same cow would be expected to be more similar,than two cups from different cows at different farms. Notice how data collected thisway is in violation with the standard assumption of independent measurements.

5.2 Main example: Lactase measurements in piglets

As part of a larger study of the intestinal health in newborn piglets, the gut enzymelactase was measured in 20 piglets taken from 5 different litters. For each of the 20 pigletsthe lactase level was measured in three different regions. At the time the measurementwas taken the piglet was either unborn (status=1) or newborn (status=2). These dataare kindly provided by Charlotte Reinhard Bjørnvad, Department of Animal Scienceand Animal Health, Division of Animal Nutrition, Royal Veterinary and AgriculturalUniversity. The 60 log transformed measurements are listed below:

eNote 5 5.3 CONSTANT MEAN VALUE STRUCTURE 4

Obs litter pig reg status loglact Obs litter pig reg status loglact

1 1 1 1 1 1.89537 31 3 11 1 2 1.58104

2 1 1 2 1 1.97046 32 3 11 2 2 1.52606

3 1 1 3 1 1.78255 33 3 11 3 2 1.65058

4 1 2 1 1 2.24496 34 4 12 1 1 1.97162

5 1 2 2 1 1.43413 35 4 12 2 1 2.11342

6 1 2 3 1 2.16905 36 4 12 3 1 2.51278

7 1 3 1 2 1.74222 37 4 13 1 1 2.06739

8 1 3 2 2 1.84277 38 4 13 2 1 2.25631

9 1 3 3 2 0.17479 39 4 13 3 1 1.79251

10 1 4 1 2 2.12704 40 5 14 1 1 1.93274

11 1 4 2 2 1.90954 41 5 14 2 1 1.82394

12 1 4 3 2 1.49492 42 5 14 3 1 1.23629

13 2 5 1 2 1.62897 43 5 15 1 2 2.07386

14 2 5 2 2 2.26642 44 5 15 2 2 1.96713

15 2 5 3 2 1.96763 45 5 15 3 2 0.47971

16 2 6 1 2 2.01948 46 5 16 1 2 2.01307

17 2 6 2 2 2.56443 47 5 16 2 2 1.85483

18 2 6 3 2 1.16387 48 5 16 3 2 2.18274

19 2 7 1 2 2.20681 49 5 17 1 2 2.86629

20 2 7 2 2 2.55652 50 5 17 2 2 2.71414

21 2 7 3 2 1.69358 51 5 17 3 2 1.60533

22 2 8 1 2 1.09186 52 5 18 1 2 1.97865

23 2 8 2 2 1.93091 53 5 18 2 2 1.93342

24 2 8 3 2 . 54 5 18 3 2 0.74943

25 3 9 1 1 2.36462 55 5 19 1 2 2.89886

26 3 9 2 1 2.72261 56 5 19 2 2 2.88606

27 3 9 3 1 2.80336 57 5 19 3 2 2.20697

28 3 10 1 1 2.42834 58 5 20 1 2 1.87733

29 3 10 2 1 2.64971 59 5 20 2 2 1.70260

30 3 10 3 1 2.54788 60 5 20 3 2 1.11077

Notice the hierarchical structure in which the measurements are naturally ordered. Firstthe five litters, then the piglets, and finally the individual measurements. This structureis illustrated in figure 5.1 for the first two litters.

Litter=1 Litter=2

Pig=1 Pig=2 Pig=3 Pig=4 Pig=5 Pig=6 Pig=7 Pig=8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 .

Figur 5.1: The hierarchical structure of the lactase data set for the first two litters

5.3 Constant mean value structure

The mixed model framework is very flexible, and it is possible to include the mean valuestructures known from fixed effects models, as well as the random effects structure. Inthis module focus is on the variance structure, and for this reason the same simple meanvalue structure is used when the models are presented. The mean structure is simply

eNote 5 5.4 THE ONE LAYER MODEL 5

denoted µ without any subscripts. This is done in order to keep focus on the variancestructure. In the analyzed examples more realistic mean value structures will be used.

5.4 The one layer model

The simplest ’hierarchical’ model is the one layer model, which is typically denoted:

yi = µ + εi, εi ∼ i.i.d. N(0, σ2)

Here all observations are independent, so this is really not a hierarchical model, and islisted here mainly for completeness.

The mean value parameter(s) in a model with this simple variance structure is esti-mated via least squares method, maximum likelihood method, or restricted/residualmaximum likelihood method. In this case these methods all give the same result, name-ly:

β = (X′X)−1X′y

Here β is the parameter(s) of the fixed effects part of the model, y is the observations,and X is the design matrix for the fixed effects part of the model, as described in theprevious module.

The single variance parameter in the one layer model is estimated via the restricted/residualmaximum likelihood method, which gives an unbiased estimate

σ2 =1

N − p

N

∑i=1

(yi − µi)2

Here N is the total number of observations, p is the number of mean value parameters,and µi is the Best Linear Unbiased Estimate (BLUE) of the mean value for the i’th ob-servation, which is calculated as µi = (Xβ)i. The sum in the estimate for the varianceis often referred to as the Sum of Squared Derivations (SSD), or simply as the Sum ofSquares (SS).

Lactase measurements in piglets analyzed as independent observations

To analyse the lactase data set with a one layer model, the information about litter andpig is ignored. Remaining is the region (1,2 or 3), the status (1 or 2), and the interactionbetween region and status ((1,1), (1,2), (2,1), (2,2), (3,1), or (3,2)). The one layer model forthe lactase data becomes:

yi = µ + α(statusi) + β(regi) + γ(statusi, regi) + εi, εi ∼ i.i.d. N(0, σ2)

eNote 5 5.4 THE ONE LAYER MODEL 6

[I]5359 status × reg2

6

reg23

status12

011

Figur 5.2: Factor structure in the lactase example when analyzed as independent mea-surements (one layer model).

where i = 1, . . . , 60 is the observation number.

To analyze this model in R, write:

model1 <- lm(loglact ~ reg + status + reg:status, data = lactase)

anova(model1)

Analysis of Variance Table

Response: loglact

Df Sum Sq Mean Sq F value Pr(>F)

reg 2 2.5843 1.29217 5.2595 0.008248 **

status 1 1.1288 1.12882 4.5946 0.036680 *

reg:status 2 1.4074 0.70368 2.8642 0.065894 .

Residuals 53 13.0212 0.24568

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

eNote 5 5.4 THE ONE LAYER MODEL 7

In this case the mean value of the log-lactase measurements are described by region,status and the interaction between then. Notice how the intercept term µ is not specified.R adds the intercept term automatically. If the intercept term is undesired in a givenmodel it must be removed by adding -1 after the last model term.

From this output it follows that the estimate of the variance parameter is σ2 = 0.2457,and two times the negative restricted/residual log likelihood function 2`re, when eva-luated with the optimal parameters can be found from

(-2) * logLik(model1,REML=TRUE)

’log Lik.’ 89.46331 (df=7)

and is 2`re = 89.46. This is an important value as it is useful for comparing differentvariance structures. This module is about hierarchical variance structures, and hencelittle attention is paid to the mean value structure. The ANOVA table indicates a p-valueof 6.6% for removing the interaction term from the model, which is above the standard5% significance level.

Type I and Type III tests - again

• Type I (in anova):

– Succesive tests– Each term is corrected for the terms PREVIOUSLY listed in the model– Sums of squares sum to TOTAL sums of squares– Depends on the order they are specified in the model

• Type III (in drop1 and anova of lmerTest):

– Parallel tests– Each term is corrected for ALL other the terms in the model– Sums of squares do NOT generally sum to TOTAL sums of squares– Does NOT depend on the order they are specified in the model

• If balanced data: the SAME!

• drop1 is really ”Type II” = Type III with the condition:

– Do NOT test main effects which are part of (fixed) interactions!

eNote 5 5.5 TWO LAYER MODEL 8

5.5 Two layer model

The simplest real hierarchical model is the two layer model, which is denoted:

yi = µ + a(subi) + εi

where a(subi) ∼ N(0, σ2a ), and εi ∼ N(0, σ2), and all (both a’s and ε’s) are independent.

In this two layer model all observations are no longer independent, in fact the covari-ance between two observations is:

cov(yi1 , yi2) =

0 , if subi1 6= subi2 and i1 6= i2σ2

a , if subi1 = subi2 and i1 6= i2σ2

a + σ2 , if i1 = i2

which is easily calculated from the model. This means that two observations comingfrom the same subject are positively correlated. This is exactly what is needed in manysituations where several observations are taken from the same subject. It correspondswell with the intuition behind the experiment, and the variance parameters can be in-terpreted in a natural way. The subject variance parameter σ2

a is the variance betweensubjects, and the residual variance parameter σ2 is the variance between measurementson the same subject.

The variance parameters and the fixed effect parameter(s) in the two layer model areestimated via the restricted/residual maximum likelihood method. The REML methodis preferred to the maximum likelihood method, because it gives unbiased estimates ofthe variance parameters. For the fixed effect parameter(s) the REML estimate is givenby:

β = (X′V−1X)−1X′V−1y

Here β is the parameter(s) of the fixed effect part of the model, y is the observations,and X is the design matrix for the fixed effect part of the model. The matrix V is thecovariance matrix for the observations y, which is estimated from the REML estimatesof the variance parameters. In this two layer model the structure of V is:

V =

σ2a + σ2 σ2

a σ2a 0 0 0

σ2a σ2

a + σ2 σ2a 0 0 0

σ2a σ2

a σ2a + σ2 0 0 0

0 0 0 σ2a + σ2 σ2

a σ2a

0 0 0 σ2a σ2

a + σ2 σ2a

0 0 0 σ2a σ2

a σ2a + σ2

Here illustrated for the situation with two subjects and three observations on each sub-ject. The general pattern is a block diagonal covariance matrix with blocks correspon-ding to each subject.

eNote 5 5.5 TWO LAYER MODEL 9

[I]

status × reg

[pig]

reg

status

0

Figur 5.3: Factor structure in the lactase example when analyzed with a two layer model(random pig effect).

Lactase measurements in piglets analyzed with two layer model

To analyze the lactase data set with a two layer model the information about litter isstill ignored, but the information about pig is now included. This corresponds to thetwo bottom layers in figure 5.1. Alternatively the the pig information could have beenomitted and the litter information included.

The two layer model with random pig effect is:

yi = µ + α(statusi) + β(regi) + γ(statusi, regi) + d(pigi) + εi,

where d(pigi ∼ N(0, σ2d ), and εi ∼ N(0, σ2), and all (both d’s and ε’s) are independent.

Here i = 1, . . . , 60 is the observation number.

The idea behind this model is that pigs are different, and the pigs in this investigationare random representatives from the relevant population of newborn pigs. The variancebetween pigs is described with the σ2

d parameter, and the remaining variance betweenmeasurements on the same pig is described with the σ2 parameter.

To analyze this model in R, write:

eNote 5 5.5 TWO LAYER MODEL 10

require(lmerTest)

eNote 5 5.5 TWO LAYER MODEL 11

model2 <- lmer(loglact ~ reg*status + (1|pig), data = lactase)

summary(model2)

Linear mixed model fit by REML t-tests use Satterthwaite approximations

to degrees of freedom [lmerMod]

Formula: loglact ~ reg * status + (1 | pig)

Data: lactase

REML criterion at convergence: 79.9

Scaled residuals:

Min 1Q Median 3Q Max

-2.0709 -0.5030 0.1307 0.4950 1.9128

Random effects:

Groups Name Variance Std.Dev.

pig (Intercept) 0.1119 0.3345

Residual 0.1349 0.3673

Number of obs: 59, groups: pig, 20

Fixed effects:

Estimate Std. Error df t value Pr(>|t|)

(Intercept) 2.129291 0.187780 37.630000 11.339 1.06e-13 ***

reg2 0.009363 0.196334 34.820000 0.048 0.9622

reg3 -0.008660 0.196334 34.820000 -0.044 0.9651

status2 -0.121178 0.232912 37.630000 -0.520 0.6059

reg2:status2 0.109818 0.243523 34.820000 0.451 0.6548

reg3:status2 -0.655019 0.245841 34.990000 -2.664 0.0116 *

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Correlation of Fixed Effects:

(Intr) reg2 reg3 stats2 rg2:s2

reg2 -0.523

reg3 -0.523 0.500

status2 -0.806 0.421 0.421

reg2:stats2 0.421 -0.806 -0.403 -0.523

reg3:stats2 0.418 -0.399 -0.799 -0.518 0.495

eNote 5 5.6 THREE OR MORE LAYERS 12

anova(model2)

Analysis of Variance Table of type III with Satterthwaite

approximation for degrees of freedom

Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)

reg 1.64252 0.82126 2 34.930 6.0872 0.005394 **

status 0.35771 0.35771 1 17.685 2.6514 0.121142

reg:status 1.51820 0.75910 2 34.930 5.6265 0.007617 **

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

From this output it is evident, that variation between pigs is a major part of the totalvariation. The variance between pigs is estimated to σ2

d = 0.1119, and the residual vari-ance is estimated to σ2 = 0.1349. So a little less than half of the total variation in the dataset is due to the variation between pigs. Twice the negative restricted/residual log like-lihood is 2`re = 79.9. The conclusion about the fixed effect parameters is different fromthe one layer model. In the one layer model the interaction term between status andregion could be removed with a p-value of 6.6%, but now the interaction term is signi-ficant with a p-value of 0.76%. The important lesson to be learned from this example isthat the variance structure matters. Even in two models with the exact same fixed effectstructure, the conclusions can differ greatly if the variance structure is not correct. Laterin this module a method for comparing different variance structures will be shown.

5.6 Three or more layers

Hierarchical models with three or more levels are natural in some situations. Considerfor instance the situation with randomly selected farms, randomly selected cows fromeach farm, and randomly selected cups of milk from the daily production of each cow.In this situation the natural model would be a three layer model. The basic three levelmodel is denoted:

yi = µ + a(blocki) + b(subi) + εi

where a(blocki) ∼ N(0, σ2a ), b(subi) ∼ N(0, σ2

b ) and εi ∼ N(0, σ2) all independent. iis the observation index. The interpretation is similar to the interpretation of the twolayer model. The parameter σ2

a is the variance between blocks (farms), σ2b is the vari-

ance between subjects (cows) in the same block (farm), and σ2 is the variance betweenobservations from the same subject (cow).

eNote 5 5.6 THREE OR MORE LAYERS 13

The parameters are estimated by REML method, just like in the two layer model, andthe structure of the covariance matrix is:

V =

σ2a +σ2

b +σ2 σ2a +σ2

b σ2a σ2

a 0 0 0 0σ2

a +σ2b σ2

a +σ2b +σ2 σ2

a σ2a 0 0 0 0

σ2a σ2

a σ2a +σ2

b +σ2 σ2a +σ2

b 0 0 0 0σ2

a σ2a σ2

a +σ2b σ2

a +σ2b +σ2 0 0 0 0

0 0 0 0 ? ? ? ?0 0 0 0 ? ? ? ?0 0 0 0 ? ? ? ?0 0 0 0 ? ? ? ?

[The stars in the lower right corner should be substituted with a 4 by 4 block identical to the upper leftcorner.]

V =

σ2a +σ2

b +σ2 σ2a +σ2

b σ2a σ2

a 0 0 0 0σ2

a +σ2b σ2

a +σ2b +σ2 σ2

a σ2a 0 0 0 0

σ2a σ2

a σ2a +σ2

b +σ2 σ2a +σ2

b 0 0 0 0σ2

a σ2a σ2

a +σ2b σ2

a +σ2b +σ2 0 0 0 0

0 0 0 0 σ2a +σ2

b +σ2 σ2a +σ2

b σ2a σ2

a0 0 0 0 σ2

a +σ2b σ2

a +σ2b +σ2 σ2

a σ2a

0 0 0 0 σ2a σ2

a σ2a +σ2

b +σ2 σ2a +σ2

b0 0 0 0 σ2

a σ2a σ2

a +σ2b σ2

a +σ2b +σ2

Here illustrated for the case with two blocks, two subjects from each block, and twoobservations from each subject. The first 4 by 4 block in V corresponds to the first block,and the next non-zero 4 by 4 block corresponds to the second block. Within each ofthese blocks there is a similar structure, where two observations from the same subjectare higher correlated, than two observations from different subjects.

Lactase measurements in piglets analyzed with three layer model

The natural model for the lactase data set is a three layer model, with litter as top layer,pig as the middle layer, and each single measurement as the bottom layer. This hierarchyis illustrated in figure 5.1 for the first two litters.

yi = µ + α(statusi) + β(regi) + γ(statusi), regi)) + d(litteri) + e(pigi) + εi,

where d(litteri) ∼ N(0, σ2d ), e(pigi) ∼ N(0, σ2

e ), and εi ∼ N(0, σ2), and all (both d’s,e’s, and ε’s) are independent. i is observation number.

To analyze this model in R, write: (Here the litter information has been added to therandom part of the model)

eNote 5 5.6 THREE OR MORE LAYERS 14

[I]

status × reg

[pig]

reg

status

[litter]

0

Figur 5.4: Factor structure in the lactase example when analyzed with a three layer mo-del (random pig and litter effect).

eNote 5 5.6 THREE OR MORE LAYERS 15

model3 <- lmer(loglact ~ reg*status + (1|pig) + (1|litter), data = lactase)

summary(model3)

Linear mixed model fit by REML t-tests use Satterthwaite approximations

to degrees of freedom [lmerMod]

Formula: loglact ~ reg * status + (1 | pig) + (1 | litter)

Data: lactase

REML criterion at convergence: 79.9

Scaled residuals:

Min 1Q Median 3Q Max

-2.0709 -0.5030 0.1307 0.4950 1.9128

Random effects:

Groups Name Variance Std.Dev.

pig (Intercept) 0.1119 0.3345

litter (Intercept) 0.0000 0.0000

Residual 0.1349 0.3673

Number of obs: 59, groups: pig, 20; litter, 5

Fixed effects:

Estimate Std. Error df t value Pr(>|t|)

(Intercept) 2.129291 0.187780 37.630000 11.339 1.06e-13 ***

reg2 0.009363 0.196334 34.820000 0.048 0.9622

reg3 -0.008660 0.196334 34.820000 -0.044 0.9651

status2 -0.121178 0.232912 37.630000 -0.520 0.6059

reg2:status2 0.109818 0.243523 34.820000 0.451 0.6548

reg3:status2 -0.655019 0.245841 34.990000 -2.664 0.0116 *

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Correlation of Fixed Effects:

(Intr) reg2 reg3 stats2 rg2:s2

reg2 -0.523

reg3 -0.523 0.500

status2 -0.806 0.421 0.421

reg2:stats2 0.421 -0.806 -0.403 -0.523

reg3:stats2 0.418 -0.399 -0.799 -0.518 0.495

eNote 5 5.7 ZERO/NEGATIVE VARIANCE PARAMETERS 16

anova(model3)

Analysis of Variance Table of type III with Satterthwaite

approximation for degrees of freedom

Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)

reg 1.64252 0.82126 2 34.930 6.0872 0.005394 **

status 0.35771 0.35771 1 17.685 2.6514 0.121142

reg:status 1.51820 0.75910 2 34.930 5.6265 0.007617 **

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

confint(model3, 1:3)

Computing profile confidence intervals ...

2.5 % 97.5 %

.sig01 0.1850922 0.4938041

.sig02 0.0000000 0.2873889

.sigma 0.2829005 0.4426169

Compared to the output from the two layer model there is basically no difference. Thelikelihood is exactly the same and the variance for the litter effect is estimated at zero,and the conclusions about the fixed effect parameters are unchanged.

5.7 Zero/Negative variance parameters

In some software, e.g. SAS Proc Mixed, an option can be set to allow for negative va-riance components - NOT in R though. When negative variance estimates occur, it isusually as an underestimate of small (or zero) variance. In this case a reasonable way tohandle the negative estimates is simply to leave out the, or fix to zero, the correspondingvariance component.

In some rare cases it is possible to interpret a negative variance parameter estimate.For instance in the case of several observations on the same litter. It could be possiblethat some of the animals would do well and eat all the available food, and some of

eNote 5 5.8 COMPARING DIFFERENT VARIANCE STRUCTURES 17

the animals would do poorly, and because all the food was now gone they would doeven worse. In this type of competitive situations the variation within a group couldbe greater than the variation between groups, which would lead to negative varianceestimates.

In R these are instead set to zero which really makes better sense in most cases.

5.8 Comparing different variance structures

To test a reduction in the variance structure a restricted/residual likelihood ratio test canbe used. A likelihood ratio test is used to compare two models A and B, where B is asub-model of A. Typically the model including some variance component (model A),and the model without this variance component (model B) is to be compared. To do thisthe negative restricted/residual log-likelihood values (`(A)

re and `(B)re ) from both models

must be computed. The test can now be computed as:

GA→B = 2`(B)re − 2`(A)

re

The likelihood ratio test statistic follows a χ2d f –distribution, where d f is the difference

between the number of parameters in A and the number of parameters in B. In the caseof removing a single variance component d f = 1.

For the lactase data set three different variance structures have been used, the compari-son of these variance models is seen in the following table:

Model 2`re G–value df P–value(A) Pig and litter included 79.9 GA→B = 0.0 1 PA→B = 1(B) Only pig included 79.9 GB→C = 9.6 1 PB→C = 0.0019/2(C) Independent observations 89.5

It follows that the litter variance component can be left out of the model, but the pigvariance component is very significant. In other words: Two observations taken on thesame pig are correlated, but two observations from different pigs within the same litterare not correlated (positively or negatively).

Warning: It is really more correct to use DF=0.5 instead of df=1 in these tests! OR si-milarly(appxomately) divide them by 2.

To calculate these p–values via R, the following lines can be used:

eNote 5 5.8 COMPARING DIFFERENT VARIANCE STRUCTURES 18

anova(model2,model3)

refitting model(s) with ML (instead of REML)

Data: lactase

Models:

object: loglact ~ reg * status + (1 | pig)

..1: loglact ~ reg * status + (1 | pig) + (1 | litter)

Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)

object 8 83.642 100.26 -33.821 67.642

..1 9 85.642 104.34 -33.821 67.642 0 1 1

(-2) * logLik(model3)

’log Lik.’ 79.91436 (df=9)

(-2) * logLik(model2)

’log Lik.’ 79.91436 (df=8)

(-2) * logLik(model1,REML=TRUE)

’log Lik.’ 89.46331 (df=7)

eNote 5 5.9 THE BALANCED CASE - BETTER CONFIDENCE LIMITS 19

1-pchisq(0,1)

[1] 1

1-pchisq(0,0.5)

[1] 1

1-pchisq(9.55,1)

[1] 0.001999494

1-pchisq(9.55,0.5)

[1] 0.0006347892

qchisq(0.95,1)

[1] 3.841459

qchisq(0.95,0.5)

[1] 2.420232

5.9 The balanced case - better confidence limits

In R we get the confidence intervals as profile likelihood based such from the confint

function. We also saw the alternative method of using an approximate χ2-distributionbased on the Satterthwaite approximation approach - however this method requiresthe asymptotic variance-covariance matrix of the (random part) parameters. And per sethese are not so easily extracted from lmer results. (It will be built into the lmerTest-

eNote 5 5.9 THE BALANCED CASE - BETTER CONFIDENCE LIMITS 20

package, but not currently available).

In the balanced case the variance components can often be simply expressed by linearcombinations of independent Mean Squares, which are all chi-square distributed anda simple chi-square approximation can be used to construct better confidence limits.Some of the theory behind this is given here.

A factor F is said to be balanced if the same number of observations are made for eachof its levels. In this section it is assumed that all factors are balanced.

As seen in the module on factor structure diagrams, it is possible in “nice” cases to calcu-late the degrees of freedom from the factor structure diagram. The formula to calculatethe degrees of freedom dfF for a factor F is:

dfF = |F| − ∑G:G<F

dfG

Here |F| is the number of levels of the factor F, and the last sum is the sum of the degreesof freedom from all factors coarser than F.

In these “nice” cases a similar formula exists for computing the sums of squares (SS)to be used in the analysis of variance table. The sum of squares SSF corresponding to afactor F is:

SSF =1nF

∑j∈F

SF(j)2 − ∑G:G<F

SSG

This formula require some explaining. The term nF is the number of observations oneach level of the factor (as the factor is balanced this is the same on all levels). The termSF(j) is the sum of all observations at the j’th level of the factor F. For instance if F is thefactor indicating treatment (“no” or “yes”), then SF(yes) is the sum of all observationsfrom treated individuals. The last sum is the sum of the SS’s from all factors coarser thanF.

Example 5.1 Balanced two layer model

Consider again the two layer model:

yi = µ + a(subi) + ε i

where a(subi) ∼ N(0, σ2a ), and ε i ∼ N(0, σ2), and all independent. The number of subjects

is |sub|, the number of observations on each subject is nsub, and hence the total number ofobservations in the balanced case is N = |sub| · nsub.

The factor structure for this model is:

eNote 5 5.9 THE BALANCED CASE - BETTER CONFIDENCE LIMITS 21

[I]N−|sub|N [sub]|sub|−1

|sub| 011

From this diagram the degrees of freedom for each factor can be calculated (in fact theyare already printed in the diagram). For the constant factor 0 the degrees of freedom isalways df0 = 1, as it is the coarsest factor and has only one level. The subject factor has|sub| levels and only 0 is coarser, so dfsub = |sub| − 1. Finally for the finest factor I. Ihas N levels, but sub and 0 are coarser, so dfI = N − (|sub| − 1)− 1 = N − |sub|.

Similarly the SS–values can be calculated as:

SS0 = 1N (∑i=1...N yi)

2

SSsub = 1nsub ∑j∈sub Ssub(j)2 − SS0

SSI = ∑i=1...N y2i − SSsub − SS0

These formulas may seem cumbersome, but actually they are fairly simple to use, asthey only depend on the total sum, the total sum of squares, and the subject–sums.

With these formulas in place the analysis of variance table can be constructed as:

Source SS df MS EMSSubject SSsub |sub| − 1 SSsub/(|sub| − 1) σ2 + nsubσ2

aResidual SSI N − |sub| SSI/(N − |sub|) σ2

So natural estimates of the two variance parameters are:

σ2 = MSI and σ2a =

MSsub −MSInsub

In this balanced case it is also possible to write down better confidence intervals forthe variance parameters, than the general confidence intervals based on the normal ap-proximation, which will be described later. The standard 95% confidence interval for

eNote 5 5.9 THE BALANCED CASE - BETTER CONFIDENCE LIMITS 22

the residual variance is:

(N − |sub|)σ2

χ20.975;(N−|sub|)

< σ2 <(N − |sub|)σ2

χ20.025;(N−|sub|)

Here N − |sub| are the degrees of freedom corresponding to MSI. The parameter σ2a is

estimated by a linear combination of two MS values, so the exact degrees of freedomcan not be found. Satterthwaite’s approximation is used to approximate the degrees offreedom. It this case Satterthwaite determine the degrees of freedom by making the cor-responding χ2–distribution have the same variance as the estimate. The 95% confidenceinterval is:

νσ2a

χ20.975;ν

< σ2a <

νσ2a

χ20.025;ν

where the ν is estimated to:

ν =(σ2

a )2(

MSsubnsub

)2

|sub|−1 +

(MSInsub

)2

N−|sub|

Slightly more general, if fixed effects are also present in the model, or if possibly so-me inbalance occur, one may still estimate variance components by a so-called momentmethod. This is actually a simple alternative to REML estimation. It amounts to a so-lution of the linear system, where the theoretical expected random effect mean squaresare equalled with the observed mean squares. In balanced experiments these theoreticalexpected mean squares can be deduced from the factor strcuture diagram, cf. Module2, and the results of this will equal the REML estimates! In other cases, expected meansquares are almost impossible to deduce by hand calculation.

However, infact , for instance SAS PROC GLM can do this for us by means listing allthe random effects in an additional RANDOM statement like:

proc glm;

class treat sub;

model y=treat sub;

Random sub;

This is not so easily available in R.

In general, IF a variance component is estimated as a linear combination of mean squa-res:

σ2A =

K

∑k=1

bkMSk

eNote 5 5.10 PREDICTION OF RANDOM EFFECT COEFFICIENTS 23

where the mean squares have associated degrees of freedom f1, . . . , fk, then an approxi-mate 95% confidence band can be obtained by:

νσ2A

χ20.975;ν

< σ2A <

νσ2A

χ20.025;ν

where the ν is estimated to:

ν =(σ2

A)2

∑Kk=1

(bkMSk)2

fk

This is the basic Satterthwaite formula. In the case above, K = 2, b1 = 1/nsub, b2 =−1/nsub, f1 = |sub| − 1 and f2 = N − |sub|.

In general, the REML estimates cannot be written as such linear combinations, so othermethods are required. we saw in, Module 4, a different satterthwaite method is descri-bed for REML estimates. In SAS this is simply obtained by using the option CL in thePROC MIXED statement. In R this will have to be done ”manually”.

Before computers revolutionized the world of statistics, the balanced case was extreme-ly important. Great care was taken to obtain balanced experiments, but a single failedmeasurement could ruin a balanced design. With modern computers and software thebalanced case seems less important.

5.10 Prediction of random effect coefficients

This is reproduced from Module 4. A model with random effects is typically used insituations where the subjects (or blocks) in the study are to be considered as represen-tatives from a greater population. As such, the main interest is usually not in the actuallevels of the randomly selected subjects, but in the variation among them. It is howeverpossible to obtain predictions of the individual subject levels u. The formula is given inmatrix notation as:

u = GZ′V−1(y− Xβ)

If G and R are known, u can be shown to be ’the best linear unbiased predictor’ (BLUP)of u. Here, ’best’ means minimum mean squared error. In real applications G and R arealways estimated, but the predictor is still referred to as the BLUP.

In R we get the random effect BLUPs very easily:

eNote 5 5.11 R-TUTORIAL: THE LACTASE EXAMPLE 24

ranef(model2)

$pig

(Intercept)

1 -0.17600554

2 -0.12850621

3 -0.40900010

4 0.01228298

5 0.09111246

6 0.06371047

7 0.23232880

8 -0.34709933

9 0.35715158

10 0.29422020

11 -0.17171747

12 0.04975418

13 -0.06476400

14 -0.33185022

15 -0.22806704

16 0.13572511

17 0.40563619

18 -0.19458734

19 0.59731937

20 -0.18764410

5.11 R-TUTORIAL: The lactase example

The R-function lmer from the package lme4 cannot be used for models with no randomeffects. Instead lm would have to be used whenever version of our models with norandom effects are needed for reasons of comparison:

eNote 5 5.11 R-TUTORIAL: THE LACTASE EXAMPLE 25

lactase <- read.table("lactase.txt", sep = ",", header = TRUE, na.strings = ".")

lactase$litter <- factor(lactase$litter)

lactase$pig <- factor(lactase$pig)

lactase$reg <- factor(lactase$reg)

lactase$status <- factor(lactase$status)

model1 <- lm(loglact ~ reg + status + reg:status, data = lactase)

require(car)

Anova(model1, type = 3, test.statistic="F")

Anova Table (Type III tests)

Response: loglact

Sum Sq Df F value Pr(>F)

(Intercept) 31.737 1 129.1792 7.931e-16 ***

reg 0.001 2 0.0023 0.99769

status 0.067 1 0.2719 0.60420

reg:status 1.407 2 2.8642 0.06589 .

Residuals 13.021 53

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Even though type III tests for main effects are produced by the Anova function, one mustbe careful about the interpretations of these, and for unbalanced cases and in fact alsomodels with covariates we should not use such type III tests from this function (as theinterpretation is generally unclear).

In the present case there is only one test for which we would always be sure to have agood interpretation: namely the test for the interaction term and this is would e.g. alsobe returned by the drop1 function:

drop1(model1, test = c("F"))

Single term deletions

Model:

loglact ~ reg + status + reg:status

Df Sum of Sq RSS AIC F value Pr(>F)

<none> 13.021 -77.146

eNote 5 5.11 R-TUTORIAL: THE LACTASE EXAMPLE 26

reg:status 2 1.4074 14.429 -75.091 2.8642 0.06589 .

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

In R the two-layer model is specified using the function lmer

require(lme4)

lmer(loglact ~ reg*status + (1|pig), data = lactase)

Linear mixed model fit by REML [’merModLmerTest’]

Formula: loglact ~ reg * status + (1 | pig)

Data: lactase

REML criterion at convergence: 79.9144

Random effects:

Groups Name Std.Dev.

pig (Intercept) 0.3345

Residual 0.3673

Number of obs: 59, groups: pig, 20

Fixed Effects:

(Intercept) reg2 reg3 status2 reg2:status2

2.129291 0.009363 -0.008660 -0.121178 0.109818

reg3:status2

-0.655019

where the (1|pig) part specifies that there is a random effect for each level of the va-riable pig and that each random effect is multiplied with the value 1, which does notchange anything in this example, but we note it for later extensions.

require(lmerTest)

eNote 5 5.11 R-TUTORIAL: THE LACTASE EXAMPLE 27

model2 <- lmer(loglact ~ reg*status + (1|pig), data = lactase)

anova(model2)

Analysis of Variance Table of type III with Satterthwaite

approximation for degrees of freedom

Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)

reg 1.64252 0.82126 2 34.930 6.0872 0.005394 **

status 0.35771 0.35771 1 17.685 2.6514 0.121142

reg:status 1.51820 0.75910 2 34.930 5.6265 0.007617 **

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

The (default) output from anova, now using the version from the lmerTest package, isthe Type III F-statistics and P-values (defined the SAS-way). A Type I version is also anoption (use type = 1).

Remark 5.2

WARNING - a confusing detail: in general the SS and MS columns of the lme4 (andhence lmerTest) will NOT coincide with what is seen from lm as the SS and MScolunms from the lme4-anova are scaled by the proper mixed model error termssuch that when they are divided by the error variance, the proper mixed model F-statistics will appear! Confused? See the example below!

eNote 5 5.11 R-TUTORIAL: THE LACTASE EXAMPLE 28

Example 5.3 Comparing anova on lm and lmer

detach("package:lmerTest", unload=TRUE)

## We illustrate this on a balanced data set, so we remove pigs 3-8:

lactase_red <- subset(lactase, pig %in% c(1:2,9:20))

testmodel <- lmer(loglact ~ status + (1|pig), data = lactase_red)

testmodelfixed <- lm(loglact ~ status + pig, data = lactase_red)

anova(testmodel)

Analysis of Variance Table

Df Sum Sq Mean Sq F value

status 1 0.20575 0.20575 1.086

anova(testmodelfixed)

Analysis of Variance Table

Response: loglact

Df Sum Sq Mean Sq F value Pr(>F)

status 1 0.5626 0.56264 2.9697 0.09586 .

pig 12 6.2172 0.51810 2.7346 0.01386 *

Residuals 28 5.3049 0.18946

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Note that the proper mixed model F-statistic for the status effect (as pigs are nested withinstatus) is

FMixed =MSstatus

MSpig=

0.562640.51810

= 1.085968

where the SSstatus is defined the ”usual”ANOVA way: (as there are 21 observations for eachstatus in the reduced )

statusmeans <- tapply(lactase_red$loglact, lactase_red$status, mean)

overallmean <- mean(lactase_red$loglact)

21*sum((statusmeans-overallmean)^2)

[1] 0.5626422

eNote 5 5.11 R-TUTORIAL: THE LACTASE EXAMPLE 29

But from anova(model2) instead we get the number SS = 0.20575, which is coming as:

SS(lmer)status = 0.20575 = 0.56264

0.189460.51810

= SS(usual)status

MSResidual

MSPig= SS(usual)

statusMSResidual

(Mixed Error)

and finally note that the F-statistic for status effect then can be achieved by taking the (so-called) MS from anova(model2) and dividing by the residual error:

FMixed =MS(lmer)

statusMSResidual

=0.205750.18946

= 1.085968

And by the way the residual error MS from the fixed model is also the estimate of the residualerror variance of the mixed model:

sigma(testmodel)^2

[1] 0.1894598

The function lmer returns an mermod object containing all information about the fittedmodel. In the example just below model2 is an mermod object: To extract informationfrom the object various functions, so-called extractors, can be applied to the object. Wehave already met the two extractors anova and summary earlier.

A list of all such methods is found as:

methods(class = "merMod")

[1] anova Anova as.function coef

[5] confint deltaMethod deviance df.residual

[9] drop1 extractAIC family fitted

[13] fixef formula getL hatvalues

[17] isGLMM isLMM isNLMM isREML

[21] linearHypothesis logLik matchCoefs model.frame

[25] model.matrix ngrps nobs plot

[29] predict print profile ranef

[33] refit refitML residuals show

[37] sigma simulate summary terms

[41] update VarCorr vcov weights

see ’?methods’ for accessing help and source code

eNote 5 5.11 R-TUTORIAL: THE LACTASE EXAMPLE 30

The variance parameter estimates and profile likelihood baed confidence intervals areobtained from the merMod object model2 by applying the extractors VarCorr and confint:

VarCorr(model2)

Groups Name Std.Dev.

pig (Intercept) 0.33453

Residual 0.36731

confint(model2, 1:2)

Computing profile confidence intervals ...

2.5 % 97.5 %

.sig01 0.1850779 0.4938041

.sigma 0.2829007 0.4426169

Only the first two elements in the output list of confint are displayed, as they containconfidence intervals for the variance parameters in the model. The last elements containthe confidence intervals for the fixed effects parameters.

The three-layer model is specified in the following manner

require(lmerTest)

eNote 5 5.11 R-TUTORIAL: THE LACTASE EXAMPLE 31

model3 <- lmer(loglact ~ reg*status + (1|pig) + (1|litter), data = lactase)

anova(model3)

Analysis of Variance Table of type III with Satterthwaite

approximation for degrees of freedom

Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)

reg 1.64252 0.82126 2 34.930 6.0872 0.005394 **

status 0.35771 0.35771 1 17.685 2.6514 0.121142

reg:status 1.51820 0.75910 2 34.930 5.6265 0.007617 **

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

VarCorr(model3)

Groups Name Std.Dev.

pig (Intercept) 0.33453

litter (Intercept) 0.00000

Residual 0.36731

confint(model3, 1:3)

Computing profile confidence intervals ...

2.5 % 97.5 %

.sig01 0.1850922 0.4938041

.sig02 0.0000000 0.2873889

.sigma 0.2829005 0.4426169

Minus twice the Restricted Log-likelihood value can be extracted as:

(-2) * logLik(model3)

’log Lik.’ 79.91436 (df=9)

eNote 5 5.12 EXERCISES 32

5.12 Exercises

Exercise 1 Blueberries

In an experiment with 4 sorts of blueberries the fructification (the number of berriesas percent of the number of flowers earlier in the season) was determined for 5 twigson each of 6 bushes for every sort. See Also the Hierachical Section 13.3 in the datadescription chapter.

a) Download the blueberry data set and investigate if the 4 sorts have different fructi-fication.

b) Draw the factor structure diagram corresponding to the analyzed model.

c) Test if it is important to include bush as a random effect.

Exercise 2 Tractor comparison

In a comparison of two tractors, five drivers used each of the tractors to plough a fieldof a certain size. The time in minutes was recorded. This was repeated on three diffe-rent days (with the same drivers and tractors). See Also the General Factor StructureSection 13.2 in the data description chapter. The data is available her, and is shown nithe following table:

Day Tractor a b c d e1 1 3.89 2.58 2.94 2.60 2.461 2 2.57 2.89 2.95 2.69 2.672 1 3.81 2.10 2.86 2.16 2.392 2 3.06 2.99 2.85 2.63 2.463 1 4.01 2.08 2.29 2.09 2.593 2 2.95 2.82 2.67 2.26 2.64

eNote 5 5.12 EXERCISES 33

a) Analyse the data which are given in the following table with Tractor as a fixedeffect and the other two as random effects. (remember that interactions involvingrandom effects should also be included as random effects). Make the factor struc-ture diagram.

b) Remember to estimate the significant variance components (make sure that nozero-components are present in the data) including the confidence intervals.

c) Predict the values for the levels of the random effects.