Analysis of Covariance - DTU Course Website 02429€¦ · eNote 8 8.3 EXAMPLE: HORMONE TREATMENT OF STEERS 7 350 400 450 500 550 600 650 700 1240 1260 1280 1300 1320 1340 Weight Y

eNote 8 1

eNote 8

Analysis of Covariance

eNote 8 INDHOLD 2

Indhold

8 Analysis of Covariance 1

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

8.2 The analysis of covariance models . . . . . . . . . . . . . . . . . . . . . . . 4

8.3 Example: Hormone treatment of steers . . . . . . . . . . . . . . . . . . . . . 6

8.4 Summary and post hoc analysis . . . . . . . . . . . . . . . . . . . . . . . . . 8

8.4.1 Example with equal slopes: Hormone treatment of steers (conti-nued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

8.4.2 Example with different slopes . . . . . . . . . . . . . . . . . . . . . . 10

8.4.3 Models and analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

8.4.4 Summary and post hoc analysis . . . . . . . . . . . . . . . . . . . . 12

8.4.5 Grouping treatments based on slopes post hoc analysis . . . . . . . 15

8.5 Analysis of covariance in perspective . . . . . . . . . . . . . . . . . . . . . . 16

8.6 The use of baseline measurements . . . . . . . . . . . . . . . . . . . . . . . 17

8.6.1 Example: Concentration of a hormone in cattle . . . . . . . . . . . . 19

8.7 R-TUTORIAL: Hormone treatment of steers . . . . . . . . . . . . . . . . . . . 20

8.8 R-TUTORIAL: Balanced incomplete block design . . . . . . . . . . . . . . . 23

8.9 R-TUTORIAL: Concentration of a hormone in cattle . . . . . . . . . . . . . . 28

eNote 8 8.1 INTRODUCTION 3

8.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

8.1 Introduction

In this eNote, we consider experiments in which one or more factors enter as quanti-tative factors, that is, with numerical values that are used in the linear model. Such aquantitative factor can either be a covariate, which is a supplementary measurement oneach experimental unit, used to reduce the random variation in the model, or it can beone of the factors of interest.

An important element in every experiment is to make the random (uncontrollable) vari-ation as small as possible. Besides the purely practicable concerns like taking care, goodexperimental technique, good equipment and so on, there are some statistical methodsto achieve this which play a role in both the planning and the analysis of an experiment.Through the systematic and random part of the mixed model we try to explain the expe-rimental results as well as possible based on the factors and other explanatory variableswe have in the experiment. Due to lack of a better alternative, we describe the rest of thevariation as random variation. In the case of linear models, we assume that this randomvariation may be described by independent samples from a normal distribution. Themore success we have in describing the variation, the less unexplained variation we ha-ve to describe as random. Furthermore, the smaller the random variation is, the greaterpossibility we have of drawing useful conclusions regarding effects of the (fixed) expe-rimental factors. It is important to note that the size of the random variation is not givenfrom the experimental units and the experimental technique, but it has to be seen (andis defined) in relation to a given model. This gives us the advantage that the possibilitiesfor reducing the random variation do not only rest on practical circumstances but alsoon possible improvements of the systematic part of the model.

A common technique when planning an experiment is to divide the experimental unitsinto blocks which are as homogeneous as possible so that the variation between expe-rimental units from the same block is relatively small. If this is achieved, it means thata larger part of the variation between the experimental units can be explained throughthe block effects in the model, and at the same time the unexplained part of the variationis reduced.

Another method consists of making one or more relevant measurements on each expe-rimental unit before the experiment is started. Such a measurement is called a covariateand, in mathematical terms, it is just a number associated with each experimental unit.The idea is that the covariate can enter the model as an explanatory variable, such thata larger part of the variation is explained, and the random variation is reduced.

eNote 8 8.2 THE ANALYSIS OF COVARIANCE MODELS 4

8.2 The analysis of covariance models

Let i = 1, . . . , N denote the N experimental units and Yi the i’th response variable. Nowwe also suppose that we have a covariate xi for each experimental unit. The experimentcan be with one or more factors, with or without blocks, and almost any experimentaldesign. However, in order to illustrate the use of the covariate, we consider a singlefactor and a one-way analysis of variance model, in the situation where the covariatewas not used. Let treat be the factor in the experiment with the k levels treat1, . . .,treatk. If treati denotes the treatment of the i’th experimental unit, so that treati isidentical to one of the treatments treat1, . . ., treatk, we can write the one-way analysisof variance model as

Yi = α(treati) + εi,

supplemented with the usual assumptions that ε1, . . . , εN are independent and normallydistributed with mean zero and the same variance σ2.

The classical way of including a covariate in the model is by adding a term of the formβ · xi. A model, which emerges in such a way, is called an analysis of covariance model,and the analysis which is based on such a model is called an analysis of covariance. Inaddition, it must be anticipated that the coefficient β, expressing the relation between xand Y within treatments, may depend on the treatment factors in a model.

The analysis of covariance model corresponding to the one-way analysis of variancemodel is thus given by

Yi = α(treati) + β(treati) · xi + εi (8-1)

The first step in the analysis would usually by to see if this dependence is significant,that is to see if the simpler model given by

Yi = α(treati) + β · xi + εi (8-2)

is sufficient to describe the data. If the simple model is not acceptable, there are defini-tely treatment differences, and the model summary and post hoc analysis are based onmodel (8-1). If the simpler model is accepted, a test for the effect of the treatments canbe carried out by testing the hypothesis

Yi = α + β · xi + εi, (8-3)

corresponding to all the α’s in model (8-2) being equal. If there is a significant treatmenteffect, the summary and post hoc analysis is based on model (8-2).

The hypothesis corresponds to a simple linear regression model where the responsevariable is described by a linear function of the covariate. The starting point is model (8-1) where there is a linear relationship for each treatment. The intercepts, α(treati), and

eNote 8 8.2 THE ANALYSIS OF COVARIANCE MODELS 5

slopes, β(treati), are allowed to depend on the treatment. The linear structure given bythe two models is depicted in Figure 8.1.

2 4 6 8 10

23

45

67

Equal slopes modelx

y

α1 + βx

α2 + βx

α3 + βx

Treat 1

Treat 2

Treat 3

xxLOW xHIGH

●

●

●

●

●

●

●

●

●

2 4 6 8 10

01

23

45

Different slopes modelx

y

α1 + β1x

α2 + β2x

α3 + β3x

xxLOW xHIGH

●

●

●

●

●

●

●

●

●

Figur 8.1: Analysis of covariance model structures.

eNote 8 8.3 EXAMPLE: HORMONE TREATMENT OF STEERS 6

8.3 Example: Hormone treatment of steers

In an experiment with steers, the influence of 4 hormone treatments (1, 2, 3, 4) on theweight of kidney fat was examined. A total of 16 steers in 4 blocks were used, distributedwith 4 on each hormone treatment. The results (from Mead and Curnow, 1983, Section8.8) can be seen in the table below. The table shows the weight of each steer beforethe hormone treatment (in kg) and the weight Y of kidney fat (in grams), measured asuitable period of time after the hormone treatment. The weight of the animal beforethe hormone treatment is recorded as a covariate, with the aim of reducing the randomvariation as described above.

Hormone treatment1 2 3 4

Weight Y Weight Y Weight Y Weight YBlock 1 560 1330 440 1280 530 1290 690 1340Block 2 470 1320 440 1270 510 1300 420 1250Block 3 410 1270 360 1270 380 1240 430 1260Block 4 500 1320 460 1280 500 1290 540 1310

In Figure 8.2, the weight of kidney fat is plotted against the covariate for each of the4 hormone treatments. There definitely seems to be a relationship between the weightof kidney fat and the initial body weight of each animal within each treatment group.The relationships seem similarly positive within treatment groups 1, 3, and 4, but intreatment group 2 the relationship seems less pronounced.

Whether this apparent difference in slopes is statistically significant is investigated byfitting the varying slopes model for this block setting:

Yi = d(blocki) + α(hormonei) + β(hormonei) · weighti + εi, (8-4)

where hormonei denotes the hormone treatment for experimental unit i, weighti is theanimal’s initial weight, and d(blocki) is the random block effect with variance σ2

b . Notethat the interpretation of an interaction between a factor (class variable) and a covariateis exactly the varying slopes model. The fixed effects F-tests become:

Source of Numerator degrees Denominator degrees F P-valuevariation of freedom of freedomweight 1 3.69 (28.09) (0.0076)treat 3 6.5 (1.59) (0.2821)weight*treat 3 6.61 1.44 0.3147

eNote 8 8.3 EXAMPLE: HORMONE TREATMENT OF STEERS 7

350 400 450 500 550 600 650 700

1240

1260

1280

1300

1320

1340

Weight

Y

1

1

1

1

2

22

2

3

3

3

3

4

4

4

4

Figur 8.2: Relation in hormone treatment of steers.

At this point, only the test for varying slopes/interaction is of interest: in this case it isnon-significant, so we re-run the analysis without this effect, that is, we fit the model

Yi = d(blocki) + α(hormonei) + β · weighti + εi (8-5)

and obtain

Source of Numerator degrees Denominator degrees F Pvariation of freedom of freedomweight 1 11 67.5 <0.0001treat 3 11 6.38 0.0092

From this table, the message is clear: There is a significant treatment effect (P-value:0.0092). In addition, it is evident that there is a significant linear relationship (P-value:<0.0001). Hence the use of the covariate makes a real difference in this case.

For comparison, the consequence of ignoring the covariate, that is, using the straight-forward randomized blocks model given by

Yi = d(blocki) + α(hormonei) + εi (8-6)

eNote 8 8.4 SUMMARY AND POST HOC ANALYSIS 8

is illustrated in the following table of fixed effects:

Source of Numerator degrees Denominator degrees F Pvariation of freedom of freedomtreat 3 9 2.04 0.1786

In this analysis of the data, there is no indication of a difference in the effect of the hor-mone treatments, and this can be attributed to a much larger random variation whichappears when the initial weight is not used as an explanatory variable.

Based on Figure 8.2, it may seem questionable to use a linear relationship between theinitial weight and the weight of kidney fat for each hormone treatment. There seems tobe a clear curvature in the relationship. However, as the purpose of this analysis is not adetailed study of this relationship but merely to use the initial weights of the animals toreduce the random variation, the analysis is still valid. The reduction of the random va-riation is, to a large extent, achieved by using the linear relationship, and more accuratemodels would only help marginally in that respect. For this reasoning to be true, howe-ver, it is important that the animals are assigned to the treatments using randomization,so that the initial weights are not very different in the 4 treatment groups. If this isnot fulfilled, a non-linear relationship between the response variable and the covariatecould lead to a false significance of the treatments, if we use the linear relationship.

8.4 Summary and post hoc analysis

The approach for presenting the important information about significant treatment dif-ferences in an analysis of covariance depends heavily on whether or not the slopes canbe assumed to be equal. In the following, the two approaches will be presented by twoexamples.

8.4.1 Example with equal slopes: Hormone treatment of steers (conti-nued)

The final equal slopes model was given by:

Yi = d(blocki) + α(hormonei) + β · weighti + εi


whered(blocki) ∼ N(0, σ2

b ), εi ∼ N(0, σ2)

are all independent. The unknown variance parameters are σ2 and σ2b , and the unknown

mean (fixed) parameters are the four α-values and the slope β, though the latter is usual-ly of less interest. All the parameter estimates can be read off the R-output. The varianceparameters are estimated as:

σ̂2B = 0, σ̂2 = 126.1

Here, the block variation is estimated (set) to zero, indicating clearly that there is nodifference between the four blocks.

In R, a zero variance component is automatically omitted in the model, when the testsfor the fixed effects are calculated. The careful reader may have wondered why thedenominator degrees of freedom were 11 for the fixed effects test table for this modelabove - had the block effect been a real part of the model, it should have been 8 (think!).

The treatment parameter estimates are:

Parameter EstimateHormone 1 α(1) 1150.6Hormone 2 α(2) 1135.3Hormone 3 α(3) 1122.2Hormone 4 α(4) 1119.1Slope β 0.3287

The interpretations of these hormone group estimates are seen from the expression forthe expected value of Yi in the model:

EYi = α(hormonei) + β · weighti

So the α-estimates express the level of kidney fat weight (Y) which is expected in eachhormone group for animals with an initial body weight of zero. Clearly, a more relevantnumber to provide for each hormone group is the expected value for an average animal(with respect to initial body weight):

α̂(hormonei) + β̂ · weight

These are the LS-means values. The average initial body weight of the 16 animals in theexperiment is

weight = 477.50

so using the α-estimates and the β-estimate provides the LS-means values:


Parameter LS-mean Lower UpperHormone 1 α(1) + β · 477.5 1308 1295 1320Hormone 2 α(2) + β · 477.5 1292 1279 1305Hormone 3 α(3) + β · 477.5 1279 1267 1292Hormone 4 α(4) + β · 477.5 1276 1263 1289

The 95% confidence intervals for the LS-means are included in the table. The interpre-tation of the parameters is obtained by imagining the 4 parallel lines corresponding tothe 4 hormone treatments, added to the plot in Figure 8.2. The αs are the intercepts, sothat the difference between two αs (or LS-MEANS) is the vertical distance between thetwo lines, which can therefore be interpreted as the expected difference between twoanimals with the same initial weight, which are given two different treatments.

In general, confidence intervals for the differences between pairs of treatments are ofvarying lengths. For example, we have that two treatments are compared more precise-ly, all other things being equal, if their corresponding covariates are close to each otherthan if they lie in two separate groups.

The LS-means show that, with a high degree of certainty, hormone treatment 1 differsfrom treatments 3 and 4, whereas treatments 2, 3, and 4 do not differ more than theuncertainty, with which they are determined.

8.4.2 Example with different slopes

This example is taken from Littell et al. (1996). It is a so-called balanced incomplete blockdesign (BIB). Four treatments were given to 24 experimental units, which were partitio-ned into 8 blocks of size 3. This means that only three out of four treatments are givenin each block, but in such a way that each treatment occurs equally often (6 times) andsuch that each pair of treatments “meet” equally often (4 times). The response is Y andthe covariate is X. The complete data set is shown in Figure 8.3 and in the followingtable:


id block treat Y X1 1 1 31 202 1 2 29 183 1 3 31 114 2 1 29 375 2 2 34 376 2 4 33 397 3 1 31 298 3 3 28 129 3 4 34 3110 4 2 39 3711 4 3 35 2912 4 4 32 2813 5 1 33 1214 5 2 35 1915 5 3 38 1616 6 1 35 3117 6 2 31 1318 6 4 42 3919 7 1 42 3820 7 3 43 3021 7 4 42 2522 8 2 27 1323 8 3 37 3924 8 4 29 21

8.4.3 Models and analysis

The different slopes model given by

Yi = d(blocki) + α(treati) + β(treati) · xi + εi

whered(blocki) ∼ N(0, σ2

b ), εi ∼ N(0, σ2)

are all independent, was fitted to investigate the necessity of allowing for different slo-pes, that is, in order to test the hypothesis that β(1) = β(2) = β(3) = β(4):


Source of Numerator degrees Denominator degrees F P-valuevariation of freedom of freedomX*treat 3 9.34 5.12 0.0233

It is seen from the table above that, in this case, the different slopes model should beused to summarize the treatment differences. See Figure 8.4 for a plot of the data withthe estimated lines added.

10 15 20 25 30 35 40

30

35

40

x

y

1

1

1

1

1

1

2

2

2

2

2

2

3

3

3

3

3

3

4

4

4

44

4

Figur 8.3: The response Y plotted against the covariate X.

8.4.4 Summary and post hoc analysis

The variance parameters are estimated at:

σ̂2B = 18.25, σ̂2 = 1.20

The treatment parameter estimates are:


Parameter EstimateTreat 1 α(1) 26.8Treat 2 α(2) 21.9Treat 3 α(3) 28.6Treat 4 α(4) 22.4Slope 1 β(1) 0.219Slope 2 β(2) 0.496Slope 3 β(3) 0.263Slope 4 β(4) 0.443

Computing LS-means (and differences between these) will only “tell the story aboutthe treatment effects” at one specific value of the covariate (the average X). Since theslopes are different, the “treatment story” will be different for different values of thecovariate. A practical solution is to chose a number of values for the covariate and then“tell the treatment story” for each of those. In Littel et al. (1996) it is recommended touse (at least) three values: a small, a medium, and a large value. The medium valuecould be the average or the median of the covariate. The small and large values couldbe the lower and upper γ percentile, with potential choices of γ being 25, 10, 5 or even0 corresponding to using the minimum and the maximum values. See Figure 8.1 for anillustration of this.

The estimation of the expected values of the response for treatment k at a specific cova-riate value x0 is given by

ˆEYi = α̂(k) + β̂(k) · x0

With 4 treatments, the “full treatment story” for one specific covariate value is told byproviding these four estimates (with confidence bands) together with the six possiblepair-wise comparisons of these four values. Note that the difference between two suchexpected values, say k = 1 and k = 2, is given by

α̂(1)− α̂(2) +(

β̂(1)− β̂(2))· x0

The average covariate value is in this case 26. The lower quartile (γ = 25) is 17 and theupper quartile is 37. The full treatment story for each of these three x0-values is given inthe following three tables:


10 15 20 25 30 35 40

30

35

40

x

y

1

1

1

1

1

1

2

2

2

2

2

2

3

3

3

3

3

3

4

4

4

44

4

Figur 8.4: The response Y plotted against the covariate X with the four expected lines.

Treatment story for X = 17Estimate Lower Upper P-value

Treat 1 30.5 26.7 34.3 –Treat 2 30.4 26.7 34.0 –Treat 3 33.1 29.4 36.8 –Treat 4 29.9 25.7 34.1 –Trt1-Trt2 0.1553 -2.0055 2.3161 0.8748Trt1-Trt3 -2.6071 -4.6809 -0.5334 0.0192Trt1-Trt4 0.6255 -3.0581 4.3091 0.7116Trt2-Trt3 -2.7625 -4.5637 -0.9612 0.0070Trt2-Trt4 0.4702 -2.6929 3.6332 0.7457Trt3-Trt4 3.2326 0.02771 6.4375 0.0484



Treat 1 32.5 28.8 36.1 –Treat 2 34.8 31.2 38.5 –Treat 3 35.5 31.8 39.2 –Treat 4 33.9 30.2 37.6 –Trt1-Trt2 -2.3390 -3.9477 -0.7304 0.0093Trt1-Trt3 -3.0084 -4.7111 -1.3058 0.0030Trt1-Trt4 -1.3884 -3.3507 0.5738 0.1448Trt2-Trt3 -0.6694 -2.3240 0.9852 0.3850Trt2-Trt4 0.9506 -0.9093 2.8106 0.2786Trt3-Trt4 1.6200 -0.04603 3.2861 0.0554


Treat 1 34.9 31.1 38.7 –Treat 2 40.3 36.4 44.1 –Treat 3 38.4 34.5 42.3 –Treat 4 38.7 35.0 42.5 –Trt1-Trt2 -5.3877 -7.9110 -2.8643 0.0009Trt1-Trt3 -3.4989 -6.2208 -0.7771 0.0172Trt1-Trt4 -3.8498 -5.9606 -1.7390 0.0025Trt2-Trt3 1.8887 -1.0768 4.8543 0.1846Trt2-Trt4 1.5378 -0.9040 3.9797 0.1885Trt3-Trt4 -0.3509 -3.2497 2.5479 0.7916

Figure 8.4 is extremely useful as a “story teller” — the tables provide some additionalsignificance information.

8.4.5 Grouping treatments based on slopes post hoc analysis

An approach for simplifying the post hoc analysis is to group similar treatments toget-her before the summary above is carried out. In this different slopes model, it could beuseful to group together treatments for which the slopes are equal, since the treatmentdifferences would be the same for such treatments. To investigate the possibility of this,a post hoc analysis of the four slopes is carried out, that is, the six hypotheses

H0 : β(k1) = β(k2)

eNote 8 8.5 ANALYSIS OF COVARIANCE IN PERSPECTIVE 16

are tested. The results are:

Slopes comparisonEstimate P-value

β1 − β2 -0.2771 0.0048β1 − β3 -0.04459 0.5647β1 − β4 -0.2238 0.0627β2 − β3 0.2326 0.0143β2 − β4 0.09714 0.5955β3 − β4 -0.1792 0.1538

In this case, it would be an option to group together treatments 1 and 3, and also treat-ments 2 and 4. Note that this grouping should only be used for the slopes part of themodel, not the intercept (main effects) part of the model.

8.5 Analysis of covariance in perspective

Analysis of covariance models are, as already mentioned, not limited to the most simpleexperimental designs but can equally well be used in experiments with several factorsand other explanatory variables. We have seen block design examples, but it could justas well have been an experiment with several fixed and random factors. Furthermore,the covariate measurement does not have to be on the level of the observational unit ofan experiment. It could just as well be on the levels of some of the random factors in themodel. The principle is still to:

1. Add the covariate term to the model together with possible interactions with fixedfactors of the model.

2. Simplify the interactions with the covariate.

3. Do the testing, summary and post hoc analysis in the resulting model.

Also, there is nothing wrong with using several covariates in the same experiment. Inan epidemiological investigation concerning the spread of a disease in cattle herds onecould, for example, with herds as the experimental units, use both the herd size andthe average age of the cattle as covariates. One advantage of using covariates is thateach covariate only costs one degree of freedom (DFe decreases by 1 when a covariateis included in the model) corresponding to the inclusion of one slope as a parameter.

eNote 8 8.6 THE USE OF BASELINE MEASUREMENTS 17

In the example with the hormone treatment of steers, one might have imagined divi-ding the animals into blocks based on their initial weights, but that would have costmore degrees of freedom. On the other hand, there can be other advantages of dividinginto blocks including, e.g., when a non-linear relationship between the covariate andresponse variable is assumed/known to hold.

In the classical analysis of covariance, as described here, it is an assumption that thetreatments do not influence the covariate, which one is best convinced of, if the cova-riate is measured before the experimental units are randomized across treatments. Itis problematic to do an analysis of covariance using a covariate that can be influencedby the treatments, and the interpretation of the results becomes much harder with asubsequent increased possibility of mistaken conclusions. Assume for instance that wewant to compare some different feeds in a growth experiment with pigs. If ad libitumfeeding is used, it is more likely that some animals eat more than others, and in order tocorrect for this, one could note each animal’s feed intake and use this as a covariate inthe model for the weight increases. The problem is that one runs the risk of correctingfor the treatment effects, as a larger weight increase probably means that the animal(since it is bigger than the others) eats more and therefore has a larger feed intake. Inan analysis, this might come out as the feed intake being responsible for the weight in-crease rather than the treatments and, logically speaking, it can be hard to distinguishbetween cause and effect in such an experiment. The problem is, in this situation, if oneshould correct for the feed intake (using an analysis of covariance) or not when testingthe treatment effect.

8.6 The use of baseline measurements

A special kind of covariate is the response variable measured before the start of theexperiment, a so-called baseline measurement. Baseline measurements are often madewhen one wants to take into account varying levels of the response variable among ex-perimental units in the treatment groups, at the onset of the experiment. Often such dataare analyzed by considering the difference between the response variable measured af-ter and before the start of the experiment as a new response variable. In this section, wewant to propose a alternative approach, namely to include the baseline measurement inthe model as a covariate.

Let us be a little more specific. Suppose that we want to compare k treatments t1, . . . , tk,given as the factor t, and that for this purpose, we have nT(j) experimental units corre-sponding to treatment tj, for j = 1, . . . , k. If xi and Yi denote the baseline measurementand the response variable for the i’th experimental unit, respectively, then a typical ap-


proach is to form the response variable as the difference

Di = Yi − xi, i = 1, . . . , N,

and analyze D1, . . . , DN using a one-way analysis of variance model, that is by the mo-del (8-7),

Di = α(ti) + εi, i = 1, . . . , N, (8-7)

where ε1, . . . , εN are independent and identically N(0, σ2)-distributed random variab-les.

The problem is that even though we implicitly assume that the baseline measurementis made without any error, this is rarely the case, as the typical situation is that theresponse variable and the baseline measurements are obtained in the same way. Whyshould we then include the baseline measurement as a covariate instead of subtractingit from the response variable, that is, why should we consider the covariance analysismodel (8-8)?

Yi = α(ti) + β · xi + ei. (8-8)

In (8-8), we have assumed, as usual, that ei ∼ N(0, σ2), for i = 1, . . . , N, and that they aremutually independent. Let us consider a few possible scenarios: If there is a very smallerror associated with the baseline measurement, then the one-way analysis of variancemodel for the increment D given by (8-7) is a special case of the analysis of covariancemodel (8-8) corresponding to β = 1. Clearly it would be preferable to start out withthe larger model and then test the adequacy of the one-way analysis of variance model,should this be desirable. If, on the other hand, there is a very large error associated withthe baseline measurement then it will dominate completely and hide any potential tre-atment effect in the one-way analysis of variance model. The only effect on the analysisof covariance model, however, would be that there is no reason to include the baselinemeasurement as a covariate, that is we would expect to accept a test of the hypothesisH0 : β = 0, which is typically not of interest. If the error associated with the baselinemeasurement is the same as that of the response variable, the variance corresponding tothe one-way analysis of variance model will always be bigger than the variance corre-sponding to the analysis of covariance model. For example if the baseline measurementand the response variable are uncorrelated the variance corresponding to the one-wayanalysis of variance model will be twice that of the analysis of covariance model. Theconclusion is that in all cases it is preferable to use the analysis of covariance model.


8.6.1 Example: Concentration of a hormone in cattle

In an experiment, the effect of 3 feed compositions on the concentration of a particularhormone in cattle was investigated. There were 9, 12, and 11 cows respectively in the 3treatment groups, and for every cow the concentration of the hormone was measuredbefore the start of the experiment and again after a certain period of feeding with ex-perimental compositions. In the following table, the initial and final concentrations aregiven for all cows in the 3 treatment groups:

Feed composition1 2 3

Initial Final Initial Final Initial Final207 216 220 225 239 268196 199 201 177 226 229217 256 229 252 235 259210 234 212 214 226 240202 203 210 202 208 182201 214 217 237 236 253214 225 240 278 218 214223 255 232 259 225 246190 182 199 184 207 194

223 244 238 272230 246 219 222231 266

In this example we have a treatment factor T with k = 3 levels corresponding to the 3feed compositions. Furthermore we have a total of N = 32 experimental units (cows)and nT(1) = 9, nT(2) = 12, and nT(3) = 11. We let Yi and xi denote the final and initialconcentrations of the hormone for the i’th cow, i = 1, . . . , 32. If Di denotes the differencebetween initial and final hormone concentrations for the i’th cow, then we can considerthe one-way analysis of variance model given by (8-7) and the analysis of covariancemodel given by (8-8) for these data in an attempt to find out if there is a treatmenteffect or not. In Figure 8.5, the final concentration is plotted against the initial hormoneconcentration for the 3 treatment groups.

A test for the hypothesis that there is no effect of the feed composition on the differencebetween the final and initial hormone concentrations, H0 : α(1) = α(2) = α(3), in theone-way analysis of variance model results in a p-value of 0.86 so that we would clearlyaccept that there is no treatment effect. Based on the analysis of covariance model (adifferent slopes model is in this case not significant) we get a p-value of less than 0.0001

eNote 8 8.7 R-TUTORIAL: HORMONE TREATMENT OF STEERS 20

190 200 210 220 230 240

180

200

220

240

260

280

Initial hormone concentration

Fin

al h

orm

one

conc

entr

atio

n

1

1

1

1

1

1

1

1

1

2

2

2

2

2

2

2

2

2

2 2

23

3

3

3

3

3

3

3

3

3

3

Figur 8.5: The response Y plotted against the covariate X.

for the same hypothesis indicating that there is a clear effect of the feed compositionon the final hormone concentration when taking the initial concentration into account.In other words, we get completely opposite conclusions from the two analyzes and ourrecommendation is to use the analysis of covariance model.

The three feed compositions are all significantly different from each other, and the leastsquares means and associated 95%-confidence intervals are given by

α̂(1) + β̂ · x = 248.31, [242.71, 253.91],α̂(2) + β̂ · x = 226.72, [222.43, 231.01],α̂(3) + β̂ · x = 217.41, [212.68, 222.15].

8.7 R-TUTORIAL: Hormone treatment of steers

Consider the analysis of the kidney data (kidney.txt).


A plot of the relation between weight and Y is obtained using the following code andshown in Figure 8.2.

kidney <- read.table("kidney.txt", sep=" ", header=TRUE)

kidney$block <- factor(kidney$block)

kidney$treat <- factor(kidney$treat)

with(kidney, {plot(weight, Y, type="n", xlab = "Weight", ylab = "Y", las=1)

points(weight[treat == 1], Y[treat == 1], pch = "1", col=1)




})

The function points adds points to an already existing plot, whereas plot creates a newplot every time it is evaluated. Using type = "n" causes an empty plot to be made.The option pch takes an integer specifying a symbol or a single character (for examplepch=’a’) as argument and this will be the plotting symbol used. Notice how the valuesof weight and Y belonging to a particular level of treat are obtained using the squarebrackets [] and the double equation sign == between the factor name and factor level.

An analysis using only the factors block and treat is based on the model

library(lmerTest)

model0 <- lmer(Y ~ treat + (1|block), data = kidney)

anova(model0)

Type III Analysis of Variance Table with Satterthwaite’s method

Sum Sq Mean Sq NumDF DenDF F value Pr(>F)

treat 2875 958.33 3 9 2.0414 0.1786

The effect of treat is not significant.

Now consider the analysis of covariance of the data set kidney. The different slopesmodel is specified as:

model1 <- lmer(Y ~ treat*weight + (1|block), data = kidney)

anova(model1)




treat 518.88 172.96 3 6.5908 1.5862 0.280914

weight 3062.17 3062.17 1 3.6925 28.0821 0.007596 **

treat:weight 470.62 156.87 3 6.7033 1.4386 0.313591

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Can model1 be reduced to the equal slopes model? Yes, the equal slopes model is accep-ted. Thus, consider the equal slopes model:

model2 <- lmer(Y ~ treat + weight + (1|block), data = kidney)

anova(model2)



treat 2413.5 804.5 3 11 6.3794 0.009175 **

weight 8512.8 8512.8 1 11 67.5044 5.065e-06 ***

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

The effect of treat in the equal slopes model is significant.

LS-means values and differences thereoff (for the average value of weight) can be exctra-cted with:

library(emmeans)

(lstreat <- emmeans::emmeans(model2, ~ treat))

treat emmean SE df lower.CL upper.CL

1 1308 5.63 10.9 1295 1320

2 1292 6.19 10.4 1279 1306

3 1279 5.62 10.9 1267 1292

4 1276 6.00 10.6 1263 1289

Degrees-of-freedom method: kenward-roger

Confidence level used: 0.95

pairs(lstreat)

eNote 8 8.8 R-TUTORIAL: BALANCED INCOMPLETE BLOCK DESIGN 23

contrast estimate SE df t.ratio p.value

1 - 2 15.28 8.48 9.50 1.802 0.3295

1 - 3 28.36 7.94 8.46 3.569 0.0278

1 - 4 31.50 8.13 8.84 3.876 0.0167

2 - 3 13.08 8.39 9.35 1.558 0.4446

2 - 4 16.22 9.23 10.53 1.758 0.3438

3 - 4 3.15 8.18 8.95 0.385 0.9795

P value adjustment: tukey method for comparing a family of 4 estimates

8.8 R-TUTORIAL: Balanced incomplete block design

Consider the analysis of the bib data (bib.txt).

The initial plot of x versus y differentiated by trt shown in Figure 8.3 was generatedwith the following code:

bib <- read.table("bib.txt", sep=" ", header=TRUE)

bib$blk <- factor(bib$blk)

bib$trt <- factor(bib$trt)

with(bib, {plot(x, y, type="n", xlab = "x", ylab = "y", las=1)

points(x[trt == 1], y[trt == 1], pch = "1", col=1)




})

The different slopes model cannot be reduced, as is seen from the ANOVA table:

model1 <- lmer(y ~ trt * x + (1|blk), data = bib)

anova(model1)



trt 16.206 5.402 3 9.3237 4.5003 0.03291 *

x 119.613 119.613 1 9.5204 99.6440 2.368e-06 ***


trt:x 18.427 6.142 3 9.3372 5.1168 0.02327 *

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

To superimpose the estimated regression lines on the previous plot it is convenient to fitthe model using a different parameterization. The following code adds the regressionlines to the plot — the result is seen in Figure 8.4.

model2 <- lmer(y ~ -1 + trt + x:trt +(1|blk), data = bib)

B <- fixef(model2)

with(bib, { # add regression lines by trt:

abline(a=B["trt1"], b=B["trt1:x"], lty=1, col=1)




})

Note that B contains the parameter estimates without an intercept which was suppres-sed in the model formula in the call to lmer:

B

trt1 trt2 trt3 trt4 trt1:x trt2:x

26.7973319 21.9304740 28.6464799 22.3678410 0.2187821 0.4959315

trt3:x trt4:x

0.2633705 0.4425478

The first four numbers represent the intercepts and the last four the slopes for the re-gression lines of the four treatment groups.

The function abline adds a straight line to a plot, and as an argument it can take anintercept and a slope, as shown here. The line type is specified using the option lty.

Actually, one should realize that the four lines added here, as taken from the mixedmodel with random block effects, are not the regression fits of the plotted points withineach treatment! And why not: Well, it makes a difference to correct for the block-effects.If we wanted to simply add the four observed lines to the plot as a part of an explorativeinvestigation, which for that purpose would be OK, it could be done as follows:


bib <- read.table("bib.txt", sep=" ", header=TRUE)

bib$blk <- factor(bib$blk)

bib$trt <- factor(bib$trt)

with(bib, {plot(x, y, type="n", xlab = "x", ylab = "y", las=1)





})for(i in 1:4)

abline(lm(y ~ x, data = bib, subset = trt==i), lty = i, col = i)

The resulting plot is shown in Figure 8.6.

10 15 20 25 30 35 40

30

35

40

x

y

1

1

1

1

1

1

2

2

2

2

2

2

3

3

3

3

3

3

4

4

4

44

4

Figur 8.6: The response Y plotted against the covariate X while not controlling for theblock-effect.

The estimated parameter values of these four lines (not corrected for the block effects)are:


model2lm <- lm(y ~ -1 + trt + trt:x, data = bib)

coef(summary(model2lm))

Estimate Std. Error t value Pr(>|t|)

trt1 29.4407830 6.0725728 4.8481565 1.779894e-04

trt2 25.6626284 4.6583978 5.5088959 4.761930e-05

trt3 28.6268771 4.5928510 6.2329209 1.197034e-05

trt4 27.6942910 8.8722531 3.1214496 6.577494e-03

trt1:x 0.1458401 0.2070931 0.7042250 4.914162e-01

trt2:x 0.2994469 0.1860632 1.6093827 1.270830e-01

trt3:x 0.2937134 0.1829110 1.6057723 1.278773e-01

trt4:x 0.2504604 0.2840667 0.8816958 3.909948e-01

The mixed model-based estimates are the ones we should use for further analysis.

To compute the treatment story at particular x values, e.g. x = 17, x = 26 and x = 37,we can use the emmeans function from the emmeans package:

model2 <- lmer(y ~ trt*x + (1|blk), data = bib)

trtlsmns <- emmeans::emmeans(model2, "trt", by = "x",

at = list(x = c(17, 26, 37)))

plot(trtlsmns)

The resulting plot is shown in Figure 8.7.

One could also get all the pairwise differences at all the chosen x-values by:

pairs(trtlsmns)

x = 17:


1 - 2 0.155 0.964 9.18 0.161 0.9984

1 - 3 -2.607 0.922 9.11 -2.827 0.0772

1 - 4 0.625 1.662 9.46 0.376 0.9808

2 - 3 -2.762 0.801 9.10 -3.450 0.0301

2 - 4 0.470 1.420 9.33 0.331 0.9867

3 - 4 3.233 1.442 9.38 2.242 0.1810


●

●

●

●

●

●

●

●

●

●

●

●

x: 17x: 26

x: 37

25 30 35 40 45

1

2

3

4

1

2

3

4

1

2

3

4

emmean

trt

Figur 8.7: Comparison of LS-means for treatment at three levels of the covariate.

x = 26:


1 - 2 -2.339 0.716 9.12 -3.269 0.0395

1 - 3 -3.008 0.762 9.24 -3.950 0.0139

1 - 4 -1.388 0.882 9.35 -1.575 0.4360

2 - 3 -0.669 0.737 9.15 -0.908 0.8012

2 - 4 0.951 0.831 9.22 1.144 0.6735

3 - 4 1.620 0.739 9.07 2.191 0.1969

x = 37:


1 - 2 -5.388 1.123 9.12 -4.799 0.0043

1 - 3 -3.499 1.220 9.30 -2.867 0.0716

1 - 4 -3.850 0.937 9.07 -4.108 0.0114

2 - 3 1.889 1.330 9.30 1.421 0.5174

2 - 4 1.538 1.086 9.10 1.417 0.5202

3 - 4 -0.351 1.305 9.41 -0.269 0.9927

eNote 8 8.9 R-TUTORIAL: CONCENTRATION OF A HORMONE IN CATTLE 28


Comparison of slopes is accomplished using the lstrends function from the emmeans

package. We use pairs to obtain the pairwise differences of the slopes:

pairs(lstrends(model2, ~ trt, var = "x"))


1 - 2 -0.2771 0.0756 9.17 -3.666 0.0216

1 - 3 -0.0446 0.0751 9.20 -0.594 0.9315

1 - 4 -0.2238 0.1072 9.39 -2.087 0.2253

2 - 3 0.2326 0.0782 9.32 2.976 0.0605

2 - 4 0.0534 0.0979 9.28 0.545 0.9456

3 - 4 -0.1792 0.1173 9.53 -1.527 0.4598


8.9 R-TUTORIAL: Concentration of a hormone in cattle

Consider the analysis of the hormbase.txt data. The plot of initial concentration andfinal concentration with separate plotting symbols for the 3 levels of feed shown inFigure 8.5 was generated with the following code:

with(hormbase, {plot(initial, final, type="n", las=1,

xlab="Initial hormone concentration",

ylab="Final hormone concentration")

for(i in levels(hormbase$feed))

points(initial[feed == i], final[feed == i], pch=i, col=i)

})

Compute the difference, D say, between final and initial weight, and add the newvariable to the data frame hormbase as follows:

eNote 8 8.9 R-TUTORIAL: CONCENTRATION OF A HORMONE IN CATTLE 29

hormbase <- within(hormbase, {D <- final - initial

})

The one-way analysis of variance is performed using lm:

model1 <- lm(D ~ feed, data = hormbase)

anova(model1)

Analysis of Variance Table

Response: D

Df Sum Sq Mean Sq F value Pr(>F)

feed 2 101.5 50.74 0.1537 0.8582

Residuals 29 9574.4 330.15

The factor feed is not significant. Now consider the analysis of covariance model:

model2 <- lm(final ~ feed + initial, data = hormbase)

drop1(model2, test="F")

Single term deletions

Model:

final ~ feed + initial

Df Sum of Sq RSS AIC F value Pr(>F)

<none> 1451.0 130.06

feed 2 3377.9 4828.9 164.53 32.593 4.89e-08 ***

initial 1 23520.0 24970.9 219.11 453.878 < 2.2e-16 ***

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Now the factor feed is highly significant. We get lsmeans and differences thereoff for anaverage initial value with:

eNote 8 8.10 EXERCISES 30

initialMean <- mean(hormbase$initial, na.rm = TRUE)

(feedlsmns <- emmeans::emmeans(model2, "feed",

at = list(initial = initialMean)))

feed emmean SE df lower.CL upper.CL

1 248 2.73 28 243 254

2 227 2.09 28 222 231

3 217 2.31 28 213 222

Confidence level used: 0.95

pairs(feedlsmns)


1 - 2 21.6 3.54 28 6.108 <.0001

1 - 3 30.9 3.86 28 8.001 <.0001

2 - 3 9.3 3.06 28 3.046 0.0135


8.10 Exercises

Exercise 1 Different slopes example

Consider the bib data - the BIB example with different slopes. The data file is availablein bib.txt.

a) Carry out the post hoc analysis using the suggested grouping of the slopes. Inclu-de a plot of the data reflecting this assumption of pair-wise equal slopes.

b) Carry out a model diagnostics investigation.

Documents

Analysis of Covariance - DTU Course Website 02429€¦ · eNote 8 8.3 EXAMPLE: HORMONE TREATMENT OF STEERS 7 350 400 450 500 550 600 650 700 1240 1260 1280 1300 1320 1340 Weight Y