Introduction and basic conceptspublicifsv.sund.ku.dk/.../repeated14/intro2014-nup.pdf · 2014. 11. 27. · university of copenhagen department of biostatistics Plan Today: Basic concepts

u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s

Faculty of Health Sciences

Introduction and basic conceptsAnalysis of repeated measurements, 27th November 2014

Julie Lyng Forman & Lene Theil SkovgaardDepartment of Biostatistics, University of Copenhagen


Outline

About the course

What are repeated measurements? (FLW:2011, ch. 1-2)

Warm up: Paired data

The multivariate normal distribution (FLW:2011, ch. 4)

Linear models for longitudinal data (FLW:2011, ch. 3)

Analysis of response profiles (FLW:2011, ch. 5)

SAS proc mixed (FLW:2011, ch. 5.9)

Presenting longitudinal data

2 / 71


Aim of the course

To make the participants able to:I understand and interpret advanced statistical analysesI judge the assumptions behind the use of various methods of

analysesI perform own analyses using SASI understand output from a statistical program package

- in general, i.e. other than SASI present results from a statistical analysis - numerically and

graphically

To create a better platform for communication between ’users’ ofstatistics and statisticians, to benefit subsequent collaboration

3 / 71


We expect students to . . .

Be interested

Be motivated (from your own research project)

Have basic knowledge of statistical concepts such as:I mean, averageI variance, standard deviation, standard errorI (normal)distributionI correlation, regression, ANOVA, GLM.I t-test, χ2-test, F-test

. . . and an open mind towards mathematical ideas and modeldescriptions.

4 / 71


Topics for the course

Extensions of ANOVA and (multiple) regression models:I Variance component modelsI Linear mixed models

for correlated data / repeated measurements

Extensions of generalized linear models:I Generalized linear mixed models for binary and count data

for correlated data / repeated measurements.

Not covered:I Censored data (survival analysis)I Multivariate data (several different outcomes at once)

5 / 71


Recommended readingThe lecture notes, exercises etc at:

I http://publicifsv.sund.ku.dk/∼jufo/RepeatedMeasuresE2014

The book:I G.M. Fitzmaurice, N.M. Laird & J.H. Ware :

Applied Longitudinal Analysis (2nd edition),John Wiley & sons, 2011

We teach SAS programming.I Additional R and Stata examples can be found at:

www.biostat.harvard.edu/∼fitzmaur/ala2e

I some SPSS examples can be found at:http://www.floppybunny.org/robin/web/virtualclassroom/stats/course2.html

6 / 71


Teaching activities

Lectures:I Thursday and Friday mornings (9.15–12.00)I Copies of overheads must be downloaded in advanceI Coffee break around 10.30

Computers labs:I In the afternoon (13.00-15.45) following each lectureI Coffee, tea, and fruit will be servedI Problems and datasets must be downloaded in advanceI Solutions can be downloaded after classes

7 / 71


Course diploma

To pass the course 80% attendance is required.I It is your responsibility to sign the list each morningI and each afternoon.I Note: 5× 2 = 10 lists, 80% equals 8 half days.

There is no compulsory home work . . .I but to benefit from the course you need to work with the

material at homeI We expect you to do so!

8 / 71


Plan

Today: Basic concepts for correlated and clustered data.I Analysis of response profiles for balanced longitudinal data.

Rest of the course:I Covariance pattern models for longitudinal data.I Linear mixed models.I Variance component and multi-level models for clustered data.I Generalized linear mixed models for binary and count data.

including misc. issues e.g. baseline corrections, modelchecking,missing data, making figures for presentation.

9 / 71


Plan

All days:I Mathematical model descriptions cannot be avoided.I BUT: Emphasis is on choosing a proper statistical analysis

and and on interpreting it.I SAS-programming: proc mixed, proc genmod.I Many real data examples.

Note: Many of the datasets we use in our case studies are in facttoo small to yield interesting conclusion. But due to their smallsize they are useful for illustrative purposes.

10 / 71


Outline

About the course








11 / 71


What are repeated measurents?

Repeated measurements refer to data where the same outcome hasbeen measured in different situations (or at different spots) on thesame individuals.

I Special case: longitudinal means repeatedly over time.

12 / 71


What is clustered data?

Repeated measurements are termed clustered data when the sameoutcome is measured on groups of individuals from the samefamilies/workplaces/school classes/clinics/etc.

13 / 71


Statistical analysis

The usual assumption is that observations are independent.

If you have clustered or repeated measurements the assumption ofindependence is violated.

I Your analyses must account for the repetitions/clustering.I In this course we will teach you how to do it.

Warning: Ignoring the repetitions/clustering and doing a standardanalysis most often leads to:

I p-values that are too small or too large.I confidence intervals that are too wide or too narrow.

14 / 71


Outline

About the course








15 / 71


Paired data

The most simple example of clustered or repeated measuments.I Two replicates or two subjects per cluster

Examples of paired data:I Same person with treatment and placeboI Baseline-follow up studyI Twin studyI Comparison of two measurement methodsI Reliability of a measurement method

Quantiative outcome analysed with the paired t-test.

16 / 71


Example: Energy intakeMean daily dietary intake for 10 women recorded over 10pre-menstrual and 10 post-menstrual days.

Subject Pre-menstrual Post-menstrual Difference1 5260 3910 13502 5470 4220 12503 5640 3885 17554 6180 5160 10205 6390 5645 7456 6515 4680 18357 6805 5265 15408 7515 5975 15409 7515 6790 72510 8230 6900 133011 8770 7335 1435

Mean 6753.6 5433.2 1320.5SD 1142.1 1216.8 366.7

(D.G. Altman: Practical Statistics for Medical Research, Section 9.5)17 / 71


Example: Energy intakeStrong association between pre- and post-menstrual dietary intake:

I Correlation: 0.95 (95% CI: 0.83;0.99).

18 / 71


Covariance and correlation

Used to describe (linear) association between dependent variables.

The covariance between two measurements is:

Cov(Y1,Y2) = E{(Y1 − µ1)(Y2 − µ2)}

The correlation between two measurements is:

Cor(Y1,Y2) = Cov(Y1,Y2)sd(Y1)sd(Y2)

In case Y1 and Y2 are independent, then correlation = 0.If a strong linear association is found then |correlation| ' 1.

19 / 71


Standard error in paired testing

Estimated difference between pre- and post-mean:

Y 2 −Y 1

Statistical uncertainty is assessed by the standard error of thedifferences between the means:

s.e.(Y 2 −Y 1) =√

Var(Y 2 −Y 1)

I By standard mathematical arguments:

Var(Y 2 −Y 1) = Var(Y1)n + Var(Y2)

n − 2Cov(Y1,Y2)n

Note the covariance!l20 / 71


Example: Paired vs unpaired comparison

We want to compare pre- and post-menstrual dietary intake.I Test H0 : µ1 = µ2.I Find a confidence interval for µ1 − µ2.

Very different results from the paired t-test (correct analysis) andthe two-sample t-test (wrong analysis):

Analysis Estimate (95% CI) P-valuepaired t-test 1320 (1074;1567) 0.0000003two-sample t-test 1320 (271; 2370) 0.01625

21 / 71


Outline

About the course








22 / 71


The distribution of repeated outcomes

Repeated measurements are characterized by beingI mutually dependent or correlated.

We need to characterize their joint distribution.

Standard model for quantiative data: The multivariate normalI Location: mean-vectorI Variability: covariance-matrix

Main interest is in modeling the mean.I BUT: We also need to model the covariance in order to

account for it in the analyses.

23 / 71


Notation

Mean and variance of the n-dimensional normal distribution:

µ =

µ1µ2...µn

, Σ =

σ21 σ12 . . . σ1n

σ21 σ22 . . . σ2n

... ... ...σn1 σn2 . . . σ2

n

NOTE:I Variances σ2

1, . . . , σ2n along the diagonal

I Covariances are symmetric, i.e. σjk = σkj .

24 / 71


Notation

Let ρjk = Cor(Yj ,Yk) = Cov(Yj ,Yk)/√

Var(Yj)Var(Yk).

Correlation matrix =

1 ρ12 . . . ρ1nρ21 1 . . . ρ2n... ... ...ρn1 ρn2 . . . 1

NOTE:I 1’s along the diagonal.I Correlations are symmetric, i.e. ρij = ρji .

25 / 71


The multivariate normal distribution

Source: Wikipedia.26 / 71


The multivariate normal distribution

Source: Wikipedia.27 / 71


The likelihood function

The density of the multi-variate normal distribution yields thelikelihood function of the longitudinal data (or other repeatedmeasurements):

N∏

i=1(2π|Σ(θ)|)− n

2 exp{−(yi − µi(β))T Σ(θ)−1(yi − µi(β))

2

}

Maximize to get estimates of the model parameters:I β (mean value structure)I θ (covariance structure)

28 / 71


Residual-likelihoodRecall: The likelihood function is used for estimating the modelparameters and for likelihood ratio testing.

I With repeated measurements two different versions of thelikelihood co-exists:

The conventional (or full) likelihood (ML)I It is not often used for model fitting.

The residual-likelihood (REML) yields somewhat moreaccurate estimates.1. After an initial fit of the mean-value structure,2. the covariance is estimated from the ’likelihood’ of the

resulting residuals (a non-linear optimization problem),3. and finally the mean-value structure is re-fitted by

GLS (weighting by the estimated covariance).29 / 71


Is normality really needed?

The standard assumption is that outcomes from the samesubject follow a multivariate normal distribution.

But: the linear models for repeated outcomes are robust.I If sample size is not too small.I If the distribution of the data is not too skew.

Highly skew data should always be transformed.

30 / 71


Outline

About the course








31 / 71


Longitudinal studies

In longitudinal studies measurements are taken repeatedly on thesame subjects over time.

Advantages:I This allows us to study changes over time within subjects

and factors that influence these changes, e.g. treatment.I By comparing each individuals responses at two or more

occations we eliminate extraneuous but unavoidable sources ofvariabitlity among individuals. Thus we obtain more accurateestimates and more certain conclusions about changes overtime than in cross-sectional studies.

Disadvantages:I Correlated measurements complicate the statistical analyses.

32 / 71


Notation

Longitudinal data is described as:I Subjects i = 1, . . . ,N .I Observations Yi1, . . . ,Yini (from subject i).I Taken at occations ti1, . . . , tini (for subject i).I Possibly additional covariates Xij1, . . . ,Xijp

(for subject i at occation j).

Convention: Subscripts "i"are dropped when occations t1, . . . , tnare the same for all subjects.

33 / 71


The linear model for longitudinal data

We model the response at each occation as dependent on time andpossibly other covariates

Yij = µit + εij , j = 1, . . . ,ni

= expected value + error term.

Which model?I A linear model for the mean

µij = Xijβ = β1 + β2Xij1 + . . .+ βp+1Xijp

I A multivariate normal model for the error terms.(We need to specify the error covariance matrix).

34 / 71


The linear model for longitudinal dataAsssume: Same n occations for all subjects.

The linear model for the response-vectors:

Yi1Yi2...

Yin

=

Xi11 Xi12 . . . Xi1pXi21 Xi22 . . . Xi2p... ... ...

Xin1 Xin2 . . . Xinp

·

β1β2...βp

+

εi1εi2...εin

Compact notation:Yi = Xi · β + εi

I Where Xi is the n × p design-matrix for subject i.I Error terms are multivariate normal εi ∼ Nn(0,Σ)

For now we assume the covariance Σ is the same for all subjects35 / 71


Modeling the mean

The mean is specified just as in a conventional linear model.I including covariates that can be both cathegorical and

continuous.I E.g. treatment, gender, and age.

The time-effect is always included.I As a factorI or as a linear/polynomial trend.

Note that covariates are allowed to change with time.

36 / 71


Modeling the covarianceIn contrary to the conventional linear model (for indenpendentdata), we need to specify a model for the covariance.

Unstructured covariance (today)I No retrictions on covariance parameters, hence many of them.

Covariance pattern models (lecture 2)I Make use of the time ordering to describe covariance with

fewer parameters. Often borrowed from time series analysis.

Variance components (lectures 2–3)I Few parameters. Historical approach. Mostly too simplistic to

be realistic in longitudinal data settings, but often reasonablewith clustered data.

37 / 71


Unbalanced and incomplete data

In a planned study the times of measurements will usually be thesame for all subjects. We have a balanced design

In practice data is most often somewhat unbalanced due todrop-out, missed visits, failed measurements.

I In this case we say that data is incomplete.I But the design is still balanced.

Data from (retro-spective) observational studies are most oftenunbalanced both by design and in practice.

38 / 71


Outline

About the course








39 / 71


Analysis of response profiles

Comparison of change over n time points within g groups ofsubjects (e.g. different treatments).

I Similar to two-way ANOVA only with correlated data.I Covariates: group and timeI Balanced design, but possibly incomplete data.I An unstructured covariance is assumed.

Do the groups evolve differently with time?

40 / 71


Unstructured covariance

(aka unrestricted covariance)

Fully flexible because no assumptions are made about thecovariance as a function of time.

I One variance parameter for each time pointI One correlation parameter for each pair of time pointsI n + n(n−1)

2 parameters in total with n time points.

Becomes infeasible with many time points.

41 / 71


Hypothesis

Of primary interest:H0 : No time*group interaction, meaning that average

changes over time are similar (parallel) in all groups.

Of secondary interest (mostly in observational studies):H1: No group-effect, or exact same time-profile in all

groups.H1: No time-effect, or constant profiles with same

difference between groups at all times.Assuming there is no interaction.

42 / 71


Data example

Cadiac output from 19 patients with a cronic heart condition takenat initiation of treatment (baseline=0 months) and after 3 months,12 months, 24 months, 36 months, 48 months and 60 months.

I Do we see an improvement following treatment?I Does it last?I Two different therapies - any difference?

43 / 71


Individual response profiles

Note: After 3 months of similar initial treatment patients areassigned to either monotherapy or combination therapy.

44 / 71


Average response profilesAs individual response curves are roughly parallel (within treatmentgroups) we may look at the average against time plot:

45 / 71


Two-way ANOVA model for the means

Treatment = Mono Treatment = kombit=3 β1 β1 + β2t=12 β1 + β3 β1 + β2 + β3 + β8t=24 β1 + β4 β1 + β2 + β4 + β9t=36 β1 + β5 β1 + β2 + β5 + β10t=48 β1 + β6 β1 + β2 + β6 + β11t=60 β1 + β7 β1 + β2 + β7 + β12

I Monotherapy and 3 months is reference (intercept)I Time effect in Mono-groupI Difference between groups at 3 monthsI Interactions (i.e. difference in time effect)

46 / 71


Analysis: Tests

Overall tests of treatment effect:

But: this doesn’t tell us where the difference occurs!

47 / 71


Analysis: Parameter estimates

48 / 71


Analysis: Estimated covariance and correlation

Note: Largest variance at time point 2.49 / 71


Conclusion

I Significant differences between treatments.I Seemingly faster improvement after kombination therapy, but

equally good response at final evaluation.I Time trends are somewhat hard to interpret due to statical

uncertainty.

We have a complex covariance model and a small dataset:I Small datasets are good for illustrative purposesI Large datasets are good for statistical analyses.

50 / 71


Analysis of response profiles - drawbacks

I Can only handle balanced designs.I Is no good with many groups or time points, due to too many

model parameters.I Do not make use of a priori known data patterns, e.g.

correlation decreasing with time. Steadily increasing responseto treatment.

Next lecture: Parametric models for the mean and the covariance.

51 / 71


Outline

About the course








52 / 71


Preparing data for analysis

To fit the longitudinal (or other mixed) model with any statisticalsoftware data must bein the so-called long format:

I Each dataline contains only one observation of the outcome.I A time-variable identifies the time of measurement.I The id identifies measurements from the same person.

Contrary to the wide format where the dataset contains oneoutcome variable (column) for each occation:

id treatment co1 co3 co12 co24 co36 co48 co60A monoterapi 2.97 3.70 2.70 "." 4.36 4.06 3.70B kombination 1.14 3.30 2.23 2.87 "." 2.84 2.84

...

Exercises: transform data from the wide to the long format.53 / 71


Long formatid time co treatmentA 0 2.97 monoterapiA 3 3.70 monoterapiA 12 2.70 monoterapiA 24 "." monoterapiA 36 4.36 monoterapiA 48 4.06 monoterapiA 60 3.70 monoterapiB 0 1.14 kombinationB 3 3.30 kombinationB 12 2.23 kombinationB 24 2.87 kombinationB 36 "." kombinationB 48 2.84 kombinationB 60 2.84 kombinationC 0 1.70 kombinationC 3 1.80 kombination...54 / 71


Syntax: Analysis of response profiles

PROC MIXED DATA=cardiac;CLASS id time treatment;MODEL logco = time treatment time*treatment

/ SOLUTION CL DDFM=SATTERTHWAITE;REPEATED time / TYPE=UN SUBJECT=id R RCORR;

RUN;

I Syntax is similar to PROC GLM with a MODEL-statementspecifying the (linear) relationship between outcome andcovariates.

I Cathegorical variable must be declared with CLASS.I The model for the covariance (UN=ustructured) is specified

in a separate REPEATED-statement.I . . .55 / 71


The option DDFM=SATTERTHWAITE

(or DDFM=KENWARDROGERS).

A technical option intended to improve the statistical performanceof the F-tests.

I It has no effect on balanced data.I In unbalanced situations (i.e for almost all observational

designs and in case of missing observations) degrees offreedom are computed by a more complicated formulae.

I The computations may require a little more time,but in most cases this will not be noticable.

When in doubt, use it!56 / 71


SAS: proc mixed output

57 / 71



Always check that the numerical optimisation has converged.58 / 71



Options R and RCORR asks that estimated covariance andcorrelation matrices be printed. Note: R as in residual.59 / 71



To be used for comparison of different models (next lecture)60 / 71


Likelihood functions in proc mixed

In proc mixed the method-option is used to choose which versionof the likelihood is used in the model fitting.

I The default is the REML-likelihoodI The conventional ML-likelihood, can be obtained with

proc mixed method=ml data=... ;

Important: When testing nested submodels using the likelihoodratio test Only the conventional likelihood is valid!

61 / 71



At last: Parameter estimates and tests . . .62 / 71



Note: The adjusted F-tests (the SATTERTHWAITE-option) arepreferable to unadjusted F-tests, Wald tests, and to forming anoverall conclusion form the many t-tests for individual parameters.

63 / 71


Outline

About the course








64 / 71


Traditional presentation of dataExample: Aspirin absorption for healthy and ill subjects

(Matthews et.al.,1990)65 / 71


Problems with the traditional presentation of data

Comparison of groups for each time point separatelyI is inefficientI have a high risk of leading to chance significanceI may be inconclusive / difficult to interpret.

(Note that the tests are not independent sincethey are carried out on the same subjects!)

Assessment of changes over timeI cannot be evaluated ’by eye’

because we cannot see the pairing!

66 / 71


Problems with average curves

Do not average over individual profiles, unless they have parallelshapes

I i.e. only shifts in level are seen between individualsI (and some random variation)

Always make a picture of individual time profiles.

Average curvesI may hide important structures.I give no indication of the variation in the time profiles.I may be biased in case of non-random drop-out / missing data.

67 / 71


When curves are not parallel

Transformation might help.I E.g. logatrithm if overall change tends to increase with

increasing level.

Could be caused by a time-varying covariate:I Ideally model checking should be performed on the residuals.

Alternatives to the linear model:I Non-linear model (advanced).I Analysis of summary statistics (simple).

68 / 71


Individual time profiles

Spaghettiplot - divided into groups

Do we see parallel timeprofiles?

Are the averagesrepresentative?

Do we see an otherwisecharacteristic shape?

69 / 71


Analysis of summary statistics

Commonly used characteristics:I The response for a selected occation, e.g. endpointI AverageI Slope (perhaps for a specific period)I Peak valueI Time to peakI The area under the curve (AUC).I A measure of cyclic behaviour.

These are analysed as new observations.I One characteristic per person implies that these observations

are independent.

70 / 71


Example: Aspirin absorptionCharacteristics: time to peak and peak value.

Conclusion: P=0.02 for peak values. Remember quantifications!71 / 71

Documents

Introduction and basic conceptspublicifsv.sund.ku.dk/.../repeated14/intro2014-nup.pdf · 2014. 11. 27. · university of copenhagen department of biostatistics Plan Today: Basic concepts