ARMY RESEARCH LABORATORY :-:::^::;x-::x::;:::-:::::::v;::::::;x::::' k:-lv:::;;:::x':;-x-XvXv:^;v.;i-;-::.-i
Proceedings of the First Annual U.S. Army Conference on Applied Statistics, 18-20 October 1995
Barry Bodt Proceedings Chairman
Hosted by: U.S. ARMY RESEARCH LABORATORY
Cosponsored by: U.S. ARMY RESEARCH LABORATORY
U.S. MILITARY ACADEMY U.S. ARMY RESEARCH OFFICE
WALTER REED ARMY INSTITUTE OF RESEARCH NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY
TRADOC ANALYSIS CENTER-WSMR
ARL-SR-43 August 1996
APPROVED FOR PUBLIC RELEASE; DISIMBUTION IS UNUMTIED.
19960809 019 H'C GUiilMl ^-'i^ * DTICQ
NONLINEAR MIXED EFFECTS METHODOLOGY FOR RHYTHMIC DATA
R. J. Weaver and M.N. Branden Pharmacia & Upjohn, Inc., Kalamazoo, Michigan 49001
We develop methodology for a mixed effects Cosinor model suitable for analyzing rhythmic data. None of the currently used procedures for nonlinear mixed effects models can be directly applied to the Cosinor model. Our approach combines ideas from the areas of time series analysis and mixed effects model methodology, and addresses the inherent limitations of the current procedures, as well as new problems encountered when combining the methodologies.
In many biological investigations data on an endpoint of interest is collected repeatedly over time for each of several individuals, which in turn may be part of a between individual experimental design. Biological time series of this type typically exhibit rhythmic behavior. As is common with biological data, there may be significant variation among individuals in these rhythm characteristics.
This experimental setup suggests using a random or mixed effects model, where a common functional form is assumed for each individual, but some or all of the parameters are considered to vary among the individuals. It is then of interest to estimate the group parameters (fixed effects) and their covariance matrix, and perhaps make comparisons between them. When the period of the rhythm is unknown, most commonly used models are nonlinear in their parameters.
Various methods are proposed in the literature for nonlinear mixed effects model, but there are several unique aspects to this particular problem that preclude using these methods directly. Ordinary nonlinear least squares estimation is difficult to use for these models due to multiple local minima. As an alternative, periodogram based estimators can be used.
Again due to the nature of the model, nonlinear mixed effects methodology using the usual Taylor's series approximation to the expected response do not work well. Problems also occur when the data are pooled together to estimate the fixed effects, as in Vonesh and Carter's EGLS methodology. These problems will be examined in detail, and a new two-stage methodology is proposed. This methodology is shown to perform well not only in rhythmic data models, but also in general nonlinear mixed effects models from pharmacokinetics and growth curve problems.
THE MIXED EFFECTS COSINOR MODEL
THE GENERAL MDOED EFFECTS MODEL
We will consider a general form of the mixed effects model, similar to that described by Lindstrom and Bates1 .
Assumption Al. A similar functional form is assumed for each of the N individuals, that is
where * = A*,-.^) + ei = 1, 2 N ytis a n x 1 vector of observations, (j), is a p X 1 vector of (unknown) parameters for individual i, X; is the w, X p within individual design matrix,
4- ap x a &q X B; a/> x b, a r x
/(fy.-X,-) is the /i.x 1 expected value of the response at X. , . is random error, assumed to be AT(0,Z;) .
Assumption A2. Each individual's true parameter vector can be expressed as
with $; = A. a + B^i q between individual design matrix, 1 vector of fixed effects, r design matrix indicating the random parameters,
1 vector of random effects, for which we will assume bt ~ N ( 0 ,x)
This model setup includes growth models, random coefficient regression models, population pharmacokinetic models and repeated measures models as special cases. The fixed effects a can be interpreted as the group mean parameters, and in the case of a single group are sometimes referred to as the population parameters. The bt are interpreted as the /* individual's parameters deviation from the group or population mean parameter vector. Our main interest is in estimating a and the unique elements of the covariance matrix T. Depending on the experimental situation, it may also be of interest to estimate the individual parameter vectors , and/or o2.
Methodologies for the case when the within individual model is linear in its parameters have been developed by, among othere, Laird and Ware 2, Jennrich and Schluchter 3, and Vonesh and Carter 4. For the nonlinear case, Steimer5, Racine-Poon6, Sheiner and Beal7, Lindstrom and Bates 8, and Vonesh and Carter8 are useful references.
THE WITHIN INDIVIDUAL MODEL - THE COSINOR MODEL
For the within individual portion of the analyses, we will use a model proposed for the analysis of biological rhythms by Halberg, Tong and Johnson 9, called the Cosinor model. Consider a time series y t = 1, 2, . . ., n , where
yt = a0co8(
has many local minima, maxima and inflection points. This problem has been discussed in some detail by Rice and Rosenblatt 15, who state that the local minima occur with a separation with respect to the frequency of abolf n . The main implication is that convergence to the global mimimum is very sensitive to the choice of starting values.
We will illustrate this difficulty using the example by Rice and RosenblattI5. The model considered is the Cosinor model with a, = 1.0, Po = 0.0, (or alternatively, A, =1.0, 60 = 0.0) (O, = 0.5 and n = 100. To examine the problem quantitatively, a single realization of this model with Gaussian noise of mean 0 and variance 1 was randomly generated. Using this data set, the parameters were estimated by nonlinear least squares. The calculations were made using the IMSL subroutine RNLESf, which utilizes a modified Levenberg-Marquardt algorithm. The default convergence parameters of IMSL were used. To examine the dependence of obtaining a good least squares fit on the starting values, we fit the model to this set of data 100 times, each time with different starting values. In each replication, the starting value for A, was set to 1.0 and starting values for 0 and co0 were randomly generated from the Uniform distributions (-%, n) and (0.3, 0.7). This corresponds to about the level precision in starting values that might be obtained by "eyeballing" the data. The objective function for this set of data and range of parameters is shown graphically in Figure 1. As expected, it quite rough and displays numerous local minima, maxima and inflection points.
With these randomly generated starting values, the procedure converged to or stopped near the global minimum only 15 times out of the 100 replications. One of the more common problems was with the algorithm becoming stuck in extremely "flat" regions of the objective function, and failing to meet the convergence criteria.The choice of starting value for co is especially critical. When the starting value was more than about .05 away from the true value, the algorithm would always converge to a local extrema rather than the global minimum. This is not surprising based on the shape of the objective function's surface. The starting values must fall within or near the long, narrow depression centered on
not seem reasonable for periodic functions such as the cosine. To illustrate how it can be inadequate, we will consider the basic model with all parameters considered random, i.e. when A, = , = /,. The model, suppressing the subscript i, can then be written as
yt = (0 + b3)t] + (p0 + &2)sin[(o)0 + b3)t] + et
Taking the derivative of the expectation with respect to the parameters (a,,, ft,, q,) and evaluating at b = (0; 0 0)' we get
yt a (0 + 6^008(0^) + (po + 62)sin((o00
+ b3t(ocos(oiQt) - o sin(u 0)
The third term of this approximation involves the value t, and for a nonzero realization of b3, it will increase in magnitude as t increases. This results in the approximation worsening as the length of the data series gets longer, which is a very undesirable property. This is shown graphically in Figure 2. The model shown has a0 = o = 25 and0G) = 271/24, with the vector b randomly generated from a Normal distribution with mean zero and covariance
1 .1 .01' .1 1 .01
.01 .01 .001,
Figure 2.3 Taylor's Series Approximation
r Actual Approximation
o o o o
-20 -30 -40 -50
0 10 20 30 40 50 60 70 80 Time
The actual model was then calculated for three periods, and graphed with the Taylor's series approximation superimposed on it. The approximation is quickly diverging from the true model, even for this relatively short series.
NAIVE POOLED DATA APPROACH
The Naive Pooled Data approach has been used for estimation of population parameters, and is also the first step of the Vonesh and Carter noniterative algorithm for nonlinear mixed effects models. This procedure pools the data from all individuals and estimates the population parameters .When the underlying model is the Cosinor, the Naive Pooled Data approach can result in problems if the individuals do not all share the same true phase. This well known problem of phase differences in biological time series has been discussed by Sollberger l6. Simply put