Doe

Pergamon

J. Frunklrn Insl. Vol. 3358, No. 2, pp. 259-279, 1998 B” 1997 The Franklm Institute

PII: S0016-0032(97)00004-5 Published by Elsevier Science Ltd Printed in Great Britam

0016-0032/98 %19.00+0.00

Design of Experiments

by JOHN A. JACQUEZ*

Departments of Biostatistics and Physiology, The University of Michigan, Ann Arbor, MI, U.S.A.

(Received in$nalform 10 July 1996; accepted 3 January 1997)

ABSTRACT: In this article the author reuiews and presents the theory involved in the design of experiments for parameter estimation, a process of importance for pharmacokinetics, the study qf intermediary metabolism and other sections of biological research. The design of experiments involves both formal and informal aspects. The formal theories of identifiability of parameters and model distinguishability, and the generation of optimal sampling schedules, play major roles. qfter presenting these theories, both the formal and informal aspects are studied in the examination of a simple but realistic example. 0 1997 The Franklin Institute. Published by Elsevier Science Ltd

Introduction

The term ‘design of experiments’ covers a large number of activities. Some are strongly dependent on experience in a particular field and are so informal as to be labeled ‘intuitive’. Others depend on formal mathematical developments such as in optimization theory. Furthermore, the term has distinctly different meanings in statistics and in systems theory.

The goal of this paper is first, to provide an overview that distinguishes between the different uses of the term ‘design of experiments’, and then to concentrate on the design of experiments in the context of systems theory, where the theories of identifiability, optimal sampling theory and model distinguishability play major roles. In the main, the paper is concerned with the design of input-output experiments in which the number of samples, though possibly large, is more usually limited.

The design of experiments - overview In a general sense, the design of experiments involves all stages in the choice of which

experiments to use to test hypotheses. That includes the choice of experimental subjects, operations to be carried out and measurements to be made, as well as the choice of measuring instruments. All of that is strongly dependent on the state of knowledge in the field and on the technology available to make measurements. Many experiments

*To whom all correspondence should be addressed at: 490 Huntington Drive, Ann Arbor, MI 48104, U.S.A. Tel: (313) 663-4783.

259

260 J. A. Jacquez

are now feasible that could not be carried out even 10 years ago. As done in the day- to-day operation of a laboratory, much of this is intuitive; it comes out of laboratory meetings in which progress and problems in a project are discussed, usually at least weekly.

It is important to realize that the major early activities in the development of any field consist of gathering data and ordering and classifying it in an attempt to define the operational entities which can serve as conceptual units in forming a meaningful picture of the structure of that part of the universe. It is only when that activity is well underway, so that some knowledge and hypotheses about the structure of the field has accumulated, that one can really design experiments.

The design of experiments is the operational side of the scientific method which is based on a few principles. One principle, often labeled Popperian (l), is that one cannot prove hypotheses, one can only disprove hypotheses. However, that idea goes further back and was well stated by Fisher (2) in his book The Design of Experiments. Thus the recommended strategy for scientific investigation is to design experiments with a view to disproving a hypothesis at hand. Less well known is the method of Strong Inference (3) due to Platt. If one can formulate a complete set of alternative hypotheses about an issue, then one can design experiments to systematically disprove one after another of the hypotheses until only one non-falsifiable hypothesis remains. That one must be true.

The role of models and modeling Models First, let us distinguish between two different types of models because what

a physiologist or biochemist calls a model is quite different from what a statistician calls a model. The physiological literature now distinguishes between two extremes in a spectrum of models. One extreme has been called ‘models of systems’, the other ‘models of data’ (4). A model of system is one in which we know the equations that describe the action of the basic physical laws that drive the system. For that reason, I prefer the term, models of process, and will use that term. In contrast, a model of data is one that fits the data without reference to the basic processes that generate the data. However, it is possible to have a model of data that is generated by a model of process. Also, it is possible to have a model of a system which is a model of process for part of the system and a model of data for other parts of the system.

Exploring with models For most problems there is some information on the structure and function of the system as well as some data in the literature. Given that, and the availability of simulation software, one can generate models that incorporate the available information and run them to see how they respond to different experiments designed to test hypotheses that are current in the field. Simulation allows one to explore how good candidate models are in explaining known data and how well they perform in testing hypotheses. That sort of exploratory modeling can play an important role in the planning process.

I believe that efficiency in the experimental sciences involves judiciously combining modeling and actual experimentation. Laboratory experiments are expensive in time and resources. Although one may have to run a few experiments to obtain preliminary estimates of parameters and to check techniques, it is generally inefficient to rush to carry out many experiments. It is far more efficient to formally list hypotheses and the

Design of Experiments 261

models of possible experiments and then to run the experiments on the models of the system. Then one can check identifiability, model distinguishability and generate optimal sampling designs so as to do fewer but more efficient experiments.

Design of experiments -pass one First we have to distinguish between the statistical design of experiments and design

of experiments in systems work. The field of statistics has developed a body of theory, often called design of experi-

ments, which is quite different from the subject matter of this paper. It is concerned with models of data from experiments in which one compares the effects of two or more treatments. The emphasis is on formal methods of randomization in allocating treatments so as to optimize between-treatment comparisons. It arose first in the analysis of experiments in agriculture. Major issues are to eliminate bias by proper randomization, choice of proper controls, stratification, balance in treatment comparisons and the generation of designs such as factorial designs. Statistical design of experiments is the basis for extensive developments on the analysis of variance. It is a field that has been well worked and is treated in many standard texts in statistics, as well as some specialist texts (2,5). It owes much to the early work of Fisher (2).

This paper is not concerned with the statistical design of experiments but with design for parameter estimation of models of process. The next few sections are devoted to the presentation of the theory of a number of techniques that are components of the design of experiments. After an introduction to basic systems theory, identifiability and methods of checking identifiability, estimability and optimal sampling schedules and model distinguishability are presented and then these are used in an example as an illustration.

Basic systems theory

In the biological sciences we try to obtain models of process of the system of interest in order to analyze the outcomes of experiments. To do that, we can think in terms of two stages in the modeling process.

1. A model of the system represents the current hypotheses of the structure, rate laws and values of the parameters of the system.

2. A model of the experiment is a model of the experiment done on the model of the system.

It is important to recognize that: we do experiments on systems in the real world and then interpret the data by analysis of models of the experiments done on models of the systems. It should be noted that in engineering, the terminology is different; there, what we call the model of an experiment is called the system.

We are concerned with models of experiments. Let x be the vector of state variables of the model. The inputs in the experiment are often described as the product of a matrix B and a vector of possible inputs, u. The inputs are linear combinations of the components of the vector u, i.e. Bu. For given initial conditions and input to the model, the time course of change in the vector of state variables is usually given by a set of

262 J. A. Jacquez

differential equations:

Jo = F(x,B,Bu,t); x(0) = x,,, (1) where 0 is a vector of the basic kinetic parameters and x0 gives the initial conditions. To fully specify an experiment, we also have to give the observations. The observations are usually specified by giving the vector function of the state variables that describes what is measured

y = G(x,x,,Bu,t) = G(t,+) = G(t,Q. (2)

We call y the observation function or the response function. Notice that G is written in two ways. G(t,O) is the observation function as a function of time and the basic parameters, 8. G(t,$) is the observation function as a function of time and parameters 4 called the observational parameters. The observational parameters are functions of the basic parameters that are uniquely determined by the observation function. In engineering it is common to refer to input-output experiments, the response function (observation function) being the output, in the information sense.

The actual observations, zj at time ti, are samples of the response function at different times with added experimental errors of measurement.

zj = y(t,,4)+t,, j = 1, . . . ,?I, (3)

where t, is the vector of measurement errors at sample time tj. If the model is a compartment model (6, 7) with constant transfer coefficients, the

equation corresponding to Eq (1) is:

4 = Kq+Bu; q(0) = q,,. (4)

In Eq (4), q is the vector of compartment sizes and K is the matrix of transfer coefficients. The components of K are the basic kinetic or structural parameters of the model. If the observations are linear combinations of the compartments, the observation function is given by Eq (5) in which C is the observation matrix:

y = cq. (5)

Basic parameters could also be introduced by the experimental design by way of the initial conditions, q,,, the inputs Bu and the observational matrix, C.

Compartmental models shall be used frequently in this review and in the development of the theory that follows.

Zdentifiability

An introductory example A simple example from enzyme kinetics illustrates the basic features of the iden-

tifiability problem (6, 8). Consider a one-substrate, one-product enzyme reaction, as shown in Fig. 1.

k, k3 S+E - ES- E+P

-k 2

‘k 4

Fig. I. One-substrate, one-product enzyme reaction.


Suppose one measures the initial velocity of the formation of product, P, at a series of substrate concentrations, S, or the initial rate of formation of substrate at a series of concentrations of product. If the rate of formation of the intermediate complex, ES, is rapid in relation to the rate of formation of product and substrate, it is well known that the initial velocities in the forwards and backwards directions show saturation kinetics in the substrate and product concentrations respectively. Equation (6) gives the Michaelis-Menten equations for the forward and backward initial velocities

VMisl VMb[Pl l’f = K,,+[sI > ub =

Knlb + [PI ’ The parameters Vmf, K,,,, and Vm, Kmb are functions of the basic kinetic parameters,

k,, kZ, k,, k, and of the total enzyme concentration, E,, as given by:

4% = VW = k,&, 43 = v,Wb = W,, (7)

&=K,,=F, $4=K,r=y. I 4

From the forwards velocity experiment, one can estimate 4, = V,, and & = K,,; from the backwards velocity experiment one can estimate c#+ = v,,& and @4 = Knlb.

Suppose we do only the forwards velocity experiment and estimate 4, and &. If we know E,, k3 is uniquely determined by the equation for 4, and k, is identifiable. However, no matter how accurately we determine $, and &, k, and k, cannot be determined and k4 does not even appear in the equations for 4, and &. On the other hand, if we do only the backwards velocity experiment, we estimate & and $.+. Then if we know E,,, k, is uniquely determined by the equation for c#+, so k, is identifiable. However, no matter how accurately we determine & and 4.+ k, and k4 cannot be determined and k, does not even appear in the equations for I#+ and c$~.

It is clear that only if one knows E,, and one does both of the above experiments, can one obtain estimates of all four basic kinetic parameters.

Class&ation of parameters It is important to distinguish between the basic parameters and the parameters that

are determinable by an experiment. The latter are called observational parameters and are denoted by the symbol $i, i = 1,. . As can be seen from the example, the observational parameters are functions of the basic kinetic parameters. If the observational parameters are not functions of a particular basic parameter, that parameter can be changed without affecting the observations. Such a parameter is insensible in the experiment and hence is called an insensible parameter. If a basic parameter does influence the observations in an experiment, it is sensible by that experiment. However, a sensible parameter may or may not be uniquely determined (identifiable) by the experiment. In the above example, there were sensible parameters that were identifiable and others that were not.

Basic parameters may also be introduced by the experimental design. Thus the basic parameters, i.e. the 8, can be basic kinetic parameters of the system model or parameters introduced by the experimental design.

In summary, the parameters can be classified as follows:

264 J. A. Jacquez

1. Observationalparameters: the observational parameters are determined by the experimental design and are functions of a basic parameter set.

2. Basicparameters: the basic parameters are the system invariants (kinetic parameters of the system) plus possibly some parameters introduced by the experimental design. For a given experiment, they may be: 1. insensible, i.e. do not influence the observations; 2. sensible, i.e. influence the observations in the experiment. In that case, they may

be: a. identifiable, b. nonidentifiable.

IdentiJiability (a priori identifiability): definitions The identifiability we have discussed so far is concerned with the question of unique-

ness of solutions for the basic parameters from the observation function of a given experiment. In some of the literature that is also called a priori identifiability to distinguish it from a posteriori identifiability. Here the term identifiability is used for a priori identifiability; a posteriori identifiability is included under the term estimability which is defined later. The various types of (a priori) identifiability that have been defined in the literature are defined below in (a)-(f).

(a) Local identijability. If the observation function for an experiment determines a finite number of values for a parameter, the parameter is locally identifiable. Additional information may be needed to decide which one of the values is appropriate for the physiological system you are working on. Local identifiability includes cases of sym- metry in models in which two or more parameters play equivalent roles, so their values can be interchanged.

(b) Global identijiability. If the observation function determines exactly one solution value for a parameter in the entire parameter space, that parameter is globally identifiable for that experiment. Thus, global identifiability is a subcategory of local identifiability. The term unique identifiability is equivalent to global identifiability.

(c) Structural ident$abiZity. A property of a parameter is structural if it holds almost everywhere in parameter space. The qualification, ‘almost everywhere’ means that the property might not hold on a special subset of measure zero. Thus a parameter could be globally identifiable almost everywhere but only locally identifiable for a few special values. The qualifier ‘structural’ applied to a property means the property is generic, i.e. it does not depend on the values of the parameters, in the almost everywhere sense (9).

(d) Model identzjiability. If for an experiment, all of the parameters of a model are globally identifiable, the experiment model is globally identifiable. If all of the parameters are identifiable but at least one is not globally identifiable, the model is only locally identifiable.

(e) Conditional identijiability. If for an experiment model, a parameter is not identifiable but setting the values of one or more other parameters makes it identifiable, the parameter is identifiable conditioned on the parameters that are preset. By ‘setting a parameter’ we mean that we assign a value to it and then treat it as known, i.e. remove it from the parameter set.


(f) Interval iden@ability and quasi-iden@ability. The values of nonidentifiable parameters often are constrained to fall in intervals. DiStefano (10) used the term interval identifiability to describe the restriction of a parameter to a subspace by the constraints of a problem. If the interval is small enough so that a parameter is ‘identifiable for practical purposes’, DiStefano calls that quasi-identifiability.

Notice that (a priori) identifiability is concerned only with whether or not the observation function, and therefore the observational parameters, uniquely define the basic parameters. It has nothing to do with actual samples or sampling errors. In contrast, a posteriori identifiability is concerned with the estimability of parameters for particular samples. For that reason I call it estimability and will deal with it in more detail under estimability and optima1 sampling design.

Methods of checking identifiability

For fairly simple problems, one can often determine identifiability by inspection of the observation function, but as soon as the models become more complex that is no longer possible. A number of methods available for checking identifiability are summarized in this section. For more details see the books by Carson et al. (ll), Godfrey (12), Jacquez (6) and Walter (9).

The methods differ for linear and nonlinear systems. Before considering that, let us make clear the distinction between linear and nonlinear systems and linear and nonlinear parameters. For a linear system, the rates of change of the state variables are given by linear differential equations. Such systems have the superposition or input linearity property, which means that the response to a sum of two inputs equals the sum of the responses to the individual inputs. In contrast, the rates of change of the state variables of nonlinear systems are given by nonlinear differential equations, and superposition does not hold. When applied to the parameters of a system, the terms linear and nonlinear have entirely different meanings; they then refer to the way the parameters appear in the solutions for the state variables or the observation functions. Suppose x is a state variable and the solution of the differential equation is of the form

x = A,e”l’+A,e”z’. (9)

A, and A2 appear linearly and are linear parameters whereas ;1, and 2, are nonlinear parameters. Even for linear systems, many of the parameters appear nonlinearly in the solutions.

Methods for linear systems with constant coefficients Topologicalproperties For compartmental systems, some simple topological proper-

ties of the connection diagram should be checked first. They provide necessary but not sufficient conditions for identifiability.

1. Input and output reachability. There must be a path from some experimental input to each of the compartments of the mode1 and there must be a path from each compartment to some observation site.

2. Condition on number of parameters. The number of unknown parameters must not

266 J. A. Jacquez

exceed a number which depends on the topology of the system; see Carson et al. (10

For checking parameter identifiability, three methods have received most attention. The Laplace transform or transfer function method This method is simple in theory

and is the most widely used, although it becomes quite cumbersome with large models. First, note that if a linear model is identifiable with some input in an experiment, it is identifiable from impulsive inputs into the same compartments. That allows one to use impulsive inputs in checking identifiability even if the actual input in the experiment is not an impulse. Take Laplace transforms of the system differential equations and solve the resulting algebraic equations for the transforms of the state variables. Then write the Laplace transform for the observation function. That will be of the form

Here, s is the transform variable and the coefficients, 4i,r, are the observational parameters which are functions of the basic parameters. That gives a set of nonlinear algebraic equations in the basic parameters. The hard part is to determine which of the basic parameters are uniquely determined by this set of nonlinear equations.

The similarity transformation method Consider a compartmental system for which the coefficient matrix K has been subjected to a similarity transformation to give a system with a coefficient matrix P-'KP, where P is nonsingular. Recall that under a similarity transformation, the eigenvalues do not change. Impose on P-'KP all the structural constraints on K and require that the response function of the system with matrix P-'KP be the same as that of the system with matrix K. If the only P that satisfies those requirements is the identity matrix, all parameters are globally identifiable. If a P # I satisfies the requirements, one can work out which parameters are identifiable and which are not.

The modal matrix method The matrix whose columns are the eigenvectors is called the modal matrix. In this approach, one looks at the response function to see if the eigenvalues and the components of the modal matrix are identifiable; both are, of course, functions of the basic parameters. This method is used less often than the previous two.

A program called PRIDE is now available which uses the transfer function approach plus topological properties to express the coefficients from the transfer function in terms of the cycles and paths connecting the inputs and outputs of an experiment and uses that to test whether the parameters are globally or locally identifiable (13).

Methods for nonlinear systems Although there is a large literature on identifiability for linear systems with constant

coefficients, less has been done on nonlinear systems. Whereas for linear systems one can substitute impulsive inputs for the experimental inputs for the analysis of identifiability, for nonlinear systems one must analyze the input-output experiment for the actual inputs used. That is a drawback. On the other hand, experience shows that frequently the introduction of nonlinearities makes a formerly nonidentifiable model identifiable for a given experiment. Two methods are available.


Taylor series A method (14) used widely depends on expanding the observation function in a Taylor’s series around t = 0 +. The coefficients of the expansion are functions of the basic parameters and are the observational parameters. Although there may be an infinite number of coefficients, only a finite number are independent. As one adds coefficients from terms of higher and higher order, eventually one reaches coefficients that are no longer independent of the preceeding ones.

Similarity transformation The method of similarity transformations has been extended to nonlinear systems (15) but so far there has been little experience with the method.

Local ident$ability at a point It is natural to develop the theory of identifiability in terms of two levels of

parameters, the basic parameters, 0,, and the observational parameters, 4,, which are identifiable functions of the basic parameters. For problems of low dimensionality it is easy to generate the & explicitly as functions of the 8, and check identifiability on the functional relations, 4i =f;(e,,...,e,,). For problems of even moderate magnitude the algebraic work involved in finding the 4i and solving the equations may become limiting.

An important finding is that if one has initial estimates of the basic parameters one can determine local identifiability numerically at the initial estimates directly, without having to generate the observational parameters as explicit functions of the basic parameters. The method works for linear and nonlinear systems, compartmental or noncompartmental. Furthermore, for linear systems it gives structural local identifiability. We develop the basic theory for checking identifiability at a point. There are many similarities with the corresponding theory for the estimation of parameters. Since we shall need the latter shortly for the discussion of estimability and optimal sampling design, let us develop the two in parallel.

Let x be the vector of state variables. Recall that the model describing the dynamics of the experiment is

k = F(x,B,Bu,t); x(0) = x,,, (11)

where 0 is a vector of basic parameters and x0 gives the initial conditions. The observation functions (response functions) are:

y = G(x,xO,Bu,t) = G(t&) = G(Q).

The actual observations at point j are,

(12)

Zii = GXt,t@ f 6,) = Yi(t,> + tz/, (13)

where cij is the error of measurement of Gi(t,$); assume it is zero mean with variance gfj To keep things simple, we assume that there is only one observation function and develop the least squares theories, assuming we have fairly good initial estimates of the parameters which we shall, for the moment, treat as though they were the correct values for the parameters.

Let there be p basic parameters and assume we have estimates, @,...,E$. For the parameters set at these estimates, calculate a set of values of the observation function

268 J. A. Jacquez

at n points in time for n>p. For small deviations in the parameters, linearize the observation function in the parameters f&, around the known values

yj = Gj’+,$‘, gAO,+ej (14)

The superscript ’ means the term is to be evaluated at the known values (the estimates) of the parameters. Notice that the ej are not measurement errors, they are truncation errors in the expansion. Furthermore, ej-+O in order (A@)’ as A0+0; since the ej play no role in the theory, we drop them.

The two sums of squares are:

Se,, = i L c zj-G;-+hek 1 2

. j=l U,’ k %k (15b)

Notice that yj- G; z 0, but zj- G; = tj. Next, find the least squares estimate of Aok. Take derivatives of the sums of squares with respect to the A8, to obtain the normal equations.

gTgA8 = 0, (164 for identifiability, and

grC-‘gA$ = grC-‘6 = g%-‘(z-G”), (16b) for estimation of parameters. Here g is the sensitivity matrix

aGy acp ~ ,.. - a4 ae*

g= ; . . . (17)

aG:: aG" _ . . n ae, 3%

and EC-’ is the diagonal matrix of the inverses of the variances of the measurement errors

1 -0 . . . 0 0:

Ol... 0 0:

. . . . . . . .

0 0 . . . ; on

(18)

From this development, one can see that,

Design of Experimen ts 269

1. for an identifiability check we only need to calculate the values of y at a series of points at the initial estimates for the Bi. We don’t need to do the experiment. Indeed, we don’t want to commit resources to an experiment if the parameters we want to estimate are not identifiable!

2. if det(gTg) = 0, the model is not locally identifiable - some parameters may be identifiable but not all. Then

(det(grg) = O)=(det(gTC-‘g) = 0), (19)

so obviously one cannot estimate parameters that are not identifiable.

This method has been programmed in a series of programs called IDENT (16).

Estimability

In this paper, the term estimability is used as a general term to cover the various issues involved in evaluating the quality of estimates of parameters, beyond the question of (a priori) identifiability (16, 17). It includes a posteriori identifiability, estimation of the variances of the estimates and analysis of the impact of correlations between the estimates.

There are qualitative and quantitative aspects to estimability. From the qualitative viewpoint, it should be obvious that one must take samples at at least as many points as there are parameters; otherwise, one cannot obtain estimates even though the parameters are identifiable. That is usually referred to as a posteriori identifiability. From a quantitative viewpoint, we would like estimates with small variances and with no correlations between the estimates of the parameters. Ideally one would like to obtain a diagonal covariance matrix, diagonal so that the correlations between the estimates of the parameters are zero, with small diagonal entries. To do that, one can increase the sample size, but increasing the number of samples is not enough; optimal placement of the samples turns out to be more important. That is covered in the next section. Unfortunately, parameter estimates are almost always correlated and correlations between estimates of the parameters degrades their value, even when the variances of the estimates are relatively small.

Optimal sampling schedules

Assume we have a specific model for an input-output experiment, i.e. we have a model of a system on which we do a specific input-output experiment. For that, we are given:

1. a prior estimate, 8*, of the p-vector of parameters 0; 2. N2p samples, i.e. measurements of the observation function, are to be taken in a

sampling interval [0,7j; 3. the variances for the measurement errors.

The problem is to pick the times of the N samples to optimize the estimation of 8*; optimize in this case means to minimize some measure of the variances and covariances of the estimates of 8*.

270 J. A. Jacquez

Optimal sampling schedules for nonlinear parameters depend on the values of the parameters. Thus we need to know the values of the parameters to obtain an optimal sampling schedule. Obviously, if we really knew the values of the parameters there would be no point to doing the experiment to estimate them! Hence the need for a prior estimate which can then be used in a sequential estimation scheme, i.e. use the prior estimate to obtain an optimal sampling schedule and then do the experiment to obtain a better estimate and repeat the process (18). Fortunately, optimal sampling schedules are rarely sharply optimal; a design with points not far from the optimal design is not far from optimal. In addition, it should be stressed that an optimal design holds for the given model of an input-output experiment. If the model is misspecified, it is likely that the design will not be optimal.

For introductory reviews see Refs (19, 20, 21). For more detailed presentations of the theory of optimal sampling design see Fedorov (22) Landaw (23) and Walter and Pronzato (24).

Terminology and some background We start with definitions of important terms.

1. Sampling design. Any choice of points t,,...,t,,, in [0,7’j is a sampling design. 2. Points of support. The points ti, ie{l,...,N} are the points of support of the design.

It has been shown that forp parameters, one needs at most p@ + 1)/2 + 1 distinct points of support in a sampling design (22). However, experience with compartmental models shows that optimal designs usually require only p points of support (23). Thus, if one takes N = kp samples, where k is an integer, the optimal design places k samples at each of the points of support.

Finally, it is worth noting and emphasizing that the points of support in an experimental design are usually far from uniformly spaced; geometric spacings often come closer to optimal designs.

Theory of parameter estimation The theory for parameter estimation has already been presented in Eqs (1 l)-( 19).

Let us assume the parameters are identifiable. Two complications have not been explicitly introduced in order to keep the derivations simple. However, they can be handled without any basic change in the theory, although with some increase in complexity of the presentation.

1. Parameters that have to be estimated may be introduced in the observations and not be present in the system model (see Jacquez and Perry (16)). Since the theory for estimation is for estimation of parameters that appear in the observations, they are included.

2. Prior information on the system may appear in additional equations of constraint. If the equations of constraint depend on state variables, with or without the explicit appearance of some parameters, they are part of the model equations and are solved with the dynamical equations. If the equations of constraint are in terms of parameters only, they are used to reduce the parameter set.

Equation (16b) is the equation for estimation of deviations around the initial estimate


of 8. If the determinant of (gTC-‘g) = 0, Eq (16b) have no solution. In fact, the model is unidentifiable if the determinant of (g’g) is zero.

Important points to note are as follows.

If the errors of measurement are not large one expects (z-GO) to be small and so At? will be small, provided that the determinant of (g?-‘g) is not small. We want the determinant of (g’C_‘g) to be large, because then small changes in z have very small effects on the estimates of 6. If the determinant of g%‘g is nonzero, (g’c-‘g))’ is proportional to the covariance matrix of the estimates of 0 and to obtain good estimates of 8” we want the covariance matrix to be small. gTXP’g is the Fisher information matrix, Z, and increasing Idecreases the covariance matrix.

Optimal sampling designs It is obvious that the entries in (g’C_‘g) depend on the times at which samples are

taken, i.e. on the sampling design. One can see intuitively why (g?-‘g) is so important; it uses the sensitivity matrix weighted by the inverse error variances and clearly one wants samples placed so as to increase the sensitivity of the estimates to change in the parameters. The problem then is to choose some objective function of I = g’X_‘g and find the sample that optimizes this objective function. The criterion used widely is to maximize the determinant of I. The determinant of (g%-‘g))’ is proportional to the volume of the ellipsoids of constant uncertainty, i.e. the confidence ellipsoids, around the minimum of S. Thus minimizing the determinant of (g%‘g)-‘, or maximizing the determinant of g?‘g minimizes the volumes of these ellipsoids. Such designs are called D-optimal designs. However, other design criteria have also been used. The major ones are given below.

D-optimal designs II’ is an estimate of the covariance matrix of the estimates of 8. The determinant of II’ is proportional to the volumes of the confidence ellipsoids around the minimum of the sum of squares in parameter space. So minimizing the determinant of II’ minimizes the volumes of the confidence ellipsoids. That is what a D-optimal design does. An important property of D-optimal designs is that they are independent of the units (scales) chosen for the parameters.

A-optimal designs Minimize the sum of the diagonal elements (the trace) of II’; that is equivalent to minimizing the average variance of the estimates of the parameters. Unfortunately, A-optimal designs are not independent of the scales used for the parameters.

C-optimal designs These minimize the trace of CII’, where C is a diagonal matrix whose entries are the inverse squares of the values of the parameters at the minimum in S. The result is to minimize the average squared coefficient of variation, i.e. c (at/. ’ C-optimal designs are also independent of the scales of the parameters.

E-optimal design Minimize the maximum eigenvalue of II’. This is equivalent to minimizing the length of the principal axis of the confidence ellipsoid in parameter space.

D-optimal designs are the most widely used, so it is worth examining their properties

272 J. A. Jacquez

in more detail. The matrix of second derivatives of G with respect to the parameters is called the Hessian. One can show that the principal axes of the sum of squares ellipsoids are along the eigenvectors of the Hessian and are intersected by the sum of squares surface at distances proportional to ,/l/7;, where yi is the ith eigenvalue of the Hessian (6). Thus it is possible for the confidence ellipsoids to be long and narrow in some directions; that implies high correlations between estimates of some of the basic parameters. For some purposes it might be better to give up some of the volume of the minimum ellipsoids in exchange for ellipsoids that are closer to spheres so as to decrease the correlations between basic parameters. E-optimal designs tend to do that. Another approach, the M-optimality of Nathanson and Saidel(25), minimizes the angle between the ellipsoid axes and the reference axes in the space of the basic parameters. That allows for differences between the lengths of the principal axes of the ellipsoids but by aligning the principal axes with the parameter axes it reduces the off-diagonal terms of the Hessian, i.e. the correlations between the estimates of the parameters.

Numerical methods Although there is an extensive and often complicated body of theory on optimal

sampling designs (22,23), with the power of modern computers a systematic search of the interval [0, T] provides a simple and direct approach to generating optimal sampling designs. The method is more obvious for one observation function. On the interval [O,T], place a lattice of N+ 1 > 3p points; there is no problem with using lo&300 or more points. For an initial choice of p points of support, spread the p points over the lattice and calculate the determinant of g%‘g; it is better to choose an initial partition that divides the interval [0, 7’j in a geometrically increasing spacing rather than to divide the interval into N equal subintervals. Starting with the first point of support do the following. Place the ith point of support on each lattice point between the (i- 1)th and the (i+ 1)th points of support, successively, and calculate g%-‘g for each of the designs so obtained. Keep the one with the largest determinant. Do this for all p points of support. After sweeping through all points of support on the lattice, repeat the process. That method converges fairly rapidly.

Two programs are available that calculate D-optimal sampling designs for multiple input and multiple output experiments, OSSMIMO (26) and OPTIM. The latter is an extension of the IDENT programs (16). Both programs use variants of the numerical method just described and both handle multiple inputs and multiple outputs. OSSMIMO is written in FORTRAN whereas OPTIM is written in C.

Model distinguishability

A theoretical problem related to identifiability is concerned with constructing all models that have different structures from a given or candidate model, but have the same input-output response over some class of admissible inputs for a given input- output experiment (27-30). The idea is to find all models that could not be distinguished by the experiment under consideration.

We do not pursue that here but look at a simpler problem that is much closer to actual practice in the design of experiments. In an area such as pharmacokinetics or physiological systems modeling, there are usually only a few competing models. These


are the models that are plausible in terms of what is known of the anatomical structure and of the basic biochemical and physiological mechanisms at work in the system. Often, there are only two main models. Thus an important issue for the experimentor is, given that there are two or three competing models and a few experiments that can be done, can one or more of the feasible experiments distinguish between the models, and if more than one, which is the best? The basic idea is to compare the input-output responses of the models for a particular experiment, and do that for all feasible experiments. Finally, pick the experiment that gives the greatest difference between input-output responses of the models. That is an iterative process whose basic unit of iteration is to compare input-output responses for a particular experiment for two models.

An example

To illustrate the many aspects of the design of experiments that have been covered, a simple but realistic example will now be given (31).

Many metabolites and drugs equilibrate slowly enough between blood and the interstitial fluid in the organs so that the amount in the blood plasma acts as one compartment and the amount in the interstitial spaces acts as another compartment. Suppose we have such a material but we do not know whether or not it enters cells and is metabolized there. There are two possible compartmental models for this system which are shown in Fig. 2(a) and (b).

In Fig. 2, node 1 is the plasma compartment and node 2 is the interstitial compartment. The transfer coefficients are constants; k,,, is for excretion by way of the urine but ko2 is the coefficient for uptake by cells and metabolic conversion to some other material. We want to choose the system model in Fig. 2(a) if koz is zero or if it is so small that the rate of removal by this pathway is not detectable within the errors of measurement. On the other hand, if entry into cells is significant, we want to choose the model in Fig. 2(b) as our system model.

The system models For Fig. 2(a), the equations for the system model are

41 = -(& +k*,)% +62q2, (20)

42 = kz,q, -k,2q2r (21)

where qi is the total amount in compartment i. For Fig. 2(b), the equations for the system model are

(4 (b) Fig. 2. Two possible compartmental models.

274 J. A. Jacquez

(a) W Fig. 3. Two experiment models.

41 = -(kx+~2h+k2,

42 = k2,9, -v%2+~12h.

(22)

(23)

If we had initial conditions on q, and q2, we could solve these equations to obtain the time courses of q, and q2.

The experiment models Now we define an experiment, experiment 1, which consists of putting a unit impulse

into compartment 1 at t = 0, i.e. an IV injection of a bolus at t = 0, and measuring the concentration in compartment 1. Figure 3 shows the experiment models.

The heavy arrows going into compartments 1 represent the inputs and the heavy arrows coming out of compartments 1 represent the observations (outputs in engineering terminology).

The equations for the experiment models are Eqs (20)-(21) and Eqs (22)-(23) plus equations for the inputs and the observations. The unit impulsive inputs are given by the initial conditions they give, i.e.

4,(O) = 1. (24)

The observation function is the concentration in compartment 1

y,(t) = q, I

(25)

where V, is the volume of distribution in the plasma. Notice that here is an example of a basic parameter, I/,, that is introduced by the experimental design, i.e. the observation function. Thus the equations for the experiment model in Fig. 3(a), which is experiment 1 on the system model in Fig. 2(a), are Eqs (20))(21) plus Eqs (24)-(25). The equations for the experiment model in Fig. 3(b), which is experiment 1 on the system model in Fig. 2(b) are Eqs (22)-(23) plus Eqs (24)-(25).

Models of process and models of data The equation sets {Eqs (20), (21), (24) (25)) and {Eqs (22), (23), (24) (25)) are

models of process because they describe the basic processes going on in experiment 1. If we solve the equations, we find that for both experiment models the observation function is of the same form


41(t) Yl(O = 7 = A,e”1’+&“2’. 1

If we try to just fit the data with an equation of the form of Eq (26), we have a model of data. In this case, the model of data is derived from models of process. If we did not know the models of process and just looked at the data, over a limited range, it looks as if it could be fitted with a polynomial in t. That too would be a model of data but one not derived from a model of process.

IdentiJiability Let us use the Laplace transform method. First, for the experiment model in Fig.

3(a), take Laplace transforms of Eqs (20t(21) and solve for Q,, the Laplace transform of q,. That gives

Q, = s+k,,

s2+(ko,+k,2+k2,)S+k0,k,2’

Thus the Laplace transform of the observation function is

(27)

Y, = 4 f’, + kd f’, s2+(ko,+k,2+k2,)S+k”,k,2 (28)

That gives four observational parameters:

4, = l/V

42 = WV,

43 = ko,+k,z+kz,

44 = k&i,.

(29)

The set of Eq (29) has unique solutions for V,, ko,, k,, and k,, so all of the basic parameters are globally (uniquely) identifiable.

If we do the same for the experiment model in Fig. 3(b)

Q, = s + (km + k,,)

s2+(k,,+koz+k,z+kz,)s+k0,k02+k0,k,2+k02k2,’

so the Laplace transform of the observation function becomes

Y, = 4 VI + (km + kdl V,

s2+(ko,+koz+k,z+kz,)s+k0,k02+k0,k,2+k02k2,’

That again gives four observational parameters

$1 = l/V

42 = (ko2+k,2YV,

$3 = kx+ku+k,2+h

4, = &&,+W,2+k,,k,,

(30)

(31)

(32)

but these are functions of five basic parameters. Now only V, is uniquely identifiable, from c$,, and none of the four basic kinetic parameters are identifiable.

276 J. A. Jacquez

k21 k21

g=y g=yko2

Fig. 4. Experiment models as in Fig. 3 but with an additional compartment 3.

The conclusion then is that if the system model in Fig. 2(b) is correct, the basic kinetic parameters cannot be estimated because they are not identifiable by experiment 1. Can one modify the experiment to make all of the kinetic parameters of the system model in Fig. 2(b) identifiable? Practically, one cannot sample compartment 2 or the outflow from it. However, if the outflow from compartment 1 is all by way of the urinary output, one can collect the urine and measure the amount excreted as a function of time. In modeling terms, that means we add a compartment that collects the outflow from compartment 1. The new experiment then is the same as experiment 1 with the addition of measurement of the amount in compartment 3, as shown in Fig. 4, the diagram for this new experiment 2.

That means the system model has another equation, the equation for q3,

43 = kl141, (33) and another observation function has to be added to the equations for the experiment model

Y2 = 93. (34)

The Laplace transform of y, is,

I’, = Q3 = ko,Q,. (35) Thus we have another observational parameter to add to Eqs (29) and (32). With

that change in the experimental design all of the kinetic parameters of the system model in Fig. 2(b) become uniquely identifiable.

Model distinguishability For experiment 1, the observational function is a double exponential decay, Eq (26),

for both system models, Fig. 2(a) and 2(b), and there is no way that experiment 1 can distinguish between the two possible system models. However, experiment 2 provides the additional measurement, y, = q3 and if the impulsive input into compartment 1 at t = 0 is 1 unit of material,

f& Y2(0 = 1,

for the system model in Fig. 2(a) but the limit must be less than 1 for the system model in Fig. 2(b). Thus, within the error of measurement of yz(t), experiment 2 can distinguish between the two system models. So we choose to do experiment 2.

Which model? It is important to realize that up to this point there has been no reason to do any

experiments in the laboratory. Analyses of identifiability and model distinguishability


are done on the experiment models, but now it is time to do a real experiment, in order to decide which is the better model and also to determine how far out in time to take samples in an optimal experiment design and to obtain preliminary estimates of the parameters. Notice that the concentration in compartment 1, y,(t), follows a double exponential decay and after some time, T, when the curve has fallen to a low level and is fairly flat, there is little point in taking more samples. In addition, model distinguishability depends on taking y2 out far enough so one can estimate whether or not it approaches 1 in the limit, but y, is also a double exponential with the same exponential terms as in y,(t) but with different coefficients. If measurement error is say 2%, one would also choose Tlarge enough to decide whether or not VI(t) is approaching 1 as t-+60.

Assume that the experiment has been done and it turns out that, as near as makes no difference, the system model in Fig. 2(a) is correct and we have an estimate of T. If model in Fig. 2(a) is correct, however, all parameters are identifiable from experiment 1, so why not fall back to the easier experiment, that of the experiment model in Fig. 3(a)? There is good reason not to do that! Experiment 2 gives independent estimates of both k,, and V,; we would find that the correlations between the estimates of the parameters would be much less for experiment 2! The better decision is to continue to do experiment 2.

Optimal sampling schedule and estimability Qualitative considerations of estimability tell us we need to take at least four samples

to be able to estimate the four parameters of the experiment model in Fig. 4(a). To decrease the estimation errors, one needs to take more. Considerations of experimental technique, e.g. how many samples can be handled without degrading the sampling technique, would set an upper limit. Suppose we decide we can easily handle 16 samples over the period T. Other information needed is the variance of the measurement error, i.e. the variances of the c,, of Eq (13). Previous experience with the experimental technique might provide estimates. If not, replicates plus the residuals from the fits in the preliminary experiment(s) can be used to estimate the variance.

With the necessary preliminary information available, we use OPTIM or OSSMIMO to come up with an optimal sampling design, which will be optimal for the preliminary estimates of the parameters. That design will almost always place four samples at each of the four points of support of the design. Technically that is usually not a practical design because of the difficulty of taking four independent samples simultaneously. However, optimal designs are rarely sharply optimal; a sampling design close to an optimal one is close to optimal. So the answer is to group four sample points around each optimal point of support, leaving enough time between successive samples so as not to degrade the technique by hasty work. That has an additional advantage. Suppose later work shows that there are really two peripheral compartments, one that was too small and/or equilibrated so slowly with the plasma compartment that it was not picked up in earlier work. Then the extra points in the design, i.e. not all on the points of support for the optimal design based on the experiment model in Fig. 4(a), help to obtain preliminary estimates for the enlarged model.

Finally, the experiment is run with the optimal design and the parameters are estimated. If the estimates are somewhat different from the preliminary estimates and/or

278 J. A. Jacquez

if they have unacceptably large variances, repeat the process. In updating the estimates in this sequential process, take into account all estimates obtained, using the variances of the estimates as weights. A Bayesian approach to updating could be used.

Discussion and conclusion

Many problems in pharmacokinetics and in the study of metabolism require that one estimate the parameters of the kinetic processes involved. The design of experiments to do that involves formally modeling the system under investigation and the experiments that one proposes to do on the system. Identifiability of parameters, model distinguishability and the generation of optimal sampling schedules are the keystones of this approach. It should be stressed that checking parameter identifiability and model distinguishability does not require that one do any experiments - those checks are done on the experiment models. However, the generation of optimal sampling schedules requires some preliminary estimates, for which preliminary experiments may have to be done. After that, generation of optimal sampling schedules and running the experiment are done iteratively to obtain the parameter estimates.

The gains from this approach to experimentation should be clear. Generally, far fewer experiments need to be done because the experiments that are done are more efficient. That conserves the resources required for experimentation - experiments are costly! Furthermore, it minimizes the use of animals in experiments and that is becoming an ever stronger consideration as the opposition mounts to the use of animals in research.

References

(1) Popper, K., The Logic ofScientific Discovery. Harper and Rowe, New York, 1968. (2) Fisher, R. A., The Design of Experiments, 5th edn. Hafner Publishing, New York, 1949. (3) J. R. Platt, Strong inference”, Science, Vol. 146, pp. 347-353, 1964. (4) J. J. DiStefano III and E. M. Landaw, “Multiexponential, multicompartmental, and non-

compartmental modeling, I. Methodological limitations and physiological interpret- ations”, Am. J. Physiol., Vol. 246, pp. R651-R664, 1984.

(5) Gill, J. L., Design and Analysis of Experiments, Vols l-3. The Iowa State University Press, Ames, IO, 1978.

(6) Jacquez, J. A., Compartmental Analysis in Biology and Medicine, 3rd edn. BioMedware, Ann Arbor, MI, 1996.

(7) J. A. Jacquez and C. P. Simon, “Qualitative theory of compartmental systems”, SIAM Rev., Vol. 35, pp. 43-79, 1993.

(8) J. A. Jacquez, “Identifiability: the first step in parameter estimation”, Fed. Proc., Vol. 46, pp. 2477-2480, 1987.

(9) Walter, E. (ed.), IdentiJiability of Parametric Models. Pergamon, Oxford, 1987. (10) J. J. DiStefano III, “Complete parameter bounds and quasiidentifiability conditions for a

class of unidentifiable linear systems”, Math. Biosciences, Vol. 65, pp. 5148, 1983. (11) Carson, E., Cobelli, C. and Finkelstein, L., The Mathematical Modeling of Metabolic and

Endocrine Systems. John Wiley & Sons, New York, 1983. (12) Godfrey, K., Compartmental Models and Their Application. Academic Press. New York,

1983.


(13) Saccomani, M. P., Audoly, S., D’Angio, L., Sattier, R. and Cobelli, C., PRIDE: a program to test a priori global identifiability of linear compartmental models. In Proc. SYSZD 94, 10th IFAC Symposium on System Identification, ed. M. Blanke and T. Soderstrom, Vol. 3. Danish Automation Society, Copenhagen, 1994, pp. 25-30.

(14) H. Pohjanpalo, “System identifiability based on the powerseries expansion of the solution”. Math. Biosciences, Vol. 41, pp. 21-33, 1978.

(15) S. Vajda, K. R. Godfrey and H. Rabitz, “Similarity transformation approach to identifiability analysis of nonlinear compartmental models”, Math. Biosciences, Vol. 93. pp. 217-248, 1989.

(16) J. A. Jacquez and T. Perry, “Parameter estimation: local identifiability of parameters”, Am. J. Physiol., Vol. 258, pp. E727-E736, 1990.

(17) J. A. Jacquez and P. Greif, “Numerical parameter identifiability and estimability: inte- grating identifiability, estimability, and optimal sampling design”, Math. Biosciences, Vol. 77, pp. 201-227, 1985.

(18) J. J. III. DiStefano, “Optimized blood sampling protocols and sequential design of kinetic experiments”, Am. J. Physiol., Vol. 240, pp. R259-R265, 1981.

(19) D. Z. D’Argenio, “Optimal sampling times for pharmacokinetic experiments”, J. Phar- macokin. Biopharm., Vol. 9, pp. 739756, 1981.

(20) J. J. DiStefano III, “Design and optimization of tracer experiments in physiology and medicine”, Fed. Proc., Vol. 39, pp. 84-90, 1980.

(21) Landaw, E. M., “Optimal design for individual parameter estimation in pharmacokinetics. In Variability in Drug Therapy: Description, Estimation, and Control, ed. M. Rowland et al. Raven Press, New York, 1985, pp. 187-200.

(22) Fedorov, V. V., Theory of Optimal Experiments. Academic Press, New York, 1972. (23) Landaw, E. M., Optimal experimental design for biologic compartmental systems with

applications to pharmacokinetics. Ph.D. Thesis, UCLA, 1980. (24) E. Walter and L. Pronzato, “Qualitative and quantitative experiment design for phenom-

enological models - a survey”, Automatica, Vol. 26, pp. 195-213, 1990. (25) M. N. Nathanson and G. M. Saidel, “Multiple-objective criteria for optimal experimental

design: application to ferrokinetics”, Am. J. Physiol., Vol. 248, pp. R378-R386, 1985. (26) C. Cobelli, A. Ruggeri, J. J. DiStefano III and E. M. Landaw, “Optimal design of mul-

tioutput sampling schedules - software and applications to endocrine-metabolic and pharmacokinetic models”, IEEE Trans. Biomed. Engng, Vol. BME32, pp. 2499256, 1985.

(27) M. J. Chapman and K. R. Godfrey, “A methodology for compartmental model indistinguishability”, Math. Biosciences, Vol. 96, pp. 141-164, 1989.

(28) M. J. Chapman, K. R. Godfrey and S. lnd Vajda, Indistinguishability for a class of nonlinear compartmental models”, Math. Biosciences, Vol. 119, pp. 77-95, 1994.

(29) A. Raksanyi, Y. LeCourtier, E. Walter and A. Venot, Identifiability and distinguishability testing via computer algebra”, Math. Biosciences, Vol. 77, pp. 245-266, 1985.

(30) L. Q. Zhang, J. C. Collins and P. H. King, Indistinguishability and identifiability analysis of linear compartmental models”, Math. Biosciences, Vol. 103, pp. 77-95, 1991.

(31) 0. A. Linares, J. A. Jacquez, L. A. Zech, et al., “Norepinephrine metabolism in humans”, J. Clin. Invest., Vol. 80, 1332-1341, 1987.

Documents

Doe