11
Inferences about Hit Number in a Virological Model Author(s): J. G. Kalbfleisch and D. A. Sprott Source: Biometrics, Vol. 30, No. 1 (Mar., 1974), pp. 199-208 Published by: International Biometric Society Stable URL: http://www.jstor.org/stable/2529629 . Accessed: 25/06/2014 07:20 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to Biometrics. http://www.jstor.org This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AM All use subject to JSTOR Terms and Conditions

Inferences about Hit Number in a Virological Model

Embed Size (px)

Citation preview

Page 1: Inferences about Hit Number in a Virological Model

Inferences about Hit Number in a Virological ModelAuthor(s): J. G. Kalbfleisch and D. A. SprottSource: Biometrics, Vol. 30, No. 1 (Mar., 1974), pp. 199-208Published by: International Biometric SocietyStable URL: http://www.jstor.org/stable/2529629 .

Accessed: 25/06/2014 07:20

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access toBiometrics.

http://www.jstor.org

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 2: Inferences about Hit Number in a Virological Model

BIOMETRICS 30, 199-208 MARCH 1974

INFERENCES ABOUT HIT NUMBER IN A VIROLOGICAL MODEL

J. G. KALBFLEISCH AND D. A. SPROTT

Department of Statistics, University of Waterloo, Waterloo, Ontario, Canada

SUMMARY

Alling [1971] considers a dilution series model arising in virology and discusses the estimation of a parameter h called the hit number. His analysis is based on asymptotic properties of various estimators. The present paper develops other methods for making inferences about h. Tests for the homogeneity of data from several series and for the goodness of fit of the model are also considered. The analysis is based upon the use of sufficient and ancillary statistics to factor the joint likelihood into parts appropriate for the various types of problem considered, and is illustrative of a general approach to problems of inference.

1. INTRODUCTION

Alling [1971] describes an interesting dilution series model arising in virology. A liquid medium containing a suspension of virus particles is successively diluted to form a geometric series of k + 1 dilutions a', a, a2, ... a'. These are poured over replicate cell sheets, and after a period of growth, the number ni of plaques occurring at dilution level ai is observed. It is assumed that the ni's are independent, and that n, has a Poisson distribution with mean qO'. The parameter X is the expected number of plaques in the undiluted suspension (j = 0), and 0 = a-h where a is the known dilution factor and h, the hit number, is the minimum number of virus particles which must attach themselves to a cell in order to form a plaque. For a derivation and further discussion of the model, see Alling's paper and also Gart [1964].

Alling suggests methods of analysis based on asymptotic properties of various easily- calculated estimators, which he illustrates using data reported by Boeye et al. ([1966], Table 4). There are seven sets of plaque frequencies from seven dilution series, each with three dilution levels (k = 2) and dilution factor a = (10)1/2 The observed frequencies are recorded in columns 2, 3, and 4 of Table 1.

The present paper describes other methods for making inferences about the hit number h, based on the data from one series (section 2) or from several (section 3). These methods are exact in that they are based upon al] of the sample information and do not depend upon mathematical approximations or asymptotic properties. In section 4 these methods are applied to the data of Table 1, with results quite different from those reported by earlier authors. Section 5 develops tests for the homogeneity of several series with respect to the hit number, and with respect to the q's when the hit numbers are equal. Section 6 considers tests for departures from the Poisson assumption.

All of the procedures described are based upon a factorization of the joint likelihood function (section 7). In fact, the main purpose of this paper is to illustrate a general ap- proach to problems of statistical inference in which sufficient and ancillary statistics play a central role. They are used to split the likelihood function, and hence the information of the sample, into various parts, each of which is appropriate for answering a different

199

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 3: Inferences about Hit Number in a Virological Model

200 BIOMETRICS, MARCH 1974

TABLE 1 DATA REPORTED BY BOEYh et al. [1966] AND INFERENCES ABOUT THE HIT NUMBER h

Plaque Counts A Isolate n n n s t h 10% Likelihood Exact a S// Approx. a Number o 1 2 Interval (h=2) (h=2) (h=2)

1 122 10 2 134 14 2.028 1.59 - 2.57 1.000 -0.124 0.901

2 176 10 4 190 18 2.108 1.72 - 2.58 0.668 -0.545 0.586

3 170 19 2 191 23 1.912 1.56 - 2.32 0.591 0.503 0.615

4 266 40 5 311 50 1.675 1.43 - 1.95 0.00906 2.745 0.00604

5 264 38 4 306 46 1.730 1.48 - 2.01 0.0341 2.183 0.0290

6 306 42 3 351 48 1.808 1.56 - 2.08 0.114 1.586 0.113

7 186 22 2 210 26 1.889 1.56 - 2.27 0.474 0.673 0.501

All Isolates 1693 225 1.831 1.707 - 1.962 0.00358 3.015 0.00257

question. This enables inferences to be made on various questions independently of one another. For example, inferences about the hit number h will not depend upon the pa- rameter q, and hence the possibility of misinterpretation owing to ignorance of X is avoided. In addition, this approach leads to methods which are exact in samples of all sizes. These will generally require some additional computation, but ease of computation becomes increasingly less important as computers become faster and more widely accessible.

Although the virological model considered is somewhat specialized, the methods to be discussed are applicable much more generally to problems exhibiting a similar structure (as outlined in section 7). Indeed, applications of this approach, some of them classical, are in the statistical literature. Some examples are exact tests for independence in con- tingency tables, back-the-winner clinical trials (Zelen [1969]), Poisson ratios (Cox [1958]), and goodness of fit to Poisson series (Fisher [1950]). However, there seems to be little general awareness of the approach or of the logical principles involved. The interest in the present model lies in its greater complexity and structure which allows increased scope for subdividing the information of the sample and seeing how the various parts interrelate and combine to form the whole.

2. INFERENCES ABOUT THE HIT NUMBER

The joint distribution of the plaque frequencies ni from a series of k + 1 dilutions a, a,a , a a is

k

TI (nOD)ni exp (-X O')/ni! = w8Ot exp (_ ZOi)/H n;! j=o

where s = n and t = 2jni . Clearly s and t are jointly sufficient for X and 0. Their joint distribution is

f(s, t; q, 0) = c(s, t)q80t exp (-w20')/s!

where c(s, t) is the sum of s!/lln, ! over all nO , n, n,k such that 2ni = s and 2jni = t.

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 4: Inferences about Hit Number in a Virological Model

INFERENCES ABOUT HIT NUMBER 201

Since s is a sum of independent Poisson variates, its marginal distribution is Poisson with mean 4, where ) = q2lo. The conditional distribution of t given s is then

f(t; 0 I s) = c(s, t)0t/(Y0i)8. (1)

This distribution depends only on 0, and will serve as the basis for inferences about 0 when X is unknown.

The marginal distribution of s depends only on 4 = qYOi, a function of both X and 0. For every possible 0, 4 is a one-to-one function of q. Thus if 4 is known to be 40, , then for any proposed value 00 between 0 and c, there exists a value no such that flo 20oi = )0 . That is, by a suitable choice of mo, any value of 0 can be made consistent with the known 40 . In this sense, knowledge of 4 contributes nothing to knowledge of 0 when X is unknown. But the marginal distribution of s, being a function of 4 only, provides information only about 4, and therefore contributes no information about 0 when X is unknown. When X

is unknown, the sample information concerning 0 is contained wholly in (1), which is therefore proportional to the conditional likelihood function of 0 (Kalbfleisch and Sprott [1970]).

(a) Likelihood inferences. The conditional relative likelihood function of 0 is obtained by setting s and t at their observed values, and standardizing (1) to have a maximum value of unity:

CR(0) = f(t; 0 1 s)/sup f(t; 0 1 s)

Since 0 = ah with a being known, the conditional relative likelihood function of h is then CR(ah'). A graph of this function may be obtained with little computational difficulty even without using a computer (see section 4). It provides a pictorial summary of the information provided by the data concerning h, independently of '. Values of h for which CR(ah') is near unity are well supported by the data, while those for which CR(ah') is small (e.g. less than 0.1) are implausible in the light of the data. Also, inferences are exact in that no mathematical approximations are involved. For general discussions of likelihood inference, see Fisher [1956]; Barnard et al. [1962]; Sprott and Kalbfleisch [1965]; and Sprott and Kalbfleisch [1969].

The value of 0 at which the maximum of (1) occurs is called the conditional maximum likelihood (ML) estimate of 0, and may be found numerically as a root of a polynomial of degree kc:

tSIf = sIjO1

For the data of Table 1, there are only three dilutions (k = 2), and O is obtained as the positive root of a quadratic equation,

0 = {:1-b + V/4-3(b--1)2}/2b, b = 2-(t/s).

The ML estimate of h is then h = -loga 0.

Although conditional and unconditional ML estimates will generally be different, they happen to be the same in this problem, and Alling's unconditional ML estimates (which he denotes by h*) should be numerically the same as our conditional ML estimates.

(b) Tests of significance. Consider now the problem of assessing the evidence provided by the plaque frequencies from a single series against some hypothesized hit number, h = ho . A test of significance will also be based upon the conditional distribution (1),

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 5: Inferences about Hit Number in a Virological Model

202 BIOMETRICS, MARCH 1974

where now we have 0 = a-h.. An exact significance level may be obtained as the total probability of the observed value t and others as improbable. This is found by summing (1) over all values of t more probable than- the observed one, and subtracting the result from unity. A convenient method for calculating the conditional probabilities (1) is de- scribed in the Appendix.

As exact test requires considerably more computation than does the likelihood analysis described above, and hand calculation would be extremely tedious. However, the exact significance level may be obtained on a computer in only a few seconds unless s is extremely large (e.g. s = 2000). If s is large, the following approximation may be used. Let S and I denote the (conditional) score and expected information at 0 a-ho:

a log(t; Is) = t _ s (k + 1))0 k

00 0 1- L 20

7 = _E&d log2f(t; 0 s)} = s 2 1 - (0 1

Thenl S/(I)112 has asymptotically a standardized normal distribution. The approximation to the exact significance level may be slightly improved by means of a correction for con- tinuity, which involves decreasing the magnitude of S by 0.5/0. See section 4 for a numerical example.

3. COMBINING INFORMATION FROM SEVERAL SERIES

Consider r replications of a dilution series experiment. The parameters for the ith replication are denoted by qj and Oi = ahi, and the sufficient statistics by si and ti . Tests for homogeneity of the r replications will be discussed in section 5. For the present, it is assumed that there is a common hit number h, and therefore a common parameter 0 = a-h

for the r replications. The problem is then to make inferences about h independently of the qj on the basis of the data from all r replications.

As a straightforward generalization of (1), inferences about 01, r, O in the absence of knowledge of q1, r, r will be based upon

f(t1, Y tr I Sl , Sr) = Ilc(si, t)o0il/(loi )* (2)

Because the r replications are independent and involve different parameters, it can be argued (as in section 2) that the distribution of s , * * *, Sr contains no information about 01 X**2. Or when ?), ... , ? are unknown. Thus (2) contains all of the sample information about 01, ,O r when 77, I *, r are unknown.

Under the assumption 01 = ... = 0. = 0, (2) becomes

f(tl . * tr I SI X * Sr) = [llc(Si X ti)]o0/(l0') (3)

where t = 2ti and s = s . The conditional likelihood function of 0 involves the ti's only through their total t. Given the si's, t is sufficient for the common value 0. Therefore, without loss of information, inferences about 0 may be based upon the conditional distribution of t given s, , 2 Sr Y

f (t I Sl X***XSr) = [Y2uc(si X Qi]fO /(2Y O), (4)

where the sum extends over all t1, ... , tr with 2ti = t. It is easy to show that the gen- erating function for the combinatorial coefficient in (4) is the same as that obtained for c(s, t) in the Appendix. It then follows that

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 6: Inferences about Hit Number in a Virological Model

INFERENCES ABOUT HIT NUMBER 203

f(t J Si , * * S) = f(t; 0 J s) = c(s, t)0t/(20')8. (5)

Consequently, to make inferences about h on the basis of the data from all r replications, one merely applies the procedures described in section 2 to the totals s = 2si and t = 2ti

4. APPLICATIONS

The procedures described in sections 2 and 3 will now be applied to the data described earlier. The values of s, t, and Ah for the seven series are given in Table 1.

(a) Likelihood inferences. Figure 1 shows the graphs of log CR(ah ) for the 4th, 5th, and 6th isolates. The reference line corresponding to 10% relative likelihood is shown. For the 4th isolate, values of h between 1.43 and 1.95 have relative likelihoods of 0.1 or more. Values of h outside this interval are not supported by the data from the 4th isolate. The 8th column of Table I gives the 10% likelihood intervals for h resulting from each of the seven isolates.

Now suppose that the hit number h is the same for all seven isolates. The log conditional relative likelihood function h based on all seven isolates is graphed in Figure 2. Values of h less than 1.707 or greater than 1.962 have relative likelihoods of 0.1 or less, and to this extent they are not supported by the data. In particular, the theoretical value h = 2 has relative likelihood 0.0146, and is therefore quite implausible (as indeed is any other integral value of h).

The implausibility of h = 2 raises the question of whether this value is actually in- compatible with the data. Although the above likelihood analysis would seem to provide a fairly good indication that this is so, the question may more appropriately be answered by a test of significance, as outlined in section (2b).

(b) Tests of significance. Exact tests of significance were performed to investigate the compatibility of the data with the theoretical value h = 2 for each isolate individually,

log 1r4 1;5 16 17 18 1r9 2.0 2 1 relative

likelihood

2- /10% relative likelihood

3- 41

4-

6

51

FIGURE 1 LOG CONDITIONAL RELATIVE LIKELIHOOD FUNCTIONS OF h: ISOLATES 4, 5, AND 6

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 7: Inferences about Hit Number in a Virological Model

204 BIOMETRICS, MARCH 1974

1.170 1.75 1.180 1 85 1.90 1.195 2100 log 0 h relat ive likelihood

2-

10%o relative likelihood

3-

4-

5-

FIGURE 2 LOG CONDITIONAL RELATIVE LIKELIHOOD FUNCTION OF h: SEVEN ISOLATES COMBINED

and for the seven isolates combined. The computational method used was that discussed in the Appendix. The results are given in the 9th column of Table 1. For comparison, approximate significance levels based on the asymptotic normality of S/(I)1/2 (not corrected for continuity) are given in the last column of the table. As could be anticipated from the likelihood analysis, the tests for isolate 4, isolate 5, and the combined data provide quite strong evidence against h = 2.

This finding disagrees with the conclusion of Alling [1971] and also with that of Boey6 et al. [1966] who initially reported the data. However, the referee has pointed out that the mean of Alling's weighted regression estimates does in fact differ significantly from 2 (a 0.01 as compared with a = 0.0036 in Table 1).

5. TESTS FOR HOMOGENEITY

Consider first the problem of testing the significance of the data from r dilution series in relation to the hypothesis of a common hit number h. If the dilution factor a is the same for all the series, this is equivalent to testing the composite hypothesis He: 0i = 0 (i- 1, 2, ... , r). It was noted below (2) in section 3 that all of the information concerning the 02's in the absence of knowledge of the 7i's is contained in the conditional distribution of the ti's given the si's. This distribution may be factored as follows:

f (tl, * tr | s, ,S)= f(t |3' fl ,s)(t, , tr |t, ) r

Under Ho j the first factor depends only upon 0 while the second is parameter free. The first factor contains all of the information concerning the common value of 0, and was used in section 3 to make inferences about 0. The second factor contains the information about the equality of the Oi's, and so serves as the basis of a test for homogeneity.

The ratio of (3) to (5) gives

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 8: Inferences about Hit Number in a Virological Model

INFERENCES ABOUT HIT NUMBER 205

f(t1 , * tr I t, S * Sr) = [Hc(si , ti)]/c(s, t).

For computational purposes, it is more convenient to write

f(ti . . . tr I t si .*8sr) = [I[f(ti ; 0 S)]/8(t; 0 ) (6)

where now the factors on the right may be evaluated by the method described in the Appendix. The result is independent of 0, and hence any convenient value may be used in the computations. An exact significance level may now be obtained by summing (6) over all configurations { ti } with Iti = t which are at least as "unfavorable" to the hypothesis as the observed configuration.

In order to complete the specification of a test, it is necessary to choose a test criterion; that is, to rank the configurations according to their "favorability" to the hypothesis. One such criterion, which is useful when no particular type of departure from H is an- ticipated, is

D = Isi

where Si and Ii are the conditional score and expected information for the ith replication (given in section 2b) evaluated at the overall maximum O. The exact significance level is obtained by evaluating Pr (D > d) using (6), where d is the observed value of D. When all the si's are large, an approximate test of Ho may be based on the fact that D is asymp- totically X (r- 1)

For the data of Table 1, the observed value of D is 5.465, giving the approximate significance level Pr (X(6)2 > 5.465) = 0.49. Hence there is no evidence against the equality of the hit numbers in the seven isolates. To obtain an exact significance level in this case would involve the summation of about 108 terms, requiring several hours on a fast computer. However, confirmation of the above approximate result can be obtained fairly rapidly by using Monte Carlo methods applied to (6).

Suppose now that He is assumed true, and consider the hypothesis H, =- (i = 1, 2, ... , r). Given Ho , H, is equivalent to H: qi = 4(i = 1, 2, . , r), where 4,i 07.

Inferences about the Xi's can be handled in much the same way as those for the 6i's. Since the si's, which are sufficient for the fi's, are also ancillary for the 6i's, inferences concerning the pi's are based solely on

Under H,k this becomes

f(Si, , sr) = , !

so that s is sufficient for 4. Under H, the conditional distribution of the 8j's given s will be independent of all parameters, and can thus serve as the basis of a test of HV . Since s has a Poisson distribution with mean r4, it follows that

ASl, s, sI s) =s! i8! (r) (7)

which is an r-variate multinomial distribution with equal probabilities 1/r. An exact significance level may be obtained by selecting a suitable test criterion and then summing (7) over the appropriate configurations {s.i} with f8i = s.

If s/r is fairly large (say five or more), an approximate test of H" may be based on the fact that (r/s)r,(si - s/r)2 has approximately a x2 distribution with r - 1 degrees of

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 9: Inferences about Hit Number in a Virological Model

206 BIOMETRICS, MARCH 1974

freedom. For the data of Table 1, we find that (r/s)l(si - sr) = 160.14, which is to be compared with tables of X2(6). The hypothesis of equal parameters 7ji is thus inconsistent with the data. A test of this hypothesis can also be made by the method of Gart [1964]. He derived a test on the basis of h = 1, but his formulae appear to be equally applicable to the present model, irrespective of the value of h.

6. TESTS OF THE POISSON ASSUMPTION

Since si and ti are jointly sufficient for 7ji and 0i , the conditional distribution of the plaque frequencies nii given si and ti will be independent of all parameters irrespective of any restrictions or hypotheses placed upon them. Hence this distribution is available for testing the assumption of a Poisson distribution in the ith replication. This conditional distribution is

f(n,io nil * nik IsSi , tj) = si J! (8)

An exact significance level is obtained by summing (8) over all nii's at least as improbable as those observed. Alternatively, the range of summation can be determined by defining a suitable test criterion as in section 5.

In Table 1, the observed frequencies for the 3rd, 4th, and 7th isolates are the most probable under the Poisson assumption, and the significance level is 1 in each case. For the 1st, 5th, and 6th isolates the levels are 0.29, 0.78, and 0.41. Only the second isolate shows evidence of departures from the Poisson model, with a significance level of 0.0195. Various other tests are also possible. For instance, one might test the totals over the seven isolates for departures from a Poisson distribution, or test for departures in all seven isolates simultaneously. However, in the present case these tests seem unlikely to yield any useful information, and hence they have not been performed.

Although there is little evidence of departures from the Poisson model in the above data, this apparently is not generally true of plaque counts, as found by Armitage [1957] and cited by Gart [1964].

7. CONCLUSION

The analysis presented in this paper is based on the separation of the joint distribution f of the plaque frequencies from all r replications into factors, each appropriate for a dif- ferent type of problem. First one obtains

f = flf2f3

where f, is the distribution of the nii's given the si's and ti's and may be used to test de- partures from the Poisson model; f2 is the distribution of the ti's given the si's and contains all information relevant to inferences about the Oi's (and hi's) independently of the 77i's; and f3 is the marginal distribution of the si's, and contains all information about the pa- rameters pi = 77il0i2. A further factorization gives

f2 = f4f5

where fA is the distribution of the ti's given t and the si's, and fj is the distribution of t given the si's. When all 0i = 0, the former depends only on 0 and the latter is independent

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 10: Inferences about Hit Number in a Virological Model

INFERENCES ABOUT HIT NUMBER 207

of all parameters. Thus ft contains the information concerning the magnitude of the common 0, while f, contains the information about the equality of the Oi's. Similarly, f3 may be expressed as the product of two factors one of which contains information about the equality of the i's, while the other contains information about the magnitude of their common value.

The factorization of the joint distribution serves to separate from one another the various types of information provided by the data. Through considerations of sufficiency and ancillarity, certain characteristics of the data may be seen to contain the information relevant to inferences about the magnitudes of the parameters, while other characteristics provide the information necessary for tests of homogeneity and tests of fit.

This paper is intended to illustrate a general approach to problems of inference in which first priority is given to the selection of the appropriate distribution (or reference set) for each inference problem. Whether the problem is one of estimation or of testing significance, attention should first be given to identifying that part of the data (i.e. that marginal or conditional distribution) which contains the relevant information. When this is done, the traditional problems of selecting estimators and test statistics may be greatly simplified or even eliminated. When appropriate structure is present in the form of sufficient and ancillary statistics, this approach leads to methods which are suitable for dealing with nuisance parameters and exact in samples of all sizes. These methods can be im- plemented on a computer, or used as a basis for the evaluation of various approximate methods which might be proposed.

INFERENCES SUR LE NOMBRE DE "COUP" DANS UN MODELE EN VIROLOGIE

RESUME

Alling (1971) considere un modele de dilution intervenant en virologie, et discute l'estimation d'un parametre h appele nombre de "coup". Son analyse est fondee sur les proprietes asymptotiques de divers estimateurs. L'article developpe d'autres methodes pour obtenir des inferences sur h. On considere aussi des tests d'homogeneite des donnees pour plusieurs series de donnees et pour la valeur de l'ajustement au modele. On base l'analyse sur l'utilisation de statistiques suffisantes et "ancillaires" pour d6composer la vraisemblance liee en parties adapt6es aux diff6rents types de probleme consid6res; on fournit ainsi un e illustration de l'approche generale des problemes d'inference.

REFERENCES Alling, D. W. [1971]. Estimation of hit number. Biometrics 27, 605-13. Armitage, P. [1957]. Studies in the variability of pock counts. J. Hyg. Camb. 55, 564-81. Barnard, G. A., Jenkins, G. M., and Winsten, C. B. [1962]. Likelihood inference and time series (with

discussion). J. R. Statist. Soc. A, 125, 321-72. Boeye, A., Melnick, J. L., and Rapp, F. [1966]. SV40-adenovirus "hybrids": presence of two genotypes

and the requirement of their complementation for viral replication. Virology 28, 56-70. Cox, D. R. [1958]. Some problems connected with statistical inference. Ann. Math. Statist. 29, 357-72. Fisher, R. A. [1950]. The significance of deviations from expectation in a Poisson series. Biometrics 6, 17-24. Fisher, R. A. [1956]. Statistical Methods and Scientific Inference. London: Oliver and Boyd. Gart, J. G. [1964]. The analysis of Poisson regression with an application in virology. Biometrika 51, 517-21. Kalbfleisch, J. D. and Sprott, D. A. [1970]. Applications of likelihood methods to models involving large

numbers of parameters (with discussion). J. R. Statist. Soc. B, 32, 175-208. Sprott, D. A. and Kalbfleisch, J. D. [1969]. Examples of likelihoods and comparisons with point estimates

and large sample approximations. J. Amer. Statist. Ass. 64, 468-84. Sprott, D. A. and Kalbfleisch, J. G. [1965]. Use of the likelihood function in inference. Psych. Bull. 64,

15-22. Zelen, M. [1969]. Play the winner rule and the controlled clinical trial. J. Amer. Statist. Ass. 64, 131-46.

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions

Page 11: Inferences about Hit Number in a Virological Model

208 BIOMETRICS, MARCH 1974

APPENDIX

Since (1) is a probability function, its sum over t is unity for all 0 > 0:

2c(s, t)Ot = (zoi)8, 0 < 0 < o

(1 - Ok+1)8(1 - a)-s

Since this holds for all 0 > 0, it in effect defines the generating function of the c(s, t)'s (s fixed). Both terms on the right may be expanded via the binomial theorem to give

2c(s, t)t _T 2()_])@k+)(8()

Rearranging and equating coefficients of Oa on both sides gives

c(s, t) = E 1)k-(k+

One can now write

f(t; 0s) = Eui(t) (9)

where

ut(t) = (t - (k + l)i

To calculate f(t; 0 j s), one first evaluates u0(t) = ( St/(Zfl)8, and then obtains u1(t),

... , recursively using

Ui+i(t) s- i [t -(k + 1i ui (t) + - s + t - (k + 1)i-i](k+l)

Note that ui(t) = 0 for (k + 1)i > t, so that the sum (9) involves at most 1 + t/(k + 1) terms. In fact, it is not usually necessary to calculate even this many terms, because the magnitude of the ratio of successive terms in the alternating series (9) is steadily decreasing. If lui(t) I < jui -(t) j, and computation terminates with ui(t), the calculated value of f(t; 0 1 s) will be in error by less than Iui(t)I.

Received May 1972, Revised October 1972

Key Words: Virology; Conditional tests; Conditional likelihood.

This content downloaded from 195.78.109.96 on Wed, 25 Jun 2014 07:20:35 AMAll use subject to JSTOR Terms and Conditions