23
Herd-level test performance based on uncertain estimates of individual test performance, individual true prevalence and herd true prevalence David Jordan, Scott A. McEwen * Department of Population Medicine, University of Guelph, Guelph, Ontario, Canada N1G 2W1 Accepted 8 May 1998 Abstract A generalized model was derived for understanding the performance of herd-testing protocols when there is uncertainty and variability in individual-level sensitivity, specificity, prevalence of infection within infected herds, and prevalence of infected herds in the population. The model uses Monte-Carlo techniques to provide estimates of test performance for a dichotomous classification of herd-disease status. Uncertainty and variability in input assumptions are described using empirical and parametric probability distributions. The model permits both cluster-correlated behavior of inputs and sampling of animals without replacement. Disease due to obligate parasites is modeled differently from that due to organisms that persist for long periods in the environment. Dependence among model outcomes is assessed using Spearman’s rank correlation. Model output is suitable for inclusion in risk-assessment models requiring probabilistic estimates of herd-level test performance, such as those developed for food-safety decision making and import–export risk assessment. The model was demonstrated using an example scenario based on Shiga-like toxin (SLT) producing Escherichia coli O157 in Ontario beef-cattle herds. Inputs were derived from the literature and Statistics Canada agricultural census data. Where appropriate, these data were subjected to distribution-fitting techniques. Otherwise, subjective interpretation was used to select input distributions and their parameters. Simulation revealed that the distribution of herd-level sensitivity for detecting herds infected with SLT producing E. coli O157 has a large range (0.003– 0.99) and a median of 0.19. Herd-level specificity also had a large range (0.58–1) and a median of 0.94. Distributions of herd-level positive and negative predictive values exhibited similar degrees of uncertainty. In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated has limited ability to discriminate between herds Preventive Veterinary Medicine 36 (1998) 187–209 * Corresponding author. Tel.: +1 519 824 4120, ext 4751; fax: +1 519 763 3117; e-mail: [email protected] 0167-5877/98/$ – see front matter # 1998 Elsevier Science B.V. All rights reserved PII:S0167-5877(98)00087-7

Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

Herd-level test performance based on uncertain

estimates of individual test performance, individual

true prevalence and herd true prevalence

David Jordan, Scott A. McEwen*

Department of Population Medicine, University of Guelph, Guelph, Ontario, Canada N1G 2W1

Accepted 8 May 1998

Abstract

A generalized model was derived for understanding the performance of herd-testing protocols

when there is uncertainty and variability in individual-level sensitivity, specificity, prevalence of

infection within infected herds, and prevalence of infected herds in the population. The model uses

Monte-Carlo techniques to provide estimates of test performance for a dichotomous classification

of herd-disease status. Uncertainty and variability in input assumptions are described using

empirical and parametric probability distributions. The model permits both cluster-correlated

behavior of inputs and sampling of animals without replacement. Disease due to obligate parasites

is modeled differently from that due to organisms that persist for long periods in the environment.

Dependence among model outcomes is assessed using Spearman's rank correlation. Model output is

suitable for inclusion in risk-assessment models requiring probabilistic estimates of herd-level test

performance, such as those developed for food-safety decision making and import±export risk

assessment.

The model was demonstrated using an example scenario based on Shiga-like toxin (SLT)

producing Escherichia coli O157 in Ontario beef-cattle herds. Inputs were derived from the

literature and Statistics Canada agricultural census data. Where appropriate, these data were

subjected to distribution-fitting techniques. Otherwise, subjective interpretation was used to select

input distributions and their parameters. Simulation revealed that the distribution of herd-level

sensitivity for detecting herds infected with SLT producing E. coli O157 has a large range (0.003±

0.99) and a median of 0.19. Herd-level specificity also had a large range (0.58±1) and a median of

0.94. Distributions of herd-level positive and negative predictive values exhibited similar degrees of

uncertainty. In combination with poor likelihood ratios for positive and negative herd tests, results

indicate that the testing protocol investigated has limited ability to discriminate between herds

Preventive Veterinary Medicine 36 (1998) 187±209

* Corresponding author. Tel.: +1 519 824 4120, ext 4751; fax: +1 519 763 3117; e-mail:[email protected]

0167-5877/98/$ ± see front matter # 1998 Elsevier Science B.V. All rights reserved

PII: S 0 1 6 7 - 5 8 7 7 ( 9 8 ) 0 0 0 8 7 - 7

Page 2: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

infected and not infected with SLT producing E. coli O157. # 1998 Elsevier Science B.V. All

rights reserved.

Keywords: Cattle-microbial diseases; Escherichia coli; Herd testing; Modelling; Uncertainty

1. Introduction

Accurate classification of herd-disease status is important for animal-health decision

making and in the study of herd-level risk factors for disease. The disease status of each

herd is often decided by applying a diagnostic test to a sample of animals selected from

the herd at random. If the number of positive results in the sample equals, or exceeds a

pre-defined `cut-point,' then the herd is classified diseased. Traditionally, the cut-point

has been set equal to one and sample sizes defined by the formulae provided by Cannon

and Roe (1982). However, this approach can result in misclassification of herd-disease

status because diagnostic tests rarely have perfect sensitivity and specificity and because

of chance effects in the number of diseased and test-positive animals sampled. In

explaining attempts to overcome this problem we will rely on the abbreviations and

definitions shown in Table 1.

Table 1Abbreviations used in explaining herd-testing models

Abbreviation Meaning

c cut-point number of test-positive animals denoting a test-positive herd

d number of diseased animals in the herd

HNPV herd negative test predictive value

HPPV herd positive test predictive value

HSENS herd-level sensitivity

HSPEC herd-level specificity

HTP true prevalence of diseased herds

LRNHT likelihood ratio for a negative herd test

LRPHT likelihood ratio for a positive herd test

N herd size

n sample size

NT� number of test positive animals in the herd

P(D�) probability that a randomly selected animal is diseased

P(HD�) probability that a randomly selected herd is diseased

P(T�) probability that a randomly selected animal is test positive

P(T�|D-) probability that an animal is test positive given that it is not diseased

rS Spearman's correlation coefficient

SENS sensitivity at the individual-animal level

SPEC specificity at the individual-animal level

TPWH true prevalence within herds

x number of test-positive animals in the sample

n first shape parameter of the beta distribution

� mean probability of success for the beta-binomial distribution

r intracluster (intraherd) correlation coefficient

w second shape parameter of the beta distribution

188 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 3: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

1.1. Existing models of herd testing

Martin et al. (1992) developed a method of calculating HSENS and HSPEC based on

the following steps. Firstly, the probability that an animal selected at random from a herd

is test-positive (P(T�)) is calculated from the true prevalence of disease within herds

(TPWH), test sensitivity at the individual-animal level (SENS), and test specificity at the

individual-animal level (SPEC). The second and third steps calculate HSENS and HSPEC

as cumulative probabilities from the binomial distribution by setting the probability of

success equal to P(T�), the number of trials equal to the sample size (n), and by defining

the summing interval using the cut-point (c). Predictive values for herd-level positive tests

(HPPV) and predictive values for herd-level negative tests (HNPV) can be calculated

from estimates of HSENS, HSPEC and an assumption of herd-level true prevalence

(HTP) using a formula for predicting individual-level true prevalence from individual-

level apparent prevalence, SENS and SPEC (Rogan and Gladen, 1978). A recent

adaptation of this method was made to permit interpretation of two tests used in series,

with use of the second test conditional on the first test providing at least one positive

result (Garner et al., 1997).

Although the method above is computationally simple, a number of drawbacks limit its

practical use. One disadvantage is its reliance on the binomial probability distribution for

estimating the number of test-positive animals in a sample. Binomial probabilities assume

that sampling is performed with replacement. Although sampling in the field is invariably

performed without replacement, binomial probabilities are adequate when herd size (N) is

large compared to sample size (n). As a rule of thumb, n/N should be <5% for binomial

probabilities to be an acceptable approximation to exact probabilities which can be

calculated from the hypergeometric probability distribution (Freund, 1992). Unfortu-

nately, there are many herd-testing scenarios where the sampling-with-replacement

assumption is inadequate according to this rule. Nevertheless, this difficulty can be

surmounted by using the hypergeometric probability distribution (MacDiarmid, 1988).

Cameron and Baldock (1998) devised a solution for overcoming the potential error

caused by binomial approximations and incorporated their formulae in a search algorithm

for estimating appropriate sample sizes for herd tests.

A second, and more difficult, problem with all the aforementioned models lies in

dealing with variability in SENS, SPEC, and TPWH. For example, serological tests for

Mycobacterium paratuberculosis in cattle will perform differently in two different herds

of cattle if the individuals in one herd are more likely to be exposed to organisms such as

M. avium (which cross-react with M. paratuberculosis and, thus, cause a reduction in

SPEC). Similarly, we know that the seropositivity of cattle infected with M.

paratuberculosis depends on factors such as the duration of infection (Chiodini et al.,

1984), and that herds will vary in this respect. Hence, we expect SENS also to vary from

herd to herd. The case for recognising that SENS and SPEC are variable traits within

human populations is also strongly argued for by Kraemer (1992). For most diseases, it is

also reasonable to assume that TPWH varies, since it is merely a reflection of P(D�).

SENS, SPEC and P(D�) have, thus, been described as cluster-correlated binary responses

(Donald, 1993). This means that, for any one of these traits, animals within the same herd

tend to be more alike compared to any two individuals selected at random from the

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 189

Page 4: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

population of individuals. Estimates of HSENS and HSPEC, that do not account for this

phenomenon, may be inaccurate and this will flow through to any subsequent calculations

of HPPV and HNPV. Moreover, it seems undesirable to calculate HPPV, HNPV and HTP

from fixed values of HSENS and HSPEC when the latter vary from herd to herd due to

changes in n, N, SENS and SPEC.

A model described by Donald (1993) and Donald et al. (1994) is a logical progression

on the model of Martin et al. (1992). This model accounts for the clustering of SENS,

SPEC and TP values within herds with the help of the beta-binomial distribution. An

example involving TPWH best illustrates use of the beta-binomial distribution. If we let

TPWH be described by P(D�) then, if n animals are sampled from a herd, the distribution

of the number of diseased animals in the sample follows the binomial probability

distribution with parameters n and P(D�). To model extra-binomial variation in true

prevalence, we then let P(D�) vary as a random variable from a beta distribution. The

beta distribution is used because it is a flexible method of modelling random variables

constrained on the domain of zero to one (such as P(D�), SENS and SPEC). As well, the

beta distribution with parameters n and w allows the confidence in a binomial proportion

to be expressed by setting n equal to the number of binomial successes and w equal to one

plus the number of failures (Evans et al., 1993). In addition, the beta-binomial distribution

can be conveniently re-expressed in terms of two other parameters: � (the mean

probability that an animal is diseased) and r (the intracluster correlation coefficient)

(Donald, 1993). The latter formulation makes it possible to find an analytical solution

(rather than a solution through simulation) to equations for calculating HSENS or HSPEC

from parameters such as SENS, SPEC and P(D�) that can be assumed to have a beta-

binomial distribution. Consequently, the beta-binomial distribution is used extensively by

Donald et al. (1994) to provide a strong theoretical basis for studying HSENS and

HSPEC.

Despite the improvements, the model by Donald et al. (1994) does not lend itself well

to practical circumstances. One reason for this is the nature of input parameters

describing the intracluster correlation of P(D�), SENS and SPEC. While these

parameters have a precise statistical interpretation, perhaps only P(D�) has any intuitive

meaning to veterinarians. The remaining two are poorly understood parameters for which

we have found no published estimates for animal diseases. Furthermore, the model by

Donald et al. (1994) like the model by Martin et al. (1992) can only be assumed to be

accurate when sample size is small compared to herd size (because of continued reliance

on the binomial distribution in modelling the number of test-positive animals in a sample).

A procedure for overcoming some of the limitations of these earlier models was

introduced by Carpenter and Gardner (1996). An objective of their work was to judge the

ability of herd testing (based on a serological test for porcine parvovirus) to correctly

classify the prevalence of parvovirus infection within pig herds as low, medium, or high.

Although this is quite different from the usual dichotomous classification of herd-disease

status, their paper illustrates that it is not necessary to rely on complex analytical

solutions to equations in order to calculate herd-level test parameters. Carpenter and

Gardner (1996) derived a non-analytical solution using Monte Carlo simulation.

However, they appear not to have used it to accommodate variability or uncertainty in

any input parameters other than P(D�).

190 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 5: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

1.2. Uncertainty and variability in model inputs

For virtually all tests and all diseases, the available estimates of SENS, SPEC and

TPWH are affected by both, random and systematic error. In this sense, random error

accounts for the inherent variation of some trait within a population. Systematic error

represents a type of uncertainty since it encompasses all other sources of non-random

error (including measurement error, deficiencies in research procedures or, in general,

any lack of knowledge about the trait under study). When variability and uncertainty are

combined, we refer to the sum as total uncertainty (Hoffman and Hammonds, 1994). We

have already seen how quantities such as SENS, SPEC and TPWH can vary between

herds. Including the notion of total uncertainty in herd-testing models now lets us

describe our beliefs about the extent of systematic error in input quantities. This is

consistent with a `subjectivist' or Bayesian view of probability (Morgan and Henrion,

1990). Thus, for example, point estimates for SENS can be replaced by a probability

distribution that encapsulates the concept of a minimum, maximum, and the most likely

values of SENS. Such distributions can be derived from actual data (where the parameters

are determined by maximum likelihood) or they may be modified or derived wholly from

expert opinion. When distributions are derived entirely from data, variability is being

represented but uncertainty is assumed to be negligible. Distributions that are defined at

least in part by judgement or expert opinion can accommodate both variability and

uncertainty. Distributions that have a wide range and low values of kurtosis (implying

thick tails) represent greater total uncertainty than those with a narrow range and a high

value of kurtosis. With input parameters represented by probability distributions, Monte

Carlo simulation can be used to propagate variability and uncertainty through the model

so that it is represented in model output.

1.3. Motivations for a new model

In this paper, we extend the use of Monte Carlo simulation in herd-testing models by

describing a generalised method for estimating HSENS, HSPEC, HPPV and HNPV for a

dichotomous herd-disease state (diseased or not-diseased). The use and acceptance of

likelihood ratios is increasing in the field of clinical epidemiology and so we have

included the ability to estimate LRPHT (likelihood ratio for a positive herd test) and

LRNHT (likelihood ratio for a negative herd test). We then demonstrate the use of the

model by simulating various scenarios involving Shiga-like toxin (SLT) producing

Escherichia coli O157 in beef cattle.

2. Materials and methods

2.1. Model construction

The herd-testing simulation model we describe was constructed using an object-

oriented visual programming language (Delphi version 2, Borland International, Scott's

Valley, CA). Algorithms for generating random numbers and for sampling from

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 191

Page 6: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

probability distributions were obtained from several sources (Kachitvichyanukul and

Schmeiser, 1988; Press et al., 1989; Law and Kelton, 1991; Evans et al., 1993). Where

stochastic inputs were incorporated into the model, a comprehensive set of probability

distributions (including a fixed value or deterministic input) was made available for

selection. Correctness of the computer code for sampling from probability distributions

was verified by generating large numbers of random variates from each distribution and

then graphically comparing the resulting probability density estimate to a density plot of

the parent distribution. Table 2 provides descriptions and abbreviations for input

parameters and input options referred to in the remainder of the text.

The model represents the cluster-based structure of animal populations, where animals

tend to be managed within herds. The term `herd' is used loosely, and may refer to any

unit by which animals or humans may be aggregated (such as pens, flocks and geographic

regions). Herds and animals within herds are represented by objects within computer

memory rather than being summarised collectively by equations. Essentially, the model

reproduces the steps that would be required if an unlimited quantity of resources were

available to measure the performance of a herd-testing protocol in a field investigation.

Table 2Input parameters for the herd-testing model, their function and mode of sampling

Input parameter Function in the model Mode of action

Numeric inputs

INCUT input integer value of the cut-point number of test-positive results

denoting a positive herd

fixed

INDHS input distribution of herd size stochastic

INFIXPROP input proportion of animals to be sampled from herds fixed

INFIXN input integer number of animals to be sampled from herds fixed

INITERHERDS input integer number of iterations per herd fixed

INMAX input integer maximum number of animals to be sampled per herd fixed

INP(D�) input probability that an animal is diseased if it is selected at

random from a diseased herd

stochastic

INP(HD�) input probability that a herd selected at random is diseased stochastic

INCUTPROP input real value of the proportion of test-positive results in the

sample denoting the cut-point

fixed

INSAMHERDS input integer number of herds sampled per simulation fixed

INSENS input distribution of animal level test sensitivity stochastic

INSPEC input distribution of animal level test specificity stochastic

Switching option for determining cut-points

CUTMETHOD interpret the cut-point as an integer value derived from INCUT or

as an integer value derived from INCUTPROP

option

Mutually exclusive test policy options defining sample size for each herd

TALLHERD test all animals in the herd option

TFN test a fixed number (INFIXN) of animals from each herd option

TFPROP test a fixed percentage (INFIXPROP) of animals from each herd option

TGREAT test using whichever is the greater of TFPROP or TFN option

TLESS test using whichever is the lesser of TFPROP or TFN option

192 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 7: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

The model is best described by an eight-step algorithm that would enable independent

construction of an identical model.

2.2. Algorithm

Step 1. A single herd is sampled from the population of herds. Each herd is defined by

a herd size (N) by sampling from INDHS and a sample size (n) according to mutually

exclusive sampling options (TFPROP, TFN, TLESS, TGREAT or TALLHERD) that refer

to several sample-size determinants (INFIXPROP, INFIXN and INMAX). For each herd,

the probability that it is diseased (P(HD�)) is obtained by independent sampling from the

input probability distribution (INP(HD�)). Similarly, each herd is allocated a value of

P(D�), SENS and SPEC by sampling from INP(D�), INSENS and INSPEC,

respectively. Provided that the latter sampling distributions are not represented by a

fixed value, sampling from them provides the source of extra-binomial variation in

prevalence and in individual-test performance traits. Herds are also defined by a value of

c. When the Boolean option CUTMETHOD is set equal to `true', the value of INCUT is

used to define a constant integer value of c (which is then the same for all sample sizes).

When CUTMETHOD is `false,' sample-size-specific values of c are defined such that c is

equal to the smallest integer which is greater than INCUTPROP multiplied by n. For

example, if test-positive herds were defined as those with 5% or more of animals in the

sample being test-positive, then a herd with a sample of size 42 would obtain a cut-point

of 3 (42 multiplied by 0.05 rounded to the next-highest integer).

Step 2. The value of P(HD�) obtained in Step 1 is used as the parameter in a single

Bernoulli trial to decide the herd-infection status. If a herd is deemed not-diseased then

Step 2.1 is executed; otherwise, diseased herds are processed by Step 2.2. In both, steps

2.1 and 2.2, the number of test-positive individuals in the herd is derived and this

information then used to define the number of test-positive animals in the sample.

Step 2.1. The number of test-positive individuals in samples from non-diseased herds is

simulated as follows: One minus the value of SPEC allocated to the current herd is used

to define the probability of a false-positive individual test (P(T�|D-)) for that herd. Next,

the number of test-positive animals in the herd (NT�) is found by drawing a random

number from the binomial probability distribution with parameters N (herd size) and

P(T�|D-). The number of test-positive animals in the sample (x) is then obtained as a

random number from the hypergeometric probability distribution (Freund, 1992) having

parameters N, n, and NT�. Finally, if x is greater than or equal to the specified cut-point

(c) the herd is assigned a false-positive test result, otherwise it is assigned a true-negative

test result. When Step 2.1 is completed, Step 3 is executed.

Step 2.2. NT� for a diseased herd is derived from the sum of true-positive and false-

positive individuals. To define the number of true-positives, it is first necessary to

calculate the number of diseased animals in the herd (d) by one of the two methods. In the

first method (referred to as the `obligate parasite mode'), d is made equal to a random

number from the zero-truncated binomial probability distribution (Johnson and Kotz,

1969a) having parameters N and P(D�). This mode of simulation is suited to infectious

diseases, where the agent behaves as an obligate parasite because this mode ensures that

each diseased herd contains at least one diseased animal. Sampling from the zero-

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 193

Page 8: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

truncated binomial probability distribution was achieved by adding one to a random

number from the binomial distribution having parameters N-1 and P(D�). The second

method (`free-living contagion mode') is suited to conditions caused by organisms that

have a substantial capacity for free-living existence. This approach involves setting d

equal to a random number from the binomial probability distribution having parameters

N and P(D�). Note that herds of equal size with an equal value of P(D�) can have a

different number of diseased animals due to random sampling from the binomial

distribution. Once d is known, then the number of true positives in the herd is obtained as

a random number from the binomial distribution with parameters d and SENS, and the

number of false positives in the herd is obtained as a random number from the binomial

distribution with parameters N-d and P(T�|D-). NT�, for a diseased herd, is then derived

as the sum of true-positive and false-positive animals in the herd. For diseased herds, x is

then defined as a random number from the hypergeometric probability distribution with

parameters N, n, NT�. It is the application of the hypergeometric probability distribution

here, and in Step 2.1, that overcomes the difficulty found in other models arising from

variable herd size and large n relative to N. If x�c, the herd is classified as `true-positive';

otherwise, it is classified as `false-negative'. Upon completion of Step 2.2, Step 3 is

executed.

Step 3. The results for the current herd are stored and Step 2 (invoking either Step 2.1

or Step 2.2) is repeated until the total number of iterations equals INITERHERDS. Step 4

is then executed.

Step 4. For the results obtained in Step 3 for the current herd, the model sums the

number of herd results that are true-positive (a), false-positive (b), false-negative (g), and

true-negative (d). From these, HSENS is derived as a/(a�g), HSPEC as d/(b�d), HPPV

as a/(a�b), HNPV as d/(g�d), LRPHT as HSENS/(1-HSPEC), and LRNHT as HSPEC/

(1-HSENS). LRPHT expresses the probability of a positive herd-test result in a diseased

herd divided by the probability of a positive herd-test result in a non-diseased herd.

LRNHT expresses the probability of a negative herd-test result in a non-diseased herd

divided by the probability of a negative herd-test result in a diseased herd. Note that each

of the values calculated in this step is specific for the particular herd being simulated;

they are not derived as estimates of the averages of all herds in the population of herds.

Execution now passes to Step 5.

Step 5. Steps 2 to 4 are repeated INSAMHERD times and, on each occasion, the values

of input variables for each herd (SENS, SPEC, N, n, P(D�), and P(HD�)) and the

resulting outputs (HSENS, HSPEC, HPPV, HNPV, LRPHT, and LRNHT) are stored for

later analysis in Step 6.

Step 6. Values of the outputs from each herd (HSENS, HSPEC, HPPV, HNPV, LRPHT,

and LRNHT) are processed to provide descriptive statistics and graphical representation

of each outcome as a probability density or distribution function (cumulative probability).

Dependence between the defining traits of each herd (N, n, SENS, SPEC, P(D�) and

P(HD�)) and each output are assessed by calculating Spearman's rank correlation

coefficient, where tied ranks are assigned the mean rank of each element within the tie

(Freund, 1992). Significance of the departure of each rank correlation coefficient from 0

is assessed by calculation of a t-statistic (Steel and Torrie, 1980) and P-values are

calculated from the incomplete beta function (Press et al., 1990). The extent of

194 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 9: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

dependence between any pair of input and output variables can be viewed as a scatter

plot.

2.3. Example inputs: E. coli O157 in Ontario beef cattle

To demonstrate the use of the new model, we generated inputs required for a

hypothetical survey of Ontario beef-cattle herds. The objective was to examine the

performance of a herd-testing protocol for estimating the herd-level prevalence of

infection with SLT producing E. coli O157. This organism is an important human enteric

pathogen that is a commensal in the gut of cattle (Griffin and Tauxe, 1991). In this

scenario, testing of cattle for E. coli O157 is based on microbiological culture of fresh

faeces (a procedure that relies on the tendency of this organism not to ferment sorbitol),

and confirmation of the presence of genes encoding for the production of SLT (an

important virulence determinant).

In the example, the distribution of herd sizes was obtained by requesting Statistics

Canada to perform a custom search of the 1996 Agricultural Census data for Ontario.

Search criteria were specified to exclude dairy farms and to provide class intervals for

herd size that were as small as possible. Using these data, an empirical probability density

for beef cattle herd size was constructed under the assumption that the maximum herd

size in the population of Ontario herds was 5000.

A distribution representing the probability that a herd was infected was obtained by

subjective assessment of point estimates from the scientific literature such as the list cited

by Armstrong et al. (1996). Initially, a large number of estimates was considered although

many were eventually discarded because of weakness in study design (especially,

deficiencies in sampling strategy) or use of tests that are too insensitive. Emphasis was

placed on one particular study (Hancock et al., 1997b), since the target population of

herds most-resembled Ontario beef cattle herds due to their similarity in animal

management and geographic location. In their study, 63 out of 100 feedlots in the United

States were found positive for SLT producing E. coli O157. Since extrapolating from

surveys that rely on imperfect tests is likely to introduce error, we chose to represent the

uncertainty in herd prevalence as a triangular distribution having a minimum 45%, a

mode of 63%, and a maximum of 100%. The minimum value is derived from the beta

distribution with n�63 and w�38, and is the approximate value of prevalence, where the

lower tail of this distribution achieves zero probability. The choice of parameters for the

beta distribution is consistent with the relationship between the binomial distribution and

the incomplete beta function where the confidence in a binomial proportion is defined by

a beta distribution with n equal to the number of successes and w equal to the number of

failures plus one (Evans et al., 1993). We considered that the maximum value of 100%

herd prevalence could be justified because of uncertainty arising from the effect of

transitory shedding of E. coli 0157 in cattle faeces (Hancock et al., 1997a; Mechie et al.,

1997) and because the published herd prevalence estimate was itself based on samples (of

animals from each herd) subjected to an imperfect test.

Data for estimating the distribution of within-herd prevalence of infection are available

from a study of dairy heifers from 36 dairy herds (Hancock et al., 1997a) and from a

study of cattle from 100 beef feedlots (Hancock et al., 1997b). Because the histograms of

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 195

Page 10: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

within-herd prevalence from these two studies are very similar, we pooled the information

belonging to diseased herds from both studies and plotted these data as a probability

density. We then visually assessed the fit of various triangular and beta distributions,

making allowances for the need to account for uncertainty. The final choice was a beta

distribution where the parameters were derived (using maximum-likelihood estimation)

from the pooled data followed by iterative adjustment to maximise the chi-squared

goodness-of-fit statistic. Distribution fitting was performed using commercial software

(Bestfit, Palisade, NY). The chosen beta distribution (with parameters n�1.15 and

w�53.69) and the pooled data are graphically depicted in Fig. 1. Goodness-of-fit of the

data to the beta distribution was poor, as judged by the chi-squared test (c2�15.6;

P�0.03) but more acceptable as judged by the Kolmogorov±Smirnov test (P>0.15)

(which is regarded a more-robust test of fit) (Law and Kelton, 1991).

The laboratory test, we assume to be in use in this survey, begins with a swab of fresh

cattle faeces (obtained per rectum), which is enriched overnight in tryptic soy broth and

then plated onto sorbitol McConkey agar with added cefiximine and tellurite (SMACct).

Positive colonies (maximum of 10) are subjected to tests for lactose fermentation and

beta glucuronidase activity, then submitted to latex agglutination for confirmation of the

correct O-antigen type, and DNA hybridisation for the presence of genes encoding for

SLT. A full description of a similar test protocol is given elsewhere (Hancock et al.,

1997b). Estimates of test sensitivity were obtained from Experiment 2 of the study by

Sanderson et al. (1995). We restricted use of their data to aiding our subjective judgement

of an appropriate probability distribution for SENS since the available estimates were not

based on a large number of individuals, and because of the limit of 10 colonies per sample

being subjected to O-antigen typing and biochemical tests. Furthermore, estimates of

variability of SENS from herd to herd are not available. The uniform distribution is

Fig. 1. Observed (pooled) and fitted probability density estimates (f(x)) for the intra-herd prevalence of infection

with E. coli O157.

196 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 11: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

appropriate for these circumstances because it portrays the amount of ignorance about the

quantity under question (Vose, 1996). Hence, we chose the uniform distribution with a

minimum value of 0.5, and maximum value of 0.8, to represent sensitivity of detection of

infection in individual cattle. The choice of minimum and maximum values are opinion-

based and reflect conservative bounds placed on the published estimates of Sanderson

et al. (1995).

Much greater confidence is possible for the distribution of individual test specificity

because of the reliability of confirmatory tests applied to isolates. The use of a sorbitol-

based culture medium followed by several biochemical tests is designed to remove most

(if not all) isolates that could cause a false positive result. Subsequent O-antigen typing

and DNA hybridisation for detection of genes encoding for SLT are likely to be almost

100% specific. In the absence of literature quantifying the specificity of the individual-

animal, test we assumed that it was uniformly distributed with a minimum of 99% and a

maximum of 100%.

The herd-test sampling protocol was based on a compromise between cost and

convenience. We modified a protocol (previously used for sampling adult dairy cattle)

which entailed a random selection of 25% of the herd or 10 cattle (whichever was the

greater) (Wilson et al., 1996). However, we applied a maximum of 50 to the number of

cattle to be tested from any one herd so as to reflect the need to contain laboratory costs

when large herds are being tested.

2.4. Model experimentation

We now refer to the stochastic inputs for E. coli O157 when applied to all Ontario beef

herds as the base scenario. The base scenario was simulated to demonstrate the overall

effect of uncertainty in all input parameters and to assess the rank-correlation between

herd-level inputs and outputs. Initially, the base scenario was simulated with the cut-point

being one positive animal in the sample. This simulation was repeated with the cut-point

being defined as 10% of animals in the sample being test-positive.

To investigate the benefits of the model over earlier models that assume both sampling

with replacement and constant herd size, we compared the base scenario with a scenario

having identical inputs except that the model was forced to sample animals from herds

with replacement.

Output from the base scenario was produced when the model was run in the obligate-

parasite mode and compared to the earlier output when the model was run in the free-

living-contagion mode.

Due to the sparseness of data for the input assumption INP(D�), we wanted to know if

model output was biased due to the poor fit of the beta distribution to the observed data

for P(D�). To answer this, we compared the base scenario with output from the model

when INP(D�) in the base scenario was replaced by the observed data summarised as an

empirical probability density.

Unless otherwise stated, all simulations were performed using a constant integer cut-

point value of one. Simulations for each scenario were based on a sample of 1000 herds

(INSAMHERDS�1000). Within the main simulation, each herd was simulated 1000

times to provide a value for each herd-test parameter (INITERHERDS�1000).

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 197

Page 12: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

3. Results

3.1. Base scenario

For the base scenario, the distribution of HSENS has a wide range (0.003±0.99), a

median of 0.19, and a mode at �0.15 (Fig. 2(a)). The range of HSPEC was narrower

(0.58±1), with a median of 0.94 (Fig. 2(b)). HPPV had a median of 0.88 and ranged from

0.5 to 1 (Fig. 2(c)). HNPV was very uncertain, extended from 0.002 to 0.98, and had a

median of 0.35. LRPHT ranged from 0.27 to 186.4, and had a median of 2.95 (Fig. 2(e)).

LRNHT ranged from 0.96 to 51.3 although most values were close to one (median�1.15)

(Fig. 2(f)). Spearman's rank correlation coefficients in Table 3 indicate the extent of

dependence between the various herd-level inputs and herd test-performance parameters

for the base scenario.

When the base scenario was simulated with the cut-point instead defined as 10% of

sample size, the distribution of HSENS (Fig. 3(a)) was shifted to the left and HSPEC was

improved by elimination of much of the lower tail (Fig. 3(b)). However, there was no

meaningful improvement in the distributions of herd test predictive values (compare

Fig. 2(c) with Fig. 3(c), and Fig. 2(d) with Fig. 3(d)).

3.2. Sampling with replacement

Some outputs for the base scenario generated by sampling with replacement differed

from those obtained when sampling was performed without replacement. The distribution

of HSENS in the sampling-with-replacement option (Fig. 4(a)) is shifted �5% to the left

of the theoretically superior sampling without replacement option (Fig. 2(a)). The lower

tail of the distribution of HSPEC produced from sampling with replacement (Fig. 4(b)) is

of slightly lower volume than that obtained by sampling without replacement (Fig. 2(b)).

Table 3Spearman's rank correlation between herd-level inputs (column headings) and herd-level outputs (row headings)with p-values in brackets (Ho: rS�0) for the simulation of E. coli O157 detection in Ontario beef herds using thebase scenario inputs

Herd size Sample size Sensitivity Specificity P(HD�)g P(D�)h

HSENSa �0.45 (0.00) �0.48 (0.00) �0.08 (0.01) ÿ0.26 (0.00) ÿ0.04 (0.21) �0.77 (0.00)

HSPECb ÿ0.50 (0.00) ÿ0.53 (0.00) �0.01 (0.78) �0.78 (0.00) �0.02 (0.46) ÿ0.02 (0.60)

HPPVc ÿ0.12 (0.00) ÿ0.10 (0.00) �0.06 (0.04) �0.51 (0.00) �0.52 (0.00) �0.49 (0.08)

HNPVd �0.15 (0.00) �0.16 (0.00) �0.03 (0.41) ÿ0.03 (0.39) ÿ0.90 (0.00) �0.30 (0.00)

LRPHTe ÿ0.12 (0.00) ÿ0.11 (0.00) �0.07 (0.02) �0.53 (0.00) ÿ0.04 (0.21) �0.73 (0.00)

LRNHTf �0.30 (0.00) �0.33 (0.00) �0.08 (0.01) ÿ0.01 (0.67) ÿ0.05 (0.12) �0.88 (0.00)

a Herd-level sensitivity.b Herd-level specificity.c Herd-level positive predictive value.d Herd-level negative predictive value.e Likelihood ratio for a positive herd test.f Likelihood ratio for a negative herd test.g Probability that a randomly selected herd is diseased.h Probability that a randomly selected animal from a diseased herd is diseased.

198 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 13: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

Fig. 2. Simulation output for E. coli O157 testing of Ontario beef herds. The simulation used the base-scenario

assumptions, an integer cut-point of one, hypergeometric sampling, and free-living-contagion mode of simulation:

(a) herd-level sensitivity; (b) herd-level specificity; (c) herd-level positive predictive value; (d) herd-level negative

predictive value; (e) likelihood ratio for a positive herd test; and (f) likelihood ratio for a negative herd test.

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 199

Page 14: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

In this comparison, the two sampling methods do not provide differences in predictive

values that are of practical importance (compare Fig. 4(c) with Fig. 2(c), and Fig. 4(d)

with Fig. 2(d)).

3.3. Effect of obligate-parasite mode

Output for the base scenario simulated under obligate-parasite mode differed

noticeably from output obtained by simulating in free-living-contagion mode. Essentially,

the body of the distribution of HSENS produced in free-living-contagion mode (Fig. 2(a))

Fig. 3. Simulation output for E. coli O157 testing of Ontario beef herds. The simulation used the base-scenario

assumptions, 10% of sample size as the cut-point, hypergeometric sampling, and free-living-contagion mode of

simulation: (a) herd-level sensitivity; (b) herd-level specificity; (c) herd-level positive predictive value; and (d)

herd-level negative predictive value.

200 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 15: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

is shifted almost 10% to the right (improved) when simulation is conducted in obligate

parasite mode (Fig. 5(a)). As expected, there were no changes in HSPEC (Fig. 2(b) vs.

Fig. 5(b)), although the distribution of HPVPOS was improved by a reduction in volume

of the lower tail (Fig. 2(c) vs. Fig. 5(c)). HPVNEG was relatively unchanged (Fig. 2(d)

vs. Fig. 5(d)).

3.4. Effect of perfect test performance

When the base scenario was simulated with INSENS fixed at 100%, there was no

substantial change in the distributions of HSENS, HSPEC, HPPV or HNPV (output not

Fig. 4. Simulation output for E. coli O157 testing of Ontario beef herds. The simulation used the base-scenario

assumptions, an integer cut-point of one, binomial sampling (sampling with replacement), and free-living-

contagion mode of simulation: (a) herd-level sensitivity, (b) herd-level specificity, (c) herd-level positive

predictive value, and (d) herd-level negative predictive value.

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 201

Page 16: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

shown) compared to the base scenario (Fig. 2). When the base scenario was simulated

with INSPEC fixed at 100%, HSENS fell but remained highly uncertain (Fig. 6(a)),

HSPEC and HPPV were (as expected) both 100%, and there was little improvement in

HNPV or reduction in uncertainty (Fig. 6(b)).

3.5. Substituting an empirical distribution for P(D�)

When an empirical distribution representing the observed data of Hancock et al.

(1997a, b) replaced the best-fitting beta distribution in INP(D�) in the base scenario, all

Fig. 5. Simulation output for E. coli O157 testing of Ontario beef herds. The simulation used the base-scenario

assumptions, an integer cut-point of one, hypergeometric sampling, and obligate-parasite mode of simulation:

(a) herd-level sensitivity; (b) herd-level specificity; (c) herd-level positive predictive value; and (d) herd-level

negative predictive value.

202 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 17: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

output variables were virtually indistinguishable (output not shown) from those of the

base scenario (Fig. 2).

4. Discussion

The model described here provides several improvements on existing models by

incorporating all of the following features. Firstly, input parameters can be expressed as

probability distributions to account for uncertainty and variability. Secondly, the model

accommodates cluster-correlated behaviour of SPEC, SENS and TP. Thirdly, variation in

herd size is accounted for and (by default) sampling is performed without replacement.

Fourthly, the number of animals sampled from each herd can vary according to simple or

complex sampling protocols (the latter were included not because they offer a particular

advantage, but because they allow the interpretation of a wide range of existing studies).

Fifth, the classification of herds as test-positive or negative can be based on either an

integer cut-point or a proportion of the sample testing positive. Finally, diseases due to

organisms that behave as obligate parasites are modelled differently to those which are

caused by a contagion able to survive for a substantial period in the environment. We

have also given examples (based on E. coli O157 in Ontario beef herds) of how input

distributions may be arrived at and then demonstrated their use in specific simulations.

Clearly, there are many scenarios which could be further explored with this model. For

this purpose, interested readers can obtain from the first author a copy of the software

which runs on 32-bit Windows1 operating systems (HerdTest, D Jordan, Department of

Population Medicine, University of Guelph).

Fig. 6. Simulation output for E. coli O157 testing of Ontario beef herds. The simulation used the base-

scenario modified by assuming test specificity is 100%, an integer cut-point of one, hypergeometric sampling,

and free-living-contagion mode of simulation: (a) herd-level sensitivity; and (b) herd-level negative predictive

value.

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 203

Page 18: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

Each of the outputs from this model represents a herd-specific probabilistic value

rather than a population-specific proportion derived by averaging one observation per

herd. The former is preferable because it allows for flexible modelling of herd-test status

in probabilistic risk assessments and because it acknowledges that testing traits (HSENS,

HSPEC, HPPV, HNPV, LRHPT, and LRHNT) can vary from herd to herd. These

probabilistic estimates contain much more information than can be obtained by

performing a simple sensitivity analysis using any of the earlier deterministic type models

because the Monte-Carlo technique effectively weights each value of each input variable.

Furthermore, even when one assumes that all inputs are certain and fixed, herd-level

testing traits for herds of the same size will continue to exhibit variability due to binomial

and hypergeometric sampling. This phenomenon is not allowed for in earlier models that

calculate point estimates of herd testing traits, and it is likely to have some impact on

methods for calculating sample sizes for herd testing.

Spearman's correlation coefficients (rS) indicate which herd-level inputs are driving the

differences in herd-level outputs (Table 3). For example, rS between P(D�) and HSENS

is �0.77, indicating that within-herd prevalence has a strong positive influence on

HSENS. The correlation between SPEC and HSENS (rS�ÿ0.26) suggests that

information providing more certain estimates of SPEC would to some extent deliver

more certain HSENS. The moderate positive correlation between SPEC and LRPHT

(rS��0.53) also suggests that improved SPEC estimates would enhance the use of the

test protocol. Because the model independently allocates each herd a value of SENS,

SPEC, P(HD�) and P(D�), pairs of rSs involving these terms can be considered

independent. However, care is needed in interpreting pairs of rSs involving both herd size

and sample size because in some simulations (such as those presented here), sample size

is dependent on herd size. Although, theoretically, the Bonferroni correction should be

applied when interpreting the significance of multiple associations within the simulation

output, this is of little practical importance when simulations consist of a large number of

iterations which inevitably provide very precise values of rS and extremely small p-

values.

Simulation of the E. coli O157 base scenario indicates that tests based on faecal culture

that are applied to only part of the herd provide unacceptable distributions of HPPV and

HNPV. Uncertain herd-level positive predictive values and low and uncertain negative

predictive values indicate the potential for misclassification of herd-infection status,

should the herd-testing protocol be used in a survey or observational study. However, the

moderate association between P(HD�) and HPPV (rS��0.52) and the strong negative

association between P(HD�) and HNPV (rS��0.90) (which is consistent with the

performance of these traits at the individual-level) indicate that better definition of prior

estimates of herd-level prevalence of infection will improve the certainty of herd-level

predictive values. In this example, it appears as though uncertainty in P(D�) is very

important because it is either moderately or strongly associated with all herd-testing

parameters except for HSPEC. Both LRPHT and LRNHT indicate the difficulty of

interpreting herd-test results when the current test protocol is employed. An advantage of

this use of likelihood ratios is that they can be interpreted regardless of the value of herd-

level prevalence. This example highlights the need for researchers to investigate the

performance of a herd-testing protocol during the planning of studies on herd-level risk

204 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 19: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

factors for disease in order to minimise misclassification of herd-disease status. Results

for the base scenario also confirm that the use of fixed values of HSENS, HSPEC, HPPV

or HNPV in a risk-assessment model should be avoided due to the amount of uncertainty

associated with each of these outcomes.

Simulations examining the effects of perfect SENS and SPEC were performed to

demonstrate what gains in herd-testing performance might be achieved if more

research effort were directed at improving tests for E. coli O157. Improving SPEC

alone (which might occur if the E. coli O157 isolates were more definitively typed

for H-antigen and virulence determinants), causes a reduction in HSENS because a

proportion of infected herds in the base scenario are classified as infected (not because

of true-positive individuals, but, rather, because of false-positives). We have found

that in some circumstances imperfections in test specificity contribute substantially to

better HSENS. Improving SENS has little impact on HSENS because P(D�) is low

and the sample size is such that the probability of including infected animals in the

sample is slight. We conclude that if industry required a more accurate definition of

the E. coli O157 status of beef herds, then use of improved tests may need to be

accompanied by a more intensive sampling of herds. The model described here can

predict the magnitude of such improvements for a wide variety of disease scenarios.

Moreover, other methods of improving the usefulness of information derived from herd

testing need to be explored. One adjunct is to use multiple cut-points (which makes

greater use of the number of test-positives than a single cut-point interpretation) (Jordan,

1996); another adjunct is the use of supplemental tests (e.g. excluding false positive

results) (Garner et al., 1997), and a third is to attempt prevalence estimation from pooled

samples (Sacks et al., 1989).

It is important to note that tests for E. coli O157 based on faecal culture will not detect

`carrier' cattle which are infected with the organism but which do not excrete it in faeces.

For this reason, the literature often refers to prevalence of shedding rather than prevalence

of infection. If an estimate of the accuracy of classifying herd-shedding status is required,

then results obtained from simulation in obligate-parasite mode are probably more

applicable. These results are produced under the assumption that there is at least one

animal excreting the organism in an infected herd. In our simulations, we have assumed

the carrier status is not important. If a contrary belief is held, then consideration should be

given to using estimates of individual test sensitivity that are lower than those used in our

simulations. Doing so would account for the impact of failing to detect carrier

individuals.

The choice between running the model in free-living-contagion mode or in obligate-

parasite mode is sometimes important. In free-living-contagion mode, infected herds do

not necessarily contain any infected animals at the time of testing. Rather, it is possible

for the pathogen of interest to be present only in the environment (e.g. soil, water,

rodents, birds) but it is available to infect individuals at some later opportunity. The free-

living-contagion mode of simulation is better-suited to modelling of herd E. coli O157

infection status, since for this organism there is good evidence that environmental

reservoirs of infection are important (Wang et al., 1996). This is also undoubtedly true for

other agents of food-borne disease such as Salmonella. The net result of running the

model in free-living-contagion mode is that the binomial distribution simulating the

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 205

Page 20: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

number of infected animals in the herd permits herds to have nil-infected animals (which

provides for fewer test-positive animals to be detected at herd testing). Consequently, in

free-living contagion mode, HSENS is likely to be reduced in some circumstances (as

was shown in our results). Because the moments of the binomial and zero-truncated

binomial distribution are virtually the same for large numbers of trials, large probability

of success, or both (Johnson and Kotz, 1969a), there is unlikely to be any practical

difference between the two modes of simulation when P(D�) is high and when N is large.

In cases where exceptionally good data are available for P(D�) and there is no desire to

allow for uncertainty, the probability-of-success parameter for the zero-truncated

binomial distribution can be estimated by various techniques including maximum

likelihood (Johnson and Kotz, 1969b). We consider that the facility for running the model

in obligate-parasite mode is important for many diseases. Examples are bovine leukosis

and bovine tuberculosis (in the absence of wildlife reservoir of infection) where it may be

necessary to define the status of small herds following a period of suppression of

prevalence due to an attempt to eradicate infection.

In the example simulations, we demonstrated that sampling with replacement ± an

assumption present in earlier models ± can lead to biased estimates of HSENS. Although

the bias induced by sampling with replacement is well recognised, we felt it appropriate

to evaluate if it persists when there is considerable uncertainty about SENS, SPEC and

P(D�). With small herds, the proportion of the herd being sampled is larger (under the

testing protocol described) and this tends to increase the differences between the binomial

and hypergeometric sampling methods. Although the differences observed in the results

are not large, they could be expected to be much greater if (for example) pen-level test

performance were being investigated. Nevertheless, since this model has been

incorporated into software, there is only a trivial cost in computation time incurred

when sampling from the hypergeometric distribution. Because sampling from the

hypergeometric distribution removes the need to make a sometimes-invalid assumption,

we consider that it should be relied on as the default mode of sampling in all simulations

of herd testing.

A problem often encountered in stochastic simulation is lack of sufficient data to

justify a particular parametric distribution for an input variable. Parametric distributions

are often preferred to empirical distributions because they provide better definition of the

probability of extreme outcomes ± that is, provided that the chosen parametric

distribution is close to the correct one. In our study, the data available on P(D�) for E.

coli O157 in cattle were sparse and maximum-likelihood techniques provided us with a

beta distribution that did not provide a good fit to the observed data. However, our

simulation results showed that model output was insensitive to the choice between the

beta and empirical distributions. Similarly, the outputs were not responsive to variation in

SENS suggesting that improvement in the ability to detect E. coli O157 would not

improve the results of herd-testing using the protocol described here. We included this

evaluation to demonstrate that in some instances it may not be necessary to expend

resources to obtain an accurate definition of a specific input variable when that variable

has little influence on the distribution of outputs.

Stochastic simulation modelling has utility in quantitative risk assessment where it is

highly desirable to allow for variability and uncertainty in input variables. In food-safety

206 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 21: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

and import-export risk assessment, it is likely that these models will incorporate the

notion of testing of animals derived from clusters (herds, regions or pens) so that the

cluster may be assigned a disease status which has relevance to the remainder of the

model. Such risk models, therefore, require assumptions about the distribution either of

HSENS and HSPEC or, possibly, HPPV and HNPV. The model described here can supply

these assumptions and is suited to conditions where there is uncertainty about factors

such as SENS and SPEC (a circumstance which is virtually inevitable in animal disease

testing). Output from the example simulations does confirm that herd-test parameters

should be described by distributions rather than as fixed values. Moreover, this model

calculates the rank-correlation between pairs of outputs as a measure of dependence. It is

important to know if dependence exists within these pairs of random variables; if they

were subsequently used as the input in a risk model, the extent of dependence should be

accounted for to avoid bias in model outputs (Vose, 1996). The advantages of Spearman's

rank correlation coefficient as a measure of dependence is that it is a non-parametric test

and it is possible to induce a specified rank correlation amongst stochastic input variables

(Iman and Conover, 1982).

Assumptions must be made in order to construct any model. The extent to which the

assumptions can be upheld dictates the strength of inferences that may be drawn from the

output. For example, at several points, we sampled from the binomial probability

distribution which assumes that each trial is independent and the probability of success

for each trial is identical. The independence of each trial might be questionable if there

are nested clusters (e.g. litters within herds). It is also feasible that the probability of

success is affected by individual-animal traits. As previously discussed, SENS, SPEC,

and P(D�) are notable candidates for this phenomenon. We have also assumed that no

dependences exist between the input variables in this model, and this might not be true

for all scenarios. In the E. coli O157 scenario, for example, there may well be a

relationship between the size of a beef herd and the probability that it is infected (since

large herds tend to be feedlots that purchase feeder cattle from many different sources).

We have also assumed throughout that random sampling is used. Avoidance of

assumptions such as these requires new assumptions and greater model complexity

(which are difficult to justify with existing knowledge).

In summary, the model described here provides a flexible method of estimating

the performance of a wide variety of herd testing protocols and negates the need to

supply fixed estimates of SENS, SPEC, P(D�) and P(HD�) which often cannot be

justified, either because they are uncertain or of varying quantities. When a herd

testing protocol for E. coli O157 was simulated, distribution estimates of herd-testing

parameters suggest that misclassification of herd infection and shedding status would

be common.

Acknowledgements

Financial assistance for this work was provided by the Meat Research Corporation of

Australia, the Ontario Ministry of Agriculture, Food and Rural Affairs, and New South

Wales Agriculture.

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 207

Page 22: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

References

Armstrong, G.L., Hollingsworth, J., Morris, J.G., 1996. Emerging foodborne pathogens: Escherichia coli

O157:H7 as a model of entry of a new pathogen into the food supply of the developed world. Epidemiol.

Rev. 18, 29±51.

Cameron, A.R., Baldock, F.C., 1998. A new probability formula for surveys to substantiate freedom from

disease. Prev. Vet. Med. 34, 1±17.

Cannon, R.M., Roe, R.T., 1982. Livestock Disease Surveys. A Field Manual For Veterinarians. Australian

Government Publishing Service, Canberra.

Carpenter, T.E., Gardner, I.A., 1996. Simulation modeling to determine herd-level predictive values and

sensitivity based on individual-animal test sensitivity and specificity and sample size. Prev. Vet. Med. 27,

57±66.

Chiodini, R.J., Van Kruiningen, H.J., Merkel, R.S., 1984. Ruminant paratuberculosis (Johne's Disease) the

current status and future prospects. Cornell Vet. 74, 218±262.

Donald, A., 1993. Prevalence estimation using diagnostic tests when there are multiple, correlated disease states

in the same animal or farm. Prev. Vet. Med. 15, 125±145.

Donald, A.W., Gardner, I.A., Wiggins, A.D., 1994. Cut-off points for aggregate herd testing in the presence of

disease clustering and correlation of test errors. Prev. Vet. Med. 19, 167±187.

Evans, M., Hastings, H., Peacock, B., 1993. Statistical Distributions, 2nd edn. John Wiley & Sons, New York.

Freund, J.E., 1992. Mathematical Statistics, 5th edn. Prentice Hall, NJ.

Garner, M.G., Gleeson, L.J., Holyoake, P.K., Cannon, R.M., Doughty, W.J., 1997. A national serological survey

to verify Australia's freedom from porcine reproductive and respiratory syndrome. Aust. Vet. J. 75, 596±

600.

Griffin, P.M., Tauxe, R.V., 1991. The epidemiology of infections caused by Escherichia coli O157:H7, other

enterohemorrhagic E. coli, and the associated hemolytic ureamic syndrome. Epidemiol. Rev. 13, 60±98.

Hancock, D.D., Rice, D.H., Herriot, D.E., Besser, T.E., Ebel, E.D., Carpenter, L.V., 1997a. Effects of farm

manure-handling practices on Escherichia coli O157 prevalence in cattle. J. Food Prot. 60, 363±366.

Hancock, D.D., Rice, D.H., Thomas, L.A., Dargatz, D.A., Besser, T.E., 1997b. Epidemiology of Escherichia coli

O157 in feedlot cattle. J. Food Prot. 60, 462±465.

Hoffman, F.O., Hammonds, J.S., 1994. Propagation of uncertainty in risk assessments: The need to

distinguish between uncertainty due to lack of knowledge and uncertainty due to variability. Risk Anal.

14, 707±712.

Iman, R.L., Conover, W.J., 1982. A distribution-free approach to inducing rank correlation among input

variables. Commun. Statist.-Simula. Comput. 11, 311±334.

Johnson, N.L., Kotz, S., 1969a. Discrete Distributions. John Wiley & Sons, New York.

Johnson, N.L., Kotz, S., 1969b. Discrete Distributions. John Wiley & Sons, New York, p. 73.

Jordan, D., 1996. Aggregate testing for the evaluation of Johne's disease herd status. Aust. Vet. J. 73, 16±19.

Kachitvichyanukul, V., Schmeiser, B.W., 1988. Binomial random variate generation. Commun. Assoc. Comput.

Mach. 31, 216±222.

Kraemer, H.C., 1992. Evaluating Medical Tests. Sage, Newbury Park, p. 92.

Law, A.M., Kelton, W.D., 1991. Simulation modelling and analysis, 2nd edn. McGraw-Hill, New York.

MacDiarmid, S.C., 1988. Future options for brucellosis surveillance in New Zealand beef herds. N.Z. Vet. J. 36,

39±42.

Martin, S.W., Shoukri, M., Thorburn, M.A., 1992. Evaluating the health status of herds based on tests applied to

individuals. Prev. Vet. Med. 14, 33±43.

Mechie, S.C., Chapman, P.A., Siddons, C.A., 1997. A fifteen month study of Escherichia coli O157:H7 in a

dairy herd. Epidemiol. Infect. 118, 17±25.

Morgan, M.G., Henrion, M., 1990. Uncertainty, a Guide to Dealing with Uncertainty in Qualitative Risk and

Policy Analysis. Cambridge University Press, New York, p. 47.

Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterlin, W.T., 1989. Numerical Recipes in Pascal. Cambridge

University Press, Cambridge.

Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T., 1990. Numerical Recipes in C: The Art of

Scientific Computing, 2nd ed. Cambridge University Press, Cambridge.

208 D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209

Page 23: Herd-level test performance based on uncertain …...In combination with poor likelihood ratios for positive and negative herd tests, results indicate that the testing protocol investigated

Rogan, W.J., Gladen, B., 1978. Estimating the prevalence from the results of a screening test. Am. J. Epidemiol.

107, 71±76.

Sacks, J.M., Bolin, S., Crowder, S.V., 1989. Prevalence estimation from pooled samples. Am. J. Vet. Res. 50,

205±206.

Sanderson, M.W., Gay, J.M., Hancock, D.D., Gay, C.C., Fox, L.K., Besser, T.E., 1995. Sensitivity of

bacteriologic culture for detection of Escherichia coli O157:H7 in bovine feces. J. Clin. Microbiol. 33,

2616±2619.

Steel, R.G., Torrie, J.H., 1980. Principles and Procedures of Statistics, 2nd edn. McGraw Hill, New York.

Vose, D., 1996. Quantitative Risk Analysis. John Wiley, Chichester, UK.

Wang, G., Zhao, T., Doyle, M.P., 1996. Fate of enterohemorrhagic Escherichia coli O157:H7 in bovine feces.

Appl. Environ. Microbiol. 62, 2567±2570.

Wilson, J.B., Clarke, R.C., Renwick, S.A., Rahn, K., Johnson, R.P., Karmali, M.A., Lior, H., Alves, D., Gyles,

C.L., Sandhu, K.S., McEwen, S.A., Spika, J.S., 1996. Vero cytotoxigenic Escherichia coli infection in dairy

farm families. J. Infect. Dis. 174, 1021±1027.

D. Jordan, S.A. McEwen / Preventive Veterinary Medicine 36 (1998) 187±209 209