Rocke Lorenza to 1995 Horwitz

Embed Size (px)

Citation preview

  • 7/27/2019 Rocke Lorenza to 1995 Horwitz

    1/9

    @ 1995 American Statistica l Association andthe American Society for Quality Control TECHNOMETRICS, MAY 1995, VOL. 37, NO. 2

    A Two-Component Model for MeasurementError in Analytical Chemistry

    David M. ROCKEGraduate School of Management

    University of CaliforniaDavis, CA 95616

    Stefan LORENZATODivision of Water Quality

    California State Water ResourcesControl BoardSacramento, CA 94244-2130

    In this article, we propose and test a new model for measurement error in analytical chemistry. Often,the standard deviation of analytical errors is assumed to increase proportionally to the concentrationof the analyte, a model that cannot be used for ve ry low concentrations. For near-zero amounts, thestandard deviation is often assum ed constant, which does not apply to larger quantities. Neither mo delapplies across the full range of concentrations of an analyte. By positing two error components, oneadditive and one multiplicative, we obtain a model that exhibits sensible behavior at both low and highconcentration levels. We use maximum likelihood estimation and apply the technique to toluene bygas-chromatography/mass-spectrometry and cadmium by atomic absorption spectroscopy.KEY WORDS: Atomic absorption spectroscopy (AAS); Coefficient of variation; Detection limit;Gas-chromatography/mass-spectrometry (GUMS); Maximum likelihood; Quantita-

    tion level.

    1. INTRODUCTION: MEASUREMENT NEARTHE DETECTION LIMITTraditionally, the description of the precision of an an-alytical method is accomplished by applying two separate

    models, one for describing zero and near-zero concen-trations of the analyte (the compound that the analyticalmethod is designed to measure) and another for quantifi-able amounts. This traditional approach leaves a grayarea, of analytical responses in which the precision o fthe measurements cannot be determined. The model gov-erning quantifiable amounts assumes that the likely sizeof the analytical error is proportional to the concentration.If this model is applied to analytical responses in the grayarea, then there is an implicit assumption that the analyti-cal error becomes vanishingly small as the measurementsapproach 0. From long experience, this assumption ap-pears to be invalid. Similarly, if the zero-quantity modelis applied, there is an implicit assumption that the abso-lute size of the analytical error is unrelated to the amountof material being measured. Based on similar empiricalinformation, this assumption also cannot be supported.The new model presented in this article resolves thesedifficulties by providing an estimate of analytical p reci-sion that varies between the two extremes described bythe traditional models. The model provides a distinct ad-vantage over existing methods by describing the precisionof measurements across the entire usable range. Examplesare given of two different analytical methods, an atomicabsorption spectroscopy analysis for cadmium and a gas-chromatography/mass-spectrometry analysis for toluene,

    both of which support the validity of the new model. Thenew model is applicable to a wide variety of situationsincluding nonlinear calibration and added-standards cali--bration. Discussion is provided on the application of thenew model to some common issues such as determina-tion of detection limits, characterization of single samples,and determination of sample size required for inference togiven tolerances.Many measurement technologies have errors whosesize is roughly proportional to the concentration-this isoften true over wide ranges of concentration (Caulcutt andBoddy 1983). One common way to describe this constantcoefficient of variation (CV) model is that the measuredconcentration x is given bylog(x) = WPU) + rl (1.1)

    orx = ,ue, (1.2)where p is the true concentration and q is a normallydistributed analytical error with mean 0 and standard de-viation o,,. This model is widely used, but it fails to makesense for very low concentrations because it implies ab-solute errors of vanishingly small size.On the other hand, one often considers the case of an-alyzing blanks (samples of zero concentration) and datanear the detection limit by the modelx=k+e (1.3)

    with normally distributed analytical error e. This is usedfor calibration with ,YL 0 and for the determination of

    176

  • 7/27/2019 Rocke Lorenza to 1995 Horwitz

    2/9

    A MODEL FOR MEASUREMENT ERROR IN ANALYTICAL CHEMISTRY 177

    detection limits (Massart, Vandingste, Deming, Michotte,and Kaufman 1988). It is, however, a bad approximationover wider ranges of concentration because it implies thatthe absolute size of the error does not increase with theconcentration, and this is rarely true of analytical methods.The solution proposed here is a combined modelthat reflects two types of errors. For example, in agas-chromatography/mass-spectrometry (CC/MS) analy-sis one type of error is in the generation and measurementof peak area, which will generally have errors of size pro-portional to the concentration. The other type of errorcomes from the fact that, even when fed with a blank sam-ple, the output is not a flat line but still retains some smallvariation. There are many sources of error in such ananalysis; the idea here is merely to classify them into twotypes, additive and multiplicative. The model proposed is

    x = p? + E, (1.4)where there are two analytical errors, 17 N(0, cr,:) andE - N(0, o,)> each normally distributed with mean 0.Here q represents the proportional error that is exhibitedat (relatively) high concentrations and E represents theadditive error that is shown primarily at small concentra-tions. Another way of writing this model is via a linearcalibration curve as

    ) = a + /3/d + 6, (1.5)where y is the observed measurement (such as peak areaa GC). Note that the new model approximates a constantstandard deviation model for very low concentrations andapproximates a constant CV model for high concentra-tions.The approach proposed here should be contrasted withan alternative method, which is to model the standard de-viation as a linear function of the mean concentration. Thelatter approach should work well over restricted ranges butsuffers from some disadvantages when used over a widerange. First, the predicted standard deviation at zero con-centration need bear little resemblance to the measuredvalue when one regresses the standard deviation of repli-cates on the mean of the replicates. The proposed newmodel allows the data near 0 to determine the predictedstandard deviation near 0 and the data for large concen-trations to determine the standard error for large concen-trations. Second, the new model allows the errors at largeconcentrations to be lognormal, rather than normal, whichis in accord with much experience. Third, there is a moreplausible physical mechanism for the existence of multi-plicative and additive errors than there is for a standarddeviation that is linear in the mean. A comparison of thepredicted standard deviations from the two models is givenin a later example.If viewed in terms of the coefficient of variation, thepicture is that large concentrations have a constant CV,whereas small concentrations have an increasing CV thattends to infinity as the concentration approaches 0. This

    bears some relationship to what Hall and Selinger (1989)called the Horwitz trumpet (Horwitz 1982; Horwitz,Kamps, and Boyer 1980). Thcrc are substantial differ-ences however, that are discussed in Appendix A.As an example, suppose that ot = 1 part per billion(ppb) and oll = .l. Then the standard deviation of blanksis 1 ppb, so the detection limit (in one definition) mightbe set at 3a, = 3 ppb. At this concentration, measure-ments have a standard deviation of 1.04 ppb (using the newmodel), barely above the value at zero concentration-thisis, a coefficient of variation of .35. One definition of thequantitation level is concentration in which the coefficientof variation falls to .2, thepructical quantitation level (En-vironmental Protection Agency 1989). Because (T,/= I,the coefficient of variation is .l for large concentrations.Using the new model, the critical level at which the CV is.2 can be found by solving (.2x) = (. 1~)~ + (1) for x,which yields x = 5.77. Thus, at 6 ppb, the Environmen-tal Protection Agencys practical quantitation level (PQL)has been reached. This calculation is only possible be-cause of the new model-neither of the standard modelscan be used to compute the PQL.It has been stated that values below the PQL do not yielduseful quantitative information about the concentration ofthe analyte (e.g., Massart et al. 1988). The incorrect-ness of this idea has been pointed out (e.g., ASTM-D42 IO1990 [ASTM 19871).The new model more clearly demon-strates the usefulness of measurements below quantitationlevels. The proposed model allows determination of theerror structure of an analytical method near the detectionlimit and so provides an easy way to give precisions forsuch measurements, as well as the average of several such.

    2. ESTIMATIONThe approach we used is based on maximum likelihoodestimation. An observed value x differs from the theoret-ical value F because of the two errors q and e, which arenot directly observed. Any combination of rj and E thatsatisfies E = x - pej is possible. Consequently, the like-lihood associated with a set (o,r, a,) of parameters given aset of n measurements x; with known concentrations pi is

    II ccl--u

    1~,-712/(2n:),-(x-l*~,11)*/(2~~~)~~,-cc 2nw,, (2.1)r=lMaximizing this likelihood leads to estimates of the nec-essary parameters a,,, and 0,. More complex models, suchas the estimation of a calibration curve, can be estimatedin the same fashion, using maximum likelihood. For ex-ample, the calibration model (1.5) has likelihood

    11 ccnl

    1 ,-rr*/(2n:),-(\-rr-Plrp)?/(2~~)~~. (2.2)!=I -m 2ra,%,Once estimates Zn and ?c have been derived, the preci-sion of any measured value in the form (1.4) is (using the

    TECHNOMETRICS, MAY 1995, VOL. 37, NO. 2

  • 7/27/2019 Rocke Lorenza to 1995 Horwitz

    3/9

    178 DAVID M. ROCKE AND STEFAN LORENZATO

    formula for the variance of a lognormal random variable)CT:+ p2eqenz - I), (2.3)

    which can be estimated by substituting estimated valuesfor the parameters. This formula illustrates the key featureof the new model-when p is small, the error is nearlyconstant in size; when p is large, the size of the error isroughly proportional to F. In the calibration form (1 S),the estimated concentration is

    ~=B^-(y-~)==-(CY--+B~e~+t). (2.4)If the covariance matrix (2, 83 is V, then a straightforwarddelta-method calculation shows that the variance of i; is

    g Vg + cr,2/B + @em; (e; - I), (2.5)where g = (jY, -pp-). Note that, when sample sizesare sufficiently large that a! and B can be regarded asknown, the g Vg term disappears, and this basically coin-cides with (2.3).The computational approach used is outlined in Ap-pendix B. From the usual maximum likelihood estimationtheory, the estimates of the parameters are normal withasymptotic variances given by the negative inverse of theinformation matrix, which we use for the matrix V men-tioned previously when needed.2.1 Elaborations of the Basic Model

    The model we are developing will be of no practical useunless it accurately describes the behavior of a wide rangeof analytical data. If a test data set consists of replicatedmeasurements at a variety of levels from 0 to (say) 500ppb, then the precision at 0 will be a, and the CV a t 500ppb will be approximately o,,, so these two parameterscan be chosen to fit almost any behavior at the extremes.The model, if correct, says much more. From these twoparameters, the exact way in which the standard deviationgradually rises in the transition region can be completelypredicted. Thus it would make sense o compare the preci-sion at each level implied by the model and the parameterfits to the estimated precision from the replicates. If themodel in its simplest form does not fit the data, then anelaborated form may be required. One example of thepossible need for an elaboration is a nonlinear calibrationfunction. With a nonlinear calibration function w(p.; e),the model becomes

    y = ci + w(p; 8)eq + l . (2.6)The parameters can be determined by maximum likeli-hood as before.Another problem that may arise is that the variance maynot depend on the mean for large concentrations in the wayimplied by the model. An elaboration to help resolve thisproblem is to specify that n - N(0, o,f?V(p)) (Carroll andRuppert 1988). This allows for a different behavior thanimplied by lognormal errors.TECHNOMETRICS, MAY 1995 , VOL. 37, NO. 2

    A third potential e laboration is needed to address thecase in which a labeled standard is used to calibrate mea-sured values, which adjusts for recovery efficiency. Con-sider an analytical method to determine the concentrationof a volatile organic such as toluene. One method of in-creasing accuracy is to spike the sample with a knownconcentration v of deuterated toluene and determine theestimated concentration p of toluene and j& of deuteratedtoluene. One estimate of the true concentration of tolueneis zadj = Ev/zd, which adjusts for recovery. If

    and(2.7)

    yd = a + BveJf2+ c2, (2.8)then the precision of j&~ depends on the precision of G,the precision of zd, and their covariance. Using the deltamethod we derive an approximate variance for G;l

  • 7/27/2019 Rocke Lorenza to 1995 Horwitz

    4/9

    A MODEL FOR MEASUREMENT ERROR IN ANALYTICAL CHEMISTRY 179

    detection limit) and to require the quantitative recordingof measurements even when they are below the individualobservation detection limit.3.2 Uncertainty of a Single Measurement

    There are two primary approaches to this problem, anexact solution and a normal or lognormal approximation.In the exact solution, we construct a confidence interval forthe true concentration by locating the points on either sideof the measured value where a hypothesis test ust rejects.If a measurement x is obtained, and a 95% confidenceinterval is desired, then we need to find values VL and ,uusuch that-j/(2n~(l-~(X-~Le)/~~)d~ = ,025

    and(3.1)

    -~2(2n,:)(~(~-~~el)/a,)dr7 = .(X5,(3.2)

    where a() is the standard normal distribution function.This can be done by numerical solution of these equations,which yields a 95% confidence interval (/JL,, p).An approximate method is based on the estimated vari-ance of x given byV(x) = CT: + .x2eg~(e~ - 1). (3.3)

    For low levels of x (those in which the first term domi-nates), the distribution of x is approximately normal. Anapproximate 95% confidence interval is formed asx f 1.96Jv(x). (3.4)

    For high levels of x [those in which the second term inV(x) dominates] In(x) is approximately normally dis-tributed with variance at so that a 95% confidence intervalfor p is

    (exp(ln(x) - 1.96a,), exp(ln(x) + 1.96a,)). (3.5)Note that this interval, although symmetric on the loga-rithmic scale, is asymmetric on the original measurementscale.3.3 Uncertainty of an Average of SeveralMeasurements

    Here, the exact method of confidence-interval determi-nation would be burdensome because t would require thecomputation of a difficult convolution, so confidence in-tervals must be based on the approximate normality of 5T(for low levels) or of In(x) (for high levels). For lowlevels, the average, 51, of n measurements will be ap-proximately normally distributed with standard deviationdm. For larger values of x, it will be better to per-form inference using the logarithms of the data, whichwill be more nearly normally distributed. The average

    Table 7. Cadmium Concentrations by AASCadmiumconcentration (ppb) Absorption (x 1001

    0 .O -.7 -.I -.62.7784 5.5 5.9 6.1 6.19.675 21.8 22.5 23.2 23.122.9716 53.4 53.6 50.9 53.831.7741 74.1 74.0 71.2 71.543.2067 94.6 99.6 99.4 101.1

    of the logarithms of n measurements will have approxi-mate standard deviation o,,/fi. Some further work willbe needed to determine when inference should be basedon the measurements and when on the logarithms. An-other possible technique to improve inference is to use anormalizing transformation that depends on the parametervalues.3.4 Determination of Sample Size

    The approximate sample size required to determine aconcentration to a particular precision depends on the con-centration as well as the precision desired. Suppose thatwe wish to choose the number of replicates r so thatthe chance of detecting a specified concentration that isabove the safe level is sufficiently high. For example,suppose that the safe level is .I ppb and the standard de-viation parameters are o, = .2 ppb and c,, = .l . Supposethat it was held to be important to detect a concentrationof .3 ppb. At that concentration, the standard deviation ofan average of r replicates is

    J[ : + p2ent (e: - l)]/r= [.04 + (.09)e,01(e,0 - l)]/r (3.6)= J.O409/r= .202/d.

    (3.7)(3.8)

    Using a normal approximation, the distance between thesafe level .l ppb and the concentration .3 ppb in standarddeviation units is(.2/.202)& = .9891/;. (3.9)

    Table 2. Standard Deviation ofAbsorption andLog-Absorption

    Cadmium Absorptionconcentration std. dev. Log-absorptionstd. dev.0 .351 -2.7784 ,283 .04aa9.675 .645 .028722.9716 1.360 .0260

    31.7741 1.564 .021543.2067 2.821 .02a9

    TECHNOMETRICS, MAY 1995, VOL. 37, NO. 2

  • 7/27/2019 Rocke Lorenza to 1995 Horwitz

    5/9

    180 DAVID M. ROCKEAND STEFANLORENZATOTable 3. Initial Estimates From Pooled Data and Final

    Parameters by MLE for Cadmium DataStarting value Final value

    -.0963 -.36912.292 2.315.02511 .02507.31aa .2970

    For the chance ofdctection to be .95, we require .989& >1.645 or r > 2.77. In this case, the number of replicatesshould be at least 3.4. EXAMPLES

    In this section, we present two examples of analyticalmethods and examine the fit of the two-component model.The first example consists of graphite-furnace atomic ab-sorption spectroscopy (AAS) measurements of cadmium,for known concentrations from 0 to 43 ppb, each quadru-ply replicated. The second example is GUMS measure-ments of toluene from 4.6 picograms to 15 nanograms. Fordescriptions of these methods see Willard, Merritt, Dean,and Settle (1988). In both cases we use the calibrationform (1.5) with likelihood (2.2).4.1 Cadmium by Atomic Absorption Spectroscopy

    The instrument used for these measurements was aPerkin-Elmer model 5500 graphite furnace atomic absorp-tion spectrometer with a model HGA-500 furnace con-troller. The tubes were pyrolytically coated; injectionswere made onto Lvov platforms. A deuterium arc lampwas used for background correction. Table I shows the

    I I I I-O 10 20 30 40 50

    Concentration of Cadmium (ppb)Figure 1. Concentration of Cadmium Versus Absorption byAAS. The dots represent the measured values, the solid line isthe calibration curve estimated by maximum likelihood usingmodel (7.51, and the dashed lines form an estimated predic-tion envelope using (3.4) and (3.5) as appropriate.

    Ln0

    r0 20 40 60 a0 100

    Mean AbsorptionFigure 2. Actual and Predicted S tandard Deviation for Cad-mium by AAS. The dots represent the standard deviationof the replicates and the solid line is the predicted standarddeviation using (3.41 and (3.5) as appropriate.

    AAS measurements, and Table 2 gives the standard de-viation of the replicate measurements of absorbance andthe standard deviation of the logarithms of the replicates.Note that the measurements at the two lowest concentra-tions appear to have a constant standard deviation of themeasured concentration and the measurements at the fourhighest concentrations appear to have a constant standarddeviation of the measured log-concentration. Therefore,the data span the gray area of quantification. Start-ing values for the maximum likelihood estimator (MLE)procedure, which in this case are themselves quite good

    0 10 20 30 40Concentration of Cadmium (ppb)

    50

    Figure 3. Residuals for Cadmium Calibration. The dots rep-resent the differences between measured values and the pre-dicted values from Model (1.5) and the dashed lines forman estimated prediction envelope using (3.4) and (3.5) asappropriate.

    TECHNOMET RICS, MAY 1995, VOL. 37, NO. 2

  • 7/27/2019 Rocke Lorenza to 1995 Horwitz

    6/9

    A MODEL FOR MEASUREMENT ERROR IN ANALYTICAL CHEMISTRY

    Table 4. Toluene Amounts by GC/MS Table 6. Initial Estimates From Pooled Data and FinalParameters by MLE for Toluene Data

    Tolueneamount (pgl Peak area Parameter Starting value Final value4.6 29.80 16.85 16.68 19.52

    23 44.60 48.13 42.27 34.78116 207.70 222.40 172.88 207.51580 894.67 821.30 773.40 936.93

    3,000 5350.65 4942.63 4315.79 3879.2815,000 20718.14 24781.61 22405.76 24863.91

    -1.6 11.511.546 1.52469 .I0 .I032rr, 6.0 5.698

    estimates of the parameters, are shown in Table 3. Startingvalues for the regression parameters cxand B were derivedby ordinary linear regression of the absorbance measure-ments on the concentration. The starting value for thestandard deviation cr, of zero measurements was estimatedfrom the two lowest concentrations because there appearsnot to be an upward trend until the third highest concentra-tion. A starting value for the standard deviation of the logabsorbances for high concentrations a,,, which is approxi-mately the coefficient of variation, was derived by poolingthe highest four levels, where the CV appears constant.These initial estimates are of such high quality that theyare scarcely changed by the MLE iterations. Final valuesare also shown in Table 3. Note that the calculation is notextremely sensitive to the starting value because the sameoptimum was arrived at from 01 = 0, B = 2, a,, = .03,and a, = .4, for example. Somewhat plausible valuesmust be used, however, to avoid numerical instability.The results are shown graphically in Figure 1, whichplots the data, the calibration curve, and an estimated en-velope; Figure 2, which shows the actual and predictedstandard deviation; and Figure 3, which shows the resid-uals from the model fit.

    95% confidence interval (20.69, 22.88) and half-widths1.07 and 1.12. A normal approximation using Equation(2.3) yields confidence intervals of (2.49, 3.01) for an ab-sorbance of 6 and (21.23, 22.29) for an absorbance of50. The approximation is quite acceptable for the low ab-sorbance but not for the higher one. For the absorbance of50, an acceptable approximation is gained by using the ap-proximate lognormality of the measurements at high lev-els, leading to a95% confidence interval of (20.72,22.85).Note, however, that there is no satisfactory alternative tothe exact confidence interval provided by the new modelthat is accurate over the entire range of measurements.4.2 Toluene by GUMS

    The instrument used here was a Trio-2 GUMS (VGMasslab) with electron ionization at 70 electron volts witha 30-meter DB-5 GC column. Table 4 shows an analy-sis of amount of toluene by GUMS for known amountsof from 4.6 picograms to 15 nanograms in 100 PL ofextract. (1 picogram in 100 PL corresponds to a concen-tration of .Ol ppb.) The quantitation is done by peak area

    If we now treat these estimated parameters as known,we can use the model for further analysis. For example,if the limit of detection is defined as the absorbance (andassociated implied concentration) that falls three standarddeviations above 0, we find that it lies at about .4 ppb.Confidence intervals for concentration can be derivedusing the methods of Section 3.2. For example, with theparameters estimated for the cadmium data, an absorbanceof 6 implies an estimated concentration of 2.75 ppb, with95% confidence interval (2.47, 3.04). This is almost sym-metric, w ith half widths of .28 and .29. An absorbance of50 implies an estimated concentration of 2 1.76 ppb, with 0

    Table 5. Standard Deviation of Peak Area and Log Peak AreaToluene amount Peak area std. dev. Log peak area std. dev.

    4.6 6.20 .271923.0 5.65 .I387116.0 21.02 .I080580.0 73.19 .0858

    3,000.0 652.98 .I42715.000.0 2005.02 .0878

    I I I I10 100 1000 10000

    Amount of Toluene (pg)Figure 4. Concentration of Toluene Versus Peak Area byGUMS. The dots represent the measured values, the so/idline is the calibration curve estimated by maximum likelihoodusing Mod el (1.51, and the dashed lines form an estimatedprediction envelope using 13.4) and (3.5) as appropriate. Notethe log-log scale.

    TECHNOMETRICS, MAY 1995, VOL. 37, NO. 2

  • 7/27/2019 Rocke Lorenza to 1995 Horwitz

    7/9

    182 DAVID M. ROCKE AND STEFAN LORENZATO

    0u-3

    50 500 5000Mean Peak Area

    Figure 5. Actual and Predicted Standard Deviation forToluene by GUMS. The dots represent the standard devia-tion of the replicates and the solid line is the predicted stan-dard deviation using (3.41 and (3.5) as appropriate. Note thelog-log scale.

    at m/z 91. The relationship between amount and peakarea is satisfactorily linear and the behavior of the errorsis generally consistent with the new model as shown inTable 5. Starting values were derived from ordinary lin-ear regression and from an examination of the standarddeviations of the raw and logged data. The MLE esti-mation gives apparently acceptable results as shown inTable 6.The results are shown graphically in Figure 4, whichplots the data, the calibration curve, and an estimated en-velope; Figure 5, which shows the actual and predicted

    . . -0.. -.._ *. ...---...-i.---..-.--.~~~-.----.--.-.-.~---. ... : . ..: .. : .. .~~..-------.........-~~~~~~~~~~~.~~......_.~~.F,,*,,#I9

    I I I I10 100 1000 10000

    Amount o f Toluene (pg)Figure 6. Residuals for Cadmium Calibration. The dots rep-resent the ratios between measured values and the predictedvalues from Model (1.51 and the dashed lines form an esti-mated prediction envelope using (3.4) and (3.5) as appropri-ate. Note the log-log scale.

    Table 7. Predicted Standard Deviations From New Model andFrom Linear Regression of the Standard Deviation on the MeanMean Peak area Predicted std. Predicted std. dev.peak area std. dev. dev. from (2.5) from linear regression

    20.7 6.20 5.74 46.6042.4 5.65 6.76 48.48202.6 21.02 19.25 62.29856.6 73.19 92.13 118.684622.1 652.98 475.65 443.3723192.4 2005.02 2378.08 2044.64

    standard deviation; and Figure 6, which shows the resid-uals from the model fit. These plots are all on a log-logscale.A comparison is given in Table 7 between the predictedstandard deviation of the peak area from the model devel-oped in this article and the predicted standard deviationfrom a variance function in which the standard deviationis a linear function of the mean. Both fit well for high con-centrations, but the linear function is quite inaccurate forlower concentrations. For these large ranges, it appearsthat the new model fits the behavior of the data better.

    5. CONCLUSIONIn this article we have provided a new model for analyt-ical error that behaves like a constant standard deviationmodel at low concentrations and like a constant CV modelat high concentrations. The importance of this new modelis that it provides a reliable way to estimate the precision

    of m easurements that are near the detection limit so thatthey can be used in inference and regulation. Althoughthe illustrations used in the article are for linear calibrationanalyses, the model itself is flexible enough to be used in awide variety of situations, including nonlinear calibration,censored data, and added-standards methods.ACKNOWLEDGMENTS

    We are grateful to A. Russell Flegal, Institute of MarineSciences, University of California, Santa Cruz, andA. Daniel Jones, Facility for Advanced Instrumentation,University of California, Davis, for providing the data usedas examples in this article, and to the editor, an associateeditor, and two referees for suggestions that substantiallyimproved the article. The research reported in this arti-cle was supported in part by the California State WaterResources Control Board, by National Institute of Envi-ronmental Health Sciences, National Institutes of HealthGrant P42 ES04699, and by National Science FoundationGrant DMS-93.0 1344.

    APPENDIX A: THE HORWITZ TRUMPETHorwitz (1982; Horwitz et al. 1980) examined over 150independent Association of Official Ana lytical Chemists

    interlaboratory collaborative studies covering numerousTECHNOMETRICS, MAY 1995 , VOL. 37, NO. 2

  • 7/27/2019 Rocke Lorenza to 1995 Horwitz

    8/9

    A MODEL FOR MEASUREMENT ERROR IN ANALYTICAL CHEMISTRY 183

    topics from drug preparations and pesticide formulations(at the high concentration end) to aflatoxin (at the lowend). Over these studies, there was a pattern to the inter-laboratory CV that Hall and Selinger (1989) described asthe Horwitz trumpet. The average CV of the interlabo-ratory errors increased from a few percent for concentra-tions in the double-digit percentage range, to about 50%for concentrations in the 1 ppb range. Horwitz suggestedan empirical formula relating the concentration F to thecoefficient of variation CV(p) [and therefore the precision0 (p)] as follows:

    to describe the way in which the size of the errors fromanalytical methods change as the method itself and theintended range of concentration changes. The latter, thefocus of this article, allows better estimation of quantita-tion error for a given analytical method for measurementswithin a few multiples of the detection limit.

    APPENDIX B: COMPUTATIONAL ISSUESIN ESTIMATION

    andCV(p) = ,006~~. (A.11

    a(p) = .oo6pcL.s. 64.2)After an analysis of Horwitzs data using a binomial modelwith variable apparent sample size, Hall and Selinger sug-gested a similar formula:

    Actually performing the maximization of the likelihoodrequires the numerical optimization of a numerically in-tegrated likelihood. For the numerical integration, weused a Gauss-Hermite rule with centering depending ona second-order Taylor approximation to the logarithm ofthe integrand. This led to highly accurate results. Specif-ically, we need to approximate

    andCV(p) = .02& (A.3) J

    DC 1,-1,*/(2n,2),-(x-,le)~/(2n~)~~~,-rn 2na,a, (A.5)

    a(p) = .02p,. (A.4)Both the Horwitz trumpet and the present article sug-

    Gauss-Hermite integration is a form of Gaussian integra-tion for use with a kernel of the form w(x) = exp(-x2/2)(Davis and Rabinowitz 1984). One approximates an inte-gralgest an increase in the CV at low concentrations; however,there are several important differences. First, Horwitzwas interested in the variation from laboratory to labo-ratory, whereas the model in this article is intended forintralaboratory precision (there is no reason why it shouldnot apply also across laboratories, however). Second,and more important, Horwitz was describing how theprecision changes when one moves from one analyticalmethod intended for a certain range of concentrations toanother method intended for a different range. The two-component error model is for a single analytical methodacross its useful range. There is no reason why one modelcannot be used within methods and adifferent one betweenmethods. In the language of this article, Horwitz had amodel for how ~~ changes with the analytical method,but the two-component model describes how the preci-sion changes within a given method as the concentrationapproaches the detection limit of that method.

    byJ co f(x>w(x>dx (4.6)--(I)

    2 w;f(x,>, 64.7)i=l

    where the Gaussian points xi and weights 11);can beobtained from a variety of computer programs. (Davisand Rabinowitz 1984; IMSL Library 1989; NAG L ibrary1987). This is exact when f(x) is a polynomial of degree2m + 1 or less.The Equation (A.5) can be transformed to resemble(A.6) by a linear transformation. In fact, we transform(A.5) so that it resembles W(X) = exp(-x2/2) by match-ing the first two derivatives of the logarithm at the mode.A function like&Y(X)= e (A-r,J2/2k2 (A.81

    It is of particular note that the Horwitz trumpet cannotbe used to serve the purpose of this article in describingthe transition of the error structure in a given analytica lmethod as the concentration changes from high levels tothose near the detection limit. This is because he Horwitzmodel in both its original form and in Hall and Selingersemendation imply that the standard deviation of a mea-surement at low levels approaches 0. In particular, if usedinappropriately to describe the interlaboratory error underzero concentration, this model would imply zero error. Infact, most laboratories might report the compound as notdetected, but this is a far cry from zero error.

    satisfiesd Wg(x))

    dx = -(x - x0)/k G4.9)andd2Wg(x))

    dx2 = -Ilk. (A.10)Then x0 is a 0 of d log(g(x))/dx, and

    (A.1 I)In summary, both the Horwitz trumpet and the two- Thus, if we match the integrand of (A.5) in this fashioncomponent error model have a proper place in understand- and transform using q* = (a - qo)/k, we approximateing the errors of analytical method. The former is useful the form of the kernel.

    TECHNOMETRICS, MAY 1995, VOL. 37, NO. 2

  • 7/27/2019 Rocke Lorenza to 1995 Horwitz

    9/9

    184 DAVID M. ROCKE AND STEFAN LORENZATO

    Now the log of the integrand of (A.5) ish(q)= -1og(2na,o,) - $/(2a;)-(x - &y/(2$).

    (A. 12)Then

    h(v) = -q/o, + &(x - @)/a,2 (A.13)andh(q) = - 1 at + &(x - pef)/a2 - 1.(2e2f/a,2.(A.14)We find a root of Equation (A. 13) numerically and use theassociated transformation to approximate the integral.This procedure results in fairly quick calculation of anaccurate approximation of the likelihood for use in max-imum likelihood estimation. In the calculations in thisarticle, we use a 1Zpoint approximation. The numericaloptimization was performed with a quasi-Newton methodusing BFGS rank-two updates and a trust region approach(Dennis and Schnabel 1983). The implementation usedwas the IMSL routine DUMINF (1989). Starting valuesfor the iterations when the mode1 s in the simple form (1.4)can be derived from two simple calculations. First, crC anbe estimated by the standard deviation of zero or near-zeroreplicates. Then, crVcan be estimated by the standard de-viation of the logarithms of some high concentration. Forthe calibration form (1.5), initial estimates of a and p canbe derived from linear regression of the response on thelog concentration.

    [Received June 1992. Revised July 1994.1REFERENCES

    ASTM (1987), Standard 42 IO, in American Sociery,for Tcsring undM~teriu/s 1987 Book cfStrmdurds. Philadelphia: Author.Carroll, R. J., and Ruppert, D. (1988). Trm.~fiwmurion und Weightingin Regression, New York: Chapman and Hall.Caulcutt, R ., and Boddy, R. (1983) Sfutisfic.~.for Anulyti cul Chew&s,London: Chapman and Hall.Davis, P.J., and Rabinowitz, P. (1984), Mefhods c$Nurnericu/ Integrtr-rim (2nd ed.), Orlando, FL: Academic Press.Dennis, J. E., and Schnabel, R. B. (1983). Numeric& Methods ,fivUnconstruined Optimization und Nonlineur Equations, EnglewoodCliffs, NJ: Prentice-Hall.Environmental Protection Agency (1989). Proposed Rule: MethodDetection Limits and Practical Quantitation Levels, Federul Regis-fer, 54: 97, 22,100-22,104.Hall, P., and Selinger, B. (1989) A Statistical Justification Relating ln-terlaboratory Coefficients of Variation With Concentration Levels,

    Anulytictil Chemistry, 61, 1465-1466.Horwitz, W. (1982) Evaluation of Analytical Methods Used for Reg-ulation of Food and Drugs, Anulyticul Chemistry, 54, 67A-76A.Horwitz, W., Kamps, L. R., and Boyer, K. W. (1980) Quality Assur-ance in the Analysis of Foods for Trace Constituents, Jourrrcrl cftheAssociation of Qfficiul Ancrlytical Chemists, 63. 1344-l 354.IMSL Library (1989), Houston, TX: IMSL Inc.

    Massart, D. L., Vandingste, B. G. M., Deming, S. N., Michotte. Y..and Kaufman, L. (1988), Chernomefrics: A Textbook, Amsterdam:Elsevier.NAG Library (1987) Oxford, U.K.: Numerical Algorithms Group,Ltd.Willard , H. H., Merritt, L. L., Jr., Dean, J. A., and Settle, F. A., Jr.

    (1988), InsfrumenruI b4ethhod.y!f Anulysi.v (7th ed.). Belmont, CA:Wadsworth.

    TECHNOMETRICS, MAY 1995, VOL. 37, NO. 2