39
Produced by: Natural Resources Management and Environment Department Title: Guidelines for quality management in soil and plant laboratories. (FAO Soils ... More details 7 QUALITY OF ANALYTICAL PROCEDURES 7.1 Introduction 7.2 Calibration graphs 7.3 Blanks and Detection limit 7.4 Types of sample material 7.5 Validation of own procedures 7.6 Drafting an analytical procedure 7.7 Research plan SOPs 7.1 Introduction In this chapter the actual execution of the jobs for which the laboratory is intended, is dealt with. The most important part of this work is of course the analytical procedures meticulously performed according to the corresponding SOPs. Relevant aspects include calibration, use of blanks, performance characteristics of the procedure, and reporting of results. An aspect of utmost importance of quality management, the quality control by inspection of the results, is discussed separately in Chapter 8. All activities associated with these aspects are aimed at one target: the production of reliable data with a minimum of errors. In addition, it must be ensured that reliable data are produced consistently. To achieve this an appropriate programme of quality control (QC) must be implemented. Quality control is the term used to describe the practical steps undertaken to ensure that errors in the analytical data are of a magnitude appropriate for the use to which the data will be put. This implies that the errors (which are unavoidably made) have to be quantified to enable a decision whether they are of an acceptable magnitude, and that unacceptable errors are discovered so that corrective action can be taken. Clearly, quality control must detect both random and systematic errors. The procedures for QC primarily monitor the accuracy of the work by checking the bias of data with the help of (certified) reference samples and control samples and the precision by means of replicate analyses of test samples as well as of reference and/or control samples. 7.2 Calibration graphs 7.2.1 Principle 7.2.2 Construction and use 7.2.3 Error due to the regression line 7.2.4 Independent standards

Quality of Analytical Procedures

Embed Size (px)

DESCRIPTION

Calidad en quimica analitica

Citation preview

Page 1: Quality of Analytical Procedures

Produced by: Natural ResourcesManagement and Environment Department

Title: Guidelines for quality management in soil and plant laboratories. (FAO Soils ...

More details

7 QUALITY OF ANALYTICAL PROCEDURES

7.1 Introduction7.2 Calibration graphs7.3 Blanks and Detection limit7.4 Types of sample material7.5 Validation of own procedures7.6 Drafting an analytical procedure7.7 Research planSOPs

7.1 Introduction

In this chapter the actual execution of the jobs for which the laboratory is intended, isdealt with. The most important part of this work is of course the analytical proceduresmeticulously performed according to the corresponding SOPs. Relevant aspectsinclude calibration, use of blanks, performance characteristics of the procedure, andreporting of results. An aspect of utmost importance of quality management, thequality control by inspection of the results, is discussed separately in Chapter 8.

All activities associated with these aspects are aimed at one target: the production ofreliable data with a minimum of errors. In addition, it must be ensured that reliable dataare produced consistently. To achieve this an appropriate programme of qualitycontrol (QC) must be implemented. Quality control is the term used to describe thepractical steps undertaken to ensure that errors in the analytical data are of amagnitude appropriate for the use to which the data will be put. This implies that theerrors (which are unavoidably made) have to be quantified to enable a decisionwhether they are of an acceptable magnitude, and that unacceptable errors arediscovered so that corrective action can be taken. Clearly, quality control must detectboth random and systematic errors. The procedures for QC primarily monitor theaccuracy of the work by checking the bias of data with the help of (certified) referencesamples and control samples and the precision by means of replicate analyses of testsamples as well as of reference and/or control samples.

7.2 Calibration graphs

7.2.1 Principle7.2.2 Construction and use7.2.3 Error due to the regression line7.2.4 Independent standards

Page 2: Quality of Analytical Procedures

7.2.5 Measuring a batch

7.2.1 Principle

Here, the construction and use of calibration graphs or curves in daily practice of alaboratory will be discussed. Calibration of instruments (including adjustment) in thepresent context are also referred to as standardization. The confusion about theseterms is mainly semantic and the terms calibration curve and standard curve aregenerally used interchangeably. The term "curve" implies that the line is not straight.However, the best (parts of) calibration lines are linear and, therefore, the general term"graph" is preferred.

For many measuring techniques calibration graphs have to be constructed. Thetechnique is simple and consists of plotting the instrument response against a seriesof samples with known concentrations of the analyte (standards). In practice, thesestandards are usually pure chemicals dispersed in a matrix corresponding with that ofthe test samples (the "unknowns"). By convention, the calibration graph is alwaysplotted with the concentration of the standards on the x-axis and the reading of theinstrument response on the y-axis. The unknowns are determined by interpolation, notby extrapolation, so that a suitable working range for the standards must be selected.In addition, in the present discussion it is assumed that the working range is limited tothe linear range of the calibration graphs and that the standard deviation does notchange over the range (neither of which is always the case* and that data arenormally distributed. Non-linear graphs can sometimes be linearized in a simple way,e.g. by using a log scale (in potentiometry), but usually imply statistical problems(polynomial regression) for which the reader is referred to the relevant literature. Itshould be mentioned, however, that in modem instruments which make and usecalibration graphs automatically these aspects sometimes go by unnoticed.

* This is the so-called "unweighted" regression line. Because normally thestandard deviation is not constant over the concentration range (it isusually least in the middle range), this difference in error should be takeninto account. This would then yield a "weighted regression line". Thecalculation of this is more complicated and information about the standarddeviation of the y-readings has to be obtained. The gain in precision isusually very limited, but sometimes the extra information about the errormay be useful.

Some common practices to obtain calibration graphs are:

1. The standards are made in a solution with the same composition as theextractant used for the samples (with the same dilution factor) so that allmeasurements are done in the same matrix. This technique is oftenpractised when analyzing many batches where the same standards areused for some time. In this way an incorrectly prepared extractant ormatrix may be detected (in blank or control sample).

2. The standards are made in the blank extract. A disadvantage of thistechnique is that for each batch the standards have to be pipetted.Therefore, this type of calibration is sometimes favoured when only one orfew batches are analyzed or when the extractant is unstable. A seeming

Page 3: Quality of Analytical Procedures

advantage is that the blank can be forced to zero. However, an incorrectextractant would then more easily go by undetected. The disadvantage ofpipetting does not apply in case of automatic dispensing of reagents whenequal volumes of different concentration are added (e.g. with flow-injection).

3. Less common, but useful in special cases is the so-called standardadditions technique. This can be practised when a matrix mismatchbetween samples and standards needs to be avoided: the standards areprepared from actual samples. The general procedure is to take a numberof aliquots of sample or extract, add different quantities of the analyte toeach aliquot (spiking) and dilute to the final volume. One aliquot is usedwithout the addition of the analyte (blank). Thus, a standard series isobtained.

If calibration is involved in an analytical procedure, the SOP for this should include adescription of the calibration sub-procedure. If applicable, including an optimalizationprocedure (usually given in the instruction manual).

7.2.2 Construction and use

In several laboratories calibration graphs for some analyses are still adequately plottedmanually and the straight line (or sometimes a curved line) is drawn with a visual "bestfit", e.g. for flame atomic emission spectrometry, or colorimetry. However, thispractice is only legitimate when the random errors in the measurements of thestandards are small: when the scattering is appreciable the line-fitting becomessubjective and unreliable. Therefore, if a calibration graph is not made automatically bya microprocessor of the instrument, the following more objective and alsoquantitatively more informative procedure is generally favoured.

The proper way of constructing the graph is essentially the performance of aregression analysis i.e., the statistical establishment of a linear relationship betweenconcentration of the analyte and the instrument response using at least six points.This regression analysis (of reading y on concentration x) yields a correlationcoefficient r as a measure for the fit of the points to a straight line (by means of LeastSquares).

Warning. Some instruments can be calibrated with only one or twostandards. Linearity is then implied but may not necessarily be true. It isuseful to check this with more standards.

Regression analysis was introduced in Section 6.4.4 and the construction of acalibration graph was given as an example. The same example is taken up here (andrepeated in part) but focused somewhat more on the application.

We saw that a linear calibration graph takes the general form:

y = bx + a (6.18; 7.1)

where:

a = intercept of the line with the y-axisb = slope (tangent)

Page 4: Quality of Analytical Procedures

Ideally, the intercept a is zero. Namely, when the analyte is absent no response of theinstrument is to be expected. However, because of interactions, interferences, noise,contaminations and other sources of bias, this is seldom the case. Therefore, a canbe considered as the signal of the blank of the standard series.

The slope b is a measure for the sensitivity of the procedure; the steeper the slope,the more sensitive the procedure, or: the stronger the instrument response on yi to aconcentration change on x (see also Section 7.5.3).

The correlation coefficient r can be calculated by:

(6.19;7.2)

where

x1= concentrations of standards¯x = mean of concentrations of standardsy1= instrument response to standards¯y = mean of instrument responses to standards

The line parameters b and a are calculated with the following equations:

(6.20;7.3)

and

a = ¯y - b¯x (6.21;7.4)

Example of calibration graph

As an example, we take the same calibration graph as discussed in Section 6.4.4.1,(Fig. 6-4): a standard series of P (0-1.0 mg/L) for the spectrophotometricdetermination of phosphate in a Bray-I extract ("available P"), reading in absorbanceunits. The data and calculated terms needed to determine the parameters of thecalibration graph were given in Table 6-5. The calculations can be done on a(programmed) calculator or more conveniently on a PC using a home-made programor, even more conveniently, using an available regression program. The calculationsyield the equation of the calibration line (plotted in Fig. 7-1):

y = 0.626x + 0.037 (6.22; 7.5)

with a correlation coefficient r = 0.997 . As stated previously (6.4.3.1), such highvalues are common for calibration graphs. When the value is not close to 1 (say,below 0.98) this must be taken as a warning and it might then be advisable to repeator review the procedure. Errors may have been made (e.g. in pipetting) or the usedrange of the graph may not be linear. Therefore, to make sure, the calibration graphshould always be plotted, either on paper or on computer monitor.

Fig. 7-1. Calibration graph plotted from data of Table 6-5.

Page 5: Quality of Analytical Procedures

If linearity is in doubt the following test may be applied. Determine for two or three ofthe highest calibration points the relative deviation of the measured y-value from thecalculated line:

(7.6)

- If the deviations are < 5% the curve can be accepted as linear.- If a deviation > 5% then the range is decreased by dropping the highestconcentration.- Recalculate the calibration line by linear regression.- Repeat this test procedure until the deviations < 5%.

When, as an exercise, this test is applied to the calibration curve of Fig. 7-1 (data inTable 6-3) it appears that the deviations of the three highest points are < 5%, hencethe line is sufficiently linear.

During calculation of the line, the maximum number of decimals is used, rounding offto the last significant figure is done at the end (see instruction for rounding off inSection 8.2).

Once the calibration graph is established, its use is simple: for each y value measuredfor a test sample (the "unknown") the corresponding concentration x can bedetermined either by reading from the graph or by calculation using Equation (7.1), orx is automatically produced by the instrument.

7.2.3 Error due to the regression line

The "fitting" of the calibration graph is necessary because the actual response points

Page 6: Quality of Analytical Procedures

yi, composing the line usually do not fall exactly on the line. Hence, random errors areimplied. This is expressed by an uncertainty about the slope and intercept b and adefining the graph. A discussion of this uncertainty is given. It was explained there thatthe error is expressed by sy, the "standard error of the y-estimate" (see Eq. 6.23, aparameter automatically calculated by most regression computer programs.

This uncertainty about the -values (the fitted y-values) is transferred to thecorresponding concentrations of the unknowns on the x-axis by the calculation usingEq. (7.1) and can be expressed by the standard deviation of the obtained x-value. Theexact calculation is rather complex but a workable approximation can be calculatedwith:

(7.7)

Example

For each value of the standards x the corresponding y is calculated with Eq. (7.5):

(7.8)

Then, sy is calculated using Eq. (6.23) or by computer:

Then, using Eq. (7.7):

Now, the confidence limits of the found results xf can be calculated with Eq. (6.9):

xf ± t.sx (7.9)

For a two-sided interval and 95% confidence: ttab = 2.78 (see Appendix 1, df = n -2=4).Hence all results in this example can be expressed as:

Xf ± 0.08 mg/L

Thus, for instance, the result of a reading y = 0.22 and using Eq. (7.5) to calculate xf =0.29, can be reported as 0.29 ± 0.08 mg/L. (See also Note 2 below.)

The used sx value can only be approximate as it is taken constant here whereas inreality this is usually not the case. Yet, in practice, such an approximate estimation ofthe error may suffice. The general rule is that the measured signal is most precise(least standard deviation) near the centroid of the calibration graph (see Fig. 6-4). Theconfidence limits can be narrowed by increasing the number of calibration points.Therefore, the reverse is also true: with fewer calibration points the confidence limits ofthe measurements become wider. Sometimes only two or three points are used. Thisthen usually concerns the checking and restoring of previously established calibrationgraphs including those in the microprocessor or computer of instruments. In suchcases it is advisable to check the graph regularly with more standards. Make a record

Page 7: Quality of Analytical Procedures

of this in the file or journal of the method.

Note 1. Where the determination of the analyte is part of a procedure withseveral steps, the error in precision due to this reading is added to theerrors of the other steps and as such included in the total precision errorof the whole procedure. The latter is the most useful practical estimate ofconfidence when reporting results. As discussed in Section 6.3.4 aconvenient way to do this is by using Equations (6.8) or (6.9) with themean and standard deviation obtained from several replicatedeterminations (n> 10) carried out on control samples or, if available,taken from the control charts (see 8.3.2: Control Chart of the Mean). Mostgenerally, the 95% confidence for single values x of test samples isexpressed by Equation (6.10):

x±2s (6.10; 7.10)

where s is the standard deviation of the mentioned large number ofreplicate determinations.

Note 2. The confidence interval of ± 0.08 mg/L in the present example isclearly not satisfactory and calls for inspection of the procedure.Particularly the blank seems to be (much) too high. This illustrates theusefulness of plotting the graph and calculating the parameters. Othertraps to catch this error are the Control Chart of the Blank and, of course,the technician's experience.

7.2.4 Independent standards

It cannot be overemphasized that for QC a calibration should always includemeasurement of an independent standard or calibration verification standard at aboutthe middle of the calibration range. If the result of this measurement deviatesalarmingly from the correct or expected value (say > 5%), then inspection is indicated.

Such an independent standard can be obtained in several ways. Most usually it isprepared from pure chemicals by another person than the one who prepared theactual standards. Obviously, it should never be derived from the same stock orsource as the actual standards. If necessary, a bottle from another laboratory could beborrowed.

In addition, when new standards are prepared, the remainder of the old ones alwayshave to be measured as a mutual check (include this in the SOP for the preparation ofstandards!).

7.2.5 Measuring a batch

After calibration of the instrument for the analyte, a batch of test samples is measured.Ideally, the response of the instrument should not change during measurement (drift orshift). In practice this is usually the case for only a limited period of time or number ofmeasurements and regular recalibration is necessary. The frequency of recalibrationduring measurement varies widely depending on technique, instrument, analyte,solvent, temperature and humidity. In general, emission and atomizing techniques(AAS, ICP) are more sensitive to drift (or even sudden shift: by clogging) thancolorimetric techniques. Also, the techniques of recalibration and possible subsequent

Page 8: Quality of Analytical Procedures

action vary widely. The following two types are commonly practised.

1. Step-wise correction or interval correction

After calibration, at fixed places or intervals (after every 10, 15, 20, ormore, test samples) a standard is measured. For this, often a standardnear the middle of the working range is used (continuing calibrationstandard). When the drift is within acceptable limits, the measurement iscontinued. If the drift is unacceptable, the instrument is recalibrated("resloped") and the previous interval of samples remeasured beforecontinuing with the next interval. The extent of the "acceptable" driftdepends on the kind of analysis but in soil and plant analysis usually doesnot exceed 5%. This procedure is very suitable for manual operation ofmeasurements. When automatic sample changers are used, variousoptions for recalibration and repeating intervals or whole batches arepossible.

2. Linear correction or correction by interpolation

Here, too, standards are measured at intervals, usually together with ablank ("drift and wash") and possible changes are processed by thecomputer software which converts the past readings of the batch to theoriginal calibration. Only in case of serious mishap are batches orintervals repeated. A disadvantage of this procedure is that drift is taken tobe linear whereas this may not be so. Autoanalyzers, ICP and AAS withautomatic sample changers often employ variants of this type ofprocedure.

At present, the development of instrument software experiences a mushroom growth.Many new fancy features with respect to resloping, correction of carryover, post-batchdilution and repeating, are being introduced by manufacturers. Running ahead of this,many laboratories have developed their own interface software programs meetingtheir individual demands.

7.3 Blanks and Detection limit

7.3.1 Blanks7.3.2 Detection limit

7.3.1 Blanks

A blank or blank determination is an analysis of a sample without the analyte orattribute, or an analysis without a sample, i.e. going through all steps of the procedurewith the reagents only. The latter type is the most common as samples without theanalyte or attribute are often not available or do not exist.

Another type of blank is the one used for calibration of instruments as discussed in theprevious sections. Thus, we may have two types of blank within one analyticalmethod or system:

- a blank for the whole method or system and- a blank for analytical subprocedures (measurements) as part of the

Page 9: Quality of Analytical Procedures

whole procedure or system.

For instance, in the cation exchange capacity (CEC) determination of soils with thepercolation method, two method or system blanks are included in each batch: twopercolation tubes with cotton wool or filter pulp and sand or celite, but without sample.For the determination of the index cation (NH4 by colorimetry or Na by flame emissionspectroscopy) a blank is included in the determination of the calibration graph. If NH4

is determined by distillation and subsequent titration, a blank titration is carried out forcorrection of test sample readings.

The proper analysis of blanks is very important because:

1. In many analyses sample results are calculated by subtracting blankreadings from sample readings.

2. Blank readings can be excellent monitors in quality control of reagents,analytical processes, and proficiency.

3. They can be used to estimate several types of method detection limits.

For blanks the same rule applies as for replicate analyses: the larger the number, thegreater the confidence in the mean. The widely accepted rule in routine analysis is thateach batch should include at least two blanks. For special studies where individualresults are critical, more blanks per batch may be required (up to eight).

For quality control, Control Charts are made of blank readings identically to those ofcontrol samples. The between-batch variability of the blank is expressed by thestandard deviation calculated from the Control Chart of the Mean of Blanks, theprecision can be estimated from the Control Chart of the Range of Duplicates ofBlanks. The construction and use of control charts are discussed in detail in 8.3. Oneof the main control rules of the control charts, for instance, prescribes that a blankvalue beyond the mean blank value plus 3× the standard deviation of this mean (i.e.beyond the Action Limit) must be rejected and the batch be repeated, possibly withfresh reagents.

In many laboratories, no control charts are made for blanks. Sometimes, analystsargue that 'there is never a problem with my blank, the reading is always close tozero'. Admittedly, some analyses are more prone to blank errors than others. This,however, is not a valid argument for not keeping control charts. They are made tomonitor procedures and to alarm when these are out of control (shift) or tend tobecome out of control (drift). This can happen in any procedure in any laboratory atany time.

From the foregoing discussion it will be clear that signals of blank analyses generallyare not zero. In fact, blanks may found to be negative. This may point to an error in theprocedure: e.g. for the zeroing of the instrument an incorrect or a contaminatedsolution was used or the calibration graph was not linear. It may also be due to thematrix of the solution (e.g. extractant), and is then often unavoidable. For convenience,some analysts practice "forcing the blank to zero" by adjusting the instrument. Someinstruments even invite or compel analysts to do so. This is equivalent to subtractingthe blank value from the values of the standards before plotting the calibration graph.From the standpoint of Quality Control this practice must be discouraged. If zeroing ofthe instrument is necessary, the use of pure water for this is preferred. However, such

Page 10: Quality of Analytical Procedures

general considerations may be overruled by specific instrument or methodinstructions. This is becoming more and more common practice with modemsophisticated hi-tech instruments. Whatever the case, a decision on how to deal withblanks must made for each procedure and laid down in the SOP concerned.

7.3.2 Detection limit

In environmental analysis and in the analysis of trace elements there is a tendency toaccurately measure low contents of analytes. Modem equipment offer excellentpossibilities for this. For proper judgement (validation) and selection of a procedure orinstrument it is important to have information about the lower limits at which analytescan be detected or determined with sufficient confidence. Several concepts and termsare used e.g., detection limit, lower limit of detection (LLD), method detection limit(MDL). The latter applies to a whole method or system, whereas the two former applyto measurements as part of a method.

Note: In analytical chemistry, "lower limit of detection" is often confusedwith "sensitivity" (see 7.5.3).

Although various definitions can be found, the most widely accepted definition of thedetection limit seems to be: 'the concentration of the analyte giving a signal equal tothe blank plus 3× the standard deviation of the blank'. Because in the calculation ofanalytical results the value of the blank is subtracted (or the blank is forced to zero)the detection limit can be written as:

LLD, MDL = 3 × sbl (7.11)

At this limit it is 93% certain that the signal is not due to the blank but that the methodhas detected the presence of the analyte (this does not mean that below this limit theanalyte is absent!).

Obviously, although generally accepted, this is an arbitrary limit and in some casesthe 7% uncertainty may be too high (for 5% uncertainty the LLD =3.3 × sbl). Moreover,the precision in that concentration range is often relatively low and the LLD must beregarded as a qualitative limit. For some purposes, therefore, a more elevated "limit ofdetermination" or "limit of quantification" (LLQ) is defined as

LLQ = 2 × LLD = 6 × sbl (7.12)

or sometimes as

LLQ = 10 × sbl (7.13)

Thus, if one needs to know or report these limits of the analysis as qualitycharacteristics, the mean of the blanks and the corresponding standard deviationmust be determined (validation). The sbl can be obtained by running a statisticallysufficient number of blank determinations (usually a minimum of 10, and not excludingoutliers). In fact, this is an assessment of the "noise" of a determination.

Note: Noise is defined as the 'difference between the maximum andminimum values of the signal in the absence of the analyte measuredduring two minutes' (ox otherwise according to instrument instruction).The noise of several instrumental measurements can be displayed by

Page 11: Quality of Analytical Procedures

using a recorder (e.g. FES, AAS, ICP, IR, GC, HPLC, XRFS). Althoughthis is not often used to actually determine the detection limit, it is used todetermine the signal-to-noise ratio (a validation parameter not discussedhere) and is particularly useful to monitor noise in case of trouble shooting(e.g. suspected power fluctuations).

If the analysis concerns a one-batch exercise 4 to 8 blanks are run in this batch. If itconcerns an MDL as a validation characteristic of a test procedure used for multiplebatches in the laboratory such as a routine analysis, the blank data are collected fromdifferent batches, e.g. the means of duplicates from the control charts.

For the determination of the LLD of measurements where a calibration graph is used,such replicate blank determinations are not necessary since the value of the blank aswell as the standard deviation result directly from the regression analysis (see Section7.2.3 and Example 2 below).

Examples

1. Determination of the Method Detection Limit (MDL) of a Kjeldahl-N determination insoils

Table 7-1 gives the data obtained for the blanks (means of duplicates) in 15successive batches of a micro-Kjeldahl N determination in soil samples. Reported arethe millilitres 0.01 M HCl necessary to titrate the ammonia distillate and the conversionto results in mg N by: reading × 0.01 × 14.

Table 7-1. Blank data of 15 batches of a Kjeldahl-N determination in soils for thecalculation of the Method Detection Limit.

ml HCl mg N

0.12 0.0161

0.16 0.0217

0.11 0.0154

0.15 0.0203

0.09 0.0126

0.14 0.0189

0.12 0.0161

0.17 0.0238

0.14 0.0189

0.20 0.0273

0.16 0.0217

0.22 0.0308

0.14 0.0189

0.11 0.0154

0.15 0.0203

Mean blank: 0.0199

sbl: 0.0048

MDL = 3 × sbl =0.014 mg N

Page 12: Quality of Analytical Procedures

The MDL reported in this way is an absolute value. Results are usually reported asrelative figures such as % or mg/kg (ppm). In the present case, if 1 g of sample isroutinely used, then the MDL would be 0.014 mg/g or 14 mg/kg or 0.0014%.

Note that if one would use only 0.5 g of sample (e.g. because of a high Ncontent) the MDL as a relative figure is doubled!

When results are obtained below the MDL of this example they must reported as: '<14mg/kg' or '< 0.0014%'. Reporting '0 %' or '0.0 %' may be acceptable for practicalpurposes, but may be interpreted as the element being absent, which is not justified.

Note 1. There are no strict rules for reporting figures below the LLD orLLQ. Most important is that data can be correctly interpreted and used.For this reason uncertainties (confidence limits) and detection limitsshould be known and reported to clients or users (if only upon request).

The advantage of using the " <" sign for values below the LLD or LLQ isthat the value 0 (zero) and negative values can be avoided as they areusually either impossible or improbable. A disadvantage of the " <" sign isthat it is a non-numerical character and not suitable in spreadsheetprograms for further calculation and manipulation. In such cases theactually found value will be required, but then the inherent confidencerestrictions should be known to the user.

Note 2. Because a normal distribution of data is assumed it canstatistically be expected that zero and negative values for analyticalresults occur when blank values are subtracted from test values equal toor lower than the blank. Clearly, only in few cases are negative valuespossible (e.g. for adsorption) but for concentrations such values shouldnormally not be reported. Exceptions to this rule are studies involvingsurveys of attributes or effects. Then it might be necessary to report theactually obtained low results as otherwise the mean of the survey wouldbe biased.

2. Lower Limit of Detection derived from a calibration graph

We use the calibration graph of Figure 7-1. Then, noting that sbl = sx = 0.6097 andusing Equation (7.11) we obtain: LLD = 3×0.6097 = 1.829 mg/L.

It is noteworthy that "forcing the blank to zero" does not affect the Lower Limit ofDetection. Although a (= yb, see Fig. 7-1) may become zero, the uncertainty sy of thecalibration graph, and thus of sx and sbl, is not changed by this: the only change is thatthe "forced" calibration line has moved up and now runs through the intersection of theaxes (parallel to the "original" line).

7.4 Types of sample material

7.4.1 Certified reference material (CRM)7.4.2 Reference material (RM)7.4.3 Control sample7.4.4 Test sample7.4.5 Spiked sample

Page 13: Quality of Analytical Procedures

7.4.6 Blind sample7.4.7 Sequence-control sample

Although several terms for different sample types have already freely been used in theprevious sections, it seems appropriate to define the various types before the majorQuality Control operations are discussed.

7.4.1 Certified reference material (CRM)

A primary reference material or substance, accompanied by a certificate, one or moreof whose property values are accurately determined by a number of selectedlaboratories (with a stated method), and for which each certified value is accompaniedby an uncertainty at a stated level of confidence.

These are usually very expensive materials and, particularly for soils, hard to come byor not available. For the availability a computerized databank containing information onabout 10,000 reference materials can be consulted (COMAR, see Appendix 4).

7.4.2 Reference material (RM)

A secondary reference material or substance, one or more of whose property valuesare accurately determined by a number of laboratories (with a stated method), andwhich values are accompanied by an uncertainty at a stated level of confidence. Theorigin of the material and the data should be traceable.

In soil and plant analysis RMs are very important since for many analytes andattributes certified reference materials (CRMs) are not (yet) available. For certainproperties a "true" value cannot even be established as the result is always method-dependent, e.g. CEC, and particle-size distribution of soil material. A very usefulsource for RMs are interlaboratory (round robin) sample and data exchangeprogrammes. The material sent around is analyzed by a number of laboratories andthe resulting data offer an excellent reference base, particularly if somehow there is alink with a primary reference material. Since this is often not the case, the data mustbe handled with care: it may well be that the mean or median value of 50 or morelaboratories is "wrong" (e.g. because most use a method with an inadequate digestionstep).

In some cases different levels of analyte may be imitated by spiking a sample with theanalyte (see 7.4.5). However, this is certainly not always possible (e.g. CEC,exchangeable cations, pH, particle-size distribution).

7.4.3 Control sample

An in-house reference sample for which one or more property values have beenestablished by the user laboratory, possibly in collaboration with other laboratories.

This is the material a laboratory needs to prepare for second-line (internal) control ineach batch and the obtained results of which are plotted on Control Charts. Thesample should be sufficiently stable and homogeneous for the properties concerned.The preparation of control samples is discussed in Chapter 8.

7.4.4 Test sample

Page 14: Quality of Analytical Procedures

The material to be analyzed, the "unknown".

7.4.5 Spiked sample

A test material with a known addition of analyte.

The sample is analyzed with and without the spike to test recovery (see 7.5.6). Itshould be a realistic surrogate with respect to matrix and concentration. The mixtureshould be well homogenized.

The requirement "realistic surrogate" is the main problem with spikes. Often theanalyte cannot be integrated in the sample in the same manner as the original analyte,and then treatments such as digestion or extraction may not necessarily reflect thebehaviour of real samples.

7.4.6 Blind sample

A sample with known content of the analyte. This sample is inserted by the Head ofLaboratory or the Quality Officer in batches at places and times unknown to theanalyst. The frequency may vary but as an indication one sample in every 10 batchesis given.

Various types of sample material may serve as blind samples such as controlsamples or sufficiently large leftovers of test samples (analyzed several times). Incase of water analysis a solution of the pure analyte, or combination of analytes, maydo. Essential is that the analyst is aware of the possible presence of a blind samplebut that he does not recognize the material as such.

Insertion of blind samples requires some attention regarding the administration andcamouflaging. The protocol will depend on the organization of the sample and datastream in the laboratory.

7.4.7 Sequence-control sample

A sample with an extreme content of the analyte (but falling within the working range ofthe method). It is inserted at random in a batch to verify the correct order of samples.This is particularly useful for long batches in automated analyses. Very effective is thecombination of two such samples: one with a high and one with a low analyte content.

7.5 Validation of own procedures

7.5.1 Trueness (accuracy), bias7.5.2 Precision7.5.3 Sensitivity7.5.4 Working range7.5.5 Selectivity and specificity7.5.6 Recovery7.5.7 Ruggedness, robustness7.5.8 Interferences7.5.9 Practicability7.5.10 Validation report

Page 15: Quality of Analytical Procedures

Validation is the process of determining the performance characteristics of amethod/procedure or process. It is a prerequisite for judgement of the suitability ofproduced analytical data for the intended use. This implies that a method may be validin one situation and invalid in another. Consequently, the requirements for data may,or rather must, decide which method is to be used. When this is ill-considered, theanalysis can be unnecessarily accurate (and expensive), inadequate if the method isless accurate than required, or useless if the accuracy is unknown.

Two main types of validation may be distinguished:

1. Validation of standard procedures. The validation of new or existingmethods or procedures intended to be used in many laboratories,including procedures (to be) accepted by national or internationalstandardization organizations.

2. Validation of own procedures. The in-house validation of methods orprocedures by individual user-laboratories.

The first involves an interlaboratory programme of testing the method by a number (³8) of selected renown laboratories according to a protocol issued to all participants.The second involves an in-house testing of a procedure to establish its performancecharacteristics or more specifically its suitability for a purpose. Since the former is aspecialist task, usually (but not exclusively) performed by standardizationorganizations, the present discussion will be restricted to the second type of validationwhich concerns every laboratory.

Validation is not only relevant when non-standard procedures are used but just as wellwhen validated standard procedures are used (to what extent does the laboratorymeet the standard validation?) and even more so when variants of standardprocedures are introduced. Many laboratories use their own versions of well-established methods or change a procedure for reasons of efficiency or convenience.

Fundamentally, any change in a procedure (e.g. sample size, liquid:solid ratio inextractions, shaking time) may affect the performance characteristics and should bevalidated. For instance, in Section 7.3.2 we noticed that halving the sample sizeresults in doubling the Lower Limit of Detection.

Thus, inherent in generating quality analytical data is to support these with aquantification of the parameters of confidence. As such it is part of the quality control.

To specify the performance characteristics of a procedure, a selection (so notnecessarily all) of the following basic parameters is determined:

- Trueness (accuracy), Bias- Precision- Recovery- Sensitivity- Specificity and selectivity- Working range (including MDL)- Interferences- Ruggedness or robustness- Practicability

Before validation can be carried out it is essential that the detailed procedure is

Page 16: Quality of Analytical Procedures

available as a SOP.

7.5.1 Trueness (accuracy), bias

One of the first characteristics one would like to know about a method is whether theresults reflect the "true" value for the analyte or property. And, if not, can the(un)trueness or bias be quantified and possibly corrected for?

There are several ways to find this out but essentially they are all based on the sameprinciple which is the use of an outside reference, directly or indirectly.

The direct method is by carrying out replicate analyses (n ³ 10) with the method on a(certified) reference sample with a known content of the analyte.

The indirect method is by comparing the results of the method with those of areference method (or otherwise generally accepted method) both applied to the samesample(s). Another indirect way to verify bias is by having (some) samples analyzedby another laboratory and by participation in interlaboratory exchange programmes.This will be discussed in Chapter 9.

It should be noted that the trueness of an analytical result may be sensitive to varyingconditions (level of analyte, matrix, extract, temperature, etc.). If a method is applied toa wide range of materials, for proper validation different samples at different levels ofanalyte should be used.

Statistical comparison of results can be done in several ways some of which weredescribed in Section 6.4.

Numerically, the trueness (often less appropriately referred to as accuracy) can beexpressed using the equation:

7.14

where

¯x = mean of test results obtained for reference samplem = "true" value given for reference sample

Thus, the best trueness we can get is 100%.

Bias, more commonly used than trueness, can be expressed as an absolute value by:

bias = ¯x - m (7.15)

or as a relative value by:

(7.16)

Thus, the best bias we can get is 0 (in units of the analyte) or 0 % respectively.

Example

The Cu content of a reference sample is 34.0 ± 2.7 mg/kg (2.7 = s, n=12). The results

Page 17: Quality of Analytical Procedures

of 15 replicates with the laboratory's own method are the following: 38.0; 34.6; 29.1;27.8; 40.4; 33.1; 40.9; 28.5; 36.1; 26.8; 30.6; 24.3; 31.6; 22.3; 29.9 mg/kg.

With Equation (6.1) we calculate: ¯ x = 31.6. Using Equation (7.14) the trueness is(31.6/34.0)×100% = 93%. Using Equation (7.16), the bias is (31.6 - 34.0)×100% / 34.0= - 7%.

These calculations suggests a systematic error. To see if this error is statisticallysignificant a t-test can be done. For this, with Equation (6.2) we first calculate s = 5.6.The F-test (see 6.4.2 and 7.5.2) indicates a significant difference in standard deviationand we have to use the Cochran variant of the t-test (see 6.4.3). Using Equation (6.16)we find tcal = 1.46, and with Eq. (6.17) the critical value ttab

* = 2.16 indicating that theresults obtained by the laboratory are not significantly different from the referencevalue (with 95% confidence).

Although a laboratory could be satisfied with this result, the fact remains that the meanof the test results is not equal to the "true" value but somewhat lower. As discussed inSections 6.4.1 and 6.4.3 the one-sided t-test can be used to test if this result isstatistically on one side (lower or higher) of the reference value. In the present casethe one-sided critical value is 1.77 (see Appendix 1) which also exceeds thecalculated value of 1.46 indicating that the laboratory mean is not systematically lowerthan the reference value (with 95% confidence).

At first sight a bias of -7% does not seem to be insignificant. In this case, however, thewide spread of the own data causes the uncertainty about this. If the standarddeviation of the results had been the same as that of the reference sample then, using

Equations (6.13) and (6.14), tcal were 2.58 and with ttab = 2.06 (App. 1) the differencewould have been significant according to the two-sided t-test, and with ttab =1.71significantly lower according to the one-sided t-test (at 95% confidence).

7.5.2 Precision

7.5.2.1 Reproducibility7.5.2.2 Repeatability7.5.2.3 Within-laboratory reproducibility

Replicate analyses performed on a reference sample yielding a mean to determinetrueness or bias, as described above, also yield a standard deviation of the mean as ameasure for precision. However, for precision alone also control samples and eventest samples can be used. The statistical test for comparison is done with the F-testwhich compares the obtained standard deviation with the standard deviation given forthe reference sample (in fact, the variances are compared: Eq. 6.11).

Numerically, precision is either expressed by the absolute value of the standarddeviation or, more universally, by the relative standard deviation (RSD) or coefficientof variation (CV) (see Equations 6.5 and 6.6,).

(7.17

Page 18: Quality of Analytical Procedures

where

¯x = mean of test results obtained for reference samples = standard deviation of x

If the attained precision is worse than given for the reference sample then it can still bedecided that the performance is acceptable for the purpose (which has to be reportedas such), otherwise it has to be investigated how the performance can be improved.

Like the bias, precision will not necessarily be the same at different concentration ofthe analyte or in different kinds of materials. Comparison of precision at different levelsof analyte can be done with the F-test: if the variances at a few different levels aresimilar, then precision is assumed to be constant over the range.

Example

The same example as above for bias is used. The standard deviation of the laboratoryis 5.6 mg/kg which, according to Eq. (7.17), corresponds with a precision of(5.6/31.6)×100% = 18%. (The precision of the reference sample can similarly becalculated as about 8%).

According to Equation (6.11) the calculated F-value is:

the critical value is 2.47 (App. 2, two-sided, df1 = 14, df2 =11) hence, the nullhypothesis that the two standard deviations belong to the same population is rejected:there is a significant difference in precision (at 95% confidence level).

Types of precision

The above description of precision leaves some uncertainty about the actualexecution of its determination. Because particularly precision is sensitive to the way itis determined some specific types of precision are distinguished and, therefore, itshould always be reported what type is involved.

7.5.2.1 Reproducibility

The measure of agreement between results obtained with the same method onidentical test or reference material under different conditions (execution by differentpersons, in different laboratories, with different equipment and at different times). Themeasure of reproducibility R is the standard deviation of these results sR, and for a nottoo small number of data (n³ 8) R is defined by (with 95% confidence):

R = 2.8 × sR (7.18)

(where 2.8 = 2 and is derived from the normal or gaussian distribution; ISO 5725).

Thus, reproducibility is a measure of the spread of results when a sample is analyzedby different laboratories. If a method is sensitive to different ways of execution orconditions (low robustness, see 7.5.7), then the reproducibility will reflect this.

This parameter can obviously not be verified in daily practice. For that purpose thenext two parameters are used (repeatability and within-laboratory reproducibility).

Page 19: Quality of Analytical Procedures

7.5.2.2 Repeatability

The measure of agreement between results obtained with the same method onidentical test or reference material under the same conditions (job done by oneperson, in the same laboratory, with the same equipment, at the same time or withonly a short time interval). Thus, this is the best precision a laboratory can obtain: thewithin-batch precision.

The measure for the repeatability r is the standard deviation of these results sr, and fora not too small number of data (³ 10) r is defined by (with 95% confidence):

r = 2.8 × sr (7.19)

7.5.2.3 Within-laboratory reproducibility

The measure of agreement between results obtained with the same method onidentical test material under different conditions (execution by different persons, withthe same or different equipment, in the same laboratory, at different times). This is amore realistic type of precision for a method over a longer span of time whenconditions are more variable than defined for repeatability.

The measure is the standard deviation of these results sL (also called between-batchprecision). The within-laboratory reproducibility RL is calculated by:

RL = 2.8 × sL (7.20)

The between-batch precision can be estimated in three different ways:

1. As the standard deviation of a large number (n³ 50) of duplicate determinationscarried out by two analysts:

(7.21)

where

si, = the standard deviation of each pair of duplicatesk = number of pairs of duplicatesdi = difference between duplicates within each pair

2. Empirically as 1.6 × sr. Then:

RL = 2.8 × 1.6 × sr

or:

RL = 1.6 × r (7.22)

where r is the repeatability as defined above.

3. The most practical and realistic expression of the within-laboratory reproducibility isthe one based on the standard deviation obtained for control samples during routinework. The advantage is that no extra work is involved: control samples are analyzed

Page 20: Quality of Analytical Procedures

in each batch, and the within-laboratory standard deviation is calculated each time acontrol chart is completed (or sooner if desired, say after 10 batches). The calculationis here:

RL = 2.8 × scc (7.23)

where scc is the standard deviation obtained from a Control Chart (see 8.3.2).

Clearly, the above three RL values are not identical and thus, whenever the within-laboratory reproducibility is reported, the way by which it is obtained should always bestated.

Note: Naturally, instead or reporting the derived validation parameters forprecision R, r, or RL, one may prefer to report their primary measure: thestandard deviation concerned.

7.5.3 Sensitivity

This is a measure for the response y of the instrument or of a whole method to theconcentration C of the analyte or property, e.g. the slope of the analytical calibrationgraph (see Section 7.2.2). It is the value that is required to quantify the analyte onbasis of the analytical signal. The sensitivity for the analyte in the final sample extractmay not necessarily be equal to the sensitivity for the analyte in a simple standardsolution. Matrix effects may cause improper calibration of the measuring Step of theanalytical method. As observed earlier for calibration graphs, the sensitivity may notbe constant over a long range. It usually decreases at higher concentrations bysaturation of the signal. This limits the working range (see next Section 7.5.4). Someof the most typical situations are exemplified in Figure 7-2.

Fig. 7-2. Examples of some typical response graphs. 1. Constant sensitivity. 2.Sensitivity constant over lower-range, then decreasing. 3. Sensitivity decreasing

over whole range. (See also 7.5.4.)

Page 21: Quality of Analytical Procedures

In general, on every point of the response graph the sensitivity can be expressed by

(7.24)

The dimension of S depends on the dimensions of y and C. In atomic absorption, forexample, y is expressed in absorbance units and C in mg/L. For pH and ion-selectiveelectrodes the response of the electrode is expressed in mV and the concentration inmg/L or moles (plotted on log scale). Often, for convenience, the signal is conversedand amplified to a direct reading in arbitrary units, e.g. concentration. However, forproper expression of the sensitivity, this derived response should be converted backto the direct response. In practice, for instance, this is simply done by making acalibration graph in the absorbance mode of the instrument as exemplified in Figure 7-1, where slope b is the sensitivity of the P measurement on the spectrophotometer. Ifmeasured in the absorption (or transmission) mode, plotting should be done with alogarithmic y-axis.

7.5.4 Working range

For most analytical methods the working range is known from previous experience.When introducing a new method or measuring technique this range may have to bedetermined. This range can be determined during validation by attempting to span a(too) wide range. This can for instance be done by using several sample sizes,liquid:sample ratios, or by spiking samples (see 7.5.6, Recovery). This practice isparticularly important to determine the upper limit of the working range (the lower limitof a working range corresponds with the Method Detection Limit and was discussed inSection 7.3.2). The upper limit is often determined by such factors as saturation of the

Page 22: Quality of Analytical Procedures

extract (e.g. the "free" iron or gypsum determinations) or by depletion of a solution incase of adsorption procedures (e.g. phosphate adsorption; cobaltihexamine or silverthiourea adsorption in single-extraction CEC methods). In such cases theliquid:sample ratio has to be adapted.

To determine the measuring range of solutions the following procedure can be applied:

- Prepare a standard solution of the analyte in the relevant matrix (e.g.extractant) at a concentration beyond the highest expected concentration.

- Measure this solution and determine the instrument response.

- Dilute this standard solution 10× with the matrix solution and measureagain.

- Repeat dilution and measuring until the instrument gives no response.

- Plot the response vs. the concentration.

- Estimate the useful part of the response graph.

(If the dilution steps are too large to obtain a reliable graph, they need tobe reduced, e.g. 5×).

In Figure 7-2 the useful parts of graphs 1 and 2 are obviously the linear parts (and forgraph 2 perhaps to concentration 8 if necessary). Sometimes a built-in curve correctorfor the linearization of curved calibration plots can extend the range of application (e.g.in AAS). Graph 3 has no linear part but must and can still be used. A logarithmicplotting may be considered and in some cases by non-linear (polynomial) regressionan equation may be calculated. It has to be decided on practical grounds whatconcentration can be accepted until the decreasing sensitivity renders the methodinappropriate (with the knowledge that flat or even downward bending ranges areuseless in any case).

7.5.5 Selectivity and specificity

The measurement of an analyte may be disturbed by the presence of othercomponents. The measurement is then non-specific for the analyte underinvestigation. An analytical method is "fully specific" when it gives an analytical signalexclusively for one particular component, but is "dead" for all other components in thesample, e.g. when a reagent forms a coloured complex with only one analyte. Amethod is "fully selective" when it produces correct analytical results for variouscomponents of a mixture without any mutual interaction of the components, e.g. whena reagent forms several coloured complexes with components in the matrix but with adifferent colour for each component. A selective method is composed of a series ofspecific measurements.

Mutual influences are common in analytical techniques but can often easily beovercome. An example is ionization interference reducing the specificity in flamespectrometric techniques (FES, AAS). The selectivity is no problem as the usefulspectral lines can be selected exactly with a monochromator or filters. The mutualinterference can be suppressed by adding an excess of an easily ionizable element,such as cesium, which maintains the electron concentration in the flame constant. Inchromatographic techniques (GC, HPLC) specificity is sometimes a problem in the

Page 23: Quality of Analytical Procedures

analysis of complex compounds.

In the validation report, selectivity and specificity are usually described rather thanquantitatively expressed.

7.5.6 Recovery

To determine the effectiveness of a method (and also of the working range), recoveryexperiments can be carried out. Recovery can be defined as the 'fraction of theanalyte determined after addition of a known amount of the analyte to a sample'. Inpractice, control samples are most commonly used for spiking. The sample as well asthe spikes are analyzed at least 10 times, the results averaged and the relativestandard deviation (RSD) calculated. For in-house validation the repeatability(replicates in one batch, see 7.5.2.2) is determined, whereas for quality control thewithin-laboratory reproducibility (replicates in different batches, see 7.5.2.3) isdetermined and the data recorded on Control Charts. The concentration level of thespikes depend on the purpose: for routine control work the level(s) will largelycorrespond with those of the test samples (recoveries at different levels may differ): aconcentration midway the working range is a convenient choice. For the determinationof a working range a wide range may be necessary, at least to start with, see 7.5.4).An example is the addition of ammonium sulphate in the Kjeldahl nitrogendetermination. Recovery tests may reveal a significant bias in the method used andmay prompt a correction factor to be applied to the analytical results.

The recovery is calculated with:

(7.25)

where

¯xs = mean result of spiked samples¯x = mean result of unspiked samples¯xadd = amount of added analyte

If a blank (sample) is used for spiking then the mean result of the unspiked sample willgenerally be close to zero. In fact, such replicate analyses could be used to determineor verify the method detection limit (MDL, see 7.3.2).

As has been mentioned before (Section 7.4.5) the recovery obtained with a spike maynot be the same as that obtained with real samples since the analyte may not beintegrated in the spiked sample in the same manner as in real samples. Also, the formof the analyte with which the spike is made may present a problem as differentcompounds and grain sizes representing the analyte may behave differently in ananalysis.

7.5.7 Ruggedness, robustness

An analytical method is rugged or robust if results are not (very) sensitive to variationsin the experimental conditions. Such conditions can be temperature, extraction orshaking time, shaking technique, pH, purity of reagents, moisture content of sample,sample size, etc. Usually, when a new method is proposed, the ruggedness is first

Page 24: Quality of Analytical Procedures

tested by the initiating laboratory and subsequently in an interlaboratory trial. Theruggedness test is conveniently done with the so-called "Youden and Steiner partialfactorial design" where in only eight replicate analyses seven factors can be variedand analyzed. This efficient technique can also be used for within-laboratory validation.As an example the ammonium acetate CEC determination of soil will be taken. Theseven factors could be for instance:

A: With (+) and without (-) addition of 125 mg CaCO3 to the sample(corresponding with 5% CaCO3 content)B: Concentration of saturation solution: 1 M (+) and 0.5 M (-) NH4OAcC: Extraction time: 4 hours (-) and 8 hours (+)D: Admixture of sea-sand (or celite): with (+) and without (-) 1 teaspoon ofsandE: Washing procedure: 2× (-) or 3×(+) with ethanol 80%F: Concentration of washing ethanol: 70% (-) or 80% (+)G: Purity of NH4OAc: technical grade (-) and analytical grade (+)

The matrix of the design looks as shown in Table 7-2. The eight subsamples areanalyzed basically according to the SOP of the method. The variations in the SOP areindicated by the + or - signs denoting the high or low level, presence or absence of afactor or otherwise stated conditions to be investigated. The eight obtained analyticalresults are Yi,. Thus, sample (experiment) no. 1 receives all treatments A to Gindicated with (+), sample no. 2 receives treatments A, B and D indicated by (+) andC, E, F and G indicated by (-), etc.

Table 7-2. The partial factorial design (seven factors) for testing ruggedness of ananalytical method

Factors

Experiment A B C D E F G Results

1 + + + + + + + Y1

2 + + - + - - - Y2

3 + - + - + - - Y3

4 + - - - - + + Y4

5 - + + - - + - Y5

6 - + - - + - + Y6

7 - - + + - - + Y7

8 - - - + + + - Y8

The absolute effect (bias) of each factor A to G can be calculated as follows:

(7.26)

where

S YA+ = sum of results Yi, where factor A has + sign (i.e. Y1, + Y2 + Y3 +Y4; n=4)S YA- = sum of results Yi, where factor A has - sign (i.e. Y5 + Y6 + Y7+

Page 25: Quality of Analytical Procedures

Y8; n=4)

The test for significance of the effect can be done in two ways:

1. With a t-test (6.4.3) using in principle the table with "two-sided" critical tvalues (App. 1, n=4). When clearly an effect in one direction is to beexpected, the one-sided test is applicable.

2. By checking if the effect exceeds the precision of the original procedure(i.e. if the effect exceeds the noise of the procedure). Most realistic andpractical in this case would be to use scc, the within-laboratory standarddeviation taken from a control chart (see Sections 7.5.2.3 and 8.3.2).Now, the standard deviation of the mean of four measurements can be

taken as (see 6.3.4), and the standard deviation of the differencebetween two such means (i.e. the standard deviation of the effect

calculated with Eq. 7.26) as . The effect of afactor can be considered significant if it exceeds 2× the standard deviation

of the procedure, i.e. .

Therefore, the effect is significant when:

Effect >1.4 × scc (7.27)

where scc is the standard deviation of the original procedure taken from the lastcomplete control chart.

Note. Obviously, when this standard deviation is not available such as inthe case of a new method, then an other type of precision has to be used,preferably the within-laboratory reproducibility (see 7.5.2).

It is not always possible or desirable to vary seven factors. However, the discussedpartial factorial design does not allow a reduction of factors. At most, one (imaginary)factor can be considered in advance to have a zero effect (e.g. the position of themoon). In that case, the design is the same as given in Table 7-2 but omitting factor G.

For studying only three factors a design is also available. This is given in Table 7-3.

Table 7-3. The partial factorial design (three factors) for testing ruggedness of ananalytical method

Experiment Factors Results

A B C

1 + + + Y1

2 - + - Y2

3 + - + Y3

4 - - - Y4

The absolute effect of the factors A, B, and C can be calculated as follows:

Page 26: Quality of Analytical Procedures

(7.28)

where

S YA+ = sum of results Yi, where factor A has + sign (i.e. Y1 + Y3; n=2)

S YA- = sum of results Yi, where factor A has - sign (i.e. Y2 + Y4; n=2)

The test for significance of the effect can be done similarly as described above for theseven-factor design, with the difference that here n = 2.

If the relative effect has to be calculated (for instance for use as a correction factor)this must be done relative to the result of the original factor. Thus, in the aboveexample of the CEC determination, if one is interested in the effect of reducing theconcentration of the saturating solution (Factor B), the "reference" values are thoseobtained with the 1 M solution (denoted with + in column B) and the relative effect canbe calculated with:

(7.29)

The confidence of the results of partial factorial experiments can be increased byrunning duplicates or triplicates as discussed in Section 6.3.4. This is particularlyuseful here since possible outliers may erroneously be interpreted as a "strong effect".

Often a laboratory wants to check the influence of one factor only. Temperature is afactor which is particularly difficult to control in some laboratories or sometimesneedlessly controlled at high costs simply because it is prescribed in the originalmethod (but perhaps never properly validated). The very recently published standardprocedure for determining the particle-size distribution (ISO 11277) has not beenvalidated in an interlaboratory trial. The procedure prescribes the use of an end-over-end shaker for dispersion. If up to now a reciprocating shaker has been used and thelaboratory decides to adopt the end-over-end shaker then in-house validation isindicated and a comparison with the end-over-end shaker must be made anddocumented. If it is decided, after all, to continue with the reciprocating shakingtechnique (e.g. for practical reasons), then the laboratory must be able to show theinfluence of this step to users of the data. Such validation must include all soil types towhich the method is applied.

The effect of a single factor can simply be determined by conducting a number ofreplicate analyses (n>. 10) with and without the factor, or at two levels of the factor,and comparing the results with the F-test and t-test (see 6.4). Such a single effectmay thus be expressed in terms of bias and precision.

7.5.8 Interferences

Many analytical methods are to a greater or lesser extent susceptible to interferencesof various kinds. Proper validation should include documentation of such influences.Most prominent are matrix effects which may either reduce or enhance analytical

Page 27: Quality of Analytical Procedures

results (and are thus a form of reduced selectivity). Ideally, such interferences arequantified as bias and corrected for, but often this is a tedious affair or evenimpossible. Matrix effects can be quantified by conducting replicate analyses atvarious levels and with various compositions of (spiked) samples or they can benullified by imitating the test sample matrix in the standards, e.g. in X-ray fluorescencespectroscopy. However, the matrix of test samples is often unknown beforehand. Apractical qualitative check in such a case is to measure the analyte at two levels ofdilution: usually the signal of the analyte and of the interference are not proportional.

Well-known other interferences are, for example, the dark colour of extracts in thecolorimetric determination of phosphate, and in the CEC determination the presence ofsalts, lime, or gypsum. A colour interference may be avoided by measuring at another wavelength (in the case of phosphate: try 880 nm). Sometimes the only way toavoid interference is to use an other method of analysis.

If it is thought that an interference can be singled out and determined, it can bequantified as indicated for ruggedness in the previous section.

7.5.9 Practicability

When a new method is proposed or when there is a choice of methods for adetermination, it may be useful if an indication or description of the ease ortediousness of the application is available. Usually the practicability can be derivedfrom the detailed description of the procedure. The problems are in most cases relatedto the availability and maintenance of certain equipment and the required staff or skills.Also, the supply of required parts and reagents is not always assured, nor theuninterrupted supply of stable power. In some countries, for instance, high puritygrades cannot always be obtained, some chemicals cannot be kept (e.g. sodiumpyrophosphate in a hot climate) and even the supply of a seemingly common reagentsuch as ethanol can be a problem. If such limitations are known, it is useful if they arementioned in the relevant SOPs or validation report.

7.5.10 Validation report

The results of validation tests should be recorded in a validation report from which thesuitability of a method for a certain purpose can be deduced. If (legal) requirements forspecific analyses are known (e.g. in the case of toxic compounds) then suchinformation may be included.

Since validation is a kind of research project the report should have a comparableformat. A plan is usually initiated by the head of laboratory, drafted by the technicianinvolved and verified by the head. The general layout of the report should include:

- Parameters to be validated- Description of the procedures (with reference to relevant SOPs)- Results

A model for a validation SOP is given (VAL 09-2).

7.6 Drafting an analytical procedure

For drafting an analytical procedure the general instructions for drafting SOPs as givenin Chapter 2 apply. An example of an analytical procedure as it can be written in the

Page 28: Quality of Analytical Procedures

form of a SOP is METH 006. A laboratory manual of procedures, the "cookery book",can be made by simply collecting the SOPs for all procedures in a ring binder.Because analytical procedures, more than any other type of SOP, directly determinethe product of a laboratory, some specific aspects relating to them are discussedhere.

As was outlined in Chapter 2, instructions in SOPs should be written in such a waythat no misunderstanding or ambiguity exists as to the execution of the procedure.Thus, much of the responsibility (not all) lies with the author of the procedure. Even ifthe author and user are one and the same person, which should normally be the case(see 2.2), such misunderstanding may be propagated since the author usually drawson the literature or documents written by someone else. Therefore, althoughinstructions should be as brief as possible, they should at the same time be asextensive as necessary.

As an example we take the weighing of a sample, a common instruction in manyanalytical procedures. Such an instruction could read:

1. Weigh 5.0 g of sample into a 250 ml bottle.2. Add 100 ml of extracting solution and close bottle.3. Shake overnight.4. Etc., etc.

Comment 1

According to general analytical practice the amount of 5.0 g means "an amountbetween and including 4.95 g and 5.05 g" (4.95£ weight£ 5.05) since less than 4.95would round to 4.9 and more than 5.05 would round to 5.1 (note that 5.05 rounds to 5.0and not to 5.1).

Some analysts, particularly students and trainees, take the amount of 5.0 g too literallyand set out on a lengthy process of adding and subtracting sample material until thebalance reads "5.0" or perhaps even "5.00". Not only is this procedure tedious, thesample may become biased as particles of different size tend to segregate during thisprocess. To prevent such an interpretation, often the prefixes "approximately","approx." or "ca." (circa) are used, e.g. "approx. 5.0 g". As this, in turn, introduces aseeming contradiction between "5.0" (with a decimal, so quite accurate) and "approx."('it doesn't matter all that much'), the desired accuracy must be stated: "weigh approx.5.0 g (accuracy 0.01 g) into a 250 ml bottle".

The notation 5.0 g can be replaced by 5 g when the sample size is less critical (in thepresent case for instance if the ratio sample: liquid is not very critical). Sometimes itmay even be possible to use "weigh 3 - 5 g of sample (accuracy 0.1 g)". Theaccuracy needs to be stated when the actual sample weight is used in the calculationof the final result, otherwise it may be omitted.

Comment 2

The "sample" needs to be specified. A convenient and correct way is to makereference to a SOP where the preparation of the sample material is described. This isthe more formal version of the common practice in many laboratories where the useof the sample is implied of which the preparation is described elsewhere in thelaboratory manual of analytical procedures. In any case, there should be no doubt

Page 29: Quality of Analytical Procedures

about the sample material to be used. When other material than the usual "laboratorysample" or "test sample" is used, the preparation must be described and the natureindicated e.g., "field-moist fine earth" or "fraction > 2 mm" or "nodules".

When drafting a new procedure or an own version of a standard procedure, it must beconsidered if the moisture content of the used sample is relevant for the final result. Ifso, a moisture correction factor should be part of the calculation step. In certain caseswhere the sample contains a considerable amount of water (moist highly humicsamples; andic material) this water will influence the soil: liquid ratio in certainextraction or equilibration procedures. Validation of such procedures is then indicated.

Comment 3

The "250 ml bottle" needs to be specified also. This is usually done in the section"Apparatus and glassware" of the SOP. If, in general, materials are not specified, thenit is implied that the type is unimportant for the procedure. However, in shakingprocedures, the kind, size and shape of bottles may have a significant influence on theresults. In addition the kind (composition) of glass is sometimes critical e.g., for theboron determination.

Comment 4

To the instruction "Add 100 ml of extracting solution" apply the same considerationsas discussed for the sample weighing. The accuracy needs to be specified,particularly when automatic dispensers are used. The accuracy may be implicit if theequipment to be used is stated e.g., "add 100 ml solution by graduated pipette" or"volumetric pipette" or "with a 100 ml measuring cylinder". If another means of addingthe solution is preferred its accuracy should equal or exceed that of the statedequipment.

Comment 5

The instruction "shake overnight" is ambiguous. It must be known that "overnight" isequivalent to "approximately 16 hrs.", namely from 5 p.m. till 9 a.m. the next morning.It is implied that this time-span is not critical but generally the deviation should not bemore than, say, two hours. In case of doubt, this should be validated with aruggedness test. More critical in many cases is the term "shake" as this can be donein many different ways. In the section "Apparatus" of the SOP the type of shakingmachine is stated e.g., reciprocating shaker or end-over-end shaker. For thereciprocating shaker the instruction should include the shaking frequency (in strokesper minute), the amplitude (in mm or cm) and the position of the bottles (standing up,lying length-wise or perpendicular to the shaking direction). For an end-over-endshaker usually only the frequency or speed (in rpm) is relevant.

7.7 Research plan

All laboratories, including those destined for routine work, carry out research in someform. For many laboratories it constitutes the main activity. Research may range froma simple test of an instrument or a change in procedure, to large projects involvingmany aspects, several departments of an institute, much staff and money, oftencarried out by commission of third parties (contract research, sponsors).

For any project of appreciable size, according to GLP the management of the institute

Page 30: Quality of Analytical Procedures

must appoint a study director before the study is initiated. This person is responsiblefor the planning and execution of the job. He/she is responsible to a higher InspectingAuthority (IA) which may be the institute's management, the Quality Assurance Unit,the Head of Research or the like as established by the management.

A study project can be subdivided into four phases: preparation, execution, reporting,filing/archiving.

1. Preparation

In this phase the purpose and plan are formulated and approved by the IA. Anysubsequent changes are documented and communicated to the IA. The plan mustinclude:

- Descriptive title, purpose, and identification details Study director andfurther personnel Sponsor or client

- Work plan with starting date and duration Materials and methods to beused Study protocol and SOPs (including statistical treatments of data)

- Protocols for interim reporting and inspection Way of reporting and filingof results Authorization by the management (i.e. signature)

- A work plan or subroutines can often be clarified by means of a flowdiagram. Some of the most used symbols in flow diagrams forprocedures in general, including analytical procedures, are given in Figure7-3. An example of a flow sheet for a research plan is given in Fig 7-4.

Fig. 7-3. Some common symbols for flow diagrams.

Page 31: Quality of Analytical Procedures

2. Execution of the work

The work must be carried out according to the plan, protocols and SOPs. Allobservations must be recorded including errors and irregularities. Changes of planhave to be reported to the IA and if there are budgetary implications also to themanagement. The study leader must have control of and be informed about theprogress of the work and, particularly in larger projects, be prepared for inspection bythe IA.

Fig. 7-4. Design of flow diagram for study project.

Page 32: Quality of Analytical Procedures
Page 33: Quality of Analytical Procedures

3. Reporting

As soon as possible after completion of the experimental work and verification of thequality control data the results are calculated. Together with a verification statement ofthe IA, possibly after corrections have been made, the results can be reported. Thecopyright and authorship of a possible publication would have been arranged in theplan.

The report should contain all information relevant for the correct interpretation of theresults. To keep a report digestible, used procedures may be given in abbreviatedform with reference to the original protocols or SOPs. Sometimes, relevant informationturns up afterwards (e.g. calculation errors). Naturally, this should be reported, even ifthe results have already been used.

It is useful and often rewarding if after completion of a study project an evaluation iscarried out by the study team. In this way a next job may be performed better.

SOPs

VAL 09-2 - Validation of CEC determination with NH4OAcMETH 006 - Determination of nitrogen in soil with micro-Kjeldahl

VAL 09-2 - Validation of CEC determination with NH4OAc

LOGO STANDARD OPERATING PROCEDURE Page; 1 # 2

No.: VAL 09-2 Version: 1 Date: 96-09-19

Page 34: Quality of Analytical Procedures

Title: Validation of CEC determination with NH4OAc (pH 7) File:

1 PURPOSE

To determine the performance characteristics of the CEC determination withammonium acetate (pH 7) using the mechanical extractor.

The following parameters have been considered: Bias, precision, working range,ruggedness, interferences, practicability.

2 REQUIREMENTS

See SOP METH 09-2 (Cation Exchange Capacity and Exchangeable Bases withammonium acetate and mechanical extractor).

3 PROCEDURES

3.1 Analytical procedure

The basic procedure followed is described in SOP METH 09-2 with variations andnumber of replicates as indicated below. Two Control Samples have been used:LABEX 6, a Nitisol (clay» 65%, CEC» 20 cmolc/kg) and LABEX 2, an Acrisol (clay»25%; CEC» 7 cmolc/kg); further details of these control samples in SOP RF 031 (Listof Control Samples).

3.2 Bias

The CEC was determined 10× on both control samples. Reference is the mean valuefor the CEC obtained on these samples by 19 laboratories in an interlaboratory study.

3.3 Precision

Obtained from the replicates of 3,2,

3.4 Working range

The Method Detection Limit (MDL) was calculated from 10 blank determinations.Determination of the Upper Limit is not relevant (percolates beyond calibration rangeare rare and can be brought within range by dilution).

3.5 Ruggedness

A partial factorial design with seven factors was used. The experiments were carriedout in duplicate and the factors varied are as follows:

A: With (+) and without (-) addition of 125 mg CaCO3 (corresponding with 5% CaCO3 content)

B: Concentration of saturating solution: 1 M (+) and 0.5 M (-) NH4OAc

C: Extraction time: 4 hours (-) and 8 hours (+)

D: Admixture of seasand (or celite): with (+) and without (-) 1 teaspoon of sand

E: Washing procedure: 2× (-) or 3× (+) with ethanol 80%

F: Concentration of ethanol for washing free of salt: 70% (-) or 80% (+)

G: Parity of NH4OAc: technical grade (-) and analytical grade (+)

3.6 Interferences

Page 35: Quality of Analytical Procedures

Two factors particularly interfere in this determination: 1. high clay content (problemswith efficiency of percolation) and 2. presence of CaCO3 (competing with saturatingindex cation). The first was addressed by the difference in clay content of the twosamples as well as by Factor D in the ruggedness test, the second by factor A of theruggedness test,

3.7 Practicability

The method is famous for its wide application and ill-famed for its limitations. Some ofthe most prominent aspects in this respect are considered.

4 RESULTS

As results may have to be produced as a document accompanying analytical results(e.g. on request of clients) they are presented here in a model format suiting thispurpose.

In the present example where two different samples have been used the results forboth samples may be given on one form, or for each sample on a separate form.

For practical reasons, abbreviated reports may be released omitting irrelevantinformation. {The full report should always be kept!)

LOGO METHOD VALIDATION FORM Page: 1 # 1

No.: VAL RES 09-2 Version: 1 Date: 96-11-23

Title: Validation data CEC-NH4OAc (METH 09-2) File:

1 TITLE or DESCRIPTION

Validation of cation exchange capacity determination with NH4OAc pH 7 method asdescribed in VAL 09-2 dd. 96-09-19.

2 RESULTS

2.1 Bias(Accuracy):

Result of calculation -with Eq. (7.14) or (7.16) of Guidelines.

2.2 Precision

Repeatability: Result of calculation with Eq. (7.17) or (7.19).

Within-labreproducibility:

Result of calculation with Eq. (7.23) (if Control Charts are available).

2.3 Workingrange:

Result of calculation as examplified by Table 7-1 in Section 7.3.2 of Guidelines.

2.4Ruggedness:

Results of calculations with Eq. (7.26) or (7.29),

2.5Interferences:

In this case mainly drawn from Ruggedness test

2.6Practicability:

Special equipment necessary: mechanical extractor substantial amounts ofethanol required washing procedures not always complete, particularly in high-clay samples, requiring thorough check.

2.7 Generalobservations:

Author: Sign.:

Page 36: Quality of Analytical Procedures

QA Officer (sign.): Date of Expiry:

Author: Sign.:

QA Officer (sign.): Date of Expiry:

METH 006 - Determination of nitrogen in soil with micro-Kjeldahl

LOGO METHOD VALIDATION FORM Page: 1 # 1

No.: METH 006 Version: 2 Date: 96-03-01

Title: Determination of nitrogen in soil with micro-Kjeldahl File:

1. SCOPE

This procedure describes the determination of nitrogen with the micro-Kjeldahltechnique. It is supposed to include all soil nitrogen (including adsorbed NH4

+) exceptthat in nitrates.

2. RELATED DOCUMENTS

2.1 Normative references

The following standards contain provisions referred to in the text.

ISO 3696 Water for analytical laboratory use. Specification and testmethods,ISO 11464 Soil quality Pretreatment of samples for physico-chemicalanalysis.

2.2 Related SOPs

F 001 Administration of SOPs

APP 066 Operation of Kjeltec 1009 digester

APP 067 Operation of ammonia distillation unit

APP 072 Operation of Autoburette ABU 13 and Titrator TTT 60 (facultative)

RF 008 Reagent Book

METH 002 Moisture content determination

3. PRINCIPLE

The micro-Kjeldahl procedure is followed. The sample is digested in sulphuric acid andhydrogen peroxide with selenium as catalyst and whereby organic nitrogen isconverted to ammonium sulphate. The solution is then made alkaline and ammonia isdistilled. The evolved ammonia is trapped in boric acid and titrated with standard acid,

4. APPARATUS AND GLASSWARE

4.1 Digester (Kjeldahl digestion tubes in heating block)4.2 Steam-distillation unit (Fitted to accept digestion tubes)4.3 Burette 25 ml

5. REAGENTS

Use only reagents of analytical grade and deionized or distilled water (ISO 3696).

Page 37: Quality of Analytical Procedures

5.1 Sulphuric acid - selenium digestion mixture. Dissolve 3.5 g seleniumpowder in 1 L concentrated (96%, density 1.84 g/ml) sulphuric acid bymixing and heating at approx. 350°C. on a hot plate. The dark colour ofthe suspension turns into clear light-yellow. When this is reached,continue heating for 2 hour

5.2 Hydrogen peroxide, 30%.

5.3 Sodium hydroxide solution, 38%. Dissolve 1,90 kg NaOH pellets in 2L water in a heavy-walled 5 L flask. Cool the solution with the flaskstoppered to prevent absorption of atmospheric CO2. Make up thevolume to 5 L with freshly boiled and cooled deionized water. Mix well.

5.4 Mixed indicator solution. Dissolve 0.13 g methyl red and 0.20 gbromocresol green in 200 ml ethanol.

5.5 Boric acid-indicator solution, 1%. Dissolve 10 g H3BO3 in 900 ml hotwater, cool and add 20 ml mixed indicator solution. Make to 1 L with waterand mix thoroughly.

5.6 Hydrochloric acid, 0.010 M standard. Dilute standard analyticalconcentrate ampoule according to instruction.

Author: Sign.:

QA Officer (sign.): Date of Expiry:

6. SAMPLE

Air-dry fine earth (<2 mm) obtained according to ISO 11464 (or refer to ownprocedure). Mill approx. 15 g of this material to pass a 0.25 mm sieve. Use part of thismaterial for a moisture determination according to ISO 11465 and PROC 002.

7. PROCEDURE

7.1 Digestion

1. Weigh 1 g of sample (accuracy 0.01 g) into a digestion tube. Of soils,rich in organic matter (>10%), 0.5 g is weighed in (see Remark 1). In eachbatch, include two blanks and a control sample.

2. Add 2.5 ml digestion mixture.

3. Add successively 3 aliquots of 1 ml hydrogen peroxide. The nextaliquot can be added when frothing has subsided. If frothing is excessive,cool the tube in water.Note:. In Steps 2 and 3 use a measuring pipette with balloon or adispensing pipette,

4. Place the tubes on the heater and heat for about 1 hour at moderatetemperature (200°C).

5. Turn up the temperature to approx. 330°C (just below boiling temp.)and continue heating until mixture is transparent (this should take abouttwo hours).

Page 38: Quality of Analytical Procedures

6. Remove tubes from heater, allow to cool and add approx., 10 ml waterwith a wash bottle while swirling.

7.2 Distillation

1. Add 20 ml boric acid-indicator solution with measuring cylinder to a 250ml beaker and place beaker on stand beneath the condenser tip.

2. Add 20 ml NaOH 38% with measuring cylinder to digestion tube anddistil for about 7 minutes during which approx. 75 ml distillate is produced.

Note: the distillation time and amount of distillate may need to beincreased for complete distillation (see Remark 2).

3. Remove beaker from distiller, rinse condenser tip, and titrate distillatewith 0.01 M HCl until colour changes from green to pink.

Note: When using automatic titrator: set end-point pH at 4.60.

Remarks

1. The described procedure is suitable for soil samples with a nitrogencontent of up to 10 mg N. This corresponds with a carbon content ofroughly 10% C. Of soils with higher contents, less sample material isweighed in. Sample sizes of less than 250 mg should not be usedbecause of sample bias.

2. The capacity of the procedure with respect to the amount of N that canbe determined depends to a large extent on the efficiency of the distillationassembly. This efficiency can be checked, for instance, with a series ofincreasing amounts of (NH4)2SO4 or NH4Cl containing 0-50 mg N.

8. CALCULATION

where

a = ml HCl required for titration of sampleb = ml HCl required for titration of blanks = air-dry sample weight in gramM = molarity of HCl1.4 = 14 × 10-3 × 100% (14 = atomic weight of nitrogen)mcf = moisture correction factor

9. VALIDATION PARAMETERS

9.1 Bias: -3.1% rel. (sample ISE 921, ¯x=2.80 g/kg N, n=5)

9.2 Within-labreproducibility:

RL = 2.8×scc = 2,5% rel. (sample LABEX 38,¯x =2.59 g/kg N,n=30)

9.3 Method Detection Limit: 0.014 mg N or 0.0014% N

10. TEST REPORT

Page 39: Quality of Analytical Procedures

The report of analytical results shall contain the following information:

- the result(s) of the determination with identification of the correspondingsample(s);- a reference to this SOP (if requested a brief outline such as given underclause 3: Principle);- possible peculiarities observed during the test;- all operations not mentioned in the SOP that can have affected theresults.

11. REFERENCES

Hesse, P.R. (1971) A textbook of soil chemical analysis. John Murray,London.

Bremner, J.M. and C.S. Mulvaney (1982) Nitrogen Total. In: Page, A.L.,R.H. Miller & D.R. Keeney (eds.) Methods of soil analysis. Part 2.Chemical and microbiological properties, 2nd ed. Agronomy Series 9ASA, SSSA, Madison. ISO 11261 Soil quality - Determination of totalnitrogen - Modified Kjeldahl method.