9
Introduction to Error Analysis for the Physical Chemistry Laboratory January 16, 2012 1 Introduction In the physical chemistry laboratory you will make a variety of measurements, and then manipulate them to arrive at a numerical value for a physical property. However, without an estimate of the error of these numbers they are largely useless. Some published numbers in physics and chemistry are accurate to 10 significant figures or more, while others are only accurate to an order of magnitude (no significant figures!). The estimated error of a published number is a crucial piece of information that must be calculated. This handout provides a practical introduction to the error analysis required in a typical physical chemistry lab. Error in an experiment is classified into two types: random error and systematic error. Random error arises from unpredictable fluctuations in taking a measurement and is the subject of this handout. Systematic error arises from errors in the equipment or procedure which cause the number measured to be consistently above or below the real number. This type of error includes simplifications in your model, biased in- strumentation, impure reagents, etc. Random error measures the precision of the experiment, or the reproducibility of a given result. Systematic error measures the accuracy of a result, or how close a result is to the true value. Random error can be decreased by increasing the number of measurements you take, systematic error cannot. The distinction between accuracy and precision is illustrated schematically in figure 1. 2 What is uncertainty? Most chemists have an intuitive idea of what uncertainty is, but it is instructive to give a more rigorous definition. Suppose you perform an experiment to determine the boiling point of a liquid, and you measure 32 degrees. How confident are you in this number? This question could in principle be answered by repeating the experiment many times and collecting the results. If you took this large collection of results and counted all of the values that lie within specified intervals (e.g. between 30 and 30.1, 30.1 and 30.2, etc.) you could 1

Handout m2 1

  • Upload
    mo-ml

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Handout m2 1

Introduction to Error Analysis for the Physical

Chemistry Laboratory

January 16, 2012

1 Introduction

In the physical chemistry laboratory you will make a variety of measurements,and then manipulate them to arrive at a numerical value for a physical property.However, without an estimate of the error of these numbers they are largelyuseless. Some published numbers in physics and chemistry are accurate to10 significant figures or more, while others are only accurate to an order ofmagnitude (no significant figures!). The estimated error of a published numberis a crucial piece of information that must be calculated.

This handout provides a practical introduction to the error analysis requiredin a typical physical chemistry lab. Error in an experiment is classified intotwo types: random error and systematic error. Random error arises fromunpredictable fluctuations in taking a measurement and is the subject of thishandout. Systematic error arises from errors in the equipment or procedurewhich cause the number measured to be consistently above or below the realnumber. This type of error includes simplifications in your model, biased in-strumentation, impure reagents, etc. Random error measures the precisionof the experiment, or the reproducibility of a given result. Systematic errormeasures the accuracy of a result, or how close a result is to the true value.Random error can be decreased by increasing the number of measurements youtake, systematic error cannot. The distinction between accuracy and precisionis illustrated schematically in figure 1.

2 What is uncertainty?

Most chemists have an intuitive idea of what uncertainty is, but it is instructiveto give a more rigorous definition. Suppose you perform an experiment todetermine the boiling point of a liquid, and you measure 32 degrees. Howconfident are you in this number? This question could in principle be answeredby repeating the experiment many times and collecting the results. If youtook this large collection of results and counted all of the values that lie withinspecified intervals (e.g. between 30 and 30.1, 30.1 and 30.2, etc.) you could

1

Page 2: Handout m2 1

Figure 1: Schematic illustration of accuracy and precision. The left-hand targetrepresents a high precision but low accuracy experiment. The right-hand targetrepresents a low precision but high accuracy experiment

make a histogram plot (figure 2). You can see that the results are spread over arange of values. The width of this spread can be quantified using the standarddeviation (σ) of the distribution which is defined as

σ =

√√√√ N∑i=1

(fi − f̄)2

N − 1(1)

where fi are the results of your individual experiments, f̄ is the average of yourresults, and N is the number of trials performed. The standard deviation giveslimits above and below a measured value in which subsequent experimentalresults will probably lie. The number will lie within +/- σ of the average 68.2%of the time and within +/- 2σ of the average 95.4% of the time.

The spread of a distribution can be used to determine the experimentaluncertainty in your measurement. When a measurement is repeated many times,the true value is taken to be the average of the values obtained for the manydifferent measurements. The error in the average is then the standard deviationof the mean which is defined as

σf̄ =σ√N

(2)

where σ is the standard deviation of the measurement and N is the number ofmeasurements that were taken. Notice that the standard deviation of the meandecreases as the number of points taken increases (though it does so slowly as1/

√N). Thus, for random error, increasing the number of points taken decreases

the error in the mean.Most often in the physical chemistry lab you will only perform an experi-

ment one or two times. In this case it is not possible to calculate the standarddeviation of a large number of trials directly. Instead, the theoretical errorcan be used. The theoretical error is a measure of the error associated with thetypical use of the glassware and equipment. For example, an instrument such as

2

Page 3: Handout m2 1

0

10

20

30

40

50

60

70

27 28 29 30 31 32 33 34 35

Cou

nts

Boiling point [degrees C]

Figure 2: Distribution of a series of 1000 boiling point experiment results. Theuncertainty in a single trial is related to the width of the distribution, and iscalled the standard deviation of the distribution (σ).

a volumetric flask will state its uncertainty in its technical specifications. If not,you can make an educated estimate of the uncertainty. If an instrument givesa digital reading, you can generally take the uncertainty to be half of the lastdecimal place. For example, if a digital thermometer reads 25.4 degrees, the un-certainty is 0.05 degrees. For analog instruments, first read the measurement toas many significant figures as there are marks on the gauge, and then estimateone more significant figure. The uncertainty should then be estimated basedon how confident you are in the estimated significant figure. For example, if amercury thermometer has marks at every degree, you would read the number ofdegrees, and estimate the tenths of a degree. The uncertainty might be plus orminus 0.2 degrees (e.g. 25.4 ± 0.2 degrees). Typically the standard deviationof the mean is larger than the theoretical error, however, if both are availablethe larger of the two numbers should be assumed.

3 Error propagation

Often times the measurement of several different quantities are used to deter-mine a final desired value. For example, to determine the number of moles in agas sample using the ideal gas law, we would measure the pressure, the volume,and the temperature. Each of these individual measurements would have bothan experimental and a theoretical error which would need to be considered whendetermining our error in the number of moles, n. To determine our error in n

3

Page 4: Handout m2 1

we need to propagate the errors in the individual measurements that led to thefinal result. To begin our discussion of error propagation, consider an exper-iment that measures some quantity x. The result we are looking for is somefunction f(x). The measurement of x is subject to some uncertainty bounds,and the most general case does not assume symmetric uncertainties above andbelow x. In this case, the measured value x0 is within the range

x0 − σ− < x0 < x0 + σ+ (3)

where x0 is the measured value of x, σ+ is the uncertainty above x0, and σ− isthe uncertainty below x0. The desired property f is then within the range

f(x0 − σ−) < f(x0) < f(x0 + σ+) (4)

If the uncertainty in x is assumed to be small, the uncertainty in f becomes

σf =df

dx

∣∣∣∣x0

σx (5)

where σx is the uncertainty in x, which we now take to be random, i.e. symmet-ric. This error can be the standard deviation of the mean or the theoretical errorin x, depending on what is available. If we have a function of many variables,and if the errors are both small and independent, the uncertainty is

σf =

√(∂f

∂x

)2

x0

σ2x +

(∂f

∂y

)2

y0

σ2y + · · · (6)

We can derive some special cases from equation 6. If a function only containsaddition and subtraction operations, the uncertainty is

σf =√σ2x + σ2

y + · · · (7)

If a function only contains multiplication and division operations, the uncer-tainty is

σf = |f | ×√(

σx

x0

)2

+

(σy

y0

)2

+ · · · (8)

Now that we are equipped with these formulas, we can proceed to propagateour individual uncertainties. We will illustrate this procedure with an example.Suppose you want to measure the molar heat of solvation of LiCl in water. Thisinvolves (1) weighing an amount of LiCl, (2) measuring a volume of water, and(3) measuring the temperature change when the reagent is dissolved. We willfirst calculate the heat of solvation (H) itself, which is expressed in terms of ourthree measurements:

H = −C × V × T

m/M(9)

Here m is the mass of LiCl, M is the molecular weight, C is the heat capacityof water per unit volume, V is the volume of water, and T is the temperature

4

Page 5: Handout m2 1

change. Note that the function H only contains multiplication/division opera-tions, so we can use the error propagation rule for multiplication and division(equation 8). The variables we need to consider are m, V , and T . We do notinclude M and C in our list of variables, because they are assumed to be knownto much higher (relative) precision than m, V and T . If they were not, wewould have to include them in our error analysis, even if we didn’t measurethem. Plugging our variables into equation 8, we have

σH = |H |√(σm

m

)2

+(σV

V

)2

+(σT

T

)2

(10)

Problem 1 Calculate the heat of solvation of LiCl and its associated uncer-tainty as discussed above, if the mass is 2.1 ± 0.05 g, the molecular weight is42.394 ±0.0005 g/mol, the volume is 0.10 ± 0.02 L, the temperature change is4.0 ± 0.5 K, and the heat capacity per volume is exactly 4.184 kJ/LK. Don’tinclude the error in the molecular weight in your calculation.

(Answer: H = −33.79 kJ/mol, σH = 8.01 kJ/mol, so you would report theheat of solvation as −34± 8 kJ/mol.)

Problem 2 Perform the same calculation as in the last problem, but this timeinclude the molecular weight in your list of error propagation variables.

(Answer: You should get the same answer as before, to at least 6 significantfigures! This is why it is often possible to ignore variables known to high preci-sion in your error propagation.)

Sometimes a function may contain both addition/subtraction and multipli-cation/division, in which case the two rules can be combined. The easiest wayto do this is to break the calculation into steps. Going back to the heat ofsolvation experiment, suppose that two separate masses were weighed, and thenboth masses were added to the solvent. Now the total mass is m = m1 +m2,and the equation for the heat of solvation becomes

H = − C × V × T

(m1 +m2)/M(11)

The first step is to calculate the uncertainty in m = m1 + m2 using the errorpropagation rule for addition/subtraction, i.e.

σm =√σ2m1

+ σ2m2

(12)

Then simply use the total mass m and its calculated uncertainty, and proceedas in Problem 1.

Problem 3 Calculate the uncertainty for the previous example if the two weightswere 0.5 ± 0.05 g and 1.1 ± 0.05 g, the volume is again 0.1 ± 0.02 L, and the

5

Page 6: Handout m2 1

temperature change is 3 ± 0.5 K.

(Answer: σm = 0.07 g, H = 33.26 kJ/mol, and σH = 8.78 kJ/mol. Reportas H = 33 kJ/mol, and σH = 9 kJ/mol)

Almost all of the error propagation you will do in the physical chemistry labwill only require the rules for addition/subtraction and multiplication/division.However, occasionally you might come across a more complicated function, inwhich case we need to use equation 6. For example, suppose you have de-termined ΔG for a reaction and are interested in calculating the equilibriumconstant:

K = exp

(−ΔG

RT

)(13)

Assuming that R and T are known to high precision, we only need to calculatethe partial derivative of K with respect to ΔG:

∂K

∂ΔG=

−1

RTexp

(−ΔG

RT

)(14)

and using equation 6, the uncertainty in K is

σK =1

RTexp

(−ΔG

RT

)σΔG (15)

Finally, for the small number of data points typically taken in the lab(N<∼20) it turns out that the standard deviation of the mean is not a goodestimate of the error of the mean value. There are several ways to correct thestandard deviation of the mean when only a few points are taken, but the moststraightforward technique is to report a confidence interval instead of σf̄ . Aconfidence interval is a range that is likely, within a certain accuracy, to containthe correct value. For example, if a range is reported to be the 95% confidenceinterval, there is a 95% chance that the true number will lie within that interval.The confidence limit has the advantage that it is not biased by the number ofdata points taken and therefore is a good estimate of the error when the numberof points taken is small. The confidence interval is given by

μ = x̄± σ√N

t = x̄±Δ (16)

where x̄ is the mean value, σ is the standard deviation, N is the number ofpoints taken and and t is the t value for N-1 degrees of freedom at the desired(usually 95%) confidence limit. Note that this conversion is done after any errorpropigation and is not well defined for values calculated from many measure-ments where a different number of trials are taken for each measurement. Inthese cases usually the standard deviation of the mean is recorded. It is upto you to decide what the correct value for the error should be and to clearlyexplain how your error was estimated.

6

Page 7: Handout m2 1

4 Graphs

In many experiments you will be required to calculate the slope or intercept ofa linear function. For example, the rate of a unimolecular reaction obeys anexponential rate law

c(t) = A exp(−kt) (17)

where c is the concentration of reagent, k is the rate constant, t is time, and Ais a constant. If you want to calculate k, you will need values for c at differenttimes. Taking the natural logarithm of equation 17 gives

ln c = −kt+ lnA (18)

which is a linear equation with the familiar form y = mx+ b. In this case, whatinterests us is the slope as a function of t, which is our desired rate constant.

Using two data points, (x1, y1) and (x2, y2), the slope and intercept can becalculated directly:

m =y2 − y1x2 − x1

(19)

b = y1 −mx1 (20)

and the uncertainties can be calculated using error propagation. If the error inx is negligible compared to the error in y, you can calculate the maximum andminimum values for the slope and intercept as illustrated in figure 3. Drawinga line through the upper bound of y1 and the lower bound for y2 gives a lowerlimit to the slope and an upper limit to the intercept. Similarly, drawing a linethrough the lower bound of y1 and the upper bound of y2 gives an upper limitto the slope and a lower limit for the intercept.

If you have many data points, the “average” slope and intercept can becalculated using linear regression. In practice this is always calculated usinga computer program, so the details of the procedure do not concern us. Yousimply need to become acquainted with a program capable of performing a linearregression with error analysis. There is a handout on the chemistry 4581/4591web site to help you with this. The result of the calculation will give a valueand an uncertainty for the slope and intercept.

5 Significant Figures

All reported values must be given with the correct number of significant fig-ures. It is assumed that you have learned the basics of assigning and reportingsignificant figures in your general chemistry lab. However, when reporting un-certainties, you must also consider what numbers you are certain of given theerror you determined from your error analysis. Errors are generally reportedto one significant figure. For example, if you calculate your error as 0.83, youare uncertain about the 8 so you are even less certain about the 3. The errorthen should be rounded to 0.8, a reasonable measure of how sure you are of

7

Page 8: Handout m2 1

20

25

30

35

40

45

50

55

60

4 5 6 7 8 9 10 11

x

y

Figure 3: How to determine the uncertainty in the slope and intercept of twodata points. The bars indicate the uncertainty in the y variable, and the dashedlines give upper and lower bounds for the line.

your answer. The one exception is if the one significant figure you would keepin your error is a one, then you typically would keep two significant figures. Forexample, if your error is calculated to be 0.132, you would report it as 0.13.This is because the number following the one is a large enough fraction of theerror that it should be reported.

When reporting a number where you have calculated the uncertainty, onlyreport your number to the same number of digits as you report your error. Forexample, if you calculate that the heat capacity of something is 5.86 kJ/molKbut you calculate an error of 0.4 kJ/molK, you should report the value for theheat capacity as 5.9 +/- 0.4 kJ/molK, even if your significant figure rules suggestyou should report all three digits. The error in your calculation is large enoughthat you cannot be certain of the 6 in this example because you are not evencompletely certain of the 8. So by reporting the 6 you would be claiming tohave more information about your heat capacity than you possibly could havegiven your level of error.

Further reading

[1] P. Bevinton and D. K. Robinson, Data Reduction and Error Analysis forthe Physical Sciences; McGraw-Hill, 2002.

[2] E. B. Wilson Jr., An Introduction to Scientific Research, Dover, 1990.

8

Page 9: Handout m2 1

[3] J. R. Taylor, An Introduction to Error Analysis: The Study of Uncertaintiesin Physical Measurements; University Science Books, 1997.

[4] D. P. Shoemaker, C. W. Garland, and J. W. Nibler, Experiments in PhysicalChemistry, McGraw-Hill, 1996.

9