Physics 114: Lecture 16 Linear and Non-Linear Fitting Dale E. Gary NJIT Physics Department

Physics 114: Lecture 16 Linear and Non-Linear

Fitting

Dale E. Gary

NJIT Physics Department

Mar 29, 2010

Reminder of Previous Results Last time we showed that it is possible to fit a straight line of the

form

to any data by minimizing chi-square:

We found that we could solve for the parameters a and b that minimize the difference between the fitted line and the data (with errors i) as:

( )y x a bx

2 2

2 ( ) 1ii

i i

y y xy a bx

2 2

2

2 2

2 2

2 2

2

2 2 2 2

1

2 2 2 2

1 1

1 1 1,

i i

i i

i i i

i i

i

i i

i i i

i i

y x

i i i i i

x y xi i i i

y

i i i i

x x yi i i i

x y x x ya

x y x yb

2 2

2

2 2

1 22

2 2 2

1.

i

i i

i i

i i

x

i i

x xi i i

x x

where

Note, if errors i are all the same, they cancel, so can be ignored.

Becomes N

Mar 29, 2010

MatLAB Commands for Linear Fits MatLAB has a “low level” routine called polyfit() that can be used to

fit a linear function to a set of points (assumes all errors are equal): x = 0:0.1:2.5; y = randn(1,26)*0.3 + 3 + 2*x; % Makes a slightly noisy linear set of points y = 3

+ 2x p = polyfit(x,y,1) % Fits straight line and returns values of b and a

in p

p = 2.0661 2.8803 % in this case, fit equation is y = 2.8803 + 2.0661x

Plot the points, and overplot the fit using polyval() function. plot(x,y,'.'); hold on plot(x,polyval(p,x),'r')

Here is the result. Thepoints have a scatter of = 0.3 around the fit,as we specified above.

0 0.5 1 1.5 2 2.52

3

4

5

6

7

8

9

x

y

Points and Fit

data points

Fit

Mar 29, 2010

Fits with Unequal Errors MatLAB has another routine called glmfit() (generalized linear

model regression) that can be used to specify weights (unequal errors). Say the errors are proportional to square-root of y (like Poisson):

err = sqrt(y)/3; hold off errorbar(x,y,err,’.’) % Plots y vs. x, with error bars

Now calculate fit using glmfit() p = glmfit(x,y) % Does same as polyfit(x,y,1) (assumes equal errors) p = glmfit(x,y,’normal’,’weights’,err) % Allows specification of errors (weights),

but must include ‘normal’ distribution type

p = circshift(p,1) % Unfortunately, glmfit returns a,b instead of b,a as polyval() wants

% circshift() has the effect of swapping the order of p elements

hold on plot(x,polyval(p,x),’r’)

-0.5 0 0.5 1 1.5 2 2.5 32

3

4

5

6

7

8

9

x

y

Points with Errors, and Fit

Mar 29, 2010

Error Estimation We saw that when the errors of each point in the plot are the same, they cancel from the

equations for the fit parameters a and b. If we do not, in fact, know the errors, but we believe that a linear fit is a correct fit to the points, then we can use the fit itself to estimate those errors.

First, consider the residuals (deviations of the points from the fit, yi – y), calculated by resid = y - polyval(p,x) plot(x,resid)

You can see that this is a random distribution with zero mean, as it should be. As usual, you can calculate the variance, where now there are two fewer degrees of freedom (m = 2) because we used the data to determine a and b.

Indeed, typing std(resid) gives 0.2998, which is close to the 0.3 we used.

Once we have the fitted line, either using individual weighting or assuming uniform errors, how do we know how good the fit is? That is, what are the errors in the determination of the parameters a and b?

0 0.5 1 1.5 2 2.5-3

-2

-1

0

1

2

3

x

Res

idua

l

Residuals vs. x

2 22 2 1 1( ) ( )

2i i is y y y a bxN m N

Mar 29, 2010

Chi-Square Probability Recall that the value of 2 is

This should be about equal to the number of degrees of freedom, = N – 2 = 24 in this case. Since i is a constant 0.3, we can bring it out of the summation, and calculate

sum(resid.^2)/0.3^2 ans = 24.9594

As we mentioned last time, it is often easier to consider the reduced chi-square,

which is about unity for a good fit. In this case, 2 = 24.9594/24 =

1.04. If we look this value up in table C.4 of the text, we find that P ~ 0.4 which means if we repeated the experiment multiple times, about 40% would be expected to have a larger chi-square.

2 2

2 ( ) 1ii

i i

y y xy a bx

22

Mar 29, 2010

Uncertainties in the Parameters We started out searching for a linear fit, , where a and b

are the parameters of the fit. What are the uncertainties in these parameters?

We saw in Lecture 14 that errors due to a series of measurements propagate to the result for, e.g. a, according to

Since we have the expressions for a and b as

The partial derivatives are (note is independent of yi)

( )y x a bx

2

2 2 .a ii

a

y

2

2 2 2 2

2 2 2 2

1

1 1,

i i i i i

i i i i

i i i i

i i i i

x y x x ya

x y x yb

2

2 2 2 2

2 2 2 2

1 1

1 1 1.

ji i

j j i j i

j i

j j i j i

xx xa

y

x xb

y

Mar 29, 2010

Uncertainties in the Parameters Inserting these into the expressions

after some algebra (see text page 109), we have

In the case of common i = we have

For our example, this is calculated as del = 26*sum(x.^2)-sum(x)^2 siga = sum(x.^2)*0.3^2/del % gives 0.0131 sigb = 26*0.3^2/del % gives 0.0062

2 2

2 2 2 2a i b i

i i

a b

y y

22 2

2 2

1 1 1ia b

i i

x

2 2 2

2 22 22 2

ia b

i i i i

x N

N x x N x x

Documents

Physics 114: Lecture 16 Linear and Non-Linear Fitting Dale E. Gary NJIT Physics Department