26

Error Squares

Embed Size (px)

Citation preview

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 1/26

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 3/26

Linear Least Squares

Is the line of best fitfor a group of points

It seeks to minimizethe sum of all datapoints of the squaredifferences between

the function value anddata value.

It is the earliest form

of linear regression

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 4/26

Gauss and Legendre

The method of least squares wasfirst published by Legendre in1805 and by Gauss in 1809.

Although Legendre’s work waspublished earlier, Gauss claimshe had the method since 1795.Both mathematicians applied the

method to determine the orbits ofbodies about the sun.Gauss went on to publish furtherdevelopment of the method in

1821.

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 5/26

ExampleConsider the points (1,2.1) , (2,2.9) , (5,6.1) , and (7,8.3) with the best fit line f(x) = 0.9x + 1.4

The squared errors are:x1=1 f(1)=2.3 y 1=2.1 e 1= (2.3 – 2.1)² = .04

x2=2 f(2)=3.2 y 2=2.9 e 2= (3.2 – 2.9)² =. 09x3=5 f(5)=5.9 y 3=6.1 e 3= (5.9 – 6.1)² = .04x4=7 f(7)=7.7 y 4=8.3 e 4= (7.7 – 8.3)² = .36

So the total squared error is .04 + .09 + .04 + .36 = .53

By finding better coefficients of the best fit line, we can make this errorsmaller…

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 6/26

We want tominimize the

vertical distancebetween the point

and the line.

• E = (d 1)² + (d 2)² + (d 3)² +…+(d n)² for n data points

• E = [f(x 1) – y1]² + [f(x 2) – y2]² + … + [f(x n) – yn]²• E = [mx 1 + b – y1]² + [mx 2 + b – y2]² +…+ [mx n + b – yn]²

• E= ∑(mx i+ b – y i )²

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 7/26

E must be MINIMIZED!

How do we do this?E = ∑(mx i+ b – y i )²

Treat x and y as constants, since we aretrying to find m and b.So…PARTIALS!

E/ m = 0 and E/ b = 0But how do we know if this will yieldmaximums, minimums, or saddle points?

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 8/26

Minimum Point Maximum Point

Saddle Point

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 9/26

Minimum!

Since the expressionE is a sum ofsquares and istherefore positive (i.e.it looks like an upwardparaboloid), we knowthe solution must be aminimum.We can prove this byusing the 2 nd PartialsDerivative Test.

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 10/26

2nd Partials Test

And form the discriminant D = AC – B2

1) If D < 0, then (x 0,y0) is a saddle point .2) If D > 0, then f takes on

A local minimum at (x 0,y0) if A > 0

A local maximum at (x 0,y0) if A < 0

A f x

B f y x

C f y

2

2

2 2

2, ,

Suppose the gradient of f(x 0,y0) = 0.(An instance of this is E/ m = E/ b = 0.)

We set

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 11/26

Calculating the Discriminant

A f

x

A E

m

Amx b y

m

A x mx b y

m A x

A x

2

2

2

2

2 2

2

2

2

2

2

2

( )

( )( )

( )

B f

y x

B E

b m

Bmx b y

b m

B x mx b y

b B x

B x

2

2

2 2

2

2

2

( )

( )( )

( )

C f

y

C E

b

C mx b y

b

C mx b y

bC C

2

2

2

2

2 2

2

2

22 1

( )

( )( )

D AC B x x 2 2 24 1 4( ( ))

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 12/26

1) If D < 0, then (x 0,y0) is a saddle point .2) If D > 0, then f takes on

A local minimum at (x 0,y0) if A > 0 A local maximum at (x 0,y0) if A < 0

Now D > 0 by an inductive proof showing that

Those details are not covered in thispresentation.We know A > 0 since A = 2 ∑ x2 is always

positive (when not all x’s have the same value).

D AC B x x 2 2 24 1 4( ( ))

n x xi

n

ii

n

i

1

2

1

2

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 13/26

Therefore…

Setting E/ m and E/ b equal to zero will

yield two minimizing equations of E, thesum of the squares of the error.

Thus, the linear least squares algorithm(as presented) is valid and we can continue.

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 14/26

E = ∑(mx i + b – y i)² is minimized (as just shown) whenthe partial derivatives with respect to each of thevariables is zero. ie: E/ m = 0 and E/ b = 0

E/ b = ∑2(mx i + b – y i) = 0 set equal to 0m∑x i + ∑b = ∑y i

mSx + bn = SyE/ m = ∑2x i (mx i + b – y i) = 2∑(m xi² + bx i – x iyi) = 0

m∑xi² + b ∑x i = ∑x iyi

mSxx + bSx = Sxy

NOTE:∑x i = Sx ∑y i = Sy ∑x i² = Sxx ∑x iyi = SxSy

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 15/26

Next we will solve the system ofequations for unknowns m and b:

nmSxx + bnSx = nSxy Multiply by nmSxSx + bnSx = SySx Multiply by Sx

nmSxx – mSxSx = nSxy – SySx Subtract

mSxx bSx SxymSx bn Sy

mnSxy SySxnSxx SxSx

m(nSxx – SxSx) = nSxy – SySx Factor m

Solving for m…

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 16/26

Next we will solve the system ofequations for unknowns m and b:

mSxSxx + bSxSx = SxSxy Multiply by SxmSxSxx + bnSxx = SySxx Multiply by Sxx

bSxSx – bnSxx = SxySx – SySxx Subtract

mSxx bSx SxymSx bn Sy

bSxxSy SxySxnSxx SxSx

b(SxSx – nSxx) = SxySx – SySxx Solve for b

Solving for b…

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 17/26

Example: Find the linear least squaresapproximation to the data: (1,1), (2,4), (3,8)

mnSxy SySxnSxx SxSx

Sx = 1+2+3= 6Sxx = 1²+2²+3² = 14Sy = 1+4+8 = 13Sxy = 1(1)+2(4)+3(8) = 33n = number of points = 3

The line of best fit is y = 3.5x – 2.667

Use these formulas:

bSxxSy SxySxnSxx SxSx

b

14 13 33 63 14 6 6

166

2667( ) ( )( ) ( )

.

m

3 33 6 133 14 6 6

216

35( ) ( )( ) ( )

.

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 18/26

Line of best fit: y = 3.5x – 2.667

-1 1 2 3 4 5

-5

5

10

15

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 19/26

THE ALGORITHM

in Mathematica

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 20/26

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 21/26

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 22/26

ActivityFor this activity we are going to use the linear leastsquares approximation in a real life situation.You are going to be given a box score from either abaseball or softball game.With the box score you are given you are going to writeout the points (with the x coordinate being the number ofhits that player had in the game and the y coordinatebeing the number of at-bats that player had in the game).

After doing that you are going to use the linear leastsquares approximation to find the best fitting line.The slope of the besting fitting line you find will be theteam’s batting average for that game.

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 23/26

In Conclusion…

E = ∑(mx i+ b – y i )² is the sum of thesquared error between the set of datapoints {(x1,y1),…,(x i,y i),…,(x n,yn)} and theline approximating the data f(x) = mx + b .By minimizing the error by calculusmethods , we get equations for m and bthat yield the least squared error :

mnSxy SySxnSxx SxSx

b

SxxSy SxySxnSxx SxSx

8/11/2019 Error Squares

http://slidepdf.com/reader/full/error-squares 24/26

Advantages

Many common methods of approximating dataseek to minimize the measure of differencebetween the approximating function and givendata points.

Advantages for using the squares of differencesat each point rather than just the difference,absolute value of difference, or other measures oferror include: – Positive differences do not cancel negative differences – Differentiation is not difficult – Small differences become smaller and large differences

become larger