21
Topics Part 1 – Single measurement 1. Basic stuff (Chapter 1 and 2) 2. Propagation of uncertainties (Chapter 3) Part 2 – Multiple measurements as independent results 1. Mean and standard deviation (Chapter 4) 2. Basic on probability distribution function (not in text explicitly) 3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution (first half of Chapter 5) 6. χ 2 test – how well does the data fit the distribution model? (Chapter 12) Part 3 – Multiple measurements as one sample 1. Central limit theorem (not in text explicitly) 2. Normal distribution (second half of Chapter 5) 3. Propagation of error (Chapter 3) 4. Rejection of data (Chapter 6) 5. Merging two sets of data together (Chapter 7) Part 4 - Dependent variables Part 4 Dependent variables 1. Curve fitting (Chapter 8) 2. Covariance and correlation (Chapter 9)

Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Topics Part 1 – Single measurement

1. Basic stuff (Chapter 1 and 2)2. Propagation of uncertainties (Chapter 3)

Part 2 – Multiple measurements as independent results

1. Mean and standard deviation (Chapter 4)2. Basic on probability distribution function (not in text explicitly)3. The Binomial distribution (Chapter 10)4. The Poisson distribution (Chapter 11)5. Normal distribution (first half of Chapter 5)6. χ2 test – how well does the data fit the distribution model? (Chapter 12)

Part 3 – Multiple measurements as one sample

1. Central limit theorem (not in text explicitly)2. Normal distribution (second half of Chapter 5)3. Propagation of error (Chapter 3)4. Rejection of data (Chapter 6)5. Merging two sets of data together (Chapter 7)

Part 4 - Dependent variablesPart 4 Dependent variables

1. Curve fitting (Chapter 8)2. Covariance and correlation (Chapter 9)

Page 2: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Basic Idea Given two random variables x and y, how to show they are independent of each other (or correlated in the opposite sense)?sense)?

0 8

1

Independent random variables x and yy

0 2

0.4

0.6

0.8

‐0.4

‐0.2

0

0.2

‐1 ‐0.8 ‐0.6 ‐0.4 ‐0.2 0 0.2 0.4 0.6 0.8 1x

‐1

‐0.8

‐0.6

0.4

Each dot is a pair of independent random numbers (x, y) in the range of [-1, 1]. Total of 100 dots.

Page 3: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Depending on the distribution the scattering may not center about the origin but it always

Basic Idea Depending on the distribution, the scattering may not center about the origin, but it always centers about the mean (x,y).

Consider the product of (x-x) and (y-y). It will be positive in quadrants I and III, and negative in quadrants II and IV If x and y are independent of each other the scatter

Independent random variables x and y

negative in quadrants II and IV. If x and y are independent of each other, the scatter randomly in all quadrants and the sum of their products should tend to 0.

y

(x,y) x

. variablesrandomt independen arey and x if 0 )yy()xx( ii

N

1i→−−∑

=

Page 4: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

CovarianceCovariance

Covariance between two random variables x and y is defined as

)yy()xx(N1 ii

N

xy −−= ∑σN 1i=

If the random variables x and y are independent of each other, | | 0 If d l t d ill b f f 0 ( b|σxy|→ 0. If x and y are correlated, σxy will be far from 0 (can be positive or negative) but its magnitude will not exceed σxσy(Schwartz inequality).

Page 5: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

ExampleI h t d 15 d d 15 d i d d tl

x y (x‐x) (y‐y) (x‐x)(y‐y)0 07253 0 964829 0 39924 0 397462 0 15868

I have generated 15 random x and 15 random y independently.

0.07253 0.964829 ‐0.39924 0.397462 ‐0.158680.711853 0.539325 0.240082 ‐0.02804 ‐0.006730.744859 0.077395 0.273088 ‐0.48997 ‐0.133810.118883 0.887473 ‐0.35289 0.320106 ‐0.112960.770617 0.527768 0.298846 ‐0.0396 ‐0.011830.08248 0.765561 ‐0.38929 0.198194 ‐0.077160.099049 0.992887 ‐0.37272 0.42552 ‐0.15860 833047 0 166098 0 361276 ‐0 40127 ‐0 144970.5

1

1.5

0.833047 0.166098 0.361276 0.40127 0.144970.322289 0.230947 ‐0.14948 ‐0.33642 0.0502890.864182 0.20435 0.392411 ‐0.36302 ‐0.142450.77726 0.415119 0.305489 ‐0.15225 ‐0.04651

0

0 0.2 0.4 0.6 0.8 1

0.423413 0.853539 ‐0.04836 0.286172 ‐0.013840.718309 0.671949 0.246538 0.104582 0.0257830.083665 0.946898 ‐0.38811 0.379531 ‐0.14730.454136 0.266361 ‐0.01763 ‐0.30101 0.005308

Sum 7.076572 8.5105 ‐1.07346

Mean 0.471771 0.567367 ‐0.07156←σxy

Page 6: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

ExampleNow I generate y randomly about the value of x as x changes from 0 g y y gto 1.5 with an inclement of 0.1.

x y (x‐x) (y‐y) (x‐x)(y‐y)0 0.929658 ‐0.75 0.021211 ‐0.01591

0.1 0.17865 ‐0.65 ‐0.7298 0.4743680.2 ‐0.64521 ‐0.55 ‐1.55366 0.8545110.3 1.074947 ‐0.45 0.1665 ‐0.074920 4 0 455536 ‐0 35 ‐0 45291 0 1585192.5 0.4 0.455536 0.35 0.45291 0.1585190.5 1.031123 ‐0.25 0.122676 ‐0.030670.6 1.585773 ‐0.15 0.677326 ‐0.10160.7 0.032196 ‐0.05 ‐0.87625 0.043813

0 5

1

1.5

2

0.8 0.261895 0.05 ‐0.64655 ‐0.032330.9 0.3087 0.15 ‐0.59975 ‐0.089961 0.830238 0.25 ‐0.07821 ‐0.01955

1.1 1.807077 0.35 0.89863 0.314521‐1

‐0.5

0

0.5

0 0.5 1 1.5 2

1.2 1.543898 0.45 0.635451 0.2859531.3 2.193795 0.55 1.285348 0.7069411.4 0.932722 0.65 0.024275 0.0157791 5 2 014146 0 75 1 105699 0 8292741.5 2.014146 0.75 1.105699 0.829274

Sum 12 14.53515 3.318736

Mean 0.75 0.908447 0.207421←σxy

Page 7: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Schwarz Inequality

[ ] ≥−+−=

∑ 2ii

yxxy

0 )yy(t)xx(N1 A(t)function Construct

Proof.

σσσ

( )2jj2 0)xx()yy(

)xx(10A ≥⎥⎤

⎢⎡ −−

−−⇒≥ ∑∑[ ]

[ ]−−

−=⇒

=−+−−⇒

=−+−−=

∑∑

∑∑

2ii

iii

iiimin

)()xx()yy(

t

0 )yy(t)xx()yy(

0 )yy(t)xx()yy(N2

dtdA :Adetermint To

N

[ ] ( )2

jj2

j2

i

2jj

2j

2i

2j

imin

)xx()yy()yy()xx(

)xx()yy(N1 )yy()xx(

N1

0)yy(

)xx(N

0A

⎟⎟⎞

⎜⎜⎛ −−

≥⎥⎤

⎢⎡ −−

−−≥−−⇒

≥⎥⎥⎦⎢

⎢⎣ −

⇒≥

∑∑∑

∑∑∑

∑∑

⎥⎤

⎢⎡

⎟⎟⎞

⎜⎜⎛ −−

−−−−

⎥⎥⎦

⎢⎢⎣

⎡⎟⎟⎠

⎞⎜⎜⎝

−−−−−=∴

∑∑∑

∑∑∑

jj2

2

2j

jjiimin

2i

)xx()yy()yy()xx(2)xx(

)yy()xx()yy(

)yy()xx(N1 A

)yy(

jj2

j2

i

2

N)xx()yy(

N

)yy(N

)xx(

NNN

≥⇒

−−≥

⎥⎥⎦

⎢⎢⎣

⎡ −−⇒

⎟⎟⎠

⎜⎜⎝

≥⎥⎥⎦⎢

⎢⎣

∑∑∑

( )⎥⎤

⎢⎡

⎟⎞

⎜⎛ −−−−

⎥⎥⎥⎥⎥⎥

⎦⎢⎢⎢⎢⎢⎢

⎣⎟⎟⎠

⎞⎜⎜⎝

−−−+

⎟⎟⎠

⎜⎜⎝ −

=

∑∑∑∑

∑∑∑

∑∑∑

2

jj22

jj2

2

2j

jj2i

2j

iii

)xx()yy()(

)xx()yy(2)(1

)yy()xx()yy(

)yy(

)yy()yy()xx(2)xx(

N1

xyyx σσσ ≥⇒

( )

( ) ( )

( ) ⎤⎡

⎥⎥⎦

⎢⎢⎣

−−−

+−

−−−−=

⎥⎥

⎦⎢⎢

⎣⎟⎟⎠

⎜⎜⎝ −

−+−

−−=

∑∑

∑∑

∑∑

∑∑∑∑

∑∑

2

2j

2jj

2j

2jj2

i

2j

jj2i2

j

jj2i

)()(

)yy()xx()yy(

)yy()xx()yy(

2)xx(N1

)yy()()yy(

)yy()yy(

)()yy(2)xx(

N1

( )⎥⎥⎦

⎢⎢⎣

−−−

−−=∑

∑∑ 2j

jj2i )yy(

)xx()yy()xx(

N1

Page 8: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Covariance in Another Form

−−= ∑1

)yy()xx(N1 iixyσ

+−−=

+−−=

∑∑∑∑

yx1x1yy1xyx1

)y xyxyxyx(N1

iiii

iiii

+−−= ∑

∑∑∑∑

y xx yy xyxN1

yNN

yyN

yN

ii

iiii

=

−= ∑yx)xy(

y xyxN1 ii

>><<><−=∴

−=

yx-xyor yx)xy(

yx)xy(

xyσ yyy)y(xy

Page 9: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Error propagation

For simplicity, we assume q is a function of two variables x and y. We have learned two equations of error propagation:

Sum)ial(Different qq :equationFirst yxq σσσ ∂+

∂=

Sum) re(Quadratui yq

xq :equation Second

)(yx

q

2y

22

x

22

q

yxq

σσσ ⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

+⎟⎠⎞

⎜⎝⎛∂∂

=

∂∂

yx yq ⎟⎠

⎜⎝ ∂⎠⎝ ∂

Is there any relation between these two equations?

Page 10: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Error propagation and Covariance Iiii -(1)--- )y-y(

yq )x-x(

xq )y,xq( q

∂∂

+∂∂

+≈

ii

ii

)y-y(1q)x-x(1q)y,xq(1

)y-y(yq )x-x(

xq )y,xq(

N1 q

yx

∂+

∂+=

⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

+∂∂

+=∴

∂∂

∑∑∑∑∑

ii

)yy(q)xx(q)yxq(q

:(1) intoback (2) Substitute-(2)--- )y,xq(

)yy(Ny

)xx(Nx

)y,xq(N

∂+

∂+≈

=∂

+∂

+ ∑∑∑∑∑

ii

iii

q,in deviation standardtheasqin y uncertaintheconsider t weIf

)y-y(yq )x-x(

xq q

)y-y(y

)x-x(x

)y,xq( q

∂∂

+∂∂

+=

∂+

∂+≈

2

ii2

q

2i

2q

q)y-y(yq )x-x(

xq q

N1

)qq(N1

qqy

σ

σ

⎟⎟⎠

⎞⎜⎜⎝

⎛−

∂∂

+∂∂

+=∴

−=

22

22

2

ii

)()(1qq2)(1q)(1q

)y-y(yq )x-x(

xq

N1

yxN

⎟⎟⎞

⎜⎜⎛ ∂⎟⎞

⎜⎛ ∂+⎟⎟

⎞⎜⎜⎛ ∂

+⎟⎞

⎜⎛ ∂

⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

+∂∂

=

⎠⎝ ∂∂

∑∑∑

xy2

y

22

x

2

iiii

yq

xq2

yq

xq

)y-y()x-x(Ny

qxq2 )y-y(

Nyq)x-x(

Nxq

σσσ ⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

⎟⎠⎞

⎜⎝⎛∂∂

+⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

+⎟⎠⎞

⎜⎝⎛∂∂

=

⎟⎟⎠

⎜⎜⎝ ∂⎟⎠

⎜⎝ ∂

+⎟⎟⎠

⎜⎜⎝ ∂

+⎟⎠

⎜⎝ ∂

= ∑∑∑

Page 11: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Error propagation and Covariance II22 ⎞⎛⎞⎛⎞⎛⎞⎛

xy2

y

22

x

22

q yq

xq2

yq

xq σσσσ ⎟⎟

⎞⎜⎜⎝

⎛∂∂

⎟⎠⎞

⎜⎝⎛∂∂

+⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

+⎟⎠⎞

⎜⎝⎛∂∂

= This is a general case.

Case I. x and y are independent variables. σxy = 0

Sum)e(Quadratur qq 2y

22

x

22

q σσσ ⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

+⎟⎠⎞

⎜⎝⎛∂∂

= )(Qyx yxq ⎟⎠

⎜⎝ ∂

⎟⎠

⎜⎝ ∂

Case 2. x and y are dependent variables. σxy ≠ 0. When x and y are highly correlated, | σxy | is maximum and equal to σ σ (Schwarz inequality)equal to σxσy (Schwarz inequality).

yq

xq2

yq

xq yx

2y

22

x

22

q σσσσσ ⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

⎟⎠⎞

⎜⎝⎛∂∂

+⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

+⎟⎠⎞

⎜⎝⎛∂∂

=

yq

xq

2

yx σσ ⎟⎟⎠

⎞⎜⎜⎝

⎛∂∂

+∂∂

=

Sum) ial(Different yq

xq yxq σσσ

∂∂

+∂∂

=∴

Page 12: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Error propagation and Covariance III

In Summary,

Sum)e(Quadraturqq 22

22

2 σσσ ⎟⎟⎞

⎜⎜⎛ ∂

+⎟⎞

⎜⎛ ∂= Sum)e(Quadratur

yx yxq σσσ ⎟⎟

⎠⎜⎜⎝ ∂

+⎟⎠

⎜⎝ ∂

qq2 ∂∂corresponds to the case when x and y are absolute independent of each other.

Sum)ial(Different yq

xq yx

2q σσσ

∂∂

+∂∂

=

corresponds to the case when x and y are completely correlated.

Anything is between these two extreme cases.

Page 13: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Coefficient of Linear CorrelationIn linear regression, we expect x and y follow a linear relationship y=A+Bx. In other words, for our least square fit to work, we expect a high

l ti b t d

Coefficient of Linear Correlation

correlation between x and y.

To measure the correlation between x and y, we define the coefficient of Linear Correlation (r) as the covariance normalized by σxσy.ea Co e a o ( ) as e co a a ce o a ed by σxσy

)yy()xx(r

ii

N

1ixy−−

==∑=

σ

)yy()xx(r

2i

N

1i

2i

N

1i

yx −−

==

∑∑==

σσ

Page 14: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Coefficient of Linear CorrelationSchwarz Inequality: σxy ≤ σxσy

1 r 1- r xy ≤≤⇒=∴σσ

σ

yxσσIf x and y are independent (uncorrelated) r=0 and we should not try to do linear regression in this case.

y y y

x x x

Positive Correlation (r > 0) Negative Correlation (r < 0) No Correlation (r = 0)

Page 15: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

If All Points Falls Exactly on a Straight Line

)xB(x)yy(xBAy BxAy

ii

ii

−=−∴+=⇒+=

If all data points falls exactly on a straight line y=A+Bx,

)yy()xx(

)yy()xx( r

2i

N2

i

N

ii

N

1i

yx

xy

−−

−−==

∑∑

∑=

σσσ

)(B)(

)xx(B

22N

2N

2i

N

1i

1i1i

−=

∑∑

∑∑

=

==

)xx(B

)xx(B)xx(

N

2i

N

1i

2i

2

1i

2i

1i

−=

−−

∑∑

=

==

1

)xx(B 2i

N

1i

=

−∑=

Note: If you have only two data points r=1 Hence correlation has noNote: If you have only two data points, r=1. Hence correlation has no meaning if there are only two data points.

Page 16: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

ExampleConsider the following data points and do a linear regression:

x y x2 xy [y‐(A+Bx)]2

‐0.04275 0.046483 0.001827 ‐0.00199 0.005989175

6 1 data points

g p g

0.021185 0.003933 0.000449 8.33E‐05 0.000671560.024486 ‐0.04226 0.0006 ‐0.00103 0.00566167‐0.03811 0.038747 0.001453 ‐0.00148 0.004257470.027062 0.002777 0.000732 7.51E‐05 0.001066320 04175 0 026556 0 001743 0 00111 0 00319422

3

4

5

‐0.04175 0.026556 0.001743 ‐0.00111 0.00319422‐0.0401 0.049289 0.001608 ‐0.00198 0.006033470.033305 ‐0.03339 0.001109 ‐0.00111 0.00558818‐0.01777 ‐0.02691 0.000316 0.000478 0.000389380.036418 ‐0.02956 0.001326 ‐0.00108 0.00545939

0

1

2

15 data points

0.027726 ‐0.00849 0.000769 ‐0.00024 0.00198476‐0.00766 0.035354 5.87E‐05 ‐0.00027 0.001083520.021831 0.017195 0.000477 0.000375 0.00017598‐0.04163 0.04469 0.001733 ‐0.00186 0.00555598‐0.00459 ‐0.02336 2.1E‐05 0.000107 0.00082487

‐1

0

‐1 0 1 2 3 4 5 6 7

5.4321 5.1762 29.50771 28.11764 1.973E‐05Sum 5.389757 5.27725 29.52193 28.10662 0.04795566

Δ= 443.3014 A= 0.009715 σy= 0.058527

B= 0.950285 σA= 0.006453‐0.02

0

0.02

0.04

0.06

‐0.05 0 0.05

σB= 0.011119

‐0.06

‐0.04

Page 17: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

ExampleNow calculate the coefficient of linear correlation:

5

6 1 data points

Now calculate the coefficient of linear correlation:

x y (x‐x)2 (y‐y)2 (x‐x)(y‐y)

‐0.04275 0.046483 0.144101 0.080284 0.10755979

0 021185 0 003933 0 099651 0 106208 0 10287697

3

4

5 0.021185 0.003933 0.099651 0.106208 0.10287697

0.024486 ‐0.04226 0.097578 0.13845 0.11623079

‐0.03811 0.038747 0.140604 0.084728 0.10914701

0.027062 0.002777 0.095975 0.106962 0.10131992

‐0.04175 0.026556 0.143347 0.091974 0.11482235

0

1

2

0.04 75 0.0 6556 0. 43347 0.09 974 0. 48 35

‐0.0401 0.049289 0.142095 0.078702 0.10575072

0.033305 ‐0.03339 0.092146 0.131927 0.11025681

‐0.01777 ‐0.02691 0.125763 0.127259 0.12650871

0.036418 ‐0.02956 0.090265 0.129163 0.10797667

15 data points

‐1

0

‐1 0 1 2 3 4 5 6 70.027726 ‐0.00849 0.095564 0.114458 0.10458502

‐0.00766 0.035354 0.118693 0.086715 0.10145184

0.021831 0.017195 0.099243 0.097739 0.09848852

‐0.04163 0.04469 0.143257 0.081304 0.10792297

0 00459 0 02336 0 116586 0 124745 0 12059609

‐0.02

0

0.02

0.04

0.06

‐0.05 0 0.05

‐0.00459 ‐0.02336 0.116586 0.124745 0.12059609

5.4321 5.1762 25.96147 23.48732 24.6934285

Mean 0.33686 0.329828σx: σy: σxy:

1.31592 1.251697 1.64555767

r= 0 999043

‐0.06

‐0.04r= 0.999043

Page 18: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

ExampleRecalculate the coefficient of linear correlation without the last data point:

5

6 1 data points

Recalculate the coefficient of linear correlation without the last data point:

x y (x‐x)2 (y‐y)2 (x‐x)(y‐y)

‐0.04275 0.046483 0.001594 0.00158 ‐0.0015869

0 021185 0 003933 0 000576 7 87E 06 6 732E 05

3

4

5 0.021185 0.003933 0.000576 7.87E‐06 ‐6.732E‐05

0.024486 ‐0.04226 0.000746 0.002401 ‐0.0013379

‐0.03811 0.038747 0.001246 0.001025 ‐0.0011297

0.027062 0.002777 0.000893 1.57E‐05 ‐0.0001183

0 04175 0 026556 0 001516 0 000393 0 0007716

0

1

2

‐0.04175 0.026556 0.001516 0.000393 ‐0.0007716

‐0.0401 0.049289 0.001389 0.001811 ‐0.0015861

0.033305 ‐0.03339 0.001305 0.00161 ‐0.0014496

‐0.01777 ‐0.02691 0.000224 0.001132 0.00050299

0 036418 0 02956 0 00154 0 001318 0 0014244

15 data points

‐1

0

‐1 0 1 2 3 4 5 6 7

0.036418 ‐0.02956 0.00154 0.001318 ‐0.0014244

0.027726 ‐0.00849 0.000933 0.000232 ‐0.0004651

‐0.00766 0.035354 2.34E‐05 0.000819 ‐0.0001385

0.021831 0.017195 0.000608 0.000109 0.0002578

0 04163 0 04469 0 001506 0 00144 0 0014731

‐0.02

0

0.02

0.04

0.06

‐0.05 0 0.05

‐0.04163 0.04469 0.001506 0.00144 ‐0.0014731

‐0.00459 ‐0.02336 3.12E‐06 0.000906 5.3169E‐05

Mean ‐0.00282 0.006737σx: σy: σxy:

0 030661 0 03141 0 0007156

‐0.06

‐0.040.030661 0.03141 ‐0.0007156

r= ‐0.74309

Page 19: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

Quantitative Significance of r

Provided x and y are independent, we do not expect r to be 0. Quantitatively, we can calculate the probability of getting |r| greater than a

Q g

y p y g g | | gcertain value r0 (given x and y are independent variables): ProbN(|r|> r0)

You can find a table of this in the Appendix C .

Notes.

1. ProbN(|r|> 0) = 100% and ProbN(|r|> 1) = 0%

1. ProbN(|r|> r0) decreases as r0 increases, since we expect small r for independent x and y.

2. . ProbN(|r|> r0) falls as N increases, since larger N will make the measurements more “truthful”.

Page 20: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

ExampleNow calculate the coefficient of linear correlation:

5

6 1 data points

Now calculate the coefficient of linear correlation:

x y (x‐x)2 (y‐y)2 (x‐x)(y‐y)

‐0.04275 0.046483 0.144101 0.080284 0.10755979

0 021185 0 003933 0 099651 0 106208 0 10287697

3

4

5 0.021185 0.003933 0.099651 0.106208 0.10287697

0.024486 ‐0.04226 0.097578 0.13845 0.11623079

‐0.03811 0.038747 0.140604 0.084728 0.10914701

0.027062 0.002777 0.095975 0.106962 0.10131992

‐0.04175 0.026556 0.143347 0.091974 0.11482235

0

1

2

0.04 75 0.0 6556 0. 43347 0.09 974 0. 48 35

‐0.0401 0.049289 0.142095 0.078702 0.10575072

0.033305 ‐0.03339 0.092146 0.131927 0.11025681

‐0.01777 ‐0.02691 0.125763 0.127259 0.12650871

0.036418 ‐0.02956 0.090265 0.129163 0.10797667

15 data points

‐1

0

‐1 0 1 2 3 4 5 6 70.027726 ‐0.00849 0.095564 0.114458 0.10458502

‐0.00766 0.035354 0.118693 0.086715 0.10145184

0.021831 0.017195 0.099243 0.097739 0.09848852

‐0.04163 0.04469 0.143257 0.081304 0.10792297

0 00459 0 02336 0 116586 0 124745 0 12059609

‐0.02

0

0.02

0.04

0.06

‐0.05 0 0.05

‐0.00459 ‐0.02336 0.116586 0.124745 0.12059609

5.4321 5.1762 25.96147 23.48732 24.6934285

Mean 0.33686 0.329828σx: σy: σxy:

1.31592 1.251697 1.64555767

r= 0 999043

‐0.06

‐0.04r= 0.999043

Prob16(|r|> 0.999)=0%

Page 21: Chapter 11 - Covariance and Correlationkwng/fall2010/phy335/lecture/lecture 14.pdf3. The Binomial distribution (Chapter 10) 4. The Poisson distribution (Chapter 11) 5. Normal distribution

ExampleRecalculate the coefficient of linear correlation without the last data point:

5

6 1 data points

Recalculate the coefficient of linear correlation without the last data point:

x y (x‐x)2 (y‐y)2 (x‐x)(y‐y)

‐0.04275 0.046483 0.001594 0.00158 ‐0.0015869

0 021185 0 003933 0 000576 7 87E 06 6 732E 05

3

4

5 0.021185 0.003933 0.000576 7.87E‐06 ‐6.732E‐05

0.024486 ‐0.04226 0.000746 0.002401 ‐0.0013379

‐0.03811 0.038747 0.001246 0.001025 ‐0.0011297

0.027062 0.002777 0.000893 1.57E‐05 ‐0.0001183

0 04175 0 026556 0 001516 0 000393 0 0007716

0

1

2

‐0.04175 0.026556 0.001516 0.000393 ‐0.0007716

‐0.0401 0.049289 0.001389 0.001811 ‐0.0015861

0.033305 ‐0.03339 0.001305 0.00161 ‐0.0014496

‐0.01777 ‐0.02691 0.000224 0.001132 0.00050299

0 036418 0 02956 0 00154 0 001318 0 0014244

15 data points

‐1

0

‐1 0 1 2 3 4 5 6 7

0.036418 ‐0.02956 0.00154 0.001318 ‐0.0014244

0.027726 ‐0.00849 0.000933 0.000232 ‐0.0004651

‐0.00766 0.035354 2.34E‐05 0.000819 ‐0.0001385

0.021831 0.017195 0.000608 0.000109 0.0002578

0 04163 0 04469 0 001506 0 00144 0 0014731

‐0.02

0

0.02

0.04

0.06

‐0.05 0 0.05

‐0.04163 0.04469 0.001506 0.00144 ‐0.0014731

‐0.00459 ‐0.02336 3.12E‐06 0.000906 5.3169E‐05

Mean ‐0.00282 0.006737σx: σy: σxy:

0 030661 0 03141 0 0007156

‐0.06

‐0.040.030661 0.03141 ‐0.0007156

r= ‐0.74309

Prob15(|r|> 0.75) ~ 40%