13
The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

–The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

Embed Size (px)

Citation preview

Page 1: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

– The shortest distance is the one that crosses at 90° the vector u

Statistical Inference on correlation and regression

Page 2: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

– The shortest distance is the one that crosses at 90° the vector u

Statistical Inference on Correlation

Angle between two variables Relationship between two variables

Page 3: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

Statistical Inference on Correlation

The null hypothesis is that there is no correlation between the two variables in the population. In other words, we seek to know if the two variables are linearly independent. If the hull hypothesis is rejected, then it means that the two variables are not independent and that there is a linear relationship between the two.

0

1

: 0

: 0

xy

xy

H

H

2

)1( 2

2

ndf

dfr

rF

xy

xy

Page 4: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

Statistical Inference on Correlation

Example

3252

7744.0

88.0

5

0:

0:

2

1

0

ndf

r

r

n

H

H

xy

xy

xy

xy

10.3 3

7744.01

7744.0

1 2

2

dfr

rF

xy

xy

xy

In this case, we cannot use the standard normal distribution (Z distribution). We will use the F ratio distribution instead.(see pdf file)

Page 5: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

Statistical Inference on Correlation

Example

3252

7744.0

88.0

5

0:

0:

2

1

0

ndf

r

r

n

H

H

xy

xy

xy

xy

Nb of variables - 1

Nb

of p

arti

cipa

nts-

2

Page 6: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

Statistical Inference on Correlation

Because Fxy > F(0.05, 1, 3) (10.3>10.13) we

reject H0 and therefore accept H1. There

is a linear dependency between the two

variables.

Example

3252

7744.0

88.0

5

0:

0:

2

1

0

ndf

r

r

n

H

H

xy

xy

xy

xy

10.3 3

7744.01

7744.0

1 2

2

dfr

rF

xy

xy

xy

13.10)3 ,1 ,05.0() ,1 ,( FdfF

Page 7: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

– The shortest distance is the one that crosses at 90° the vector u

Linear regression

We want a functional relationship between 2 variables; not only a strength of association.

In other words, we want to be able to predict the outcome given a predictor

x1

y1

Recall: finding the slope and the constant of a line

Page 8: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

Linear regression

• Regression: 0 1v b b u

b

e

Page 9: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

– By substitution, we can isolate the b1 coefficient.

Linear regression

• Regression: The formula to obtain the regression coefficients can be deducted directly from geometry

T

T1

T T1

T T1

T 1 T T 1 T1

T 1 T1 1

0

( ) 0

0

( ) ( ) ( ) ( )

( ) ( ) 1

b

b

b

b

b b

u e

u v u

u v u u

u v u u

u u u v u u u u

u u u v

If we generalized to any situation (multiple, multivariate)

T 1 T( )B X X X Y

-1T T1

-121 2

covcov

b

b ss

uvu uv

u

u u u v

(true for 2 variables only)

Page 10: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

1 2

0 1

xy

x

Covb

s

b y b x

0 1y b b x

If we replace b0

Parameters of the linear regression

0 1

1 1

1

ˆ

ˆ

ˆ ( )

y b b x

y y b x b x

y y x x b

Equation of prediction

Page 11: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

We know that:

If we replace the covariance we then obrain:

Note

xyxy

x y

xy xy x y

Covr

s s

Cov r s s

1 2

1 2

1

xy

x

xy x y

x

xy y yxy

x x

Covb

s

r s sb

s

r s sb r

s s

Page 12: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

2 2

1

2

3

4

5

6

7

( ) ( ) ( )( )

8 10 3 4 9 16 12

6 8 1 2 1 4 2

3 2 2 4 4 16 8

5 6 0 0 0 0 0

7 9 2 3 4 9 6

2 2 3 4 9 16 12

4 5 1 1 1 1 1

35 42 0 0 28 62 41

Sujet x y x x y y x x y y x x y y

s

s

s

s

s

s

s

5 6

2.16 3.21x y

x y

s s

2

cov 6.83

0.98

0.96

xy

xy

xy

r

r

Example

Participant

Page 13: –The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression

1

0 1

0 1

3.210.98 1.46

2.16

6 1.46 5 1.3

ˆ

ˆ 1.3 1.46

yxy

x

sb r

s

b y b x

y b b x

y x

Example