63
Introduction to measures of relationship: covariance, and Pearson r Ivan Jacob Agaloos Pesigan

Introduction to measures of relationship: covariance, and Pearson r

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Introduction to measures of relationship: covariance, and Pearson r

Introduction to measures of relationship: covariance, and Pearson r Ivan Jacob Agaloos Pesigan

Page 2: Introduction to measures of relationship: covariance, and Pearson r

Outline

•  Defining correlation •  Describing relationships: Scatter plots •  Measuring relationships: Covariance •  Measuring relationships: Pearson r •  Measuring relationships: Issues •  Uses of Correlation

2

Page 3: Introduction to measures of relationship: covariance, and Pearson r

Defining correlation

Introduction to measures of relationship: covariance, and Pearson r

3

Page 4: Introduction to measures of relationship: covariance, and Pearson r

Defining correlation

•  A correlation is a statistical method used to describe and measure the relationship between two variables.

•  A relationship exists when changes in one variable tend to be accompanied by consistent and predictable changes in the other variable.

4

Page 5: Introduction to measures of relationship: covariance, and Pearson r

Defining correlation

Although you should not make too much of the distinction between relationships and differences (if treatments have different means, then means are related to treatments), the distinction is useful in terms of the interests of the experimenter and the structure of the experiment.

5

Page 6: Introduction to measures of relationship: covariance, and Pearson r

Describing Relationships: ���Scatter Plots

Introduction to measures of relationship: covariance, and Pearson r

6

Page 7: Introduction to measures of relationship: covariance, and Pearson r

7

Child Vocabulary Digit Symbol 1 5 12 2 8 15 3 7 14 4 9 18 5 10 19 6 8 18 7 6 14 8 6 17 9 10 20 10 9 17 11 7 15 12 7 16 13 9 16 14 6 13 15 8 16

Page 8: Introduction to measures of relationship: covariance, and Pearson r

8

5 6 7 8 9 10

1214

1618

20

A strong, positive linear relationship

Vocabulary

Dig

it S

ymbo

l

Page 9: Introduction to measures of relationship: covariance, and Pearson r

9

Child Age Time to complete 1 7.5 21 2 6.5 43 3 7.5 26 4 6 37 5 7 25 6 6.5 34 7 5.5 35 8 7 33 9 5.5 44 10 6.5 35 11 7 41 12 5 41 13 5 50 14 5 56 15 5.5 51 16 6 40

Page 10: Introduction to measures of relationship: covariance, and Pearson r

10

5.0 5.5 6.0 6.5 7.0 7.5

2025

3035

4045

5055

A negative linear relationship

Age

Tim

e to

com

plet

e

Page 11: Introduction to measures of relationship: covariance, and Pearson r

Describing relationships: ���Scatter plots

•  Linear relationship •  Positive linear relationship •  Negative linear relationship •  Curvilinear relationship •  No linear relationship

11

Page 12: Introduction to measures of relationship: covariance, and Pearson r

Linear relationship

12

5 6 7 8 9 10

1214

1618

20

A strong, positive linear relationship

Vocabulary

Dig

it S

ymbo

l

5.0 5.5 6.0 6.5 7.0 7.5

2025

3035

4045

5055

A negative linear relationship

Age

Tim

e to

com

plet

e

Page 13: Introduction to measures of relationship: covariance, and Pearson r

Curvilinear Relationship

13

1 2 3 4 5

510

1520

A curvilinear relationship

Motivation

Tim

e to

Com

plet

e th

e Te

st

4.0 4.5 5.0 5.5 6.0 6.5

810

1214

1618

A negative, curved line relationship

Age

Num

ber o

f Err

ors

Page 14: Introduction to measures of relationship: covariance, and Pearson r

No linear relationship

14

60 80 100 120 140

5060

7080

No linear relationship

IQ

Dep

ress

ion

Sco

re

Page 15: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

Introduction to measures of relationship: covariance, and Pearson r

15

Page 16: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

16

100 105 110 115 120 125

7075

8085

9095

100

IQ and final examination grade

IQ

Exam

Page 17: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

•  The simplest way to look at whether two variables are associated is to look at whether they covary.

•  To understand what covariance is, we first need to think back to the concept of variance.

17

Page 18: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

•  Remember that the variance of a single variable represents the average amount that the data vary from the mean.

18

variance = sX2 =

Xi−M

X( )2

∑nX−1

variance = sX2 =

Xi− X( )

2

∑nX−1

Page 19: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

•  If we are interested in whether two variables are related, then we are interested in whether changes in one variable are met with similar changes in the other variable.

•  Therefore, when one variable deviates from its mean we would expect the other variable to deviate from its mean in a similar way or the directly opposite way.

19

Page 20: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

20

2 4 6 8 10

100

105

110

115

120

125

IQ Scores per Student

Student

IQ S

core

s

2 4 6 8 10

7075

8085

9095

100

Final Exam Scores per Student

Student

Fina

l Exa

m S

core

s

Page 21: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

21

100 105 110 115 120 125

05

1015

2025

30

IQ and Errors in the Final Examination

IQ

Exam

Page 22: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

22

2 4 6 8 10

100

105

110

115

120

125

IQ Scores per Student

Student

IQ S

core

s

2 4 6 8 10

05

1015

2025

30

Final Exam Errors per Student

Student

Fina

l Exa

m E

rror

s

Page 23: Introduction to measures of relationship: covariance, and Pearson r

23

Student IQ Final Exam Score 1 97 71

2 98 71

3 108 74

4 114 84

5 115 86

6 119 86

7 120 87

8 121 90

9 122 96

10 127 99

Page 24: Introduction to measures of relationship: covariance, and Pearson r

24

Student IQ Final Exam Score 1 97 71

2 98 71

3 108 74

4 114 84

5 115 86

6 119 86

7 120 87

8 121 90

9 122 96

10 127 99

SUM 1141 844

MEAN 114.1 84.4

Page 25: Introduction to measures of relationship: covariance, and Pearson r

25

Student IQ X - Mx Final Exam

Score Y - MY

1 97 -17.1 71 -13.4

2 98 -16.1 71 -13.4

3 108 -6.1 74 -10.4

4 114 -0.1 84 -0.4

5 115 0.9 86 1.6

6 119 4.9 86 1.6

7 120 5.9 87 2.6

8 121 6.9 90 5.6

9 122 7.9 96 11.6

10 127 12.9 99 14.6

SUM 1141 844

MEAN 114.1 84.4

Page 26: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

26

2 4 6 8 10

100

105

110

115

120

125

IQ Scores per Student

Student

IQ S

core

s

-17.1-16.1

-6.1

-0.10.9

4.95.9

6.97.9

12.9

2 4 6 8 10

7075

8085

9095

100

Final Exam Scores per Student

Student

Fina

l Exa

m S

core

s

-13.4 -13.4

-10.4

-0.4

1.6 1.62.6

5.6

11.6

14.6

Page 27: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

•  When we multiply the deviations of one variable by the corresponding deviations of a second variable, we get what is known as the cross-product deviations.

•  As with the variance, if we want an average value of the combined deviations for the two variables, we must divide by the number of observations (n− 1)

27

Page 28: Introduction to measures of relationship: covariance, and Pearson r

28

Student IQ X - Mx

Final Exam Score

Y - MY (X - Mx) (Y – MY)

1 97 -17.1 71 -13.4 229.14

2 98 -16.1 71 -13.4 215.74

3 108 -6.1 74 -10.4 63.44

4 114 -0.1 84 -0.4 0.04

5 115 0.9 86 1.6 1.44

6 119 4.9 86 1.6 7.84

7 120 5.9 87 2.6 15.34

8 121 6.9 90 5.6 38.64

9 122 7.9 96 11.6 91.64

10 127 12.9 99 14.6 188.34

SUM 1141 844 851.6

MEAN 114.1 84.4 cov = 94.62

Page 29: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

•  This averaged sum of combined deviations is known as the covariance.

29

covariance = covX ,Y

=Xi−M

X( ) Yi −MY( )∑n−1

covariance = covX ,Y

=Xi− X( ) Yi −Y( )∑n−1

Page 30: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

30

covariance = covX ,Y

=Xi-M

X( ) Yi -MY( )∑n -1

covX ,Y

=851.610 -1

=851.69

≈ 94.622

Page 31: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

•  Calculating the covariance is a good way to assess whether two variables are related to each other.

•  A positive covariance indicates that as one variable deviates from the mean, the other variable deviates in the same direction.

•  On the other hand, a negative covariance indicates that as one variable deviates from the mean (e.g. increases), the other deviates from the mean in the opposite direction (e.g. decreases).

31

Page 32: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

•  There is, however, one problem with covariance as a measure of the relationship between variables and that is that it depends upon the scales of measurement used.

•  So, covariance is not a standardized measure.

32

Page 33: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

•  For example, if we use the data above and assume that they represented two variables measured in miles then the covariance is 94.62 (as calculated above). If we then convert these data into kilometers (by multiplying all values by 1.609) and calculate the covariance again then we should find that it increases to 244.97.

33

Page 34: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Covariance

•  This dependence on the scale of measurement is a problem because it means that we cannot compare covariances in an objective way – so, we cannot say whether a covariance is particularly large or small relative to another data set unless both data sets were measured in the same units.

34

Page 35: Introduction to measures of relationship: covariance, and Pearson r

Measuring Relationships: ���Pearson r

Introduction to measures of relationship: covariance, and Pearson r

35

Page 36: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

•  To overcome the problem of dependence on the measurement scale, we need to convert the covariance into a standard set of units.

•  This process is known as standardization.

36

Page 37: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

•  A very basic form of standardization would be to insist that all experiments use the same units of measurement, say meters – that way, all results could be easily compared.

•  However, what happens if you want to measure attitudes – you’d be hard pushed to measure them in meters!

37

Page 38: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

•  Therefore, we need a unit of measurement into which any scale of measurement can be converted. The unit of measurement we use is the standard deviation.

•  If we divide any distance from the mean by the standard deviation, it gives us that distance in standard deviation units.

38

Page 39: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

39

zX=Xi−M

X

sX

=Xi− X

sX

zY=Yi−M

Y

sY

=Yi−Y

sY

Page 40: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r •  Since the properties of z scores form the foundation

necessary for understanding the Pearson product moment correlation coefficient (r) they will be briefly reviewed: 1.  The sum of a set of z score (Σ z) and therefore

also the mean equal 0. 2.  The variance (s2) of the set of z scores equals 1, as

does the standard deviation (s).

3.  Neither the shape of the distribution of X, nor of its relationship to any other variable, is affected by transforming it to zX

40

Page 41: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

41

rX ,Y

=zXzY∑

n−1

zX=Xi−M

X

sX

=Xi− X

sX

zY=Yi−M

Y

sY

=Yi−Y

sY

then

rX ,Y

=Xi−M

X( ) Yi −MY( )∑sXsY( ) n−1( )

=Xi− X( ) Yi −Y( )∑sXsY( ) n−1( )

where

covX ,Y

=Xi−M

X( ) Yi −MY( )∑n−1( )

=Xi− X( ) Yi −Y( )∑n−1( )

then

rX ,Y

=cov

X ,Y

sXsY

rX ,Y

=covariability of X and Y

variability of X and Y separately

rX ,Y

=degree to which X and Y vary together

degree to which X and Y vary separately

Page 42: Introduction to measures of relationship: covariance, and Pearson r

42

Student IQ zx = (X – Mx)/sx

Final Exam Score

zy = (Y – MY)/sy

zxzy

1 97 -1.69 71 -1.37 2.31

2 98 -1.59 71 -1.37 2.18

3 108 -0.60 74 -1.06 0.64

4 114 -0.01 84 -0.04 0.00

5 115 0.09 86 0.16 0.01

6 119 0.48 86 0.16 0.08

7 120 0.58 87 0.27 0.15

8 121 0.68 90 0.57 0.39

9 122 0.78 96 1.19 0.93

10 127 1.27 99 1.49 1.90

SUM 1141 844 8.60

MEAN 114.1 s = 10.14 84.4 s = 9.77 r = 0.96

Page 43: Introduction to measures of relationship: covariance, and Pearson r

43

Student IQ X - Mx

Final Exam Score

Y - MY (X - Mx) (Y – MY)

1 97 -17.1 71 -13.4 229.14

2 98 -16.1 71 -13.4 215.74

3 108 -6.1 74 -10.4 63.44

4 114 -0.1 84 -0.4 0.04

5 115 0.9 86 1.6 1.44

6 119 4.9 86 1.6 7.84

7 120 5.9 87 2.6 15.34

8 121 6.9 90 5.6 38.64

9 122 7.9 96 11.6 91.64

10 127 12.9 99 14.6 188.34

SUM 1141 844 851.6

MEAN 114.1 s = 10.14 84.4 s = 9.77 cov = 94.62

Page 44: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

44

rX ,Y

=cov

X ,Y

sXsY

=94.622

10.14( ) 9.77( )

rX ,Y

=94.622

99.00331457= 0.96

Page 45: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

45

radj=1−

1− r2( ) n−1( )n− 2

Page 46: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r There are three general strategies for determining the size of the population effect which a research is trying to detect:

1.  To the extent that studies which have been carried out by the current investigator or others are closely related to the present investigation, the ESs found in these studies reflect the magnitude which can be expected.

2.  In some research areas an investigator may posit some minimum population effect that would have either practical of theoretical significance.

3.  A third strategy in deciding what ES values to use in determining the power of a study is to use certain suggested conventional definitions of “small”, “medium”, and “large” effects.

46

Page 47: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

47

Size of Effect / Magnitude r r2 (% of variance)

small 0.1 0.01 (1%)

medium 0.3 0.09 (9%)

large 0.5 0.25 (25%)

Page 48: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

r = +1.00 r = -1.00

48

-3 -2 -1 0 1 2 3

-3-2

-10

12

3

Perfect positive linear relationship

X

Y

-2 -1 0 1 2 3

-3-2

-10

12

Perfect negative linear relationship

X

Y

Page 49: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

r = .80 r = .30

49

-3 -2 -1 0 1 2

-2-1

01

23

Strong positive linear relationship

X

Y

-2 -1 0 1 2 3

-3-2

-10

12

Moderate positive linear relationship

X

Y

Page 50: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r

r = .10 r = 0

50

-2 -1 0 1 2 3

-3-2

-10

12

No linear relationship

X

Y

-3 -2 -1 0 1 2

-3-2

-10

12

Weak positive linear relationship

X

Y

Page 51: Introduction to measures of relationship: covariance, and Pearson r

Measuring relationships: Pearson r •  Pearson Product Moment Correlation Coefficient:

–  It is a pure number and independent of the units of measurement. –  Its absolute value varies between zero, when the variables have no

linear relationship, and 1, when each variable is perfectly predicted by the other. The absolute value thus gives the degree of relationship.

–  Its sign indicates the direction of the relationship. A positive sign indicates a tendency for high values of one variable to occur with high values of the other, and low values to occur with low. A negative sign indicates a tendency for high values of one variable to be associated with low values of the other. Reversing the direction of measurement of one of the variables will produce a coefficient of the same value bur of opposite sign. Coefficients with equal value but opposite sign (for example, +.50 and -.50) thus indicate equally strong linear relationship, but in opposite directions.

51

Page 52: Introduction to measures of relationship: covariance, and Pearson r

Measuring Relationships: Issues

Introduction to measures of relationship: covariance, and Pearson r

52

Page 53: Introduction to measures of relationship: covariance, and Pearson r

-2 -1 0 1 2

-3-2

-10

12

3

Restriction of range

X

Y

Restriction of Range

53

Page 54: Introduction to measures of relationship: covariance, and Pearson r

Heterogeneous Subsamples

62 64 66 68 70 72 74

100

120

140

160

180

200

Relationship of Height and WeightBlue Line = Males; Red Line = Females; Black Line = Overall

Height

Weight

malefemale

54

Page 55: Introduction to measures of relationship: covariance, and Pearson r

Outliers

55

Page 56: Introduction to measures of relationship: covariance, and Pearson r

Correlation and Causation

56

Page 57: Introduction to measures of relationship: covariance, and Pearson r

Uses of Correlation

Introduction to measures of relationship: covariance, and Pearson r

57

Page 58: Introduction to measures of relationship: covariance, and Pearson r

Uses of Correlation

•  Prediction. If two variables are known to be related in a systematic way, then it is possible to use one of the variables to make accurate predictions about the other.

58

Page 59: Introduction to measures of relationship: covariance, and Pearson r

Uses of Correlation

•  Validity. Suppose that a psychologist develops a new test for measuring intelligence. How could you show that this test truly measures what it claims; that is, how could you demonstrate the validity of the test? One common technique for demonstrating validity is to use a correlation.

59

Page 60: Introduction to measures of relationship: covariance, and Pearson r

Uses of Correlation

•  Reliability. In addition to evaluating the validity of a measurement procedure, correlations are used to determine reliability. A measurement procedure is considered reliable to the extent that it produces stable, consistent measurements.

60

Page 61: Introduction to measures of relationship: covariance, and Pearson r

Uses of Correlation

•  Theory Verification. Many psychological theories make specific predictions about the relationship between two variables.

61

Page 62: Introduction to measures of relationship: covariance, and Pearson r

People behind the concepts Scatter plot

Bivariate plot Pearson Product Moment

Correlation Coefficient

62

Page 63: Introduction to measures of relationship: covariance, and Pearson r

Slides Prepared by: Ivan Jacob Agaloos Pesigan Lecturer, Psychology Department

Ateneo de Manila University Assistant Professor Lecturer, Psychology Department

De La Salle University Lecturer, College of Education, Psychology Department, Mathematics Department Miriam College

63