Correlation
Correlation Coefficient
aka Pearson Product-Moment Correlation Coefficient.
Correlation coefficient summarizes the relations b/t 2 variables, both direction and degree (closeness). Scattergram summary.
Sample r; population (Greek rho). Can take values from –1 thru 0 to +1. Sign tells direction (+ or -); magnitude or value tells
closeness or degree.
Positive Correlation
757269666360
Height
210
180
150
120
90
Wei
ght
Plot of Weight by Height
N Ht. In.
Wt.Lbs.
1 60 102
2 62 120
3 63 130
4 65 150
5 65 120
6 68 145
7 69 175
8 70 170
9 72 185 Example of a Positive Correlation10 74 210
When one variable increases, the other also increases.
Negative Correlation
4003002001000Study Time
30
20
10
0
Err
ors
Plot of Errors by Study Time
N Study Time Minutes
# Error
1 90 252 100 283 130 204 150 205 180 156 200 127 220 138 300 109 350 8 Example of a Negative
Correlation10 400 6
When one variable increases, the other decreases.
Zero Correlation
1.91.81.71.61.5Toe Size
700
600
500
400
SA
T-V
Plot of SAT-V by Toe Size
N SAT-V
Toe Size
1 450 1.7
2 480 1.8
3 500 1.6
4 510 1.8
5 520 1.9
6 550 1.7
7 600 1.6
8 630 1.7
9 650 1.9 Example of a Zero Correlation(*Strictly speaking, no linear relation.)
10 700 1.7
No relation*: when one variable increases, the other variable neither increases nor decreases.
Correlation Sign & Magnitude
The sign indicates the direction of the relationship. If positive, they increase together. If negative, when one goes up the other goes down.
The absolute value tells the strength of the relationship. Values close to +1 and –1 indicate very strong relations. As r 0, the relationship is weaker. If r=0, no relation. If |r|=1, perfect relation.
Perfect Positive Correlation (r = 1)
40302010Cars Sold
4000
3000
2000
1000In
com
e
Plot of Income by Cars Sold
N Cars sold
$
1 10 1000
2 15 1500
3 20 2000
4 25 2500
5 30 3000
6 35 3500
7 40 4000
8 45 4500
Notice the straight line. When r=+1 or -1, all the points will fall on a line.
Example Correlations
Variable X Variable Y
Salary Taxes paid
Shyness # people greeted at party
Price of car Prestige of car
Price of quartz watch
Accuracy of time kept
Income of sales people
Number of cars sold
Anxiety Memory exam
Correlation
Computing the Correlation
The definition: rz z
Nx y
The correlation coefficient, r, is the average cross-product of z scores. zX is X in z-score form, zY is Y in z-score form and we multiply the two. We add them all and divide by N to get the average.
XX SDXXz /)(
Computational ExampleN Ht Wt Zht Zwt Zh*Zw
1 60 102 -1.58 -1.51 2.39
2 62 120 -1.11 -0.95 1.06
3 63 130 -0.88 -0.64 0.57
4 65 150 -0.42 -0.02 0.01
5 65 120 -0.42 -0.95 0.40
6 68 145 0.28 -0.18 -0.05
7 69 175 0.51 0.75 0.39
8 70 170 0.74 0.60 0.45
9 72 185 1.21 1.06 1.29
10Mean
74 210 1.67 1.84 3.08
66.8 150.7 0 0 0.96S [N](SD[N-1])
4.31(4.54)
32.20(33.95)
1 1
-1.58 = (60-66.8)/4.31 -1.51=(102-150.7)/32.2
Some examples
X Y correlation
Extroversion Job Performance of Managers
.10
Corporate social responsibility
Corporate financial performance
.20
Job performance –supervisor report
Job performance – peer report
.30
High School SAT College GPA (year 1)
.60
Weight-self report Weight-measured .90
Review
What are the maximum and minimum values of r?
What does r summarize? What is the symbol for the population value of r?
How does r show direction of relations? How does r show magnitude of relations?
Test Questions
Engine Displacement (cu. inches)
5004003002001000-100
Mile
s pe
r G
allo
n
50
40
30
20
10
0
Engine Displacement (cu. inches)
5004003002001000-100
Tim
e to
Acc
eler
ate fro
m 0
to
60 m
ph (se
c)
30
20
10
0
Model Year (modulo 100)
848280787674727068
Tim
e to
Acc
eler
ate fro
m 0
to
60 m
ph (se
c)
30
20
10
0
Vehicle Weight (lbs.)
600050004000300020001000
Tim
e to
Acc
eler
ate fro
m 0
to
60 m
ph (se
c)
30
20
10
0
A B C D
Which figure shows the most positive correlation? A, B, C, D?
Test Questions
Which of the four choices below shows the correlation with the strongest association between X and Y?
a. -.50
b. .0
c. .25
d. 1.02