Upload
lambert-casey
View
212
Download
0
Embed Size (px)
Citation preview
Correlation
Suppose we found the age and weight of a sample of 10 adults.
Create a scatterplot of the data below.
Is there any relationship between the age and weight of these adults?
Age 24 30 41 28 50 46 49 35 20 39
Wt 256 124 320 185 158 129 103 196 110 130
Suppose we found the height and weight of a sample of 10 adults.
Create a scatterplot of the data below.
Is there any relationship between the height and weight of these adults?
Ht 74 65 77 72 68 60 62 73 61 64
Wt 256 124 320 185 158 129 103 196 110 130
Is it positive or negative? Weak or strong?
The closer the points in a scatterplot are to a straight
line - the stronger the relationship.
The farther away from a straight line – the weaker the relationship
Identify as having a positivepositive association, a negativenegative association, or nono association.
1. Heights of mothers & heights of their adult daughters
++
2. Age of a car in years and its current value
3. Weight of a person and calories consumed
4. Height of a person and the person’s birth month
5. Number of hours spent in safety training and the number of accidents that occur
--++NONO
--
Correlation Coefficient (r)-• A quantitativequantitative assessment of the strength
& direction of the linear relationship between bivariate, quantitative data
• Pearson’s sample correlation is used most
• parameter - rho)
• statistic - r
y
i
x
i
s
yy
s
xx
nr
1
1
Calculate r. Interpret r in context.
Speed Limit (mph) 55 50 45 40 30 20
Avg. # of accidents (weekly)
28 25 21 17 11 6
There is a strong, positive, linear relationship between speed limit and average number of accidents per week.
Moderate CorrelationStrong correlation
Properties of r(correlation coefficient)
• legitimate values of r is [-1,1]
0 .5 .8 1-1 -.8 -.5
No Correlation
Weak correlation
•value of r does not depend on the unitunit of measurement for either variable
x (in mm) 12 15 21 32 26 19 24
y 4 7 10 14 9 8 12
Find r.
Change to cm & find r.
The correlations are the same.
•value of r does not depend on which of the two variables is labeled
x
x 12 15 21 32 26 19 24y 4 7 10 14 9 8 12
Switch x & y & find r.
The correlations are the same.
•value of r is non-resistantnon-resistant
x 12 15 21 32 26 1924
y 4 7 10 14 9 822
Find r.Outliers affect the
correlation coefficient
•value of r is a measure of the extent to which x & y are linearlylinearly related
A value of r close to zero does not rule out any strong relationship between x and y.
r = 0, but has a definite definite relationship!
Minister data:(Association vs Causation Data on Elmo)
r = .9999
So does an increase in ministers causecause an increase in consumption of rum?
Correlation does not imply causation
Correlation does not imply causation
Correlation does not Correlation does not imply causationimply causation