Upload
cory-richardson
View
217
Download
1
Tags:
Embed Size (px)
Citation preview
Introduction to Quantitative Data Analysis (continued)
Introduction to Quantitative Data Analysis (continued)
Reading on Quantitative Data Analysis: Baxter and Babbie, 2004, Chapter 12
Recall: Correlation
Correlation is used to measure and describe a relationship between two variables.Usually these two variables are simply observed as they exist in the environment; (no attempt to control or manipulate the variables).
Correlation
The correlation coefficient measures three characteristics of the relationship between X and Y: The direction of the relationship. The form of the relationship. The degree of the relationship.Pearson’s r called Pearson product-moment correlation coefficient in the textbook (p. 290)
Describing relationships: An example…
Scatter Plot
What is the relationship between level of education and lifetime earnings?
Education Level and Lifetime Earnings
0
1
2
3
4
5
0 2 4 6 8 10
Education (Predictor Variable)
Lif
etim
e E
arn
ing
s (C
rite
rio
n V
aria
ble
)X (Education) Y (Income)8 3.47 4.46 2.55 2.14 1.63 1.52 1.21 1
Scatter PlotDesignate one variable X and the other Y.
Although in some cases it does not matter which is which, in cases where one variable is used to predict the other, X is the “predictor” variable (the variable you’re predicting from—independent variable in hypothesis).
Draw axes of equal length for your graph.Determine the range of values for each variable. Place the high values of X to the right on the horizontal axis and the high values of Y toward the top of the vertical axis. Label convenient points along each axis.For each pair of scores, find the point of intersection for the X and Y values and indicate it with a dot.Label each axis and give the entire graph a name.
Direction of RelationshipA scatter plot shows at a glance the direction of the relationship. A positive correlation appears as a
cluster of data points that slopes from the lower left to the upper right.
Positive CorrelationIf the higher scores on X are generally paired with the higher scores on Y, and the lower scores on X are generally paired with the lower scores on Y, then the direction of the correlation between two variables is positive.
Direction of Relationship
A scatter plot shows at a glance the direction of the relationship. A negative correlation appears as a
cluster of data points that slopes from the upper left to the lower right.
Negative CorrelationIf the higher scores on X are generally paired with the lower scores on Y, and the lower scores on X are generally paired with the higher scores on Y, then the direction of the correlation between two variables is negative.
No Correlation
In cases where there is no correlation between two variables (both high and low values of X are equally paired with both high and low values of Y), there is no direction in the pattern of the dots.They are scattered about the plot in an irregular pattern.
Perfect Correlation
When there is a perfect linear relationship, every change in the X variable is accompanied by a corresponding change in the Y variable.
Form of Relationship
Pearson’s r assumes an underlying linear relationship (a relationship that can be best represented by a straight line).Not all relationships are linear.
Strength of Relationship
How can we describe the strength of the relationship in a scatter plot? A number between -1 and +1 that indicates
the relationship between two variables. The sign (- or +) indicates the direction of
the relationship. The number indicates the strength of the
relationship.
-1 ------------ 0 ------------ +1Perfect Relationship No Relationship Perfect Relationship
The closer to –1 or +1, the stronger the relationship.
Correlation Coefficient
Pearson’s r
Definitional formula:
))()()((
))(()(2222
YYnXXn
YXXYnr
r COVXY(sx )(sy) n
YYXXCOVXY
))((
separately vary Y and X which todegree
ther vary togeY and X which todegreer
Computational formula:
An Example: Correlation
What is the relationship between level of education and lifetime earnings?
Education Level and Lifetime Earnings
0
1
2
3
4
5
0 2 4 6 8 10
Education (Predictor Variable)
Lif
etim
e E
arn
ing
s (C
rite
rio
n V
aria
ble
)X (Education) Y (Income)8 3.47 4.46 2.55 2.14 1.63 1.52 1.21 1
An Example: CorrelationX Education Y Income XY X2 Y2
8 3.4 27.2 64 11.567 4.4 30.8 49 19.366 2.5 15 36 6.255 2.1 10.5 25 4.414 1.6 6.4 16 2.563 1.5 4.5 9 2.252 1.2 2.4 4 1.441 1 1 1 136 17.7 97.8 204 48.83
8
83.48
204
8.97
7.17
36
2
2
n
Y
X
XY
Y
X
))()()((
))(()(2222
YYnXXn
YXXYnr
An Example: Correlation
8
83.48
204
8.97
7.17
36
2
2
n
Y
X
XY
Y
X
An Example: CorrelationResearchers who measure reaction time for human participants often observe a relationship between the reaction time scores and the number of errors that the participants commit. This relationship is known as the speed-accuracy tradeoff. The following data are from a reaction time study where the researcher recorded the average reaction time (milliseconds) and the total number of errors for each individual in a sample of 8 participants. Calculate the correlation coefficient.
Reaction Time Errors184 10213 6234 2197 7189 13221 10237 4192 9
Speed Accuracy Tradeoff
0
5
10
15
150 175 200 225 250
Reaction Time
Num
ber o
f Err
ors
An Example: Correlation
))()()((
))(()(2222
YYnXXn
YXXYnr
X X2 Y Y2 XY184 33856 10 100 1840213 45369 6 36 1278234 54756 2 4 468197 38809 7 49 1379189 35721 13 169 2457221 48841 10 100 2210237 56169 4 16 948192 36864 9 81 17281667 350385 61 555 12308
77.0
)61()555(8)1667()350385(8
)61)(1667()12308(822
r
Interpreting Pearson’s r
Values can be influenced by the range of scores.
Interpreting Pearson’s r
Values can be influenced by outliers.
Interpreting Pearson’s r
Correlation does not equal causation. Can tell you the strength and
direction of a relationship between two variables but not the nature of the relationship. The third variable problem. The directionality problem.