View
235
Download
3
Tags:
Embed Size (px)
Citation preview
Correlation Topics
Co-relationship between two variables.
Linear vs Curvilinear relationships
Positive vs Negative relationships
Strength of relationship
Mythical relationship between Baseball and Football performance
AlBenChuckDavidEdFrankGeorge
Baseball skillVery goodVery poor
GoodTerrible
PoorAverageExcellent
Football skillVery goodVery poor
GoodTerrible
PoorAverageExcellent
Is this a linear relationship?
Baseball skill
predicts football skill.
Football skill predicts
baseball skill.There is a
strong relationship.
First we must arrange the scores in “order”
Baseball skillTerrible
Very PoorPoor
AverageGood
Very GoodExcellent
Football skillTerrible
Very PoorPoor
AverageGood
Very GoodExcellent
DavidBenEdFrankChuckAlGeorge
Then we plot the scores
* Ben
* Ed
* Frank
* Chuck
* Al
* David
* GeorgeExcellent
Terrible
Very Good
Good
Average
Poor
Very Poor
ExcellentTerrible Very GoodGoodAveragePoorVery Poor
FootballSkill
BaseballSkill
This is definitely a linear relationship!
Let’s get more abstract?Excellent
Terrible
Very Good
Good
Average
Poor
Very Poor
ExcellentTerrible Very GoodGoodAveragePoorVery Poor
FootballSkill
BaseballSkill
X
Y
3
-3
2
1
0
-1
-2
3 -3 2 1 0 -1 -2
Linear or nonlinear? Let’s look at another set of values.
Football skillTerribleAverageAverage
Very GoodExcellent
GoodPoor
Baseball skillTerrible
Very PoorPoor
AverageGood
Very GoodExcellent
DavidBenEdFrankChuckAlGeorge
Is this a linear relationship?
Is this linear?
* Ben* Ed
* Frank
* Chuck
* Al
* David
* George
Excellent
Terrible
Very Good
Good
Average
Poor
Very Poor
ExcellentTerrible Very GoodGoodAveragePoorVery Poor
FootballSkill
BaseballSkill
NO! It is best described bya curved line.It is a curvilinear relationship!
Positive vs Negative relationships
In a positive relationship, as one value increases the other value tends to increase as well.
Example: The longer a sailboat is, the more it tends to cost. As length goes up, price tends to go up.
In a negative relationship, as one value increases, the other value decreases.
Example: The older a sailboat is, the less it tends to cost. As years go up, price tends to go down.
Positive vs Negative scatterplot
3
-3
2
1
0
-1
-2
3 -3 2 1 0 -1 -2
Negativerelationship
Positiverelationship
The strength of a relationship tells us approximately how the dots will fall around a best fitting line.Perfect - scores fall exactly on a straight line.
Strong - most scores fall near the line.
Moderate - some are near the line, some not.
Weak – lots of scores fall close to the line, but many fall quite far from it.
Independent - the scores are not close to the line and form a circular or square pattern
Comparing apples to oranges? Use t scores!
You can use correlation to look for the relationship between ANY two values that you can measure of a single subject.
However, there may not be any relationship (independent).
A correlation tells us if scores are consistently similar on two measures, consistently different from each other, or have no real pattern
Comparing apples to oranges? Use t scores!To compare scores on two different
variables, you transform them into tX and tY scores.
tX and tY scores can be directly compared to each other to see whether they are consistently similar, consistently quite different, or show no consistent pattern of similarity or difference
Similar tX and tY scores = positive correlation. dissimilar = negative correlation. No pattern = independence.
When t scores are consistently more similar than different, we have a positive correlation.
When t scores are consistently more different than similar, we have a negative correlation.
When t scores show no consistent pattern of similarity or difference, we have independence.
Comparing variables
Anxiety symptoms, e.g., heartbeat, with number of hours driving to class.
Hat size with drawing ability.Math ability with verbal ability.Number of children with IQ.Turn them all into t scores
Pearson’s Correlation Coefficient
coefficient - noun, a number that serves as a measure of some property.
The correlation coefficient indexes the consistency and direction of a correlation
Pearson’s rho () is the parameter that characterizes the strength and direction of a linear relationship (and only a linear relationship) between two population variables.
Pearsons r is a least squares, unbiased estimate of rho.
Pearson’s Correlation Coefficientr and rho vary from -1.000 to +1.000.
A negative value indicates a negative relationship; a positive value indicates a positive relationship.
Values of r close to 1.000 or -1.000 indicate a strong (consistent) relationship; values close to 0.000 indicate a weak (inconsistent) or independent relationship.
r, strength and direction
Perfect, positive +1.00Strong, positive + .75Moderate, positive + .50Weak, positive + .25Independent .00Weak, negative - .25Moderate, negative - .50Strong, negative - .75 Perfect, negative -1.00
Calculating Pearson’s r
Select a random sample from a population; obtain scores on two variables, which we will call X and Y.
Convert all the scores into t scores.
Calculating Pearson’s r
First, subtract the tY score from the tX score in each pair.
Then square all of the differences and add them up, that is, (tX - tY)2.
Calculating Pearson’s r
Estimate the average squared distance between ZX and ZY by dividing by the sum of squared differences by(nP - 1), that is,
(tX - tY)2 / (nP - 1)
To turn this estimate into Pearson’s r, use the formula
r = 1 - (1/2 (tX - tY)2 / (nP - 1))
Note seeming exceptionUsually we divide a sum of squared
deviations around a mean by df to estimate the variance.
Here the sum of squares is not around a mean and we are not estimating a variance.
So you divide (tX - tY)2 by (nP - 1)
nP - 1 is not df for corr & regression (dfREG = nP - 2)
Example: Calculate t scores for X
DATA2468
10
X=30 N= 5
X=6.00 MSW = 40.00/(5-1) = 10
sX = 3.16
(X - X)2
16404
16
X - X-4-2024
tx=(X-X)/ s
-1.26-0.63 0.00 0.63 1.26
SSW = 40.00
Calculate t scores for Y
DATA9
11101213
Y=55 N= 5 Y=11.00 MSW = 10.00/(5-1) = 2.50
sY = 1.58
(Y - Y)2
40114
Y - Y-2-0-1+1+2
(ty=Y - Y) / s-1.26 0.00-0.63 0.63 1.26
SSW = 10.00
Calculate r
tY
-1.26 0.00-0.63 0.63 1.26
tX
-1.26-0.63 0.00 0.63 1.26
tX - tY
0.00-0.63 0.630.000.00
(tX - tY)2
0.000.400.400.000.00
(tX - tY)2 / (nP - 1)=0.200
r = 1.000 - (1/2 * ( (tX - tY)2 / (nP - 1)))
r = 1.000 - (1/2 * .200) = 1 - .100 = .900
(tX - tY)2=0.80
This is a verystrong, positive relationship.