25
PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

PSY 307 – Statistics for the Behavioral Sciences

Chapter 6 – Correlation

Page 2: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Midterm Results

Top score = 45Top score for curve = 45

40-53 A 736-39 B 431-35 C 227-30 D 80-26 F 3

24

Page 3: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Aleks/Holcomb Hint

Page 4: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

To Find the Cutoff Scores

If you know the mean and standard deviation, you can find what x values cut off certain percentages. Solve for k then multiply the k value by the SD and add/subtract that number from the mean to get the cutoff scores.

Page 5: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Does Aleks Quiz 1 Predict Midterm Scores?

Page 6: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Adding a Prediction (Regression) Line Provides More Information

r = .56

Page 7: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Does Time Spent on Aleks Predict Quiz Grades?

r = .16

Page 8: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Sometimes the Relationship is Not Linear

r = .16

r = .47 (quadratic)

Page 9: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

This is the graph as published in a Wall Street Journal editorial (7/13), where they claimed that reducing corporate taxes results in greater revenue.

Treating Norway as an outlier, the data instead shows that as taxes increase, so do revenues – the opposite conclusion.

Which is right? The correct graph is the one with the best fit – where most of the data points are close to the line drawn (right).

Lying With Statistics

Page 10: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Describing Relationships

Positive relationship – high values tend to go with high values, low with low.

Negative relationship – high values tend to go with low values, low with high.

No relationship – no regularity appears between pairs of scores in two distributions.

Page 11: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Relationship Does Not Imply Causality

A relationship can exist without being a CAUSAL relationship. Correlation does not imply causation.

Third variable problem -- a third variable is causing both of the variables you are measuring to change – e.g., popsicles & drowning.

The direction of causality cannot be determined from the r statistic.

Page 12: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Chocolate and Nobel Prizes

http://www.nejm.org/doi/full/10.1056/NEJMon1211064

Page 13: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Scatterplots

One variable is measured on the x-axis, the other on the y-axis.

Positive relationship – a cluster of dots sloping upward from the lower left to the upper right.

Negative relationship – a cluster of dots sloping down from upper left to lower right.

No relationship – no apparent slope.

Page 14: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Example Positive Correlations

r=1.0

r=.85

r=.39

r=.17

Page 15: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Example Negative Correlations

r=-.54

r=-.94 r=-.33

Note that the line slopes in the opposite direction, from upper left to lower right.

Page 16: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Strength of Relationship

The more closely the dots approximate a straight line, the stronger the relationship.

A perfect relationship forms a straight line.

Dots forming a line reflect a linear relationship.

Dots forming a curved or bent line reflect a curvilinear relationship.

Page 17: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

More Examples

http://www.stat.uiuc.edu/courses/stat100/java/GCApplet/GCAppletFrame.html

Page 18: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Correlation Coefficient

Pearson’s r –a measure of how well a straight line describes the cluster of dots in a plot. Ranges from -1 to 1. The sign indicates a positive or

negative relationship. The value of r indicates strength of

relationship. Pearson’s r is independent of units

of measure.

Page 19: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Interpreting Pearson’s r

The value of r needed to assert a strong relationship depends on: The size of n What is being measured.

Pearson’s r is NOT the percent or proportion of a perfect relationship.

Correlation is not causation. Experimentation is used to confirm a

suspected causal relationship.

Page 20: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Calculating Pearson’s r

zxzy

r = _______

n – 1

This formula is most useful when the scores are already z-scores.

Computational formulas – use whichever is most convenient for the data at hand.

Page 21: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Sum of the Products (SP)

n

XXXXSS

andn

YXXYYYXXSP

where

SSSS

SPr

x

xy

yx

xy

222 )(

)(

))(())((

Page 22: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Computational Formulas

Page 23: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Outliers

An outlier that is near where the regression line might normally go, increases the r value.

r=.457

An outlier away from the regression line decreases the r value.

r=.336

Page 24: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Dealing with Outliers

Outliers can dramatically change the value of the r correlation coefficient.

Always produce a scatterplot and inspect for outliers before calculating r.

Sometimes outliers can be omitted. Sometimes r cannot be used. http://www.stat.sc.edu/~west/javahtml/Regression.html

Page 25: PSY 307 – Statistics for the Behavioral Sciences Chapter 6 – Correlation

Other Correlation Coefficients

Spearman’s rho (r) – based on ranks rather than values. Used with ordinal data (qualitative data

that can be ordered least to most). Point biserial correlation --

correlations between quantitative data and two coded categories.

Cramer’s phi – correlation between two ordered qualitative categories.