Upload
warren-griffith
View
228
Download
4
Tags:
Embed Size (px)
Citation preview
Describing Relationships: Scatterplots and Correlation
Two Quantitative Variables Plot observed data on a graphHorizontal (X axis) independent
variable(explanatory variable)
Vertical (Y axis) dependent variable(response variable)
We call the graph a scatter diagram or scatter plot
ExampleX = Dosage of DrugY = Reduction in Blood Pressure X Y100 10200 18300 32400 44500 56 Correlation – a measure of association that tests
whether a relationship exists between two variables
Perfect positive linear correlation
50403020100
50
40
30
20
10
0
C2
C1
Perfect negative linear correlation
50403020100
50
40
30
20
10
0
C2
C1
Positive linear correlation
50403020100
50
40
30
20
10
0
C2
C1
Negative linear correlation
50403020100
50
40
30
20
10
0
C2
C1
Non-linear correlation
50403020100
30
20
10
0
C1
C2
No correlation
50403020100
50
40
30
20
10
0
C1
C2
We wish to quantify the strength and direction of a linear relationship (Pearson product-moment correlation coefficient, r)
222222 )()(
))((
yynxxn
yxxyn
yyxx
yyxxr
-1 <= r <= 1
r = 1 Perfect Positive Linear Correlation
r = -1 Perfect Negative Linear Correlation
r = 0 no linear relationship
General Rule: |r| >= .75 indicates a strong linear relationship
ExampleX = Dosage of DrugY = Reduction in Blood Pressure
99728.
2222
yynxxn
yxxynr
The R-squared value is the percent of the variation of Y explained by the model
For the Drug example
The higher is, the better the model
99728.r%5.992 R
%100%0 2 R
2R
“Causal” Research – When the objective is to determine if a variable causes a certain behavior (whether there is a cause and effect relationship between variables)
Note that it is never possible to prove causality just
based on the relationship between two variables There is a strong statistical correlation over months
of the year between ice cream consumption and the number of assaults in the U.S.
Does this mean ice cream manufacturers are
responsible for crime?
No! The correlation occurs statistically because the hot temperatures of summer increase both ice cream consumption and assaults
Thus, correlation does NOT imply causation Other factors besides cause and effect can
create an observed correlation
To establish whether two variables are causally related you must establish:Time order - The cause must have occurred before the effect Co-variation (statistical association) – The correlation
coefficient must show a strong relationship between the dependent and independent variable
Rationale - There must be a logical and compelling
explanation for why these two variables are related Non-spuriousness - It must be established that the
independent variable X, and only X, was the cause of changes in the dependent variable Y; rival explanations must be ruled out
This type of research is very complex and the researcher can never be completely certain that there are not other factors influencing the causal relationship
To help identify a relationship as cause and
effect a study is often performed many times The study should yield the same results every
time it is conducted (if this occurs it helps rule out rival explanations)