DO NOW
EXPLAIN YOUR ANSWER!
Stats: Modeling the WorldChapter 7
Scatterplots, Association, and Correlation
Making a Picture of Bivariate Data
Relationships between variables are often what we truly want to know about our data.
Visually, you can show the associations between quantative variables using a scatterplot.
Looking at ScatterplotsAfter plotting two variables on a scatterplot, we describe the relationship by examining the form, direction, and strength of the association. We look for an overall pattern
Form: linear, curved, clusters, no pattern
Association/Direction: positive, negative, no direction
Scatterplot of the size of a diamond ring in carats and the price in dollars.
Looking at Scatterplots
Strength: how closely the points fit the “form”
Outliers: deviations from the pattern
Scatterplot of the size of a diamond ring in carats and the price in dollars.
Which of the scatterplots show:
a) Little or no association?
b) A negative association?
c) A linear association?
d) A moderately strong association?
e) A very strong association?
POD #30 10/18/2011
Roles for VariablesExplanatory (or predictor) variable goes on the x-axisResponse (or predicted) variable goes on the y-axis
**If the relationship between the variables is unclear, it does not matter which one we identify as the explanatory/response variable. Always THINK about the dataset and what you are measuring!!!!
Creating a ScatterplotBy hand:
- Graph on a normal x-y plane- Make sure to label and scale axes (including units if known!)- You do not have to show the origin!
By TI:- Enter data- 2nd:Stat Plot – 1st type of graph
Quantifying StrengthWhen determining the strength of a scatterplot, we would like a numerical value that indicates the strength of the relationship. This numerical value is called the correlation coefficient.
Correlation Coefficient (aka “r”)
The correlation coefficient (r) gives us a numerical measurement of the strength of the linear relationship between the explanatory and response variables.
Strength and Direction Direction:
Positive “r” indicates a positive association Negative “r” indicates a negative association
Strength: Values close to 0 indicate weak relations As r gets closer to 1, the relationship is stronger Values of exactly 1 indicate a perfect line
“r” ranges from −1 to +1
“r” quantifies the strength and direction of a linear relationship between two quantitative variables.
Strength: How closely the points follow a straight line.
Direction is positive when individuals with higher x values tend to have higher values of y.
POD #31 10/24/2011
When to use Correlation Quantitative Variables – r cannot be applied to
categorical data! Make sure you understand your variables
Linear data – r can always be calculated, but correlation only measures strength of linear relationships, so watch for curvature!
Outliers – Since r is calculated using z-scores (and hence the mean and st. dev), it is non-resistant to outliers!
Properties of Correlation Sign of r gives the direction of association Correlation is always between -1 and +1 Flipping x and y does NOT affect r R has NO units!! It has been standardized Changing units on x or y does not affect r R measures a LINEAR relationship only! R is non-resistant to outliers
Finding Correlation Using the TI
Stat: Calc: 4:LinReg
If your r does not show, you will need to turn DiagnosticsOn. Go to 2nd:0 (Catalog), scroll down to DiagnosticsOn and hit Enter
twice.
Recommended