Section 12.1
Scatter Plots and Correlation
HAWKES LEARNING SYSTEMS
math courseware specialists
Copyright © 2008 by Hawkes Learning
Systems/Quant Systems, Inc.
All rights reserved.
• Scatter Plot – a graph on the coordinate plane which contains one point for each pair of data.
• Independent Variable – placed on the x-axis, this variable causes a change in the dependent variable. Also known as the predictor variable or the explanatory variable.
• Dependent Variable – placed on the y-axis, this variable changes in response to the independent variable. Also known as the response variable.
Definitions:
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
HAWKES LEARNING SYSTEMS
math courseware specialists
Draw a scatter plot to represent the following data:
Draw a scatter plot:
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
HAWKES LEARNING SYSTEMS
math courseware specialists
Hours of Study 0 0.5 0.5 1 1.5 2 2 3 4.5 5
Test Grade 72 84 68 85 77 81 48 90 99 88
Solution:
• We are interested in the relationship between the two variables being studied. Like any other relationship, some relationships are stronger than others.
• Linear Relationship – when a relationship seems to follow a straight line.
• Positive Slope – indicates that as the values of one variable increase, so do the values of the other variable.
• Negative Slope – indicates that as the values of one variable increase, the values of the other variable decrease.
Scatter Plots:
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
Types of Slopes:
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
x x
Determine the pattern of the scatter plot:
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
a. The price of a used car and the number of miles it was driven.
Negative slope
b. The pressure on a gas pedal and the speed of the car.
Positive slope
c. Shoe size and IQ for adults.
No apparent linear relationship
Determine whether the pattern of a scatter plot between the two variables would have a positive slope, a negative slope, or not follow a straight line pattern.
• Correlation is the mathematical term for the relationship between two variables.
• Strong Linear Correlation – when a relationship seems to follow a straight line.
• Weak Linear Correlation– when a relationship seems to follow a straight line, but the points are more scattered.
• No Correlation– no relationship between the variables.
Correlation:
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
Types of Relationships:
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
• Pearson Correlation Coefficient, – the parameter that measures the strength of a linear relationship for the population.
• Correlation Coefficient, r – measures how strongly one variable is linearly dependent upon the other for a sample.
Correlation coefficient:
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
When calculating the correlation coefficient, round your answers to three decimal places.
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
• –1 ≤ r ≤ 1
• Close to –1 means a strong negative correlation.
• Close to 0 means no correlation.
• Close to 1 means a strong positive correlation.
HAWKES LEARNING SYSTEMS
math courseware specialists
TI-84 Plus Instructions:
1. To make sure all values appear, these two steps only have to be performed once:
a. Press 2ND, then 0
b. Select DiagnosticOn
c. Press ENTER
2. Press STAT, then EDIT
3. Type the x-variable values into L1
4. Type the y-variable values into L2
5. Press STAT, then CALC
6. Choose 4: Linreg(ax+b)
7. Press ENTER
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
Find the value of r:
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
Calculate the correlation coefficient, r, for the data shown below.
Hours of Study 0 0.5 0.5 1 1.5 2 2 3 4.5 5
Test Grade 72 84 68 85 77 81 48 90 99 88
Solution:
n 10, ∑x20, ∑y792, ∑xy1690, ∑x266, ∑y264528
0.490
• Significant Linear Relationship (Two-Tailed Test):
H0: 0 (Implies there is no significant linear relationship.)
Ha: ≠ 0 (Implies there is a significant linear relationship.)
• Negative Linear Relationship (Left-Tailed Test):
H0: ≥ 0
Ha: < 0
• Positive Linear Relationship (Right-Tailed Test):
H0: ≤ 0
Ha: > 0
Null and Alternative Hypotheses Testing for Linear Relationships:
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
HAWKES LEARNING SYSTEMS
math courseware specialists
Testing Statistic:
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
with d.f. n – 2
To determine if the test statistic calculated from the sample is statistically significant we will need to look at the critical value. Also, when using technology such as a TI-84 Plus, p-values can be compared to the level of significance.
HAWKES LEARNING SYSTEMS
math courseware specialists
Rejection Rule for Testing Linear Relationships:
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
• Significant Linear Relationship (Two-Tailed Test):Reject H0 if |t | ≥ t/2
• Negative Linear Relationship (Left-Tailed Test): Reject H0 if t ≤ –t
• Positive Linear Relationship (Right-Tailed Test): Reject H0 if t ≥ t
HAWKES LEARNING SYSTEMS
math courseware specialists
Steps for Hypothesis Testing:
1. State the null of alternative hypotheses.2. Set up the hypothesis test by choosing the
test statistic and determining the values of the test statistic that would lead to rejecting the null hypothesis.
3. Gather data and calculate the necessary sample statistics.
4. Draw a conclusion.
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
HAWKES LEARNING SYSTEMS
math courseware specialists
Draw a conclusion:Use a hypothesis test to determine if the linear relationship between the number of parking tickets a student receives during a semester and their GPA during the same semester is statistically significant at the 0.05 level of significance.
Solution:
First state the hypotheses:H0:Ha:
Next, set up the hypothesis test and determine the critical value: d.f. 13, 0.05t0.05/2
Reject if |t | ≥ t/2 , or if |t | ≥ 2.160
0 ≠ 0
2.160
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
# of Tickets 0 0 0 0 1 1 1 2 2 2 3 3 5 7 8
GPA 3.6 3.9 2.4 3.1 3.5 4.0 3.6 2.8 3.0 2.2 3.9 3.1 2.1 2.8 1.7
HAWKES LEARNING SYSTEMS
math courseware specialists
Solution (continued):
Gather the data and calculate the necessary sample statistics:n 15, r –0.587
Finally, draw a conclusion:Since |t | is greater than t/2, we will reject the null hypothesis.
–2.614
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
HAWKES LEARNING SYSTEMS
math courseware specialists
Using Critical Values to Determine Statistical Significance:
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
The correlation coefficient, r, is statistically significant if the absolute value of the correlation is greater than the critical value in the Pearson Correlation Coefficient Table.
|r | > r
Determine the significance:
HAWKES LEARNING SYSTEMS
math courseware specialists
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
a. r 0.52, n 19, 0.05
r 0.456, Yes
b. r 0.52, n 19, 0.01
r 0.575, No
c. r –0.44, n 35, 0.01
r 0.430, Yes
Determine whether the following values of r are statistically significant.
HAWKES LEARNING SYSTEMS
math courseware specialists
Determine the significance:
Determine if r is statistically significant at the 0.05 level.
Solution:
n 15, r –0.587, 0.05
r
Statistically significant if |r | > r.
Therefore, r is statistically significant at the 0.05 level.
0.514
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
# of Tickets 0 0 0 0 1 1 1 2 2 2 3 3 5 7 8
GPA 3.6 3.9 2.4 3.1 3.5 4.0 3.6 2.8 3.0 2.2 3.9 3.1 2.1 2.8 1.7
HAWKES LEARNING SYSTEMS
math courseware specialists
Coefficient of Determination:
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation
The coefficient of determination, r 2, is the measure of the amount of variation in y explained by the variation in x.
HAWKES LEARNING SYSTEMS
math courseware specialists
Coefficient of Determination :
If the correlation between the number of rooms in a house and its price is r 0.65, how much of the variation in price can be explained by the relationship between the two variables?
Solution:
r 0.65
r 2
So about 42.3% of the variation in the price of a house can be explained by the relationship between the two variables.
0.423
Regression, Inference, and Model Building
12.1 Scatter Plots and Correlation