Upload
lance
View
96
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Introduction to Correlation and Regression. Ginger Holmes Rowell, Ph. D. Associate Professor of Mathematics Middle Tennessee State University. Outline. Introduction Linear Correlation Regression Simple Linear Regression Using the TI-83 Model/Formulas. - PowerPoint PPT Presentation
Citation preview
Introduction toIntroduction to
Correlation and RegressionCorrelation and Regression
Ginger Holmes Rowell, Ph. D.Ginger Holmes Rowell, Ph. D.Associate Professor of MathematicsAssociate Professor of MathematicsMiddle Tennessee State UniversityMiddle Tennessee State University
OutlineOutline Introduction Introduction
Linear CorrelationLinear Correlation
Regression Regression Simple Linear Simple Linear
Regression Regression Using the TI-83 Using the TI-83 Model/FormulasModel/Formulas
Outline continuedOutline continued ApplicationsApplications
Real-life ApplicationsReal-life Applications Practice ProblemsPractice Problems
Internet Resources Internet Resources Applets Applets Data SourcesData Sources
CorrelationCorrelation Correlation Correlation
Example (positive correlation)Example (positive correlation)
Specific Example Specific Example
For seven For seven random summer random summer days, a person days, a person recorded the recorded the temperaturetemperature and and their their water water consumptionconsumption, , during a three-hour during a three-hour period spent period spent outside. outside.
Temperature (F)
Water Consumption
(ounces)
75 16
83 20
85 25
85 27
92 32
97 48
99 48
How would you describe the graph?How would you describe the graph?
How “strong” is the linear relationship?How “strong” is the linear relationship?
Measuring the RelationshipMeasuring the Relationship
Pearson’s Sample Pearson’s Sample Correlation Coefficient, Correlation Coefficient, rr
Direction of AssociationDirection of Association
_______ Correlation_______ Correlation _______ Correlation_______ Correlation
Strength of Linear AssociationStrength of Linear Association
r value Interpretation
1
0
-1
Strength of Linear AssociationStrength of Linear Association
Other Strengths of AssociationOther Strengths of Association
r value Interpretation
0.9
0.5
0.25
Other Strengths of AssociationOther Strengths of Association
FormulaFormula
= the sum n = number of paired items
xi = input variableyi
= output variable
x = x-bar = mean of x’s
y = y-bar = mean of y’s
sx= standard deviation of x’s
sy= standard deviation of y’s
Internet ResourcesInternet Resources CorrelationCorrelation
Guessing CorrelationsGuessing Correlations - An - An interactive site that allows you to interactive site that allows you to try to match correlation coefficients try to match correlation coefficients to scatterplots. University of Illinois, to scatterplots. University of Illinois, Urbanna Champaign Statistics Urbanna Champaign Statistics Program. Program. http://www.stat.uiuc.edu/~stat100/jhttp://www.stat.uiuc.edu/~stat100/java/guess/GCApplet.htmlava/guess/GCApplet.html
RegressionRegression
RegressionRegression
Specific statistical methodsSpecific statistical methods for for finding the “line of best fit” for one finding the “line of best fit” for one response (dependent) numerical response (dependent) numerical variable based on one or more variable based on one or more explanatory (independent) explanatory (independent) variables.variables.
Curve Fitting vs. RegressionCurve Fitting vs. Regression
RegressionRegression
Includes using statistical methods Includes using statistical methods to assess the "goodness of fit" of to assess the "goodness of fit" of the model. (ex. Correlation the model. (ex. Correlation Coefficient)Coefficient)
Regression: 3 Main PurposesRegression: 3 Main Purposes
To describeTo describe (or model) (or model)
To predictTo predict ( (or estimate) or estimate)
To controlTo control (or administer) (or administer)
Simple Linear RegressionSimple Linear Regression
Statistical method for findingStatistical method for finding the “line of best fit” the “line of best fit”
for one response (dependent) for one response (dependent) numerical variable numerical variable
based on one explanatory based on one explanatory (independent) variable. (independent) variable.
Least Squares Regression: ExampleLeast Squares Regression: Example
Least Squares RegressionLeast Squares Regression GOAL GOAL --
This minimizes the This minimizes the ________________________________
Need to find a mean square error Need to find a mean square error appletapplet
Allan/Beth maybe KyleAllan/Beth maybe Kyle
Internet ResourcesInternet Resources RegressionRegression
Estimate the Regression LineEstimate the Regression Line. . Compare the mean square error Compare the mean square error from different regression lines. Can from different regression lines. Can you find the minimum mean square you find the minimum mean square error? Rice University Virtual error? Rice University Virtual Statistics Lab. Statistics Lab. http://www.ruf.rice.edu/~lane/stat_sihttp://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.htmlm/reg_by_eye/index.html
ExampleExample
Plan an outdoor party.Plan an outdoor party.
EstimateEstimate number of soft drinks to buy number of soft drinks to buy per person, based on how hot the per person, based on how hot the weather is.weather is.
Use Temperature/Water data and Use Temperature/Water data and regressionregression..
Steps to Reaching a SolutionSteps to Reaching a Solution Draw a scatterplot of the data.Draw a scatterplot of the data. Visually, consider the strength of the Visually, consider the strength of the
linear relationship.linear relationship. If the relationship appears relatively If the relationship appears relatively
strong, find the correlation coefficient strong, find the correlation coefficient as a numerical verification.as a numerical verification.
If the correlation is still relatively If the correlation is still relatively strong, then find the simple linear strong, then find the simple linear regression line. regression line.
Our Next StepsOur Next Steps Estimate the line using algebra (i.e. Estimate the line using algebra (i.e.
practice equation of lines)practice equation of lines)
Learn to Use the TI-83/84 for Learn to Use the TI-83/84 for Correlation and Regression. Correlation and Regression.
Interpret the Results (in Interpret the Results (in the Context of the Problem). the Context of the Problem).
ExampleExample
Temperature (F)
Water Consumption
(ounces)
75 16
83 20
85 25
85 27
92 32
97 48
99 48
Using Pencil and PaperUsing Pencil and Paper Draw a scatterplotDraw a scatterplot
Draw your estimate of the line of Draw your estimate of the line of best fit on the scatterplotbest fit on the scatterplot
Find the equation of YOUR lineFind the equation of YOUR line
Finding the Solution: TI-83/84Finding the Solution: TI-83/84 Using the TI- 83/84 calculatorUsing the TI- 83/84 calculator
Turn on the calculator diagnostics.Turn on the calculator diagnostics. Enter the data. Enter the data. Graph a scatterplot of the data.Graph a scatterplot of the data. Find the equation of the regression line Find the equation of the regression line
and the correlation coefficient.and the correlation coefficient. Graph the regression line on a graph Graph the regression line on a graph
with the scatterplot. with the scatterplot.
Preliminary StepPreliminary Step Turn the Diagnostics On.Turn the Diagnostics On.
Press Press 2nd 02nd 0 (for Catalog). (for Catalog). Scroll down to Scroll down to DiagnosticOnDiagnosticOn. The . The
marker points to the right of the marker points to the right of the words.words.
Press Press ENTERENTER. Press . Press ENTERENTER again. again. The word The word Done Done should appear on the should appear on the
right hand side of the screen.right hand side of the screen.
ExampleExample
Temperature (F)
Water Consumption
(ounces)
75 16
83 20
85 25
85 27
92 32
97 48
99 48
Estimate the Line Using AlgebraEstimate the Line Using Algebra
Draw a scatter plot.Draw a scatter plot. Visualize the line of best fit.Visualize the line of best fit. Find the equation of that line.Find the equation of that line.
Point-Slope FormPoint-Slope Form Using Two Points on a LineUsing Two Points on a Line
We will use graph paper for this.We will use graph paper for this.
1. Enter the Data into Lists1. Enter the Data into Lists Press Press STATSTAT. . Under Under EDITEDIT, select , select 1: Edit1: Edit. . Enter x-values (input) into Enter x-values (input) into L1 L1 Enter y-values (output) into Enter y-values (output) into L2L2.. After data is entered in the lists, go After data is entered in the lists, go
to to 2nd MODE2nd MODE to quit and return to the to quit and return to the home screen.home screen.
Note:Note: If you need to clear out a list, for If you need to clear out a list, for example list 1, place the cursor on L1 example list 1, place the cursor on L1 then hit CLEAR and ENTER .then hit CLEAR and ENTER .
2. Set up the Scatterplot.2. Set up the Scatterplot. Press Press 2nd Y=2nd Y= (STAT PLOTS). (STAT PLOTS). Select Select 1: PLOT 11: PLOT 1 and hit and hit ENTERENTER. . Use the arrow keys to move the Use the arrow keys to move the
cursor down to cursor down to OnOn and hit and hit ENTERENTER.. Arrow down to Arrow down to Type:Type: and select the and select the
first graphfirst graph under Type. under Type. Under Under Xlist:Xlist: Enter Enter L1L1.. Under Under Ylist:Ylist: Enter Enter L2L2.. Under Under Mark:Mark: select any of these. select any of these.
3. View the Scatterplot3. View the Scatterplot Press Press 2nd MODE2nd MODE to quit and to quit and
return to the home screen.return to the home screen. To plot the points, press To plot the points, press ZOOMZOOM
and select and select 9: ZoomStat9: ZoomStat.. The scatterplot will then be The scatterplot will then be
graphed.graphed.
4. Find the regression line.4. Find the regression line. Press Press STATSTAT.. Press Press CALCCALC.. Select Select 4: LinReg(ax + b)4: LinReg(ax + b). . Press Press 2nd 12nd 1 (for List 1) (for List 1) Press the Press the comma keycomma key,, Press Press 2nd 22nd 2 (for List 2) (for List 2) Press Press ENTERENTER. .
5. Interpreting and Visualizing5. Interpreting and Visualizing Interpreting the result: Interpreting the result:
y = ax + by = ax + b
The valueThe value ofof aa is the __________is the __________ The value of The value of bb is the __________is the __________ rr is the is the __________________________________________ rr22 is the is the ________________________________________
5. Interpreting and Visualizing5. Interpreting and Visualizing Write down the equation of the Write down the equation of the
line in slope intercept form. line in slope intercept form. Press Press Y=Y= and enter the equation and enter the equation
under Y1. (Clear all other under Y1. (Clear all other equations.) equations.)
Press Press GRAPHGRAPH and the line will and the line will be graphed through the data be graphed through the data points.points.
Questions ???Questions ???
Interpretation in ContextInterpretation in Context
Regression Equation: Regression Equation:
y=1.5*x - 96.9y=1.5*x - 96.9
Water Consumption = Water Consumption = 1.5*Temperature - 96.9 1.5*Temperature - 96.9
Interpretation in ContextInterpretation in Context
Slope = ____________________Slope = ____________________ (don’t forget units)(don’t forget units)
Interpretation in context of problem:Interpretation in context of problem:
Interpretation in ContextInterpretation in Context
y-intercept = _______y-intercept = _______
Interpretation (general): Interpretation (general):
Interpretation (problem context): Interpretation (problem context):
Prediction ExamplePrediction Example
Predict Predict the amount of the amount of water a person would drink when the water a person would drink when the temperature is temperature is 95 degrees F.95 degrees F.
Method:Method:
Solution,Solution,If x=95, y=_______________________If x=95, y=_______________________
Strength of the Association: Strength of the Association: rr22
Coefficient of Determination = Coefficient of Determination = rr22
General Interpretation:General Interpretation:
Interpretation of Interpretation of rr22
Example: Example: rr22 =92.7%. =92.7%. Interpretation Interpretation (problem context):(problem context):
Note: Note:
Questions ???Questions ???
Simple Linear Regression ModelSimple Linear Regression Model
The model for The model for simple linear regression issimple linear regression is
There are mathematical assumptions There are mathematical assumptions
behind the concepts thatbehind the concepts that we are covering today.we are covering today.
FormulasFormulas
Prediction Equation: Prediction Equation:
Real Life ApplicationsReal Life Applications
Cost Estimating for Future Space Cost Estimating for Future Space Flight Vehicles (Multiple Flight Vehicles (Multiple
Regression)Regression)
Nonlinear ApplicationNonlinear Application
Predicting when Solar Maximum Will Predicting when Solar Maximum Will OccurOccur
http://science.msfc.nasa.gov/ssl/pad/http://science.msfc.nasa.gov/ssl/pad/
solar/predict.htmsolar/predict.htm
Real Life ApplicationsReal Life Applications Estimating Seasonal Sales for Estimating Seasonal Sales for
Department Stores (Periodic)Department Stores (Periodic)
Real Life ApplicationsReal Life Applications Predicting Student Grades Based Predicting Student Grades Based
on Time Spent Studyingon Time Spent Studying
Real Life ApplicationsReal Life Applications
. . .. . .
What ideas can you think of?What ideas can you think of?
What ideas can you think of that What ideas can you think of that your students will relate to?your students will relate to?
Practice ProblemsPractice Problems Measure Height vs. Arm SpanMeasure Height vs. Arm Span Find line of best fit for height.Find line of best fit for height. Predict height forPredict height for
one student not inone student not indata set. Checkdata set. Checkpredictability of model.predictability of model.
Practice ProblemsPractice Problems
Is there any correlation between Is there any correlation between shoe size and height? shoe size and height?
Does gender make a difference Does gender make a difference in this analysis?in this analysis?
Practice ProblemsPractice Problems Can the number of points Can the number of points
scored in a basketball game be scored in a basketball game be predicted by predicted by The time a player plays in The time a player plays in
the game?the game?
By the player’s height?By the player’s height?
Idea modified from Steven King, Aiken, Idea modified from Steven King, Aiken, SC. NCTM presentation 1997.)SC. NCTM presentation 1997.)
ResourcesResources Data Analysis and StatisticsData Analysis and Statistics. .
Curriculum and Evaluation Curriculum and Evaluation Standards for School Standards for School Mathematics. Addenda Series, Mathematics. Addenda Series, Grades 9-12. NCTM. 1992.Grades 9-12. NCTM. 1992.
Data and Story LibraryData and Story Library. Internet . Internet Website. Website. http://lib.stat.cmu.edu/DASL/http://lib.stat.cmu.edu/DASL/ 2001. 2001.
Internet ResourcesInternet Resources
RegressionRegression Effects of adding an Effects of adding an
OutlierOutlier. .
W. West, University of South W. West, University of South Carolina. Carolina.
http://www.stat.sc.edu/~west/http://www.stat.sc.edu/~west/javahtml/Regression.htmljavahtml/Regression.html
Internet Resources: Data SetsInternet Resources: Data Sets Data and Story Library. Data and Story Library.
Excellent source for small data sets. Excellent source for small data sets. Search for specific statistical methods Search for specific statistical methods (e.g. boxplots, regression) or for data (e.g. boxplots, regression) or for data concerning a specific field of interest concerning a specific field of interest (e.g. health, environment, sports). (e.g. health, environment, sports). http://lib.stat.cmu.edu/DASL/http://lib.stat.cmu.edu/DASL/
Internet Resources: Data SetsInternet Resources: Data Sets
FEDSTATS.FEDSTATS. "The gateway to "The gateway to statistics from over 100 U.S. Federal statistics from over 100 U.S. Federal agencies" agencies" http://www.fedstats.gov/http://www.fedstats.gov/
"Kid's Pages.""Kid's Pages." (not all related to (not all related to statistics) statistics) http://www.fedstats.gov/kids.htmlhttp://www.fedstats.gov/kids.html
Internet ResourcesInternet Resources OtherOther
Statistics Applets. Using Web Statistics Applets. Using Web Applets to Assist in Statistics Applets to Assist in Statistics Instruction. Robin Lock, St. Instruction. Robin Lock, St. Lawrence University. Lawrence University. http://it.stlawu.edu/~rlock/maa99/http://it.stlawu.edu/~rlock/maa99/
Internet ResourcesInternet Resources OtherOther
Ten Websites Every Statistics Ten Websites Every Statistics Instructor Should Bookmark. Instructor Should Bookmark. Robin Lock, St. Lawrence Robin Lock, St. Lawrence University. University. http://it.stlawu.edu/~rlock/10sitehttp://it.stlawu.edu/~rlock/10sites.htmls.html
CAUSEweb – digital library for CAUSEweb – digital library for undergraduate statistics undergraduate statistics educationeducation
For More Information…For More Information…
On-line version of this presentationOn-line version of this presentationhttp://www.mtsu.edu/~statshttp://www.mtsu.edu/~stats
//corregpres/index.htmlcorregpres/index.html
More information about regressionMore information about regressionVisit Visit STATS @ MTSUSTATS @ MTSU web site web site
http://www.mtsu.edu/~statshttp://www.mtsu.edu/~stats