Upload
anand-satsangi
View
53
Download
1
Embed Size (px)
Citation preview
11
Relationships Among Relationships Among VariablesVariables
Correlation and RegressionCorrelation and Regression
KNES 510KNES 510Research Methods in Research Methods in
KinesiologyKinesiology
22
CorrelationCorrelation
Correlation is “a statistical technique used to Correlation is “a statistical technique used to determine the relationship between two or more determine the relationship between two or more variables”variables”
We use two different techniques to determine We use two different techniques to determine score relationships:score relationships:1.1. graphing techniquegraphing technique
2.2. mathematical technique called correlation mathematical technique called correlation
Graphs of the RelationshipGraphs of the RelationshipBetween VariablesBetween Variables
33
44
Types of RelationshipsTypes of Relationships
The scattergram can indicate a The scattergram can indicate a positivepositive relationship, a relationship, a negativenegative relationship, or a relationship, or a zerozero relationshiprelationship
What are the characteristics of positive, What are the characteristics of positive, negative, and zero relationships?negative, and zero relationships?
55
Mathematical Technique: The Mathematical Technique: The Correlation Coefficient (Correlation Coefficient (rr))
The correlation coefficient, The correlation coefficient, rr,* represents ,* represents the relationship between the z-scores of the relationship between the z-scores of the subjects on two different variables the subjects on two different variables (usually designated X and Y)(usually designated X and Y)
This can be stated mathematically as the This can be stated mathematically as the mean of the z-score products for all mean of the z-score products for all subjectssubjects
*A more complete name for this statistic is Pearson’s product-moment correlation coefficient
66
Formula for the Correlation Formula for the Correlation CoefficientCoefficient
The correlation coefficient can be calculated as The correlation coefficient can be calculated as follows:follows:
( )N
ZZr YX∑=
77
The values of the coefficient will always The values of the coefficient will always range from +1.00 to -1.00range from +1.00 to -1.00
A correlation coefficient near 0.00 A correlation coefficient near 0.00 indicates no relationshipindicates no relationship
88
SPSS Bivariate Correlation OutputSPSS Bivariate Correlation Output
Correlations
1 .947
.053
4 4
.947 1
.053
4 4
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
X
Y
X Y
99
Interpreting the Correlation Interpreting the Correlation CoefficientCoefficient
Because the relationship between two Because the relationship between two sets of data is seldom perfect, the sets of data is seldom perfect, the majority of correlation coefficients are majority of correlation coefficients are fractions (0.92, -0.80, and the like)fractions (0.92, -0.80, and the like)
When interpreting correlation coefficients When interpreting correlation coefficients it is sometimes difficult to determine what it is sometimes difficult to determine what is high, low, and averageis high, low, and average
1010
The Correlation Coefficient and The Correlation Coefficient and Cause-and-EffectCause-and-Effect
There is a high correlation between a There is a high correlation between a person's shoe size and their math skills in person's shoe size and their math skills in grades K through 6grades K through 6
Is this an example of cause-and-effect?Is this an example of cause-and-effect?
Can we predict math skill based on shoe Can we predict math skill based on shoe size in grade K through 6 students?size in grade K through 6 students?
1111
Coefficient of DeterminationCoefficient of Determination
The The coefficient of determinationcoefficient of determination is the is the amount of variability in one measure that amount of variability in one measure that is explained by the other measureis explained by the other measureThe The coefficient of determinationcoefficient of determination is the is the square of the correlation coefficient (square of the correlation coefficient (rr22).).For example, if the correlation coefficient For example, if the correlation coefficient between two variables is between two variables is rr = 0.90, the = 0.90, the coefficient of determination is (0.90)coefficient of determination is (0.90)22 = = 0.81 0.81
1212
RegressionRegression
When two variables are related When two variables are related (correlated), it is possible to predict a (correlated), it is possible to predict a person’s score on one variable (Y) by person’s score on one variable (Y) by knowing their score on the second knowing their score on the second variable (X)variable (X)
1313
1414
This scatterplot illustrates that there is a This scatterplot illustrates that there is a strong, positive relationship between fat-strong, positive relationship between fat-free body mass and daily energy free body mass and daily energy expenditureexpenditure
Correlations
1 .981**
.000
7 7
.981** 1
.000
7 7
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Fat-Free Mass (kg)
Energy Expenditure (kcal)
Fat-FreeMass (kg)
EnergyExpenditure
(kcal)
Correlation is significant at the 0.01 level (2-tailed).**.
1515
Regression Line (Line of Best Fit)Regression Line (Line of Best Fit)
The regression line is a line that best The regression line is a line that best describes the trend in the datadescribes the trend in the data
This line is as close as possible to the This line is as close as possible to the data pointsdata points
The equation for this line is:The equation for this line is:
Y' = bX = CY' = bX = C
Fitting a Regression LineFitting a Regression Line
1616
1717
Simple PredictionSimple Prediction
Tests have been developed to predict VOTests have been developed to predict VO 22
max from the time it takes a person to run max from the time it takes a person to run 1.5 miles1.5 miles
A person's VOA person's VO22 max can thus be predicted max can thus be predicted
from their 1.5 mile run time because a from their 1.5 mile run time because a predictionprediction or or regression equationregression equation has has been developedbeen developed
1818
The simple linear The simple linear predictionprediction or or regression regression equationequation takes the following form: takes the following form:
Y' = a + bXY' = a + bXY' = predicted valueY' = predicted value
a = intercept of the regression line (Y intercept) a = intercept of the regression line (Y intercept)
b = slope of the regression line (change in Y b = slope of the regression line (change in Y with each change in X)with each change in X)
X = score on the predictor variableX = score on the predictor variable
1919
Determining Error in PredictionDetermining Error in Prediction
Unless two variables are perfectly related Unless two variables are perfectly related (-1.00 or +1.00) there will always be error (-1.00 or +1.00) there will always be error associated with a prediction equationassociated with a prediction equation
We find the standard deviation of this We find the standard deviation of this error, the error, the standard error of prediction standard error of prediction ((ssyy ⋅⋅xx)), using the following formula:, using the following formula:
21 rss yxy −=⋅
2020
Prediction and ResidualsPrediction and Residuals
2121
A predicted score (Y’) A predicted score (Y’) ± ± ssyy ⋅⋅xx yields a range yields a range
of scores within which a person’s true of scores within which a person’s true score on the predicted variable liesscore on the predicted variable lies
If the standard error of prediction may be If the standard error of prediction may be interpreted as the standard deviation of interpreted as the standard deviation of residuals, what are the odds that a residuals, what are the odds that a person’s true score lies between person’s true score lies between Y’ Y’ ± ± ssyy ⋅⋅xx??
2222
The standard error of prediction for The standard error of prediction for percent body fat estimated using the percent body fat estimated using the skinfold method is ±3.5%skinfold method is ±3.5%
If a person has their percent body fat If a person has their percent body fat estimated at 12%, between what two estimated at 12%, between what two values does their true body fat lie (95% values does their true body fat lie (95% probability)?probability)?
2323
Which of the following will more Which of the following will more precisely predict job performance?precisely predict job performance?
A: r = 0.168 B: r = 0.686
2424
Sample SPSS OutputSample SPSS Output
Here is the SPSS output for regressing Here is the SPSS output for regressing Work Simulation Job Performance Work Simulation Job Performance (Dependent Variable) against Supervisor (Dependent Variable) against Supervisor Ratings (Independent Variable)Ratings (Independent Variable)
Coefficientsa
-1.156 .675 -1.712 .089
.033 .016 .168 2.053 .042
(Constant)
Supervisor Ratings
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: Work Simulation Job Performancea.
2525
This information can be used to create a This information can be used to create a prediction (regression) equation for prediction (regression) equation for predicting work performance of future predicting work performance of future applicants from supervisor ratingsapplicants from supervisor ratings
Y’ = – 1.156 + 0.033 XY’ = – 1.156 + 0.033 X
2626
Work Simulation Job Performance may Work Simulation Job Performance may also be predicted from Arm Strengthalso be predicted from Arm Strength
Here is the SPSS output:Here is the SPSS output:
Coefficientsa
-4.095 .392 -10.454 .000
.055 .005 .686 11.353 .000
(Constant)
Arm Strength (lbs)
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: Work Simulation Job Performancea.
2727
This information can be used to create a This information can be used to create a prediction (regression) equation for prediction (regression) equation for predicting work performance of future predicting work performance of future applicants from supervisor ratingsapplicants from supervisor ratings
Y’ = – 4.095 + 0.055 XY’ = – 4.095 + 0.055 X
2828
We now have two regression equations for We now have two regression equations for predicting Work Simulation Job predicting Work Simulation Job PerformancePerformance
Which is the better equation for accurate Which is the better equation for accurate prediction?prediction?
To determine this, we must examine the To determine this, we must examine the standard error of prediction for each standard error of prediction for each equationequation
2929
Standard error of prediction using Supervisor Ratings:Standard error of prediction using Supervisor Ratings:
Standard error of prediction using Arm Strength:Standard error of prediction using Arm Strength:
Which is the better equation?Which is the better equation?
Model Summary
.168a .028 .022 1.66078Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), Supervisor Ratingsa.
Model Summary
.686a .471 .467 1.22582Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), Arm Strength (lbs)a.
3030
Multiple PredictionMultiple Prediction
A prediction formula using a single A prediction formula using a single measure X is usually not very accurate for measure X is usually not very accurate for predicting a person's score on measure Ypredicting a person's score on measure Y
Multiple correlation-regressionMultiple correlation-regression techniques allow us to predict score Y techniques allow us to predict score Y using several X scoresusing several X scores
3131
The general form of a two predictor The general form of a two predictor multiple regression equation is:multiple regression equation is:
Y' = a + bY' = a + b11XX11 + b + b22XX22
3232
An example of multiple correlation-An example of multiple correlation-regression is the prediction of percent regression is the prediction of percent body fat from multiple skinfold body fat from multiple skinfold measurementsmeasurements
DB (g/cc) = 1.0994921 - 0.0009929 (DB (g/cc) = 1.0994921 - 0.0009929 (ΣΣ3SKF) 3SKF) + 0.0000023 (+ 0.0000023 (ΣΣ3SKF)3SKF)22 – 0.0001392 (age) – 0.0001392 (age)
Next ClassNext Class
Chapters 9 & 11Chapters 9 & 11
Mock Mock ProposalsProposals in class! in class!
3333