Upload
others
View
25
Download
0
Embed Size (px)
Citation preview
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Correlation & Linear Regressionin SPSS
Petra Petrovics
4th seminar
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Types of dependence
• association – between two nominal data
• mixed – between a nominal and a ratio data
• correlation – among ratio data
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
• X (or X1, X2, … , Xp):
known variable(s) / independent variable(s) / predictor(s)
• Y: unknown variable / dependent variable
• causal relationship: X „causes” Y to change
Correlation Regression
describes the strength of a
relationship, the degree to
which one variable is
linearly related to another
shows us how to
determine the nature of a
relationship between two
or more variables
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Correlation Measures
1. Covariance
2. Coefficient of correlation
3. Coefficient of determination
4. Coefficient of rank correlation
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
1. Covariance
• A measure of the joint variation of the two variables;
• An average value of the product of the deviations ofobservations on 2 random variables from theirsample means.
– ranges from - to +;
– C = 0, when X and Y are uncorrelated;
– its sign shows the direction of correlation
– it doesn’t measure the degree of relationship!!!
1 yx,C
n
yyxx
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
• Pearson correlation
• A measure of how closely related two data series are.
• Its sign shows the direction of correlation
• It measures the strength of correlation
• 0 < r < 1 statistical dependence
r = 0 X and Y are uncorrelated
r = -1 negative ☻
r = 1 positive ☺
• You can use only in case of linear relationship!
2. Coefficient of correlation
2
y
2
x
yx
yx dd
dΣd =
ss
Cr
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
3. Coefficient of determination
• r2
• The square of the sample correlation coefficient betweenthe outcomes and their predicted values.
• Measures the degree of correlation in percentage (%)
• It provides a measure of how well future outcomes arelikely to be predicted by the model.
• Vary from 0 to 1.
y
e
y
y2
S
S - 1 =
S
S r
ˆ
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
File / Open / Employee data.sav
Is there any relation between
- current salary &
- beginning salary?
CORRELATION
Exercise 1 - Correlation
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
+ -
Analyze / Correlate / Bivariate…
r
C
0 I r I 0,3 weak dependence
0,3 I r I 0,7 medium-strong dependence
0,7 I r I 1 strong dependence
Shows direction and strength
Just direction!
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
OutputMean Std. Deviation N
Current Salary $34,419.57 $17,075.661 474
Beginning Salary $17,016.09 $7,870.638 474
Current SalaryBeginning
SalaryCurrent Salary
Pearson Correlation 1 ,880(**)
Sig. (2-tailed) ,000Sum of Squares and Cross-products 137916495436,340 55948605047,73
Covariance 291578214,45 118284577,27N 474 474
Beginning Salary
Pearson Correlation ,880(**) 1
Sig. (2-tailed) ,000Sum of Squares and Cross-products 55948605047,73 29300904965,45
Covariance 118284577,27 61946944,96
N 474 474
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Exercise 2 – Multiple Correlation
Is there any relation between
• the current salary
• previous experience (month)
• month since hire
• beginning salary?
MULTIPLE CORRELATION
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Analyze / Correlate / Bivariate…
r
C
Shows direction and strength
Just direction!
0 I r I 0,3 weak dependence
0,3 I r I 0,7 medium-strong dependence
0,7 I r I 1 strong dependence
+ -
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Output ViewCorrelations
1 -,097* ,084 ,880**
,034 ,067 ,000
1,379E+011 -82332343,5 6833347,5 5,59E+010
291578214,5 -174064,151 14446,823 118284577
474 474 474 474
-,097* 1 ,003 ,045
,034 ,948 ,327
-82332343,54 5173806,810 1482,241 17573777
-174064,151 10938,281 3,134 37153,862
474 474 474 474
,084 ,003 1 -,020
,067 ,948 ,668
6833347,489 1482,241 47878,295 -739866,50
14446,823 3,134 101,223 -1564,200
474 474 474 474
,880** ,045 -,020 1
,000 ,327 ,668
55948605048 17573776,7 -739866,5 2,93E+010
118284577,3 37153,862 -1564,200 61946945
474 474 474 474
Pearson Correlation
Sig. (2-tailed)
Sum of Squares and
Cross-products
Covariance
N
Pearson Correlation
Sig. (2-tailed)
Sum of Squares and
Cross-products
Covariance
N
Pearson Correlation
Sig. (2-tailed)
Sum of Squares and
Cross-products
Covariance
N
Pearson Correlation
Sig. (2-tailed)
Sum of Squares and
Cross-products
Covariance
N
Current Salary
Previous Experience
(months)
Months since Hire
Beginning Salary
Current Salary
Previous
Experience
(months)
Months
since Hire
Beginning
Salary
Correlation is significant at the 0.05 level (2-tailed).*.
Correlation is significant at the 0.01 level (2-tailed).**.
Matrix
r
C
Inverse relationship
Direct relationship
Inverse relationship
& weak dependence
Direct relationship
& strong dependence
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Linear regression
ŷ = b0 + b1x
y
x
b0: when x=0, y=b0
b1: for every 1 unitincrease in x we expect yto change by b1 units onaverage
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Exercise 3 – Linear Regression
File / Open / Employee data.sav
Determine a linear relationship between thesalary and the age of the employees!
Create a new variable!
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
This year
Create a new variable: age = this year – date of birth (in year)Transform / Compute Variable…
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
RegressionAnalyze / Regression / Linear…
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Model Summary
,146a ,021 ,019 $16,928.804
Model
1
R R Square
Adjusted
R Square
Std. Error of
the Estimate
Predictors: (Constant), agea.
Weak dependence
The dependent variable’s(current salary) variation is explained in 2,1% by the regression model
)1(1
11 22 R
pn
nR
It enables to compare themultiple determinationcoefficient amongpopulations / sampleswith different size anddifferent number ofdependent variables as itcontrol for the number ofsample / population size(n) and the number ofindependent variables (p)
How many percent of the variation of the dependent variable can be explained by the variation of all the independent variables
2
12
1221
2
2
2
1
1
2
r
rrrrrR
yyyy
It expresses the combined effect of all the variables acting on the dependent variable
Multiple correlation coefficient
Multiple determination coefficient
Adjusted multiple determination coefficient
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
We can accept the model inevery significance level.
F-test: for model testing
The F ratio (in the Analysis of Variance Table) is 10.241 andsignificant at p=.001. This provides evidence of existence of alinear relationship between the variables
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Coefficientsa
41543,805 2358,686 17,613 ,000
-211,609 66,124 -,146 -3,200 ,001
(Constant)
age
Model
1
B Std. Error
Unstandardized
Coefficients
Beta
Standardized
Coefficients
t Sig.
Dependent Variable: Current Salarya.
The regression line: ŷ = b0 + b1x
b0: If the x variable is 0, how much is the y.
If the employees are 0-year-old, they earn $41543,805 (It doesn’t mean anything.)
b1: If the x increases by 1 unit, what is the difference in y.
If the employees are 1 year older, they earn less money with $211,609 onaverage.
b0
b1
We can acceptthe parameters atevery significancelevel.
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Exercise 4 – Curve Estimation
File / Open / Employee data.sav
Determine the relationship between thesalary and the age of the employees! Whichregression model fit the most?
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
• Linear
• Compound
• Power
Analyze / Regression / CurveEstimation…
To get a chart
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Output View
Linear
Compound
Power
Model Summary
,146 ,021 ,019 16928,804
R R Square
Adjusted
R Square
Std. Error of
the Estimate
The independent variable is age.
Model Summary
,215 ,046 ,044 ,389
R R Square
Adjusted
R Square
Std. Error of
the Estimate
The independent variable is age.
Model Summary
,156 ,024 ,022 ,393
R R Square
Adjusted
R Square
Std. Error of
the Estimate
The independent variable is age.
The highest R2
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Also in the Output View…
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
The model is significant.
• Weak dependence.
• The age has 4,6%influence on thecurrent salary’svariation
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
a: no analyzation
b: When an employee is 1 year older, the current salary will be
0.993 times higher on average.
b
a
ŷ = a bx = 40482.362 0.993x
The parameters aresignificant.
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
• Faculty of Economics• Gazdaságelméleti és Módszertani Intézet
Thank You for Your [email protected]