18
Regression Review Regression and Pearson’s R SPSS Demo

Regression Review Regression and Pearson’s R SPSS Demo

Embed Size (px)

Citation preview

Page 1: Regression Review Regression and Pearson’s R SPSS Demo

Regression

Review Regression and Pearson’s RSPSS Demo

Page 2: Regression Review Regression and Pearson’s R SPSS Demo

The Regression line

• Properties:1. The sum of positive and negative vertical

distances from it is zero2. The standard deviation of the points from the

line is at a minimum3. The line passes through the point (mean x,

mean y)• Bivariate Regression Applet

Page 3: Regression Review Regression and Pearson’s R SPSS Demo

Regression Line Formula

Y = a + bXY = score on the dependent variableX = the score on the independent variablea = the Y intercept –

point where the regression line crosses the Y axis

b = the slope of the regression line– SLOPE – the amount of change produced in Y by a unit change

in X; or,– a measure of the effect of the X variable on the Y

Page 4: Regression Review Regression and Pearson’s R SPSS Demo

Regression Line FormulaY = a + bX

y-intercept (a) = 102 slope (b) = .9

Y = 102 + (.9)X

• This information can be used to predict weight from height.

• Example: What is the predicted weight of a male who is 70” tall (5’10”)?– Y = 102 + (.9)(70) = 102 + 63 = 165 pounds

height (inches)

807570656055504540

wei

ght

(pou

nds)

260

240

220

200

180

160

140

120

100

Page 5: Regression Review Regression and Pearson’s R SPSS Demo

The Slope (b) – A Strength & A Weakness

– We know that b indicates the change in Y for a unit change in X, but b is not really a good measure of strength

– Weakness– It is unbounded (can be >1 or <-1) making it hard to interpret

• The size of b is influenced by the scale that each variable is measured on

Page 6: Regression Review Regression and Pearson’s R SPSS Demo

Pearson’s r Correlation Coefficient

• By contrast, Pearson’s r is bounded – a value of 0.0 indicates no linear relationship and

a value of +/-1.00 indicates a perfect linear relationship

Page 7: Regression Review Regression and Pearson’s R SPSS Demo

Pearson’s rY = 0.7 + .99x

sx = 1.51

sy = 2.24

• Converting the slope to a Pearson’s r correlation coefficient:

– Formula: r = b(sx/sy)

r = .99 (1.51/2.24)r = .67

Page 8: Regression Review Regression and Pearson’s R SPSS Demo

Coefficient of Determination

• Conceptually, the formula for r2 is: r2 = Explained variation Total variation

“The proportion of the total variation in Y that is attributable or explained by X.”

• The variation not explained by r2 is called the unexplained variation

– Usually attributed to measurement error, random chance, or some combination of other variables

Page 9: Regression Review Regression and Pearson’s R SPSS Demo

Coefficient of Determination

– Interpreting the meaning of the coefficient of determination in the example:

• Squaring Pearson’s r (.67) gives us an r2 of .45

• Interpretation:– The # of hours of daily TV watching (X) explains 45% of the

total variation in soda consumed (Y)

Page 10: Regression Review Regression and Pearson’s R SPSS Demo

Another Example: Relationship between Mobility Rate (x) & Divorce rate (y)

• The formula for this regression line is:Y = -2.5 + (.17)X– 1) What is this slope telling you?– 2) Using this formula, if the

mobility rate for a given state was 45, what would you predict the divorce rate to be?

– 3) The standard deviation (s) for x=6.57 & the s for y=1.29. Use this info to calculate Pearson’s r. How would you interpret this correlation?

– 4) Calculate & interpret the coefficient of determination (r2)

Mobility Rate

6050403020100

Div

orce

Rat

e

8

7

6

5

4

3

2

1

0

-1

-2

-3

Page 11: Regression Review Regression and Pearson’s R SPSS Demo

Another Example: Relationship between Mobility Rate (x) & Divorce rate (y)

• The formula for this regression line is:Y = -2.5 + (.17)X– 1) What is this slope telling you?

• For every one unit increase in x (mobility rate), divorce rate (y) goes up .17– 2) Using this formula, if the mobility rate for a given state was 45,

what would you predict the divorce rate to be?• Y = -2.5 + (.17) 45 = 5.15

– 3) The standard deviation (s) for x=6.57 & the s for y=1.29. Use this info to calculate Pearson’s r. How would you interpret this correlation?

• r = .17 (6.57/1.29) = .17(5.093) = .866– There is a strong positive association between mobility rate & divorce rate.

– 4) Calculate & interpret the coefficient of determination (r2)• r2 = (.866)2 = .75

– A state’s mobility rate explains 75% of the variation in its divorce rate.

Page 12: Regression Review Regression and Pearson’s R SPSS Demo

PEARSON’S r IN SPSS

– Steps for running Pearson’s r in SPSS:1. Click Analyze Correlate Bivariate2. Highlight the 2(+) variables you wish to examine3. Click OK

Page 13: Regression Review Regression and Pearson’s R SPSS Demo

Pearson’s r Output

• Note that the table reports each correlation coefficient twice– (3 bivariate relationships, 6 correlation coefficients reported)

• Example interpretation:– There is a weak to moderate negative relationship (r = -.260) between age at

which one’s first child is born (AGEKDBRN) and the number of children one has (CHILDS).

Correlations

1 -.260** -.119*

. .000 .040

1035 1015 297

-.260** 1 .276**

.000 . .000

1015 1438 442

-.119* .276** 1

.040 .000 .

297 442 447

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

AGEKDBRN R'S AGEWHEN 1ST CHILD BORN

CHILDS NUMBER OFCHILDREN

CHLDIDEL IDEALNUMBER OF CHILDREN

AGEKDBRN R'S AGE

WHEN 1STCHILD BORN

CHILDS NUMBER OFCHILDREN

CHLDIDEL IDEAL

NUMBER OFCHILDREN

Correlation is significant at the 0.01 level (2-tailed).**.

Correlation is significant at the 0.05 level (2-tailed).*.

Page 14: Regression Review Regression and Pearson’s R SPSS Demo

Measures of Association

* But, has an upper limit of 1 when dealing with a 2x2 table.

Level of Measurement

(both variables)

Measures of Association “Bounded”?

PRE interpretation?

NOMINAL PhiCramer’s VLambda

NO*YESYES

NONOYES

ORDINAL Gamma YES YES

INTERVAL-RATIO

b (slope)Pearson’s rr2

NOYESYES

NONOYES

Page 15: Regression Review Regression and Pearson’s R SPSS Demo

Significance Testing for 2 IR Variables

• When both variables are interval-ratio level, strength and association are tested together– The slope or “r” will have a “sig value”

• Sig = the specific odds of this slope, assuming the null is correct

• The Null in this case is that there is no relationship between the two variables in the population

– In other words, that the slope (or “r”) in the population is zero– What are the odds of getting this slope, if in the population,

the slope is zero?

Page 16: Regression Review Regression and Pearson’s R SPSS Demo

SPSS DEMO

• Are individuals with stronger moral values more likely to engage in criminal activity?– Sample = 484 UMD Students– Null hypothesis?

• How to test null? • Both are I/R variables (or close enough)

– Can test the significance of the measure of strength– E.g., is the slope/correlation significantly different from zero?

Or, what are the odds of finding this slope/correlation, if in the population, the slope is zero.

Page 17: Regression Review Regression and Pearson’s R SPSS Demo

• Is there a relationship here?

• If so, what direction? • What (ballpark) would

the constant, or “y-intercept” be in the regression equation?

Page 18: Regression Review Regression and Pearson’s R SPSS Demo

“Model” Regression Output

• The same generic output is given for all regression models– The “model” stuff is relevant for models with

more than one independent variable• How do all the independent variables together predict

the dependent variable?

– For our purposes, the “Model R” will have the same values as pearson’s r

• HOWEVER, the model R cannot tell you direction (compute separate correlation in SPSS)