Download pptx - Regression and Correlation

Transcript
Page 1: Regression and Correlation

Regression and Correlation

Jake BlanchardFall 2010

Page 2: Regression and Correlation

IntroductionWe can use regression to find

relationships between random variables

This does not necessarily imply causation

Correlation can be used to measure predictability

Page 3: Regression and Correlation

Regression with Constant VarianceLinear Regression: E(Y|

X=x)=+xIn general, variance is function of

xIf we assume the variance is a

constant, then the analysis is simplified

Define total absolute error as the sum of the squares of the errors

Page 4: Regression and Correlation

Linear Regression

n

ii

n

iii

n

iiii

n

iii

n

iii

n

iii

xx

xxyy

xysolve

xyx

xy

xyxy

1

2

1

1

2

1

2

1

2

1

22

02

02

Page 5: Regression and Correlation

Variance in Regression AnalysisRelevant variance is conditional:

Var(Y|X=x)

2

2|2

22|

1

22

1

22|

1

22|

1

2

2121

Y

XY

XY

n

ii

n

iiXY

n

iiiXY

ss

r

ns

xxyyn

s

xyn

s

Page 6: Regression and Correlation

Confidence IntervalsRegression coefficients are t-

distributed with n-2 dofStatistic below is thus t-

distributed with n-2 dof

And the confidence interval is

n

ii

ixY

xYi

xx

xxn

s

Yi

1

2

2

|

|

1

n

ii

iXY

nixY

xx

xxn

styi

1

2

2

|2,

211|

1

Page 7: Regression and Correlation

ExampleExample 8.1Data for compressive strength (q)

of stiff clay as a function of “blow counts” (N)

038.08305.0

2

029.0

112.0

22.191

12.9591123.27.18

22|

22

222

222

ns

Nq

NnNqNnqN

qnqs

NnNs

qN

Nq

i

ii

iq

iN

744.0,21.07.18*104353

7.184101038.*306.2477.

477.04*112.0029.04

306.2

1

95.0|

2

2

95.0|

8,975.0

1

2

2

|2,

211|

Nq

Nq

i

n

ii

iXY

nixY

yNat

t

xx

xxn

styi

Page 8: Regression and Correlation

Plot

Page 9: Regression and Correlation

Correlation Estimate

22

2|2

,

,

1,

1,

121

11

11

rss

nn

ss

ss

yxnyx

n

ss

yyxx

n

Y

xYyx

Y

Xyx

YX

n

iii

yx

YX

n

iii

yx

Page 10: Regression and Correlation

Regression with Non-Constant VarianceNow relax

assumption of constant variance

Assume regions with large conditional variance weighted less

)(2

)(1

)(1

|1

)|()(|

|

1

2

2

22

2

11

2

1

1111

1

11

1

22

22

22

xsgsn

yyws

xgww

xwxww

ywxwyxww

w

xwyw

xyw

xgxXYVarw

weightsxxXYExgxXYVar

xY

n

iii

iii

n

iii

n

iii

n

ii

n

iii

n

iii

n

iiii

n

ii

n

ii

n

iii

n

iii

n

iiii

iii

Page 11: Regression and Correlation

Example (8.2)Data for maximum settlement (x)

of storage tanks and maximum differential settlement (y)

From looking at data, assume g(x)=x (that is, standard deviation of y increases linearly with x

2

22

1|

ii xw

xxXYVar

Page 12: Regression and Correlation

Example (8.2) continued

96.0

243.00589.0

65.0045.0

627.0923.011.165.1

|

2

xss

ssyx

xy

y

x

Page 13: Regression and Correlation

Multiple Regression

ikkiii xxxy ...22110

“Nonlinear” Regression

)()|( xgxYE

Use LINEST in Excel


Recommended