13
Regression and Correlation Jake Blanchard Fall 2010

Regression and Correlation

  • Upload
    avian

  • View
    79

  • Download
    0

Embed Size (px)

DESCRIPTION

Regression and Correlation. Jake Blanchard Fall 2010. Introduction. We can use regression to find relationships between random variables This does not necessarily imply causation Correlation can be used to measure predictability. Regression with Constant Variance. - PowerPoint PPT Presentation

Citation preview

Page 1: Regression and Correlation

Regression and Correlation

Jake BlanchardFall 2010

Page 2: Regression and Correlation

IntroductionWe can use regression to find

relationships between random variables

This does not necessarily imply causation

Correlation can be used to measure predictability

Page 3: Regression and Correlation

Regression with Constant VarianceLinear Regression: E(Y|

X=x)=+xIn general, variance is function of

xIf we assume the variance is a

constant, then the analysis is simplified

Define total absolute error as the sum of the squares of the errors

Page 4: Regression and Correlation

Linear Regression

n

ii

n

iii

n

iiii

n

iii

n

iii

n

iii

xx

xxyy

xysolve

xyx

xy

xyxy

1

2

1

1

2

1

2

1

2

1

22

02

02

Page 5: Regression and Correlation

Variance in Regression AnalysisRelevant variance is conditional:

Var(Y|X=x)

2

2|2

22|

1

22

1

22|

1

22|

1

2

2121

Y

XY

XY

n

ii

n

iiXY

n

iiiXY

ss

r

ns

xxyyn

s

xyn

s

Page 6: Regression and Correlation

Confidence IntervalsRegression coefficients are t-

distributed with n-2 dofStatistic below is thus t-

distributed with n-2 dof

And the confidence interval is

n

ii

ixY

xYi

xx

xxn

s

Yi

1

2

2

|

|

1

n

ii

iXY

nixY

xx

xxn

styi

1

2

2

|2,

211|

1

Page 7: Regression and Correlation

ExampleExample 8.1Data for compressive strength (q)

of stiff clay as a function of “blow counts” (N)

038.08305.0

2

029.0

112.0

22.191

12.9591123.27.18

22|

22

222

222

ns

Nq

NnNqNnqN

qnqs

NnNs

qN

Nq

i

ii

iq

iN

744.0,21.07.18*104353

7.184101038.*306.2477.

477.04*112.0029.04

306.2

1

95.0|

2

2

95.0|

8,975.0

1

2

2

|2,

211|

Nq

Nq

i

n

ii

iXY

nixY

yNat

t

xx

xxn

styi

Page 8: Regression and Correlation

Plot

Page 9: Regression and Correlation

Correlation Estimate

22

2|2

,

,

1,

1,

121

11

11

rss

nn

ss

ss

yxnyx

n

ss

yyxx

n

Y

xYyx

Y

Xyx

YX

n

iii

yx

YX

n

iii

yx

Page 10: Regression and Correlation

Regression with Non-Constant VarianceNow relax

assumption of constant variance

Assume regions with large conditional variance weighted less

)(2

)(1

)(1

|1

)|()(|

|

1

2

2

22

2

11

2

1

1111

1

11

1

22

22

22

xsgsn

yyws

xgww

xwxww

ywxwyxww

w

xwyw

xyw

xgxXYVarw

weightsxxXYExgxXYVar

xY

n

iii

iii

n

iii

n

iii

n

ii

n

iii

n

iii

n

iiii

n

ii

n

ii

n

iii

n

iii

n

iiii

iii

Page 11: Regression and Correlation

Example (8.2)Data for maximum settlement (x)

of storage tanks and maximum differential settlement (y)

From looking at data, assume g(x)=x (that is, standard deviation of y increases linearly with x

2

22

1|

ii xw

xxXYVar

Page 12: Regression and Correlation

Example (8.2) continued

96.0

243.00589.0

65.0045.0

627.0923.011.165.1

|

2

xss

ssyx

xy

y

x

Page 13: Regression and Correlation

Multiple Regression

ikkiii xxxy ...22110

“Nonlinear” Regression

)()|( xgxYE

Use LINEST in Excel