17
Bivariate Normal Distribution and Regression Application to Galton’s Heights of Adult Children and Parents Sources: Galton, Francis (1889). Natural Inheritance, MacMillan, London. Galton, F.; J.D. Hamilton Dickson (1886). “Family Likeness in Stature”, Proceedings of the Royal Society of London, Vol. 40, pp.42-73.

Bivariate Normal Distribution and Regression

Embed Size (px)

DESCRIPTION

Bivariate Normal Distribution and Regression. Application to Galton’s Heights of Adult Children and Parents Sources: Galton, Francis (1889). Natural Inheritance, MacMillan, London. - PowerPoint PPT Presentation

Citation preview

Page 1: Bivariate Normal Distribution and Regression

Bivariate Normal Distribution and Regression

Application to Galton’s Heights of Adult Children and Parents

Sources: Galton, Francis (1889). Natural Inheritance, MacMillan, London.Galton, F.; J.D. Hamilton Dickson (1886). “Family Likeness in Stature”, Proceedings of the Royal Society of London, Vol. 40, pp.42-73.

Page 2: Bivariate Normal Distribution and Regression

Data – Heights of Adult Children and Parents

• Adult Children Heights are reported by inch, in a manner so that the median of the grouped values is used for each (62.2”,…,73.2” are reported by Galton). – He adjusts female heights by a multiple of 1.08– We use 61.2” for his “Below” – We use 74.2” for his “Above”

• Mid-Parents Heights are the average of the two parents’ heights (after female adjusted). Grouped values at median (64.5”,…,72.5” by Galton)– We use 63.5” for “Below”– We use 73.5” for “Above”

Page 3: Bivariate Normal Distribution and Regression

Adult Child vs Mid-Parent Height

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

63 64 65 66 67 68 69 70 71 72 73

Mid-Parent

Ad

ult

Ch

ild

Page 4: Bivariate Normal Distribution and Regression

Mid-Parent Height

0

50

100

150

200

250

63.5 64.5 65.5 66.5 67.5 68.5 69.5 70.5 71.5 72.5

Height

Fre

qu

en

cy

Page 5: Bivariate Normal Distribution and Regression

Adult Child Heights

0

20

40

60

80

100

120

140

160

180

61.2 62.2 63.2 64.2 65.2 66.2 67.2 68.2 69.2 70.2 71.2 72.2 73.2 74.2

Height

Fre

qu

en

cy

Page 6: Bivariate Normal Distribution and Regression

Joint Density Function

21

2211

22222

12111

2122

222

21

221121

211

2222

21

21

)()(

)()(

:where

,2

12

1exp

12

1),(

YYE

YVYE

YVYE

yyyyyy

yyf

0

0.05

0.1

0.15

0.2

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

x1

Bivariate Normal Density

0.15-0.2

0.1-0.15

0.05-0.1

0-0.05

Page 7: Bivariate Normal Distribution and Regression

Marginal Distribution of Y1 (P. 1)

222

22

1121

222212211

222

2112

221

2

222

21

222

22

1122

22

1121

222212211

22

2112

221

2

222

21

222

211

221

222212211

22

2112

221

2222

21

222

222

21

221121

211

2222

21

22111

1212

1exp

12

1

212

1exp

12

1

:brackets square in the gsubtractin and addingby exponent in the square theCompleting

212

1exp

12

1

:exponentin r denominatocommon forming and ly)(temporariconstant out Bringing

2

12

1exp

12

1

,

dyyyyyy

dyyyyyyy

y

dyyyyy

dyyyyy

dyyyfyf

Page 8: Bivariate Normal Distribution and Regression

Marginal Distribution of Y1 (P. 2)

21

211

21

2

2

22

2

1

21122

22

221

211

21

11

22

22

1

21122

2

2

22

2

1

21122

21

211

222

21

2

2

22

21

21122

21

211

222

21

222

21

2

2211122

22

21

2

222

211

222

21

11

2

2exp

2

1

12exp

12

1

2exp

2

1

:us givesfront in constant thefromconstant gnormalizin theTaking

1 :thdensity wi normal a toalproportion is integrand The

12exp

2exp

12

1

12exp

2exp

12

1

12exp

12

1exp

12

1

:exponents up cleaning and involvingnot out term Pulling

ydy

yyy

yf

YVyYE

dy

yyy

dy

yyy

dyyyy

yf

y

Page 9: Bivariate Normal Distribution and Regression

Conditional Distribution of Y2 Given Y1=y1 (P. 1)

21

2211

21

221122

222

2222

22

222

21

221121

2211

2222

2211

21

211

22

222

21

221121

211

2222

21

211

21

22

222

21

221121

211

2222

21

11

2112

2

12

1exp

12

1

211

12

1exp

12

1

:1by last term dividing and gmultiplyinby together involving termsPutting

2

12

12

1exp

12

1

2exp

2

1

212

1exp

12

1

,|

yyyy

yyyy

y

yyyyy

y

yyyy

yf

yyfyyf

Page 10: Bivariate Normal Distribution and Regression

Conditional Distribution of Y2 Given Y1=y1 (P. 2)

222

1

2112112

2

1

211222

2222

2

2

1

211222

2222

2

21

22

2211

1

221122222

2222

2

222

1,~|

12

1exp

12

1

12

1exp

12

1

2

12

1exp

12

1

: offunction a then ,square"perfect " theforming then exponent, theofr denominato in the out Pulling

yNyYY

yy

yy

yyyy

y

This is referred to as the REGRESSION of Y2 on Y1

Page 11: Bivariate Normal Distribution and Regression

Summary of Results

221

2

1221221

222

1

2112112

1

2

2

122112

1222

1

21

2

2

1

211222

2222

2

12

2222

2111

222

222

22

22

121

211

21

11

2122

222

21

221121

211

2222

21

21

1,~|1,~|

12

1exp

12

1|

12

1exp

12

1|

:onsDistributi lConditiona

,~,~

2exp

2

1

2exp

2

1

:onsDistributi onal) Unconditi(aka Marginal

,2

12

1exp

12

1),(

:onDistributiJoint

yNyYY

yNyYY

yy

yyyf

yy

yyyf

NYNY

yy

yf

yy

yf

yyyyyy

yyf

Page 12: Bivariate Normal Distribution and Regression

Heights of Adult Children and Parents

• Empirical Data Based on 924 pairs (F. Galton)

• Y2 = Adult Child’s Height

– Y2 ~ N(68.1,6.39) 2=2.53

• Y1 = Mid-Parent’s Height

– Y1 ~ N(68.3,3.18) 1=1.78

• COV(Y1,Y2) = 2.02 2 = 0.20

• Y2|Y1=y1 is Normal with conditional mean and variance:

26.211.511.5)20.1(39.61|

638.05.246.43638.01.6818.3

39.6)45.0(3.681.68|

12 |22

2112

111

1

2112112

yYyYYV

yyyy

yYYE

y1Unconditional 63.5 66.5 69.5 72.5

E[Y2|y1] 68.1 65.0 66.9 68.8 70.8

Y2|y1 2.53 2.26 2.26 2.26 2.26

Page 13: Bivariate Normal Distribution and Regression

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

62.96

64.206

65.452

66.698

67.944

69.19

70.436

71.682

72.928

y1

Joint Density Function

0.035-0.04

0.03-0.035

0.025-0.03

0.02-0.025

0.015-0.02

0.01-0.015

0.005-0.01

0-0.005

Page 14: Bivariate Normal Distribution and Regression

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

62.96

64.206

65.452

66.698

67.944

69.19

70.436

71.682

72.928

y1

Joint Density Function

0.035-0.04

0.03-0.035

0.025-0.03

0.02-0.025

0.015-0.02

0.01-0.015

0.005-0.01

0-0.005

Page 15: Bivariate Normal Distribution and Regression

Distributions of Heights of Adult Children

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

59.5 60.5 61.5 62.5 63.5 64.5 65.5 66.5 67.5 68.5 69.5 70.5 71.5 72.5 73.5 74.5 75.5 76.5

y2

f(y

2)

uncond

y1=63.5

y1=66.5

y1=69.5

y1=72.5

Page 16: Bivariate Normal Distribution and Regression

E(Child)=

Parent+constant

Galton’s Finding

E(Child) independent of parent

Regression to the Mean

63.5

64.5

65.5

66.5

67.5

68.5

69.5

70.5

71.5

72.5

63.5 64.5 65.5 66.5 67.5 68.5 69.5 70.5 71.5 72.5

y1

E(Y

2) E(Y2|y1)=24.5+.638y1

E(Y2|y1)=0.21+y1

E(Y2|y1)=E(Y2)

Page 17: Bivariate Normal Distribution and Regression

Expectations and Variances

• E(Y1) = 68.3 V(Y1) = 3.18

• E(Y2) = 68.1 V(Y2) = 6.39

• E(Y2|Y1=y1) = 24.5+0.638y1

• EY1[E(Y2|Y1=y1)] = EY1[24.5+0.638Y1] = 24.5+0.638(68.3) = 68.1 = E(Y2)

• V(Y2|Y1=y1) = 5.11 EY1[V(Y2|Y1=y1)] = 5.11

• VY1[E(Y2|Y1=y1)] = VY1[24.5+0.638Y1] = (0.638)2

V(Y1) = (0.407)3.18 = 1.29

• EY1[V(Y2|Y1=y1)]+VY1[E(Y2|Y1=y1)] = 5.11+1.29=6.40 = V(Y2) (with round-off)