54
Probability STAT 416 Spring 2007 4 Jointly distributed random variables 1. Introduction 2. Independent Random Variables 3. Transformations 4. Covariance, Correlation 5. Conditional Distributions 6. Bivariate Normal Distribution 1

4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

ProbabilitySTAT 416Spring 2007

4 Jointly distributed random variables

1. Introduction

2. Independent Random Variables

3. Transformations

4. Covariance, Correlation

5. Conditional Distributions

6. Bivariate Normal Distribution

1

Page 2: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

4.1 Introduction

Computation of probabilities with more than one random variable

two random variables . . . bivariate

two or more random variables. . . multivariate

Concepts:

• Joint distribution function

• purely discrete: Joint probability function

• purely continuous: Joint density

2

Page 3: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Joint distribution function

Bivariate case: Random variables X and Y

Define joint distribution function as

F (x, y) := P (X ≤ x, Y ≤ y), −∞ < x, y < ∞

Bivariate distribution thus fully specified

P (x1<X≤x2, y1<Y≤y2) = F (x2, y2)−F (x1, y2)−F (x2, y1)+F (x1, y1)

for x1 < x2 and y1 < y2

Marginal distribution: FX(x) := P (X ≤ x) = F (x,∞)

Idea: P (X ≤ x) = P (X ≤ x, Y < ∞) = limy→∞

F (x, y)

Analogous FY (y) := P (Y ≤ y) = F (∞, y)

3

Page 4: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Bivariate continuous random variable

X and Y jointly continuous if there exists joint density function:

f(x, y) =∂2

∂x ∂yF (x, y)

joint distribution function then obtained by integration

F (a, b) =

b∫

y=−∞

a∫

x=−∞f(x, y) dxdy

Density of marginal distribution X obtained by integration over ΩY :

fX(x) =

∞∫

y=−∞f(x, y) dy

4

Page 5: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example: Bivariate uniform distribution

X and Y uniformly distributed on [0, 1]× [0, 1] ⇒ density

f(x, y) = 1, 0 ≤ x, y ≤ 1.

Joint distribution function

F (a, b) =

b∫

y=0

a∫

x=0

f(x, y) dxdy = a b, 0 ≤ a, b ≤ 1.

Density of marginal distribution:

fX(x) =

∞∫

y=−∞f(x, y) dy = 1, 0 ≤ x ≤ 1

i.e. density of univariate uniform distribution

5

Page 6: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Exercise: Bivariate continuous distribution

Let joint density of X and Y be given as

f(x, y) =

ax2y2, 0 < x < y < 1

0, else

1. Determine the constant a

2. Determine the CDF F (x, y)

3. Compute the probability P (X < 1/2, Y > 2/3)

4. Compute the probability P (X < 2, Y > 2/3)

5. Compute the probability P (X < 2, Y > 3)

Hint: Be careful with the domain of integration

6

Page 7: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

S.R.: Example 1c

Let joint density of X and Y be given as

f(x, y) =

e−(x+y), 0 < x < ∞, 0 < y < ∞0, else

Find the density function of the random variable X/Y

Again most important to consider the correct domain of integration

P (X/Y ≤ a) =

∞∫

y=0

ay∫

x=0

e−(x+y) dx dy = 1− 1a + 1

(More details are given in the book)

7

Page 8: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Bivariate discrete random variable

X and Y both discrete

Define joint probability function

p(x, y) = P (X = x, Y = y)

Naturally we havep(x, y) = F (x, y)− F (x−, y)− F (x, y−) + F (x−, y−)

Obtain probability function of X by summation over ΩY :

pX(x) = P (X = x) =∑

ΩY

p(x, y)

8

Page 9: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example

Bowl with 3 red, 4 white and 5 blue balls;draw 3 balls randomly without replacement

X . . . number of red drawn balls

Y . . . number of white drawn balls

e. g.: p(0, 1) = P (0R, 1W, 2B) =(30

)(41

)(52

)/(123

)= 40/220

j

i 0 1 2 3 pX

0 10/220 40/220 30/220 4/220 84/220

1 30/220 60/220 18/220 0 108/220

2 15/220 12/220 0 0 27/220

3 1/220 0 0 0 1/220

pY 56/220 112/220 48/220 4/220 1

9

Page 10: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Multivariate random variable

More than two random variables

joint distribution function for n rabdom variables

F (x1, . . . , xn) = P (X1 ≤ x1, . . . , Xn ≤ xn)

Discrete: Joint probability function:

p(x1, . . . , xn) = P (X1 = x1, . . . , Xn = xn)

Marginals again by summation over all components that are not ofinterest, e.g.

pX1(x1) =∑

x2∈Ω2

· · ·∑

xn∈Ωn

p(x1, . . . , xn)

10

Page 11: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Multinomial distribution

one of the most important multivariate discrete distribution

n independent experiments with r possible results, each withprobability p1, . . . pr

Xi . . . number of experiments with ith result, then

P (X1 = n1, . . . , Xr = nr) = n!n1!···nr! pn1

1 · · · pnrr

whenever∑r

i=1 ni = n

Generalization of binomial distribution (r = 2)

Exercise: Poker Dice (throw 5 dices)

Compute probability of a (High) straight, Four of a kind, Full House

11

Page 12: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

4.2 Independent random variables

Two random variables X and Y are called independent if for allevents A and B

P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B)

Information on X does not give any information on Y

X and Y independent if and only if

P (X ≤ a, Y ≤ b) = P (X ≤ a)P (Y ≤ b)

i.e. F (a, b) = FX(a) FY (b) for all a, b.

Also equivalent with f(x, y) = fX(x) fY (y) in the continuous caseand with p(x, y) = pX(x) pY (y) in the discrete case for all x, y

12

Page 13: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example: continuous

Looking back at the example

f(x, y) =

ax2y2, 0 < x < y < 1

0, else

Are the random variables X and Y with joint density as specifiedabove independent?

What changes if we look at

f(x, y) =

ax2y2, 0 < x < 1, 0 < y < 1

0, else

13

Page 14: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example

Z ∼ P(λ) . . . number of persons which enter a barp . . . percentage of male visitorsX, Y . . . number of male and female visitors respectively

X, Y are independently Poisson distributed with parameters pλ andqλ respectively

Solution: Law of total probability:

P (X = i, Y =j) = P (X = i, Y =j|X + Y = i + j)P (X + Y = i + j)

by definition:P (X = i, Y =j|X + Y = i + j) =

(i+j

i

)piqj

P (X + Y = i + j) = e−λ λi+j

(i+j)!

Together:P (X = i, Y =j) = e−λ (λp)i

i!j! (λq)j = e−λp (λp)i

i! e−λq (λq)j

j!

14

Page 15: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example: Two dices

X, Y . . . uniformly distributed on 1, . . . , 6Due to independence we have p(x, y) = pX(x) pY (y) = 1

36

Distribution function:FX(x) = FY (x) = i/6, if x ≤ i < x + 1 and 0 < x < 7

F (x, y) = FX(x)FY (y) = ij36 , x ≤ i < x + 1, y ≤ j < y + 1

Which distribution has X + Y ?

P (X + Y = 2) = p(1, 1) = 1/36P (X + Y = 3) = p(1, 2) + p(2, 1) = 2/36P (X + Y = 4) = p(1, 3) + p(2, 2) + p(3, 1) = 3/36

P (X + Y = k) = p(1, k − 1) + p(2, k − 2) + · · ·+ p(k − 1, 1)

15

Page 16: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Sum of two independent distributions

Sum of random variables itself a random variable

Computation of distribution via convolution

Discrete random variables:

P (X + Y = k) =∑

x+y=k

pX(x)pY (y) =∑

ΩY

pX(k − y)pY (y)

Continuous random variables:

fX+Y (x) =

∞∫

y=−∞fX(x− y)fY (y)dy

Exercise: X ∼ P(λ1) and Y ∼ P(λ2) independent

⇒ X + Y ∼ P(λ1 + λ1)

16

Page 17: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example of convolution: Continuous case

X, Y independent,uniformly distributed on [0, 1]i.e. f(x, y) = 1, (x, y) ∈ [0, 1]× [0, 1]fX(x) = 1, 0 ≤ x ≤ 1, fY (y) = 1, 0 ≤ y ≤ 1

Computation of density of Z := X + Y

fZ(x) =

∞∫

y=−∞fX(x− y)fY (y)dy

=

x∫y=0

dy = x, 0 < x ≤ 1

1∫y=x−1

dy = 2− x, 1 < x ≤ 2

Reason: fY (y) = 1 if 0 ≤ y ≤ 1

fX(x− y) = 1 if 0 ≤ x− y ≤ 1 ⇔ y ≤ x ≤ y + 1

17

Page 18: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Another example of convolution

X, Y indep., Γ−distributed with parameters t1, t2 and same λ

fX(x) = λe−λx(λx)t1−1

Γ(t1), fY (y) = λe−λy(λy)t2−1

Γ(t2), x, y ≥ 0,

fZ(x) =

∞∫

y=−∞fX(x− y)fY (y)dy

=

x∫

y=0

λe−λ(x−y)(λ(x− y))t1−1

Γ(t1)λe−λy(λy)t2−1

Γ(t2)dy

=λt1+t2e−λx

Γ(t1)Γ(t2)

x∫

y=0

(x− y)t1−1yt2−1dy

=

∣∣∣∣∣y = xz

dy = xdz

∣∣∣∣∣ =λe−λx(λx)t1+t2−1

Γ(t1 + t2)

18

Page 19: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

4.3 Transformations

As in the univariate case one considers also in the multivariatecase transformations of random variables

We restrict ourselves to the bivariate case

X1 and X2 jointly distributed with density fX1,X2

Y1 = g1(X1, X2) and Y2 = g2(X1, X2)

Major question: What is the joint density fY1,Y2?

19

Page 20: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Density of Transformation

Assumptions:

• y1 = g1(x1, x2) and y2 = g2(x1, x2) can be uniquely solved forx1 and x2, say x1 = h1(y1, y2) and x2 = h2(y1, y2)

• g1 and g2 are C1 such that

J(x1, x2) =

∣∣∣∣∣∣

∂g1∂x1

∂g1∂x2

∂g2∂x1

∂g2∂x2

∣∣∣∣∣∣6= 0

Under these conditions we have

fY1,Y2(y1, y2) = fX1,X2(x1, x2)|J(x1, x2)|−1

Proof: Calculus

Note the similarity to the one-dimensional case

20

Page 21: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Examples

• Sheldon Ross: Example 7a

Y1 = X1 + X2

Y2 = X1 −X2

• Sheldon Ross: Example 7c

X and Y independent gamma random variables

U = X + Y

V = X/(X + Y )

21

Page 22: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Expectation of bivariate RV

X and Y discrete with joint probability p(x, y).

Like in the one-dimensional case:

E(g(X,Y )) =∑

x∈ΩX

∑y∈ΩY

g(x, y)p(x, y)

Exercise:

Let X and Y eyes of two fair dice, independently thrown

Compute expectation of the difference |X − Y |

22

Page 23: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Expectation of bivariate RV

X and Y continuous with joint density f(x, y).

Like in the one-dimensional case:

E(g(X, Y )) =∫

x∈R

∫y∈R

g(x, y)f(x, y) dydx

Exercise:

Accident occurs at point X on a road of length L. At that timeambulance is at point Y . Assuming both X and Y uniformlydistributed on [0, L] and independent, calculate average distancebetween them. This is again E(|X − Y |)

23

Page 24: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Expectation of the sum of two RV

X and Y discrete with joint probability p(x, y)

For g(x, y) = x + y we obtain

E(X + Y ) =∑

x∈ΩX

∑y∈ΩY

(x + y)p(x, y) = E(X) + E(Y )

Analogous in the continuous case:

E(X + Y ) =∞∫

y=−∞

∞∫x=−∞

(x + y)f(x, y) dx dy = E(X) + E(Y )

Be aware: Additivity for variances in general not the case!

24

Page 25: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Expectation of functions under independence

If X and Y are independent random variables, then for anyfunctions h and g we obtain

E (g(X)h(Y )) = E (g(X)) E (h(Y ))

Proof:

E (g(X)h(Y )) =∫ ∫

g(x)h(y)f(x, y) dx dy

=∫ ∫

g(x)h(y)fX(x)fY (y) dx dy

=∫

g(x)fX(x) dx

∫h(y)fY (y) dy

= E (g(X)) E (h(Y ))

Second equality uses independence

25

Page 26: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Expectation of random samples

X1, . . . , Xn i.i.d. like X, (independent, identically distributed)

Definition: X := 1n

n∑i=1

Xi

For E(X) = µ and Var (X) = σ2 we obtain:

E(X

)= µ, Var (X) = σ2

n

Proof:

E

(n∑

i=1

Xi

)=

n∑i=1

E(Xi)

E

(1n

n∑i=1

Xi − µ

)2

= 1n2 E

(n∑

i=1

(Xi − µ))2

= 1n2 E

(n∑

i=1

(Xi − µ)2)

Last equality due to independence of Xi

26

Page 27: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

4.4 Covariance and Correlation

Describe the relation between two random variables

Definition Covariance:

Cov (X,Y ) = E(X − E(X))(Y − E(Y ))

Usual notation: σXY := Cov (X,Y )

Just like for variances we have

σXY = E(XY )− E(X)E(Y )

Definition of correlation:

ρ(X, Y ) := σXY

σXσY

27

Page 28: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example Correlation

ρ = 0.9

−4 −3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

−3 −2 −1 0 1 2 3−3

−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

ρ=−0.6

ρ = 0.3

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

−3 −2 −1 0 1 2 3 4−3

−2

−1

0

1

2

3

4

ρ = 0.0

28

Page 29: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example Covariance

Discrete bivariate distribution (ΩX = ΩY = 0, 1, 2, 3) given by

j

i 0 1 2 3 pX

0 1/20 4/20 3/20 2/20 10/20

1 3/20 2/20 2/20 0 7/20

2 1/20 1/20 0 0 2/20

3 1/20 0 0 0 1/20

pY 6/20 7/20 5/20 2/20 20/20

Compute Cov (X,Y )

Solution:Cov (X, Y ) = E(XY )− E(X)E(Y ) = 8

20 − 1420 · 23

20 = − 162400

29

Page 30: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example 2: Covariance:

Compute covariance between X and Y for

f(x, y) =

24xy, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, x + y ≤ 1

0, else

Compute first cdf of X:

P (X ≤ a) =

1−a∫

y=0

a∫

x=0

24xy dx dy +

1∫

y=1−a

1−y∫

x=0

24xy dx dy

= 6(1− a)2a2 + 1− 6(1− a)2 + 8(1− a)3 − 3(1− a)4

and by differentiation

fX(x) = 12(1− x)2x− 12(1− x)x2 + 12(1− x)− 24(1− x)2 + 12(1− x)3

= 12(1− x)2x

30

Page 31: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example 2 :Covariance continued

Because of symmetry we have fY (x) = fX(x) and we get

E(X) = E(Y ) =

1∫

x=0

12(1− x)2x2 dx = 2/5

Furthermore

E(XY ) =

1∫

y=0

1−y∫

x=0

24x2y2 dx dy = 2/15

and finally

Cov (X, Y ) =215− 2

5· 25

=10− 12

75= − 2

75

31

Page 32: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Covariance for independent RV

X and Y independent ⇒ σXY = 0

follows immediately from σXY = E(XY )− E(X)E(Y ) and

E(XY ) =∑

x

∑y

xyp(x, y) =∑

x

xpX(x)∑

y

ypY (y)

Converse not true:

X uniformly distributed on −1, 0, 1 and Y =

0, X 6= 0

1, X = 0

E(X) = 0XY = 0 ⇒ E(XY ) = 0

thus Cov (X, Y ) = 0, although X and Y not independent:

e.g. P (X = 1, Y = 0) = P (X = 1) = 1/3, P (Y = 0) = 2/3

32

Page 33: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Properties of Covariance

Obviously

Cov (X, Y ) = Cov (Y,X), and Cov (X, X) = Var (X)

Covariance is a bilinear form:

Cov (aX, Y ) = a Cov (X,Y ), a ∈ R

and

Cov

n∑

i=1

Xi,m∑

j=1

Yj

=

n∑

i=1

m∑

j=1

Cov (Xi, Yj)

Proof by simple computation . . .

33

Page 34: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Variance of a sum of RV

Due to the properties shown above

Var

(n∑

i=1

Xi

)=

n∑

i=1

n∑

j=1

Cov (Xi, Xj)

=n∑

i=1

Var (Xi) +n∑

i=1

j 6=i

Cov (Xi, Xj)

Extreme cases:

• independent RV: Var(

n∑i=1

Xi

)=

n∑i=1

Var (Xi)

• X1 = X2 = · · · = Xn: Var(

n∑i=1

Xi

)= n2 Var (X1)

34

Page 35: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Correlation

Definition: ρ(X, Y ) := σXY

σXσY

We have:

−1 ≤ ρ(X, Y ) ≤ 1

Proof:

0 ≤ Var(

X

σX+

Y

σY

)=

Var (X)σ2

X

+Var (Y )

σ2Y

+2Cov (X, Y )

σXσY

= 2[1 + ρ(X,Y )]

0 ≤ Var(

X

σX− Y

σY

)=

Var (X)σ2

X

+Var (Y )

σ2Y

− 2Cov (X, Y )σXσY

= 2[1− ρ(X,Y )]

35

Page 36: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Exercise Correlation

Let X and Y be independently uniform on [0, 1]

Compute correlation between X and Z for

1. Z = X + Y

2. Z = X2 + Y 2

3. Z = (X + Y )2

36

Page 37: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

4.5 Conditional Distribution

Conditional probability for two events A and B:

P (E|F ) =P (EF )P (F )

Corresponding definition for random variables X and Y

Discrete: pX|Y (x|y) := P (X = x|Y = y) = p(x,y)pY (y)

Exercise: Let p(x, y) given by

p(0, 0) = 0.4, p(0, 1) = 0.2, p(1, 0) = 0.1, p(1, 1) = 0.3,

Compute conditional probability function of X for Y = 1

37

Page 38: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Discrete conditional distribution

Conditional CDF:

FX|Y (x|y) := P (X ≤ x|Y = y) =∑

k≤x

pX|Y (k|y)

If X and Y are independent – then pX|Y (x|y) = pX(x)

Proof: Easy computation

Exercise: Let X ∼ P(λ1) and Y ∼ P(λ2) be independent.

Compute conditional distribution of X, for X + Y = n

P (X = k|X + Y = n) = P (X=k)P (Y =n−k)P (X+Y =n) ,

X + Y ∼ P(λ1 + λ2) ⇒ X|(X + Y = n) ∼ B(n, λ1

λ1+λ2

)

38

Page 39: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Continuous conditional distribution

Continuous: fX|Y (x|y) := f(x,y)fY (y) for fY (y) > 0

Definition in continuous case can be motivated by discrete case(Probabilities for small environments of x and y)

Computation of conditional probabilities:

P (X ∈ A|Y = y) =∫

A

fX|Y (x|y) dx

Conditional CDF:

FX|Y (a|y) := P (X ∈ (−∞, a)|Y = y) =

a∫

x=−∞fX|Y (x|y) dx

39

Page 40: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example

Joint density of X and Y given by

f(x, y) =

c x(2− x− y), x ∈ [0, 1], y ∈ [0, 1],

0, otherwise.

Compute c, fX|Y (x|y) and P (X < 1/2|Y = 1/3)

Solution: fY (y) = c1∫

x=0

x(2− x− y) dx = c ( 23 − y

2 )

1 =1∫

y=0

fY (y) = c( 23 − 1

4 ) ⇒ c = 125

fX|Y (x|y) = f(x,y)fY (y) = x(2−x−y)

23− y

2= 6x(2−x−y)

4−3y

P (X < 1/2|Y = 1/3) =1/2∫

x=0

6x(2−x−1/3)4−3/3 dx = · · · = 1/3

40

Page 41: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Conditional expectation and variance

Computation in continuous case with conditional density:

E(X|Y = y) =

∞∫

x=−∞xfX|Y (x|y)dx

Var (X|Y = y) =

∞∫

x=−∞(x− E(X|Y = y))2fX|Y (x|y)dx

Example: prolonged

E(X|Y = y) =

1∫

x=0

6x2(2− x− y)4− 3y

dx =5/2− 2y

4− 3y

Specifically: E(X|Y = 1/3) = 29

Var (X|Y = y) = . . .

41

Page 42: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Conditional expectation and variance

Computation in discrete case with conditional probability function:

E(X|Y = y) =∑

x∈ΩX

xpX|Y (x|y)

Var (X|Y = y) =∑

x∈ΩX

(x− E(X|Y = y))2pX|Y (x|y)

Exercise: Compute expectation and variance for X given Y = j

j

i 0 1 2 3 pX

0 1/20 4/20 3/20 2/20 10/20

1 3/20 2/20 2/20 0 7/20

2 1/20 1/20 0 0 2/20

3 1/20 0 0 0 1/20

pY 6/20 7/20 5/20 2/20 20/20

42

Page 43: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Computation of expectation by conditioning

E(X|Y = y) can be seen as a function of y, and therefore theexpectation of this function can be computed

It holds: E(X) = E(E(X|Y )

Proof: (for the continuous case)

E(E(X|Y ) =

∞∫

y=−∞E(X|Y = y)fY (y) dy

=

∞∫

y=−∞

∞∫

x=−∞xfX|Y =y(x)fY (y) dx dy

=

∞∫

y=−∞

∞∫

x=−∞x

f(x, y)fY (y)

fY (y) dx dy = E(X)

Verify formula for previous examples

43

Page 44: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

The conditional variance formula

Var (X) = E(Var (X|Y )) + Var (E(X|Y ))

Proof: From Var (X|Y ) = E(X2|Y )− (E(X|Y ))2 we get

E(Var (X|Y )) = E(E(X2|Y ))−E((E(X|Y ))2) = E(X2)−E(E(X|Y )2)

On the other hand

Var (E(X|Y )) = E(E(X|Y )2)−(E(E(X|Y )))2 = E(E(X|Y )2)−E(X)2

The sum of both expressions gives the result.

Formula important in the theory of linear regression!

44

Page 45: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

4.6 Bivariate normal distribution

Univariate normal distribution: f(x) = 1√2π σ

e−(x−µ)2/2σ2

Standard normal distribution: φ(x) = 1√2π

e−x2/2

Let X1 and X2 be independent, both normal distributedN (µi, σ

2i ), i = 1, 2

⇒ f(x1, x2) =1

2π σ1σ2e−(x1−µ1)

2/2σ21−(x2−µ2)

2/2σ22

=1

2π |Σ|1/2e−(x−µ)T Σ−1(x−µ)/2

where x =(x1x2

), µ =

(µ1µ2

), Σ =

(σ21 0

0 σ22

)

45

Page 46: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Densitiy function in general

X = (X1, X2) bivariate normal distributed if joint density has theform

f(x) = 12π |Σ|1/2 e−(x−µ)T Σ−1(x−µ)/2

Covariance matrix: Σ =

σ2

1 σ12

σ12 σ22

Notation: ρ := σ12σ1σ2

• |Σ| = σ21σ2

2 − σ212 = σ2

1σ22(1− ρ2)

• Σ−1 = 1σ21σ2

2(1−ρ2)

σ2

2 −ρσ1σ2

−ρσ1σ2 σ21

46

Page 47: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Density in general

Finally we obtain

f(x1, x2) = 1

2πσ1σ2

√1−ρ2

exp− z2

1−2ρz1z2+z22

2(1−ρ2)

where z1 = x1−µ1σ1

and z2 = x2−µ2σ2

(comparestandardization)

Notation suggests that µi and σ2i are expectation and variance of

the marginals Xi, and that ρ is the correlation between X1 and X2

Important formula:

f(x1, x2) = 1√2πσ1

e−z212 · 1√

2π(1−ρ2)σ2e− (ρz1−z2)2

2(1−ρ2)

Completion of square in the exponent

47

Page 48: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Bivariate normal distribution plot

Marginals standard normal distributed N (0, 1), ρ = 0:

48

Page 49: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example density

Let (X,Y ) be bivariate normal distributed with µ1 = µ2 = 0,σ1 = σ2 = 1 and ρ = 1/2, compute the joint density

Solution: µ =(00

), Σ =

(1 1/2

1/2 1

)

|Σ| = 1− 1/4 = 3/4, Σ−1 = 43

(1 −1/2−1/2 1

)

(x, y)Σ−1(xy

)= 2

3 (x, y)(

2x−y−x+2y

)= 4

3 (x2 − xy + y2)

f(x, y) =1√3π

e−23 (x2−xy+y2)

Density after completion of square in exponent:

f(x, y) =1√2π

e−12 x2 1√

2π 3/4e−

(y−x/2)2

2·3/4

49

Page 50: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Example continued

f(x, y) =1√2π

e−12 x2 1√

2π 3/4e−

(y−x/2)2

2·3/4

Joint density is the product of a standard normal (in x) and anormal distribution (in y) with mean x/2 and variance 3/4.

Compute density of X:

fX(x) =1√2π

e−12 x2

∞∫

y=−∞

1√2π 3/4

e−(y−x/2)2

2·3/4 dy =1√2π

e−12 x2

fX(x) is density of a standard normal distribution

The integral equals 1, because we integrate a density function!

50

Page 51: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Interpretation of µi, σ2i and ρ

In general we have for a bivariate normal distribution

1. X1 ∼ N (µ1, σ21) and X2 ∼ N (µ2, σ

22)

2. Correlation coefficient ρ(X1, X2) = σ12σ1σ2

Proof: 1. Use formula with completed square in exponent andintegrate:

fX1(x1)=1√

2πσ1

e−z212

∞∫

x2=−∞

1√2π(1− ρ2)σ2

e− (ρz1−z2)2

2(1−ρ2) dx2

=1√

2πσ1

e−z212

∞∫

s=−∞

1√2π

e−

(ρz1√1−ρ2

−s

)2

2 ds =1√

2πσ1

e−z212

With substitution s ← z2/√

1− ρ2 = (x2 − µ2)/(√

1− ρ2σ2)

51

Page 52: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Proof continued

2. Again formula with completed square in exponent andsubstitutions z1 ← (x1 − µ1)/σ1, z2 ← (x2 − µ2)/σ2:

Cov (X1, X2) =

∞∫

x1=−∞

∞∫

x2=−∞(x1 − µ1)(x2 − µ2)f(x1, x2) dx2dx1

=

∞∫

x1=−∞

x1 − µ1√2πσ1

e−z212

∞∫

x2=−∞

x2 − µ2√2π(1− ρ2)σ2

e− (ρz1−z2)2

2(1−ρ2) dx2dx1

=∫

z1

z1φ(z1)∫

z2

z2√1− ρ2

φ

(ρz1 − z2√

1− ρ2

)σ2dz2σ1dz1

= σ1σ2

z1

z1φ(z1)ρz1dz1 = σ1σ2ρ = σ12

52

Page 53: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Conditional distribution

Interpretation of the formula

f(x1, x2) = 1√2πσ1

e−z212 · 1√

2π(1−ρ2)σ2e− (ρz1−z2)2

2(1−ρ2)

f(x1, x2) = f1(x1)f2|1(x2|x1)

From (ρz1−z2)2

(1−ρ2) = (µ2+σ2ρz1−x2)2

σ22(1−ρ2)

we conclude:

Conditional distribution is again normal distribution withµ2|1 = µ2 + ρ(x1 − µ1)σ2

σ1, σ2|1 = σ2

2(1− ρ2)

For bivariate normal distribution: ρ = 0 ⇒ Independence

In general not correct!

53

Page 54: 4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

Sum of bivariate normal distributions

Let X1, X2 be bivariate normal with µ1, µ2, σ21 , σ2

2 , σ12

Then the random variable Z = X1 + X2 is again normal, with

X1 + X2 ∼ N (µ1 + µ2, σ21 + σ2

2 + 2σ12)

Proof: For the density of the sum we have

fZ(z) =

∞∫

x2=−∞f(z − x2, x2) dx2

Get the result again by completion of square (lengthy calculation)

Intuition: Mean and variance of Z obtained from formula forgeneral random variables

54