4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables

ProbabilitySTAT 416Spring 2007

4 Jointly distributed random variables

1. Introduction

2. Independent Random Variables

3. Transformations

4. Covariance, Correlation

5. Conditional Distributions

6. Bivariate Normal Distribution

1

4.1 Introduction

Computation of probabilities with more than one random variable

two random variables . . . bivariate

two or more random variables. . . multivariate

Concepts:

• Joint distribution function

• purely discrete: Joint probability function

• purely continuous: Joint density

2

Joint distribution function

Bivariate case: Random variables X and Y

Define joint distribution function as

F (x, y) := P (X ≤ x, Y ≤ y), −∞ < x, y < ∞

Bivariate distribution thus fully specified

P (x1<X≤x2, y1<Y≤y2) = F (x2, y2)−F (x1, y2)−F (x2, y1)+F (x1, y1)

for x1 < x2 and y1 < y2

Marginal distribution: FX(x) := P (X ≤ x) = F (x,∞)

Idea: P (X ≤ x) = P (X ≤ x, Y < ∞) = limy→∞

F (x, y)

Analogous FY (y) := P (Y ≤ y) = F (∞, y)

3

Bivariate continuous random variable

X and Y jointly continuous if there exists joint density function:

f(x, y) =∂2

∂x ∂yF (x, y)

joint distribution function then obtained by integration

F (a, b) =

b∫

y=−∞

a∫

x=−∞f(x, y) dxdy

Density of marginal distribution X obtained by integration over ΩY :

fX(x) =

∞∫

y=−∞f(x, y) dy

4

Example: Bivariate uniform distribution

X and Y uniformly distributed on [0, 1]× [0, 1] ⇒ density

f(x, y) = 1, 0 ≤ x, y ≤ 1.

Joint distribution function

F (a, b) =

b∫

y=0

a∫

x=0

f(x, y) dxdy = a b, 0 ≤ a, b ≤ 1.

Density of marginal distribution:

fX(x) =

∞∫

y=−∞f(x, y) dy = 1, 0 ≤ x ≤ 1

i.e. density of univariate uniform distribution

5

Exercise: Bivariate continuous distribution

Let joint density of X and Y be given as

f(x, y) =

ax2y2, 0 < x < y < 1

0, else

1. Determine the constant a

2. Determine the CDF F (x, y)

3. Compute the probability P (X < 1/2, Y > 2/3)

4. Compute the probability P (X < 2, Y > 2/3)

5. Compute the probability P (X < 2, Y > 3)

Hint: Be careful with the domain of integration

6

S.R.: Example 1c

Let joint density of X and Y be given as

f(x, y) =

e−(x+y), 0 < x < ∞, 0 < y < ∞0, else

Find the density function of the random variable X/Y

Again most important to consider the correct domain of integration

P (X/Y ≤ a) =

∞∫

y=0

ay∫

x=0

e−(x+y) dx dy = 1− 1a + 1

(More details are given in the book)

7

Bivariate discrete random variable

X and Y both discrete

Define joint probability function

p(x, y) = P (X = x, Y = y)

Naturally we havep(x, y) = F (x, y)− F (x−, y)− F (x, y−) + F (x−, y−)

Obtain probability function of X by summation over ΩY :

pX(x) = P (X = x) =∑

ΩY

p(x, y)

8

Example

Bowl with 3 red, 4 white and 5 blue balls;draw 3 balls randomly without replacement

X . . . number of red drawn balls

Y . . . number of white drawn balls

e. g.: p(0, 1) = P (0R, 1W, 2B) =(30

)(41

)(52

)/(123

)= 40/220

j

i 0 1 2 3 pX

0 10/220 40/220 30/220 4/220 84/220

1 30/220 60/220 18/220 0 108/220

2 15/220 12/220 0 0 27/220

3 1/220 0 0 0 1/220

pY 56/220 112/220 48/220 4/220 1

9

Multivariate random variable

More than two random variables

joint distribution function for n rabdom variables

F (x1, . . . , xn) = P (X1 ≤ x1, . . . , Xn ≤ xn)

Discrete: Joint probability function:

p(x1, . . . , xn) = P (X1 = x1, . . . , Xn = xn)

Marginals again by summation over all components that are not ofinterest, e.g.

pX1(x1) =∑

x2∈Ω2

· · ·∑

xn∈Ωn

p(x1, . . . , xn)

10

Multinomial distribution

one of the most important multivariate discrete distribution

n independent experiments with r possible results, each withprobability p1, . . . pr

Xi . . . number of experiments with ith result, then

P (X1 = n1, . . . , Xr = nr) = n!n1!···nr! pn1

1 · · · pnrr

whenever∑r

i=1 ni = n

Generalization of binomial distribution (r = 2)

Exercise: Poker Dice (throw 5 dices)

Compute probability of a (High) straight, Four of a kind, Full House

11

4.2 Independent random variables

Two random variables X and Y are called independent if for allevents A and B

P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B)

Information on X does not give any information on Y

X and Y independent if and only if

P (X ≤ a, Y ≤ b) = P (X ≤ a)P (Y ≤ b)

i.e. F (a, b) = FX(a) FY (b) for all a, b.

Also equivalent with f(x, y) = fX(x) fY (y) in the continuous caseand with p(x, y) = pX(x) pY (y) in the discrete case for all x, y

12

Example: continuous

Looking back at the example

f(x, y) =

ax2y2, 0 < x < y < 1

0, else

Are the random variables X and Y with joint density as specifiedabove independent?

What changes if we look at

f(x, y) =

ax2y2, 0 < x < 1, 0 < y < 1

0, else

13

Example

Z ∼ P(λ) . . . number of persons which enter a barp . . . percentage of male visitorsX, Y . . . number of male and female visitors respectively

X, Y are independently Poisson distributed with parameters pλ andqλ respectively

Solution: Law of total probability:

P (X = i, Y =j) = P (X = i, Y =j|X + Y = i + j)P (X + Y = i + j)

by definition:P (X = i, Y =j|X + Y = i + j) =

(i+j

i

)piqj

P (X + Y = i + j) = e−λ λi+j

(i+j)!

Together:P (X = i, Y =j) = e−λ (λp)i

i!j! (λq)j = e−λp (λp)i

i! e−λq (λq)j

j!

14

Example: Two dices

X, Y . . . uniformly distributed on 1, . . . , 6Due to independence we have p(x, y) = pX(x) pY (y) = 1

36

Distribution function:FX(x) = FY (x) = i/6, if x ≤ i < x + 1 and 0 < x < 7

F (x, y) = FX(x)FY (y) = ij36 , x ≤ i < x + 1, y ≤ j < y + 1

Which distribution has X + Y ?

P (X + Y = 2) = p(1, 1) = 1/36P (X + Y = 3) = p(1, 2) + p(2, 1) = 2/36P (X + Y = 4) = p(1, 3) + p(2, 2) + p(3, 1) = 3/36

P (X + Y = k) = p(1, k − 1) + p(2, k − 2) + · · ·+ p(k − 1, 1)

15

Sum of two independent distributions

Sum of random variables itself a random variable

Computation of distribution via convolution

Discrete random variables:

P (X + Y = k) =∑

x+y=k

pX(x)pY (y) =∑

ΩY

pX(k − y)pY (y)

Continuous random variables:

fX+Y (x) =

∞∫

y=−∞fX(x− y)fY (y)dy

Exercise: X ∼ P(λ1) and Y ∼ P(λ2) independent

⇒ X + Y ∼ P(λ1 + λ1)

16

Example of convolution: Continuous case

X, Y independent,uniformly distributed on [0, 1]i.e. f(x, y) = 1, (x, y) ∈ [0, 1]× [0, 1]fX(x) = 1, 0 ≤ x ≤ 1, fY (y) = 1, 0 ≤ y ≤ 1

Computation of density of Z := X + Y

fZ(x) =

∞∫


=

x∫y=0

dy = x, 0 < x ≤ 1

1∫y=x−1

dy = 2− x, 1 < x ≤ 2

Reason: fY (y) = 1 if 0 ≤ y ≤ 1

fX(x− y) = 1 if 0 ≤ x− y ≤ 1 ⇔ y ≤ x ≤ y + 1

17

Another example of convolution

X, Y indep., Γ−distributed with parameters t1, t2 and same λ

fX(x) = λe−λx(λx)t1−1

Γ(t1), fY (y) = λe−λy(λy)t2−1

Γ(t2), x, y ≥ 0,

fZ(x) =

∞∫


=

x∫

y=0

λe−λ(x−y)(λ(x− y))t1−1

Γ(t1)λe−λy(λy)t2−1

Γ(t2)dy

=λt1+t2e−λx

Γ(t1)Γ(t2)

x∫

y=0

(x− y)t1−1yt2−1dy

=

∣∣∣∣∣y = xz

dy = xdz

∣∣∣∣∣ =λe−λx(λx)t1+t2−1

Γ(t1 + t2)

18

4.3 Transformations

As in the univariate case one considers also in the multivariatecase transformations of random variables

We restrict ourselves to the bivariate case

X1 and X2 jointly distributed with density fX1,X2

Y1 = g1(X1, X2) and Y2 = g2(X1, X2)

Major question: What is the joint density fY1,Y2?

19

Density of Transformation

Assumptions:

• y1 = g1(x1, x2) and y2 = g2(x1, x2) can be uniquely solved forx1 and x2, say x1 = h1(y1, y2) and x2 = h2(y1, y2)

• g1 and g2 are C1 such that

J(x1, x2) =

∣∣∣∣∣∣

∂g1∂x1

∂g1∂x2

∂g2∂x1

∂g2∂x2

∣∣∣∣∣∣6= 0

Under these conditions we have

fY1,Y2(y1, y2) = fX1,X2(x1, x2)|J(x1, x2)|−1

Proof: Calculus

Note the similarity to the one-dimensional case

20

Examples

• Sheldon Ross: Example 7a

Y1 = X1 + X2

Y2 = X1 −X2

• Sheldon Ross: Example 7c

X and Y independent gamma random variables

U = X + Y

V = X/(X + Y )

21

Expectation of bivariate RV

X and Y discrete with joint probability p(x, y).

Like in the one-dimensional case:

E(g(X,Y )) =∑

x∈ΩX

∑y∈ΩY

g(x, y)p(x, y)

Exercise:

Let X and Y eyes of two fair dice, independently thrown

Compute expectation of the difference |X − Y |

22

Expectation of bivariate RV

X and Y continuous with joint density f(x, y).

Like in the one-dimensional case:

E(g(X, Y )) =∫

x∈R

∫y∈R

g(x, y)f(x, y) dydx

Exercise:

Accident occurs at point X on a road of length L. At that timeambulance is at point Y . Assuming both X and Y uniformlydistributed on [0, L] and independent, calculate average distancebetween them. This is again E(|X − Y |)

23

Expectation of the sum of two RV

X and Y discrete with joint probability p(x, y)

For g(x, y) = x + y we obtain

E(X + Y ) =∑

x∈ΩX

∑y∈ΩY

(x + y)p(x, y) = E(X) + E(Y )

Analogous in the continuous case:

E(X + Y ) =∞∫

y=−∞

∞∫x=−∞

(x + y)f(x, y) dx dy = E(X) + E(Y )

Be aware: Additivity for variances in general not the case!

24

Expectation of functions under independence

If X and Y are independent random variables, then for anyfunctions h and g we obtain

E (g(X)h(Y )) = E (g(X)) E (h(Y ))

Proof:

E (g(X)h(Y )) =∫ ∫

g(x)h(y)f(x, y) dx dy

=∫ ∫

g(x)h(y)fX(x)fY (y) dx dy

=∫

g(x)fX(x) dx

∫h(y)fY (y) dy

= E (g(X)) E (h(Y ))

Second equality uses independence

25

Expectation of random samples

X1, . . . , Xn i.i.d. like X, (independent, identically distributed)

Definition: X := 1n

n∑i=1

Xi

For E(X) = µ and Var (X) = σ2 we obtain:

E(X

)= µ, Var (X) = σ2

n

Proof:

E

(n∑

i=1

Xi

)=

n∑i=1

E(Xi)

E

(1n

n∑i=1

Xi − µ

)2

= 1n2 E

(n∑

i=1

(Xi − µ))2

= 1n2 E

(n∑

i=1

(Xi − µ)2)

Last equality due to independence of Xi

26

4.4 Covariance and Correlation

Describe the relation between two random variables

Definition Covariance:

Cov (X,Y ) = E(X − E(X))(Y − E(Y ))

Usual notation: σXY := Cov (X,Y )

Just like for variances we have

σXY = E(XY )− E(X)E(Y )

Definition of correlation:

ρ(X, Y ) := σXY

σXσY

27

Example Correlation

ρ = 0.9

−4 −3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

−3 −2 −1 0 1 2 3−3

−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

ρ=−0.6

ρ = 0.3

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

−3 −2 −1 0 1 2 3 4−3

−2

−1

0

1

2

3

4

ρ = 0.0

28

Example Covariance

Discrete bivariate distribution (ΩX = ΩY = 0, 1, 2, 3) given by

j

i 0 1 2 3 pX

0 1/20 4/20 3/20 2/20 10/20

1 3/20 2/20 2/20 0 7/20

2 1/20 1/20 0 0 2/20

3 1/20 0 0 0 1/20

pY 6/20 7/20 5/20 2/20 20/20

Compute Cov (X,Y )

Solution:Cov (X, Y ) = E(XY )− E(X)E(Y ) = 8

20 − 1420 · 23

20 = − 162400

29

Example 2: Covariance:

Compute covariance between X and Y for

f(x, y) =

24xy, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, x + y ≤ 1

0, else

Compute first cdf of X:

P (X ≤ a) =

1−a∫

y=0

a∫

x=0

24xy dx dy +

1∫

y=1−a

1−y∫

x=0

24xy dx dy

= 6(1− a)2a2 + 1− 6(1− a)2 + 8(1− a)3 − 3(1− a)4

and by differentiation

fX(x) = 12(1− x)2x− 12(1− x)x2 + 12(1− x)− 24(1− x)2 + 12(1− x)3

= 12(1− x)2x

30

Example 2 :Covariance continued

Because of symmetry we have fY (x) = fX(x) and we get

E(X) = E(Y ) =

1∫

x=0

12(1− x)2x2 dx = 2/5

Furthermore

E(XY ) =

1∫

y=0

1−y∫

x=0

24x2y2 dx dy = 2/15

and finally

Cov (X, Y ) =215− 2

5· 25

=10− 12

75= − 2

75

31

Covariance for independent RV

X and Y independent ⇒ σXY = 0

follows immediately from σXY = E(XY )− E(X)E(Y ) and

E(XY ) =∑

x

∑y

xyp(x, y) =∑

x

xpX(x)∑

y

ypY (y)

Converse not true:

X uniformly distributed on −1, 0, 1 and Y =

0, X 6= 0

1, X = 0

E(X) = 0XY = 0 ⇒ E(XY ) = 0

thus Cov (X, Y ) = 0, although X and Y not independent:

e.g. P (X = 1, Y = 0) = P (X = 1) = 1/3, P (Y = 0) = 2/3

32

Properties of Covariance

Obviously

Cov (X, Y ) = Cov (Y,X), and Cov (X, X) = Var (X)

Covariance is a bilinear form:

Cov (aX, Y ) = a Cov (X,Y ), a ∈ R

and

Cov

n∑

i=1

Xi,m∑

j=1

Yj

=

n∑

i=1

m∑

j=1

Cov (Xi, Yj)

Proof by simple computation . . .

33

Variance of a sum of RV

Due to the properties shown above

Var

(n∑

i=1

Xi

)=

n∑

i=1

n∑

j=1

Cov (Xi, Xj)

=n∑

i=1

Var (Xi) +n∑

i=1

∑

j 6=i

Cov (Xi, Xj)

Extreme cases:

• independent RV: Var(

n∑i=1

Xi

)=

n∑i=1

Var (Xi)

• X1 = X2 = · · · = Xn: Var(

n∑i=1

Xi

)= n2 Var (X1)

34

Correlation

Definition: ρ(X, Y ) := σXY

σXσY

We have:

−1 ≤ ρ(X, Y ) ≤ 1

Proof:

0 ≤ Var(

X

σX+

Y

σY

)=

Var (X)σ2

X

+Var (Y )

σ2Y

+2Cov (X, Y )

σXσY

= 2[1 + ρ(X,Y )]

0 ≤ Var(

X

σX− Y

σY

)=

Var (X)σ2

X

+Var (Y )

σ2Y

− 2Cov (X, Y )σXσY

= 2[1− ρ(X,Y )]

35

Exercise Correlation

Let X and Y be independently uniform on [0, 1]

Compute correlation between X and Z for

1. Z = X + Y

2. Z = X2 + Y 2

3. Z = (X + Y )2

36

4.5 Conditional Distribution

Conditional probability for two events A and B:

P (E|F ) =P (EF )P (F )

Corresponding definition for random variables X and Y

Discrete: pX|Y (x|y) := P (X = x|Y = y) = p(x,y)pY (y)

Exercise: Let p(x, y) given by

p(0, 0) = 0.4, p(0, 1) = 0.2, p(1, 0) = 0.1, p(1, 1) = 0.3,

Compute conditional probability function of X for Y = 1

37

Discrete conditional distribution

Conditional CDF:

FX|Y (x|y) := P (X ≤ x|Y = y) =∑

k≤x

pX|Y (k|y)

If X and Y are independent – then pX|Y (x|y) = pX(x)

Proof: Easy computation

Exercise: Let X ∼ P(λ1) and Y ∼ P(λ2) be independent.

Compute conditional distribution of X, for X + Y = n

P (X = k|X + Y = n) = P (X=k)P (Y =n−k)P (X+Y =n) ,

X + Y ∼ P(λ1 + λ2) ⇒ X|(X + Y = n) ∼ B(n, λ1

λ1+λ2

)

38

Continuous conditional distribution

Continuous: fX|Y (x|y) := f(x,y)fY (y) for fY (y) > 0

Definition in continuous case can be motivated by discrete case(Probabilities for small environments of x and y)

Computation of conditional probabilities:

P (X ∈ A|Y = y) =∫

A

fX|Y (x|y) dx

Conditional CDF:

FX|Y (a|y) := P (X ∈ (−∞, a)|Y = y) =

a∫

x=−∞fX|Y (x|y) dx

39

Example

Joint density of X and Y given by

f(x, y) =

c x(2− x− y), x ∈ [0, 1], y ∈ [0, 1],

0, otherwise.

Compute c, fX|Y (x|y) and P (X < 1/2|Y = 1/3)

Solution: fY (y) = c1∫

x=0

x(2− x− y) dx = c ( 23 − y

2 )

1 =1∫

y=0

fY (y) = c( 23 − 1

4 ) ⇒ c = 125

fX|Y (x|y) = f(x,y)fY (y) = x(2−x−y)

23− y

2= 6x(2−x−y)

4−3y

P (X < 1/2|Y = 1/3) =1/2∫

x=0

6x(2−x−1/3)4−3/3 dx = · · · = 1/3

40

Conditional expectation and variance

Computation in continuous case with conditional density:

E(X|Y = y) =

∞∫

x=−∞xfX|Y (x|y)dx

Var (X|Y = y) =

∞∫

x=−∞(x− E(X|Y = y))2fX|Y (x|y)dx

Example: prolonged

E(X|Y = y) =

1∫

x=0

6x2(2− x− y)4− 3y

dx =5/2− 2y

4− 3y

Specifically: E(X|Y = 1/3) = 29

Var (X|Y = y) = . . .

41

Conditional expectation and variance

Computation in discrete case with conditional probability function:

E(X|Y = y) =∑

x∈ΩX

xpX|Y (x|y)

Var (X|Y = y) =∑

x∈ΩX

(x− E(X|Y = y))2pX|Y (x|y)

Exercise: Compute expectation and variance for X given Y = j

j

i 0 1 2 3 pX

0 1/20 4/20 3/20 2/20 10/20

1 3/20 2/20 2/20 0 7/20

2 1/20 1/20 0 0 2/20

3 1/20 0 0 0 1/20

pY 6/20 7/20 5/20 2/20 20/20

42

Computation of expectation by conditioning

E(X|Y = y) can be seen as a function of y, and therefore theexpectation of this function can be computed

It holds: E(X) = E(E(X|Y )

Proof: (for the continuous case)

E(E(X|Y ) =

∞∫

y=−∞E(X|Y = y)fY (y) dy

=

∞∫

y=−∞

∞∫

x=−∞xfX|Y =y(x)fY (y) dx dy

=

∞∫

y=−∞

∞∫

x=−∞x

f(x, y)fY (y)

fY (y) dx dy = E(X)

Verify formula for previous examples

43

The conditional variance formula

Var (X) = E(Var (X|Y )) + Var (E(X|Y ))

Proof: From Var (X|Y ) = E(X2|Y )− (E(X|Y ))2 we get

E(Var (X|Y )) = E(E(X2|Y ))−E((E(X|Y ))2) = E(X2)−E(E(X|Y )2)

On the other hand

Var (E(X|Y )) = E(E(X|Y )2)−(E(E(X|Y )))2 = E(E(X|Y )2)−E(X)2

The sum of both expressions gives the result.

Formula important in the theory of linear regression!

44

4.6 Bivariate normal distribution

Univariate normal distribution: f(x) = 1√2π σ

e−(x−µ)2/2σ2

Standard normal distribution: φ(x) = 1√2π

e−x2/2

Let X1 and X2 be independent, both normal distributedN (µi, σ

2i ), i = 1, 2

⇒ f(x1, x2) =1

2π σ1σ2e−(x1−µ1)

2/2σ21−(x2−µ2)

2/2σ22

=1

2π |Σ|1/2e−(x−µ)T Σ−1(x−µ)/2

where x =(x1x2

), µ =

(µ1µ2

), Σ =

(σ21 0

0 σ22

)

45

Densitiy function in general

X = (X1, X2) bivariate normal distributed if joint density has theform

f(x) = 12π |Σ|1/2 e−(x−µ)T Σ−1(x−µ)/2

Covariance matrix: Σ =

σ2

1 σ12

σ12 σ22

Notation: ρ := σ12σ1σ2

• |Σ| = σ21σ2

2 − σ212 = σ2

1σ22(1− ρ2)

• Σ−1 = 1σ21σ2

2(1−ρ2)

σ2

2 −ρσ1σ2

−ρσ1σ2 σ21

46

Density in general

Finally we obtain

f(x1, x2) = 1

2πσ1σ2

√1−ρ2

exp− z2

1−2ρz1z2+z22

2(1−ρ2)

where z1 = x1−µ1σ1

and z2 = x2−µ2σ2

(comparestandardization)

Notation suggests that µi and σ2i are expectation and variance of

the marginals Xi, and that ρ is the correlation between X1 and X2

Important formula:

f(x1, x2) = 1√2πσ1

e−z212 · 1√

2π(1−ρ2)σ2e− (ρz1−z2)2

2(1−ρ2)

Completion of square in the exponent

47

Bivariate normal distribution plot

Marginals standard normal distributed N (0, 1), ρ = 0:

48

Example density

Let (X,Y ) be bivariate normal distributed with µ1 = µ2 = 0,σ1 = σ2 = 1 and ρ = 1/2, compute the joint density

Solution: µ =(00

), Σ =

(1 1/2

1/2 1

)

|Σ| = 1− 1/4 = 3/4, Σ−1 = 43

(1 −1/2−1/2 1

)

(x, y)Σ−1(xy

)= 2

3 (x, y)(

2x−y−x+2y

)= 4

3 (x2 − xy + y2)

f(x, y) =1√3π

e−23 (x2−xy+y2)

Density after completion of square in exponent:

f(x, y) =1√2π

e−12 x2 1√

2π 3/4e−

(y−x/2)2

2·3/4

49

Example continued

f(x, y) =1√2π

e−12 x2 1√

2π 3/4e−

(y−x/2)2

2·3/4

Joint density is the product of a standard normal (in x) and anormal distribution (in y) with mean x/2 and variance 3/4.

Compute density of X:

fX(x) =1√2π

e−12 x2

∞∫

y=−∞

1√2π 3/4

e−(y−x/2)2

2·3/4 dy =1√2π

e−12 x2

fX(x) is density of a standard normal distribution

The integral equals 1, because we integrate a density function!

50

Interpretation of µi, σ2i and ρ

In general we have for a bivariate normal distribution

1. X1 ∼ N (µ1, σ21) and X2 ∼ N (µ2, σ

22)

2. Correlation coefficient ρ(X1, X2) = σ12σ1σ2

Proof: 1. Use formula with completed square in exponent andintegrate:

fX1(x1)=1√

2πσ1

e−z212

∞∫

x2=−∞

1√2π(1− ρ2)σ2

e− (ρz1−z2)2

2(1−ρ2) dx2

=1√

2πσ1

e−z212

∞∫

s=−∞

1√2π

e−

(ρz1√1−ρ2

−s

)2

2 ds =1√

2πσ1

e−z212

With substitution s ← z2/√

1− ρ2 = (x2 − µ2)/(√

1− ρ2σ2)

51

Proof continued

2. Again formula with completed square in exponent andsubstitutions z1 ← (x1 − µ1)/σ1, z2 ← (x2 − µ2)/σ2:

Cov (X1, X2) =

∞∫

x1=−∞

∞∫

x2=−∞(x1 − µ1)(x2 − µ2)f(x1, x2) dx2dx1

=

∞∫

x1=−∞

x1 − µ1√2πσ1

e−z212

∞∫

x2=−∞

x2 − µ2√2π(1− ρ2)σ2

e− (ρz1−z2)2

2(1−ρ2) dx2dx1

=∫

z1

z1φ(z1)∫

z2

z2√1− ρ2

φ

(ρz1 − z2√

1− ρ2

)σ2dz2σ1dz1

= σ1σ2

∫

z1

z1φ(z1)ρz1dz1 = σ1σ2ρ = σ12

52

Conditional distribution

Interpretation of the formula

f(x1, x2) = 1√2πσ1

e−z212 · 1√

2π(1−ρ2)σ2e− (ρz1−z2)2

2(1−ρ2)

f(x1, x2) = f1(x1)f2|1(x2|x1)

From (ρz1−z2)2

(1−ρ2) = (µ2+σ2ρz1−x2)2

σ22(1−ρ2)

we conclude:

Conditional distribution is again normal distribution withµ2|1 = µ2 + ρ(x1 − µ1)σ2

σ1, σ2|1 = σ2

2(1− ρ2)

For bivariate normal distribution: ρ = 0 ⇒ Independence

In general not correct!

53

Sum of bivariate normal distributions

Let X1, X2 be bivariate normal with µ1, µ2, σ21 , σ2

2 , σ12

Then the random variable Z = X1 + X2 is again normal, with

X1 + X2 ∼ N (µ1 + µ2, σ21 + σ2

2 + 2σ12)

Proof: For the density of the sum we have

fZ(z) =

∞∫

x2=−∞f(z − x2, x2) dx2

Get the result again by completion of square (lengthy calculation)

Intuition: Mean and variance of Z obtained from formula forgeneral random variables

54

Documents

4 Jointly distributed random variables - univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/...Probability STAT 416 Spring 2007 4 Jointly distributed random variables