Chapter 4: Joint and Conditional Distributions [email protected] Yang Zhenlin

Chapter 4: Joint and Conditional Distributions

[email protected]://www.mysmu.edu/faculty/zlyang/

Yang ZhenlinYang Zhenlin

STAT306, Term II, 09/10

Chapter 4

STAT151, Term II 14-15 © Zhenlin Yang, SMU

Chapter Contents

Joint Distribution

Special Joint Distributions:

Multinomial and Bivariate Normal

Covariance and Correlation Coefficient

Conditional Distribution

Conditional Expectation

Conditional Variance

2


Chapter 4

STAT151, Term II 14-15 © Zhenlin Yang, SMU3

Introduction

In many applications, more than one variables are needed for describing a quantity or a phenomenon of interest, e.g.,

To describe the size of a man, one needs at least height (X) and weight (Y).

To describe a point in a rectangle, one needs X coordinate and Y coordinate.

In general, the set of k r.v.s correspond to the same “unit”, defined on the same sample space and taking values in a k-dimensional Euclidean space.

In this chapter, we focus mainly on the case of two r.v.s, and deal separately with two cases:

•both X and Y are discrete

•both X and Y are continuous


Chapter 4


Joint Distributions

Definition 4.2. Let X and Y be two discrete random variables defined on the same sample space. The joint probability mass function of X and Y is defined to be

p(x, y) = P(X = x, Y = y)

for all possible values of X and Y.

Definition 4.2. Let X and Y be two discrete random variables defined on the same sample space. The joint probability mass function of X and Y is defined to be

p(x, y) = P(X = x, Y = y)

for all possible values of X and Y.

Definition 4.1. (Joint CDF) The joint cumulative distribution function of r.v.s X and Y is the function defined by

F(x, y) = P(X x, Y y ).

Definition 4.1. (Joint CDF) The joint cumulative distribution function of r.v.s X and Y is the function defined by

F(x, y) = P(X x, Y y ).

Definition 4.1 extends naturally to cases of more than two r.v.s.It applies to both discrete and continuous r.v.s.

Definition 4.2 extends directly to cases of more than two r.v.s


Chapter 4


Joint Distributions

Example 4.1. Xavier and Yvette are two real estate agents. Let X and Y denote the number of houses that Xavier and Yvette will sell next week, respectively. Suppose that there only four houses for sale next week. The joint probability mass function and its graph are presented below.Find P(X 1, Y 1) and P(Y 1).

p(x,y)

X

Y X=0 X=2 X=1

y=1

y=2

y=0

0.42

0.12 0.21

0.07 0.06

0.02

0.06

0.03

0.01

p(x, y) X

Y 0 1 2

0 .12 .42 .06

1 .21 .06 .03

2 .07 .02 .01

Answer: P(X 1, Y 1) =.06+.03+.02+.01 = 0.12,P(Y 1) =.21+.06+.03+.07+.02+.01 = .40


Chapter 4


Joint Distributions

Example 4.2. A bin contains 1000 flower seeds, of which 400 are red, 400 are white and 200 are pink. Ten seeds are selected at random without replacement. Let X be the number of red flower seeds and Y be the number of white flower seeds being selected.

(a) Find the joint pmf of X and Y. (b) Calculate P(X = 2, Y = 3) and P(X = Y).

Solution: (a) From the counting techniques in Chapter 1, we obtain

.10,0,0,10

100010

200400400),(

yxyx

yxyxyxp

0081.0

101000

5200

3400

2400

)3,2(

YXP(b)

0263.0

101000

210200400400

),()(

5

0

5

0

i

i

iii

iYiXPYXP


Chapter 4


Joint Distributions

A function p(x, y) is said to be the joint pmf of discrete r.v.s X and Y if and only if for all possible values (x, y),

(i) p(x, y) 0 and (ii)

A function p(x, y) is said to be the joint pmf of discrete r.v.s X and Y if and only if for all possible values (x, y),

(i) p(x, y) 0 and (ii) .1),( x y

yxp

Example 4.3. Let the joint pmf of X and Y be give by

(a) Find the value of the constant k.(b) Calculate P(X > Y), P(X + Y 4), and P(Y X).

otherwise.,0(3,3) (2,3), (1,2), (1,1), ) ,( if),(),(

22 yxyxkyxp

Solution:

(a)

= k[(12 + 12) + (12 + 22 ) + (22 + 32 ) + (32 + 32)] = 38 k, k = 1/38.

(b) P(X > Y) = 0, P(X + Y 4) = 7/38, and P(Y X) = 1.

),(

),(),(1yxx y

yxpyxp


Chapter 4


Joint Distributions

Definition 4.3. A function f(x, y) is said to be the joint probability density function of the continuous r.v.s X and Y if the joint CDF of X and Y can be written as

Definition 4.3. A function f(x, y) is said to be the joint probability density function of the continuous r.v.s X and Y if the joint CDF of X and Y can be written as

x y

yxdvduvufyxF . and allfor ,),(),(

A function f(x, y) is said to be the joint pdf of continuous r.v.s X and Y if and only if for all possible values (x, y),

(i) f(x, y) 0 and (ii)

A function f(x, y) is said to be the joint pdf of continuous r.v.s X and Y if and only if for all possible values (x, y),

(i) f(x, y) 0 and (ii) 1),(

dxdyyxf

Marginal pmf:

Marginal pdf:

x

Yy

X yxpypyxpxp ),()(;),()(

dxyxfyfdyyxfxf YX ),()(;),()(


Chapter 4


In Example 4.3, the marginal pmfs of X and Y are given below:

X 1 2 3 Y 1 2 3 )(xpX 7/38 13/38 18/38 )(ypY 2/38 5/38 31/38

9

Joint Distributions

Example 4.4. Let the joint pdf be given by

(a) Find the value of the constant k. (b) Find the marginal pdfs of X and Y.(c) Calculate P(X + Y < 1), P(2X < Y), and P(X = Y).

otherwise.010 if),(

2 yxxykyxf

Solution: Some points to note:

Finding the constant k and probabilities are matters of double integration,

It is important to draw regions on which integrations are desired, so that the integration limits can be determined.


Chapter 4


Joint Distributions

1 =

dxdyyxf ),(

= 1

0 0

2ydxdyxyk

= 1

0

20

221 dyyxk

y

= 1

0

421 dyyk =

10

k, k = 10.

1

0 1 X

Y

0 x y 1

(a)

The marginal pdfs are

dyyxfxf X ),()( =

1 210x

dyxy = )1(3

10 3xx , 0 x 1.

dxyxfyfY ),()( =

ydxxy

0

210 = 5y4, 0 y 1.

P(X + Y < 1) = 5.0

0

1 210x

xdxdyxy

= 5.0

0

33)1(3

10dxxxx

= 5.0

0

432 )233(3

10dxxxxx = 0.1146

(c)

1

0 1 X

Y

X = Y X+Y = 1

x y, x+y 1

(b)


Chapter 4


Joint Distributions

P(2X < Y) = 5.0

0

1

2

210x

dxdyxy

= 5.0

0

3 )81(3

10dxxx

= 1/4

X=Y

2X=Y 1

0 1 X

Y

2x y

Finally, P(X = Y) = 1

0

210x

xdxdyxy = 0.

Definition 4.4. Two random variables X and Y are said to be independent if and only if

P(X x, Y y) = P(X x) P(Y y)

for all possible values (x, y) of (X, Y).

Definition 4.4. Two random variables X and Y are said to be independent if and only if

P(X x, Y y) = P(X x) P(Y y)

for all possible values (x, y) of (X, Y).

1

0

210x

xdxdyxy

1

0

210x

xdxdyxy


Chapter 4


Note:

This definition states that X and Y are independent if and only if their joint CDF can be written as the product of their marginal CDfs, i.e., F(x, y) = FX(x) FY(y).

When X and Y are both discrete, the independence condition can be written as P(X = x, Y = y) = P(X = x) P(Y = y), for all x and y, i.e., the joint pmf is the product of the marginal pmfs.

When X and Y are both continuous, the independence condition can be written as f(x, y) = fX(x) fY(y), i.e., joint pdf is the product of the marginal pdfs.

Definition 4.4 extends naturally to the cases of many random variables

Joint Distributions


Chapter 4


Example 4.5. Stores A and B, which belong to the same owner, are located in two different towns. If the probability density function of the weekly profit of each store, in thousand dollars, is given by

and the profit of one store is independent of the other, what is the probability that next week one store makes at least $500 more than the other store?

otherwise.031 if4

)(xx

xf

Joint Distributions

Solution: Let X and Y denote, respectively, next week’s profits of stores A and B. The desired probability is

P(X Y + 1/2) + P(Y X + 1/2)

Since X and Y are independent, by symmetry,

P(X Y + 1/2) + P(Y X + 1/2) = 2 P(X Y + 1/2)

To calculate P(X Y + 1/2), we need the joint pdf of X and Y. Since X and Y are independent, we have,

otherwise,0

31,31 if,16)()(),(

yxxyyfxfyxf YX


Chapter 4


Example 4.6. Prove that the two random variables X and Y with the following joint probability density function are not independent.

Example 4.6. Prove that the two random variables X and Y with the following joint probability density function are not independent.

otherwise.010 if8

),(yxxy

yxf

Joint Distributions

To find P(X > Y + 1/2), one needs to integrate f(x, y) on a region defined by the conditions: 1 X 3, 1 Y 3, and X Y + 1/2.

2P(X Y +1/2)

= 2 dxdyxyx

3

23

21

1 16

= 3

23

21

12

16

1dxxy

x

= 3

23

23 )43(16

1dxxxx = 0.54

X =Y+1/2 3

0 3 X

Y

2 1

2

1

3/2

5/2 X Y+1/2, 1 X 3 1 Y 3

Solution:

fX(x) = dyxyx18 = 4x(1 x2), 0 x 1,

fY(y) = dxxyy

0 8 = 4y3, 0 y 1.

Since f(x, y) fX(x) fY(y) , X and Y are

NOT independent.


Chapter 4


Special Joint Distributions

Certain special joint distributions such as multinomial and bivariate normal deserve some detailed attention.

Multinomial is a direct generalization of the binomial. An experiment has k possible outcomes with probabilities 1, 2, , k. Let Xi be the number of times that the ith outcome occurs among a total of n independent trials of such an experiment, i = 1, 2, , k. Then the joint distribution of X1, X2, . . . , Xk is called the Multinomial Distribution with the joint pmf of the following form:

where 1 + 2 + . . . + k = 1, and x1 + x2 + ... + xk = n.

kxk

xx

kk xxx

nxxxp

21

2121

21 !!!

!),,,(


Chapter 4


-2

-1

0

1

2-2

-1

0

1

2

0

0.05

0.1

0.15

-2

-1

0

1

2

µ1 = µ2 = 0, 1 = 2 =1, = 0.1

16


A Bivariate Normal distribution has the following joint pdf:

f(x1,x2) =

2

22

1

11

2

2

22

2

1

112

212

2)1(2

1exp

12

1

xxxx

-2

-1

0

1

2-2

-1

0

1

2

0

0.1

0.2

0.3

-2

-1

0

1

2

µ1 = µ2 = 0, 1 = 2 =1, = 0.9

Plots of Bivariate Normal pdfPlots of Bivariate Normal pdf


Chapter 4



It can be shown that is the correlation coefficient between X1 and X2. When =0, we have

So, in this case, X1 and X2 are independent.

)()(

2

1exp

2

1

2

1exp

2

1

2

1exp

2

1),(

21

2

2

22

2

2

1

11

1

2

2

22

2

1

11

2121

xfxf

xx

xxxxf

For two normal random variables, if they are uncorrelated (or covariance is zero), then they are independent. This conclusion may not apply to other random variables.

For two normal random variables, if they are uncorrelated (or covariance is zero), then they are independent. This conclusion may not apply to other random variables.


Chapter 4



Definition 4.5. The covariance between any two jointly distributed r.v.s X and Y, denoted by Cov(X, Y), is defined by

Cov(X, Y) = E[(X µX)(Y µY)] = E[XY] µX µY

where µX = E[X] and µY = E[Y]

Definition 4.5. The covariance between any two jointly distributed r.v.s X and Y, denoted by Cov(X, Y), is defined by

Cov(X, Y) = E[(X µX)(Y µY)] = E[XY] µX µY

where µX = E[X] and µY = E[Y]

Properties of Covariance:

For any two r.v.s X and Y, and constants a, b, c and d,

Cov(X, X) = Var(X)

Cov(X, Y) = Cov(Y, X)

Cov(aX+b, cY+d) = ac Cov(X, Y)

If X and Y are independent then Cov(X, Y) = 0.


Chapter 4


Definition 4.6. The correlation coefficient between any two jointly distributed r.v.s X and Y, denoted by (X, Y), is defined by

It measures the degree of association between X and Y, and takes values in [1, 1].

Definition 4.6. The correlation coefficient between any two jointly distributed r.v.s X and Y, denoted by (X, Y), is defined by

It measures the degree of association between X and Y, and takes values in [1, 1].

)Var()Var(

),(Cov),(

YX

YXYX

Properties of Correlation Coefficient:

For any two r.v.s X and Y, and constants a, b, c and d,

(aX+b, cY+d) = (X, Y), if ac > 0,

= (X, Y), if ac < 0.



Chapter 4


Conditional Distributions

One of the most useful concepts in probability theory is that of conditional probability and conditional expectation, because

In practice, some partial information is often available, and hence calculations of probabilities and expectations should be conditional upon the given information;

In calculating a desired probability or expectation it is often extremely useful to first “condition” on some appropriate random variables.

The concept of conditional probability, P(A|B) = P(A B)/P(B), can be extended directly to give a definition of the conditional distribution of X given Y = y, where X and Y are two r.v.s, discrete or continuous.


Chapter 4



Definition 4.7. For two discrete r.v.s X and Y, the conditional pmf of X given Y = y is

where pY(y) 0;

Definition 4.7. For two discrete r.v.s X and Y, the conditional pmf of X given Y = y is

where pY(y) 0;,

)(

),(

)(

),(

)|()|(|

yp

yxp

yYP

yYxXP

yYxXPyxp

Y

YX

Clearly, when X is independent of Y, pX|Y(x | y) = pX(x) .

The conditional expectation of X given Y = y is defined asThe conditional expectation of X given Y = y is defined as

,)|(]|[E |x

YX yxpxyYX


Chapter 4



Definition 4.8. For two continuous r.v.s X and Y, the conditional pdf of X given Y = y is

where fY(y) 0;

Definition 4.8. For two continuous r.v.s X and Y, the conditional pdf of X given Y = y is

where fY(y) 0;

,)(

),()|(| yf

yxfyxf

YYX

The conditional expectation of X given Y = y is defined asThe conditional expectation of X given Y = y is defined as

dxyxfxyYX YX )|(]|[E |

Example 4.7. Roll a fair die successfully. Let X be the number of rolls until first 4 and Y be the number of rolls until first 5.

(a)Find the conditional pmf of X given Y = 4.

(b)Calculate P(X > 2 | Y = 4).

(c) Calculate E[X | Y = 4]


Chapter 4



Solution:

(a) )4|(| xp YX = )4(

)4,(

Yp

xp, where

)4(Yp = P(Y = 4) = (5/6)3(1/6) = 125/1296

p(1, 4) = P(X = 1, Y = 4) = (1/6)(5/6)2(1/6) = 25/1296

p(2, 4) = P(X = 2, Y = 4) = (4/6)(1/6)(5/6)(1/6) = 20/1296

p(3, 4) = P(X = 3, Y = 4) = (4/6)2(1/6)(1/6) = 16/1296

p(4, 4) = P(X = 4, Y = 4) = 0, and

p(x, 4) =

6

1

6

5

6

1

6

453 x

, for x = 5, 6, 7, . . . .

Therefore, )4|1(|YXp = p(1,4)/pY(4) = 25/125,

)4|2(|YXp = p(2,4)/pY(4) = 20/125,


Chapter 4



)4|3(|YXp = p(3, 4)/pY(4) = 16/125,

)4|4(|YXp = p(4, 4)/pY(4) = 0, and

)4|(| xp YX =

6

1

6

5

5

453 x

, for x = 5, 6, 7, . . . .

(b) P(X > 2 | Y = 4) = 1 P(X = 1 | Y = 4) P(X = 2 | Y = 4)

= 80/125.

(c) E(X | Y = 4)

= (1)(25/125) + (2)(20/125) + (3)(16/125) + (4)(0) +

5

53

6

1

6

5

5

4

x

x

x

=

1

13

6

1

6

5)4(

5

4

125

113

y

y

y = 3

5

4

125

113

(6+4) = 6.024.

(in the above, y = x 4)


Chapter 4


Conditional DistributionsExample 4.8. The joint pdf of X and Y is given by

(a) Compute the conditional pdf of X given Y=y, where 0 y 1.

(b) Calculate P(X > 0.5 | Y = 0.5) .

(c) Calculate E(X | Y = 0.5).

otherwise.010,10 if)2(

),( 512 yxyxx

yxf

Solution:

(a) )|(| yxf YX = )(

),(

yf

yxf

Y

=

dxyxf

yxf

),(

),( =

1

0)2(

)2(

dxyxx

yxx =

y

yxx

34

)2(6

, for 0 x 1, 0 y 1.

(b) P(X > 0.5 | Y = 0.5)

= 1

5.0 | )5.0|( dxxf YX = 1

5.0)23(

5

6dxxx = 0.65.


Chapter 4



Definition 4.9. Let X and Y be jointly distributed r.v.s. The conditional variance of X given Y = y, is given by,

Var(X | Y = y) = E[(X µX|Y)2 | Y = y],

where µX|Y = E(X | Y = y).

Definition 4.9. Let X and Y be jointly distributed r.v.s. The conditional variance of X given Y = y, is given by,

Var(X | Y = y) = E[(X µX|Y)2 | Y = y],

where µX|Y = E(X | Y = y).

Continuing on Example 4.8, now we want to find Var(X | Y = 0.5):

E(X2 | Y = 0.5) =1

0

2 )5.0( dxxfx YX = 1

0

3 )23(5

6dxxx = 0.42

Var(X | Y = 0.5) = E(X2 | Y = 0.5) [E(X | Y = 0.5)]2 = 0.42 – 0.62 = 0.06

(c) )5.0|(| xf YX = )23(5

6xx ,

E(X | Y = 0.5) =1

0)5.0( dxxxf YX =

1

0

2 )23(5

6dxxx = 0.6.

Documents

Chapter 4: Joint and Conditional Distributions [email protected] Yang Zhenlin