Upload
ezra-williamson
View
225
Download
3
Tags:
Embed Size (px)
Citation preview
Chapter 4: Joint and Conditional Distributions
[email protected]://www.mysmu.edu/faculty/zlyang/
Yang ZhenlinYang Zhenlin
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU
Chapter Contents
Joint Distribution
Special Joint Distributions:
Multinomial and Bivariate Normal
Covariance and Correlation Coefficient
Conditional Distribution
Conditional Expectation
Conditional Variance
2
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU3
Introduction
In many applications, more than one variables are needed for describing a quantity or a phenomenon of interest, e.g.,
To describe the size of a man, one needs at least height (X) and weight (Y).
To describe a point in a rectangle, one needs X coordinate and Y coordinate.
In general, the set of k r.v.s correspond to the same “unit”, defined on the same sample space and taking values in a k-dimensional Euclidean space.
In this chapter, we focus mainly on the case of two r.v.s, and deal separately with two cases:
•both X and Y are discrete
•both X and Y are continuous
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU4
Joint Distributions
Definition 4.2. Let X and Y be two discrete random variables defined on the same sample space. The joint probability mass function of X and Y is defined to be
p(x, y) = P(X = x, Y = y)
for all possible values of X and Y.
Definition 4.2. Let X and Y be two discrete random variables defined on the same sample space. The joint probability mass function of X and Y is defined to be
p(x, y) = P(X = x, Y = y)
for all possible values of X and Y.
Definition 4.1. (Joint CDF) The joint cumulative distribution function of r.v.s X and Y is the function defined by
F(x, y) = P(X x, Y y ).
Definition 4.1. (Joint CDF) The joint cumulative distribution function of r.v.s X and Y is the function defined by
F(x, y) = P(X x, Y y ).
Definition 4.1 extends naturally to cases of more than two r.v.s.It applies to both discrete and continuous r.v.s.
Definition 4.2 extends directly to cases of more than two r.v.s
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU5
Joint Distributions
Example 4.1. Xavier and Yvette are two real estate agents. Let X and Y denote the number of houses that Xavier and Yvette will sell next week, respectively. Suppose that there only four houses for sale next week. The joint probability mass function and its graph are presented below.Find P(X 1, Y 1) and P(Y 1).
p(x,y)
X
Y X=0 X=2 X=1
y=1
y=2
y=0
0.42
0.12 0.21
0.07 0.06
0.02
0.06
0.03
0.01
p(x, y) X
Y 0 1 2
0 .12 .42 .06
1 .21 .06 .03
2 .07 .02 .01
Answer: P(X 1, Y 1) =.06+.03+.02+.01 = 0.12,P(Y 1) =.21+.06+.03+.07+.02+.01 = .40
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU6
Joint Distributions
Example 4.2. A bin contains 1000 flower seeds, of which 400 are red, 400 are white and 200 are pink. Ten seeds are selected at random without replacement. Let X be the number of red flower seeds and Y be the number of white flower seeds being selected.
(a) Find the joint pmf of X and Y. (b) Calculate P(X = 2, Y = 3) and P(X = Y).
Solution: (a) From the counting techniques in Chapter 1, we obtain
.10,0,0,10
100010
200400400),(
yxyx
yxyxyxp
0081.0
101000
5200
3400
2400
)3,2(
YXP(b)
0263.0
101000
210200400400
),()(
5
0
5
0
i
i
iii
iYiXPYXP
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU7
Joint Distributions
A function p(x, y) is said to be the joint pmf of discrete r.v.s X and Y if and only if for all possible values (x, y),
(i) p(x, y) 0 and (ii)
A function p(x, y) is said to be the joint pmf of discrete r.v.s X and Y if and only if for all possible values (x, y),
(i) p(x, y) 0 and (ii) .1),( x y
yxp
Example 4.3. Let the joint pmf of X and Y be give by
(a) Find the value of the constant k.(b) Calculate P(X > Y), P(X + Y 4), and P(Y X).
otherwise.,0(3,3) (2,3), (1,2), (1,1), ) ,( if),(),(
22 yxyxkyxp
Solution:
(a)
= k[(12 + 12) + (12 + 22 ) + (22 + 32 ) + (32 + 32)] = 38 k, k = 1/38.
(b) P(X > Y) = 0, P(X + Y 4) = 7/38, and P(Y X) = 1.
),(
),(),(1yxx y
yxpyxp
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU8
Joint Distributions
Definition 4.3. A function f(x, y) is said to be the joint probability density function of the continuous r.v.s X and Y if the joint CDF of X and Y can be written as
Definition 4.3. A function f(x, y) is said to be the joint probability density function of the continuous r.v.s X and Y if the joint CDF of X and Y can be written as
x y
yxdvduvufyxF . and allfor ,),(),(
A function f(x, y) is said to be the joint pdf of continuous r.v.s X and Y if and only if for all possible values (x, y),
(i) f(x, y) 0 and (ii)
A function f(x, y) is said to be the joint pdf of continuous r.v.s X and Y if and only if for all possible values (x, y),
(i) f(x, y) 0 and (ii) 1),(
dxdyyxf
Marginal pmf:
Marginal pdf:
x
Yy
X yxpypyxpxp ),()(;),()(
dxyxfyfdyyxfxf YX ),()(;),()(
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU
In Example 4.3, the marginal pmfs of X and Y are given below:
X 1 2 3 Y 1 2 3 )(xpX 7/38 13/38 18/38 )(ypY 2/38 5/38 31/38
9
Joint Distributions
Example 4.4. Let the joint pdf be given by
(a) Find the value of the constant k. (b) Find the marginal pdfs of X and Y.(c) Calculate P(X + Y < 1), P(2X < Y), and P(X = Y).
otherwise.010 if),(
2 yxxykyxf
Solution: Some points to note:
Finding the constant k and probabilities are matters of double integration,
It is important to draw regions on which integrations are desired, so that the integration limits can be determined.
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU10
Joint Distributions
1 =
dxdyyxf ),(
= 1
0 0
2ydxdyxyk
= 1
0
20
221 dyyxk
y
= 1
0
421 dyyk =
10
k, k = 10.
1
0 1 X
Y
0 x y 1
(a)
The marginal pdfs are
dyyxfxf X ),()( =
1 210x
dyxy = )1(3
10 3xx , 0 x 1.
dxyxfyfY ),()( =
ydxxy
0
210 = 5y4, 0 y 1.
P(X + Y < 1) = 5.0
0
1 210x
xdxdyxy
= 5.0
0
33)1(3
10dxxxx
= 5.0
0
432 )233(3
10dxxxxx = 0.1146
(c)
1
0 1 X
Y
X = Y X+Y = 1
x y, x+y 1
(b)
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU11
Joint Distributions
P(2X < Y) = 5.0
0
1
2
210x
dxdyxy
= 5.0
0
3 )81(3
10dxxx
= 1/4
X=Y
2X=Y 1
0 1 X
Y
2x y
Finally, P(X = Y) = 1
0
210x
xdxdyxy = 0.
Definition 4.4. Two random variables X and Y are said to be independent if and only if
P(X x, Y y) = P(X x) P(Y y)
for all possible values (x, y) of (X, Y).
Definition 4.4. Two random variables X and Y are said to be independent if and only if
P(X x, Y y) = P(X x) P(Y y)
for all possible values (x, y) of (X, Y).
1
0
210x
xdxdyxy
1
0
210x
xdxdyxy
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU12
Note:
This definition states that X and Y are independent if and only if their joint CDF can be written as the product of their marginal CDfs, i.e., F(x, y) = FX(x) FY(y).
When X and Y are both discrete, the independence condition can be written as P(X = x, Y = y) = P(X = x) P(Y = y), for all x and y, i.e., the joint pmf is the product of the marginal pmfs.
When X and Y are both continuous, the independence condition can be written as f(x, y) = fX(x) fY(y), i.e., joint pdf is the product of the marginal pdfs.
Definition 4.4 extends naturally to the cases of many random variables
Joint Distributions
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU13
Example 4.5. Stores A and B, which belong to the same owner, are located in two different towns. If the probability density function of the weekly profit of each store, in thousand dollars, is given by
and the profit of one store is independent of the other, what is the probability that next week one store makes at least $500 more than the other store?
otherwise.031 if4
)(xx
xf
Joint Distributions
Solution: Let X and Y denote, respectively, next week’s profits of stores A and B. The desired probability is
P(X Y + 1/2) + P(Y X + 1/2)
Since X and Y are independent, by symmetry,
P(X Y + 1/2) + P(Y X + 1/2) = 2 P(X Y + 1/2)
To calculate P(X Y + 1/2), we need the joint pdf of X and Y. Since X and Y are independent, we have,
otherwise,0
31,31 if,16)()(),(
yxxyyfxfyxf YX
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU14
Example 4.6. Prove that the two random variables X and Y with the following joint probability density function are not independent.
Example 4.6. Prove that the two random variables X and Y with the following joint probability density function are not independent.
otherwise.010 if8
),(yxxy
yxf
Joint Distributions
To find P(X > Y + 1/2), one needs to integrate f(x, y) on a region defined by the conditions: 1 X 3, 1 Y 3, and X Y + 1/2.
2P(X Y +1/2)
= 2 dxdyxyx
3
23
21
1 16
= 3
23
21
12
16
1dxxy
x
= 3
23
23 )43(16
1dxxxx = 0.54
X =Y+1/2 3
0 3 X
Y
2 1
2
1
3/2
5/2 X Y+1/2, 1 X 3 1 Y 3
Solution:
fX(x) = dyxyx18 = 4x(1 x2), 0 x 1,
fY(y) = dxxyy
0 8 = 4y3, 0 y 1.
Since f(x, y) fX(x) fY(y) , X and Y are
NOT independent.
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU15
Special Joint Distributions
Certain special joint distributions such as multinomial and bivariate normal deserve some detailed attention.
Multinomial is a direct generalization of the binomial. An experiment has k possible outcomes with probabilities 1, 2, , k. Let Xi be the number of times that the ith outcome occurs among a total of n independent trials of such an experiment, i = 1, 2, , k. Then the joint distribution of X1, X2, . . . , Xk is called the Multinomial Distribution with the joint pmf of the following form:
where 1 + 2 + . . . + k = 1, and x1 + x2 + ... + xk = n.
kxk
xx
kk xxx
nxxxp
21
2121
21 !!!
!),,,(
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU
-2
-1
0
1
2-2
-1
0
1
2
0
0.05
0.1
0.15
-2
-1
0
1
2
µ1 = µ2 = 0, 1 = 2 =1, = 0.1
16
Special Joint Distributions
A Bivariate Normal distribution has the following joint pdf:
f(x1,x2) =
2
22
1
11
2
2
22
2
1
112
212
2)1(2
1exp
12
1
xxxx
-2
-1
0
1
2-2
-1
0
1
2
0
0.1
0.2
0.3
-2
-1
0
1
2
µ1 = µ2 = 0, 1 = 2 =1, = 0.9
Plots of Bivariate Normal pdfPlots of Bivariate Normal pdf
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU17
Special Joint Distributions
It can be shown that is the correlation coefficient between X1 and X2. When =0, we have
So, in this case, X1 and X2 are independent.
)()(
2
1exp
2
1
2
1exp
2
1
2
1exp
2
1),(
21
2
2
22
2
2
1
11
1
2
2
22
2
1
11
2121
xfxf
xx
xxxxf
For two normal random variables, if they are uncorrelated (or covariance is zero), then they are independent. This conclusion may not apply to other random variables.
For two normal random variables, if they are uncorrelated (or covariance is zero), then they are independent. This conclusion may not apply to other random variables.
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU18
Covariance and Correlation Coefficient
Definition 4.5. The covariance between any two jointly distributed r.v.s X and Y, denoted by Cov(X, Y), is defined by
Cov(X, Y) = E[(X µX)(Y µY)] = E[XY] µX µY
where µX = E[X] and µY = E[Y]
Definition 4.5. The covariance between any two jointly distributed r.v.s X and Y, denoted by Cov(X, Y), is defined by
Cov(X, Y) = E[(X µX)(Y µY)] = E[XY] µX µY
where µX = E[X] and µY = E[Y]
Properties of Covariance:
For any two r.v.s X and Y, and constants a, b, c and d,
Cov(X, X) = Var(X)
Cov(X, Y) = Cov(Y, X)
Cov(aX+b, cY+d) = ac Cov(X, Y)
If X and Y are independent then Cov(X, Y) = 0.
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU19
Definition 4.6. The correlation coefficient between any two jointly distributed r.v.s X and Y, denoted by (X, Y), is defined by
It measures the degree of association between X and Y, and takes values in [1, 1].
Definition 4.6. The correlation coefficient between any two jointly distributed r.v.s X and Y, denoted by (X, Y), is defined by
It measures the degree of association between X and Y, and takes values in [1, 1].
)Var()Var(
),(Cov),(
YX
YXYX
Properties of Correlation Coefficient:
For any two r.v.s X and Y, and constants a, b, c and d,
(aX+b, cY+d) = (X, Y), if ac > 0,
= (X, Y), if ac < 0.
Covariance and Correlation Coefficient
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU20
Conditional Distributions
One of the most useful concepts in probability theory is that of conditional probability and conditional expectation, because
In practice, some partial information is often available, and hence calculations of probabilities and expectations should be conditional upon the given information;
In calculating a desired probability or expectation it is often extremely useful to first “condition” on some appropriate random variables.
The concept of conditional probability, P(A|B) = P(A B)/P(B), can be extended directly to give a definition of the conditional distribution of X given Y = y, where X and Y are two r.v.s, discrete or continuous.
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU21
Conditional Distributions
Definition 4.7. For two discrete r.v.s X and Y, the conditional pmf of X given Y = y is
where pY(y) 0;
Definition 4.7. For two discrete r.v.s X and Y, the conditional pmf of X given Y = y is
where pY(y) 0;,
)(
),(
)(
),(
)|()|(|
yp
yxp
yYP
yYxXP
yYxXPyxp
Y
YX
Clearly, when X is independent of Y, pX|Y(x | y) = pX(x) .
The conditional expectation of X given Y = y is defined asThe conditional expectation of X given Y = y is defined as
,)|(]|[E |x
YX yxpxyYX
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU22
Conditional Distributions
Definition 4.8. For two continuous r.v.s X and Y, the conditional pdf of X given Y = y is
where fY(y) 0;
Definition 4.8. For two continuous r.v.s X and Y, the conditional pdf of X given Y = y is
where fY(y) 0;
,)(
),()|(| yf
yxfyxf
YYX
The conditional expectation of X given Y = y is defined asThe conditional expectation of X given Y = y is defined as
dxyxfxyYX YX )|(]|[E |
Example 4.7. Roll a fair die successfully. Let X be the number of rolls until first 4 and Y be the number of rolls until first 5.
(a)Find the conditional pmf of X given Y = 4.
(b)Calculate P(X > 2 | Y = 4).
(c) Calculate E[X | Y = 4]
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU23
Conditional Distributions
Solution:
(a) )4|(| xp YX = )4(
)4,(
Yp
xp, where
)4(Yp = P(Y = 4) = (5/6)3(1/6) = 125/1296
p(1, 4) = P(X = 1, Y = 4) = (1/6)(5/6)2(1/6) = 25/1296
p(2, 4) = P(X = 2, Y = 4) = (4/6)(1/6)(5/6)(1/6) = 20/1296
p(3, 4) = P(X = 3, Y = 4) = (4/6)2(1/6)(1/6) = 16/1296
p(4, 4) = P(X = 4, Y = 4) = 0, and
p(x, 4) =
6
1
6
5
6
1
6
453 x
, for x = 5, 6, 7, . . . .
Therefore, )4|1(|YXp = p(1,4)/pY(4) = 25/125,
)4|2(|YXp = p(2,4)/pY(4) = 20/125,
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU24
Conditional Distributions
)4|3(|YXp = p(3, 4)/pY(4) = 16/125,
)4|4(|YXp = p(4, 4)/pY(4) = 0, and
)4|(| xp YX =
6
1
6
5
5
453 x
, for x = 5, 6, 7, . . . .
(b) P(X > 2 | Y = 4) = 1 P(X = 1 | Y = 4) P(X = 2 | Y = 4)
= 80/125.
(c) E(X | Y = 4)
= (1)(25/125) + (2)(20/125) + (3)(16/125) + (4)(0) +
5
53
6
1
6
5
5
4
x
x
x
=
1
13
6
1
6
5)4(
5
4
125
113
y
y
y = 3
5
4
125
113
(6+4) = 6.024.
(in the above, y = x 4)
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU25
Conditional DistributionsExample 4.8. The joint pdf of X and Y is given by
(a) Compute the conditional pdf of X given Y=y, where 0 y 1.
(b) Calculate P(X > 0.5 | Y = 0.5) .
(c) Calculate E(X | Y = 0.5).
otherwise.010,10 if)2(
),( 512 yxyxx
yxf
Solution:
(a) )|(| yxf YX = )(
),(
yf
yxf
Y
=
dxyxf
yxf
),(
),( =
1
0)2(
)2(
dxyxx
yxx =
y
yxx
34
)2(6
, for 0 x 1, 0 y 1.
(b) P(X > 0.5 | Y = 0.5)
= 1
5.0 | )5.0|( dxxf YX = 1
5.0)23(
5
6dxxx = 0.65.
STAT306, Term II, 09/10
Chapter 4
STAT151, Term II 14-15 © Zhenlin Yang, SMU26
Conditional Distributions
Definition 4.9. Let X and Y be jointly distributed r.v.s. The conditional variance of X given Y = y, is given by,
Var(X | Y = y) = E[(X µX|Y)2 | Y = y],
where µX|Y = E(X | Y = y).
Definition 4.9. Let X and Y be jointly distributed r.v.s. The conditional variance of X given Y = y, is given by,
Var(X | Y = y) = E[(X µX|Y)2 | Y = y],
where µX|Y = E(X | Y = y).
Continuing on Example 4.8, now we want to find Var(X | Y = 0.5):
E(X2 | Y = 0.5) =1
0
2 )5.0( dxxfx YX = 1
0
3 )23(5
6dxxx = 0.42
Var(X | Y = 0.5) = E(X2 | Y = 0.5) [E(X | Y = 0.5)]2 = 0.42 – 0.62 = 0.06
(c) )5.0|(| xf YX = )23(5
6xx ,
E(X | Y = 0.5) =1
0)5.0( dxxxf YX =
1
0
2 )23(5
6dxxx = 0.6.