Upload
nebalious
View
317
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
1
10. Joint Moments and Joint Characteristic Functions
Following section 6, in this section we shall introduce various parameters to compactly represent the information contained in the joint p.d.f of two r.vs. Given two r.vs X and Y and a function define the r.v
Using (6-2), we can define the mean of Z to be
(10-1)),( YXgZ
),,( yxg
.)( )(
dzzfzZE ZZ (10-2)
PILLAI
2
However, the situation here is similar to that in (6-13), and it is possible to express the mean of in terms of without computing To see this, recall from (5-26) and (7-10) that
where is the region in xy plane satisfying the above inequality. From (10-3), we get
As covers the entire z axis, the corresponding regions are nonoverlapping, and they cover the entire xy plane.
zDyxXY
Z
yxyxf
zzYXgzPzzfzzZzP
),(
),(
),()(
(10-3)
zD
),( YXgZ ),( yxf XY ).(zfZ
( , )
( ) ( , ) ( , ) .z
Z XYx y D
z f z z g x y f x y x y
(10-4)
zDz
PILLAI
3
By integrating (10-4), we obtain the useful formula
or
If X and Y are discrete-type r.vs, then
Since expectation is a linear operator, we also get
( ) ( ) ( , ) ( , ) .Z XYE Z z f z dz g x y f x y dxdy
(10-5)
(10-6)
[ ( , )] ( , ) ( , ) .XYE g X Y g x y f x y dxdy
[ ( , )] ( , ) ( , ).i j i ji j
E g X Y g x y P X x Y y (10-7)
( , ) [ ( , )].k k k kk k
E a g X Y a E g X Y (10-8)
PILLAI
4
If X and Y are independent r.vs, it is easy to see that and are always independent of each other. In that case using (10-7), we get the interesting result
However (10-9) is in general not true (if X and Y are not independent).
In the case of one random variable (see (10- 6)), we defined the parameters mean and variance to represent its average behavior. How does one parametrically represent similar cross-behavior between two random variables? Towards this, we can generalize the variance definition given in (6-16) as shown below:
)(XgZ
)].([)]([)()()()(
)()()()()]()([
YhEXgEdyyfyhdxxfxg
dxdyyfxfyhxgYhXgE
YX
YX
(10-9)
)(YhW
PILLAI
5
Covariance: Given any two r.vs X and Y, define
By expanding and simplifying the right side of (10-10), we also get
It is easy to see that
To see (10-12), let so that
.
)()()()(),(________
YXXY
YEXEXYEXYEYXCov YX
(10-10)
(10-12)
. )()(),( YX YXEYXCov
(10-11)
,YaXU
. )()(),( YVarXVarYXCov
. 0)(),( 2)(
)()()(2
2
YVarYXCovaXVara
YXaEUVar YX (10-13)
PILLAI
6
The right side of (10-13) represents a quadratic in the variable a that has no distinct real roots (Fig. 10.1). Thus the roots are imaginary (or double) and hence the discriminant
must be non-positive, and that gives (10-12). Using (10-12), we may define the normalized parameter
or
and it represents the correlation coefficient between X and Y.
(10-14)
)( )(),( 2 YVarXVarYXCov
,11 ,),(
)()(
),( XY
YXXY
YXCov
YVarXVar
YXCov
a
)(UVar
Fig. 10.1
YXXYYXCov ),( (10-15)
PILLAI
7
Uncorrelated r.vs: If then X and Y are said to be uncorrelated r.vs. From (11), if X and Y are uncorrelated, then
Orthogonality: X and Y are said to be orthogonal if
From (10-16) - (10-17), if either X or Y has zero mean, then orthogonality implies uncorrelatedness also and vice-versa. Suppose X and Y are independent r.vs. Then from (10-9) with we get
and together with (10-16), we conclude that the random variables are uncorrelated, thus justifying the original definition in (10-10). Thus independence implies uncorrelatedness.
(10-16)
,0XY
,)( ,)( YYhXXg
).()()( YEXEXYE
.0)( XYE(10-17)
),()()( YEXEXYE
PILLAI
8
Naturally, if two random variables are statistically independent, then there cannot be any correlation between them However, the converse is in general not true. As the next example shows, random variables can be uncorrelated without being independent.
Example 10.1: Let Suppose X and Y are independent. Define Z = X + Y, W = X - Y . Show that Z and W are dependent, but uncorrelated r.vs.
Solution: gives the only solution set to be
Moreover and
),1,0( UX ).1,0( UY
|| ,2 ,2 ,11 ,20 wzwzwzwz .2/1|),(| wzJ
.2
,2
wzy
wzx
yxwyxz ,
).0( XY
PILLAI
9
,otherwise,0
,|| ,2 ,2 ,11 ,20,2/1),(
zwwzwzwzwzfZW
(10-18)
Thus (see the shaded region in Fig. 10.2)
and hence
or by direct computation ( Z = X + Y )
Fig. 10.21
1
w
z2
,21,2 2
1
,10 , 2
1
),()(2
2
zzdw
zzdw
dwwzfzf-z
z-
z
z
ZWZ
PILLAI
10
and
Clearly Thus Z and W are not independent. However
and
and hence
implying that Z and W are uncorrelated random variables.
,otherwise,0
,21,2
,10,
)()()( zz
zz
zfzfzf YXZ(10-19)
(10-20)
).()(),( wfzfwzf WZZW
(10-21) ,0)()())(()( 22 YEXEYXYXEZWE
0)()()(),( WEZEZWEWZCov (10-22)
.otherwise,0
,11|,|1
2
1),()(
||2
wwdzdzwzfwf
w
|w|ZWW
,0)()( YXEWE
PILLAI
11
Example 10.2: Let Determine the variance of Z in terms of and Solution:
and using (10-15)
In particular if X and Y are independent, then and (10-23) reduces to
Thus the variance of the sum of independent r.vs is the sum of their variances
(10-23)
,0XY
.bYaXZ YX , .XY
( ) ( )Z X YE Z E aX bY a b
22 2
2 2 2 2
2 2 2 2
( ) ( ) ( ) ( )
( ) 2 ( )( ) ( )
2 .
Z Z X Y
X X Y Y
X XY X Y Y
Var Z E Z E a X b Y
a E X abE X Y b E Y
a ab b
.22222YXZ ba (10-24)
).1( baPILLAI
12
Moments:
represents the joint moment of order (k,m) for X and Y.
Following the one random variable case, we can define the joint characteristic function between two random variables which will turn out to be useful for moment calculations.
Joint characteristic functions:
The joint characteristic function between X and Y is defined as
Note that
, ),(][
dydxyxfyxYXE XY
mkmk (10-25)
( ) ( )
( , ) ( , ) .j Xu Yv j Xu Yv
XY XYu v E e e f x y dxdy
(10-26)
.1)0,0(),( XYXY vu
PILLAI
13
It is easy to show that
If X and Y are independent r.vs, then from (10-26), we obtain
Also
More on Gaussian r.vs :
From Lecture 7, X and Y are said to be jointly Gaussian as if their joint p.d.f has the form in (7-23). In that case, by direct substitution and simplification, we obtain the joint characteristic function of two jointly Gaussian r.vs to be
.),(1
)(0,0
2
2
vu
XY
vu
vu
jXYE (10-27)
).()()()(),( vueEeEvu YXjvYjuX
XY (10-28)
( ) ( , 0), ( ) (0, ).X XY Y XYu u v v (10-29)
),,,,,( 22 YXYXN
PILLAI
14
.)(),()2(
2
1)()(
2222 vuvuvujYvXujXY
YYXXYX
eeEvu (10-30)
Equation (10-14) can be used to make various conclusions. Letting in (10-30), we get
and it agrees with (6-47).
From (7-23) by direct computation using (10-11), it is easy to show that for two jointly Gaussian random variables
Hence from (10-14), in represents the actual correlation coefficient of the two jointly Gaussian r.vs in (7-23). Notice that implies
0v
,)0,()(22
2
1uuj
XYX
XX
euu
(10-31)
),,,,( 22 YXYXN
. ),( YXYXCov
0
PILLAI
15
Thus if X and Y are jointly Gaussian, uncorrelatedness does imply independence between the two random variables. Gaussian case is the only exception where the two concepts imply each other.
Example 10.3: Let X and Y be jointly Gaussian r.vs with parameters Define Determine Solution: In this case we can make use of characteristic function to solve this problem.
).,,,,( 22 YXYXN .bYaXZ ).(zfZ
).,(
)()()()( )(
buau
eEeEeEu
XY
jbuYjauXubYaXjjZuZ
(10-32)
).()(),( yfxfYXf YXXY
PILLAI
16
From (10-30) with u and v replaced by au and bu respectively we get
where
Notice that (10-33) has the same form as (10-31), and hence we conclude that is also Gaussian with mean and variance as in (10-34) - (10-35), which also agrees with (10-23).
From the previous example, we conclude that any linear combination of jointly Gaussian r.vs generate a Gaussian r.v.
,)(2222222
2
1)2(
2
1)( uujubabaubaj
Z
ZZYYXXYX
eeu
(10-33)
(10-34)
(10-35)
bYaXZ
.2
,22222YYXXZ
YXZ
baba
ba
PILLAI
17
In other words, linearity preserves Gaussianity. We can use the characteristic function relation to conclude an even more general result.
Example 10.4: Suppose X and Y are jointly Gaussian r.vs as in the previous example. Define two linear combinations
what can we say about their joint distribution? Solution: The characteristic function of Z and W is given by
As before substituting (10-30) into (10-37) with u and v replaced by au + cv and bu + dv respectively, we get
. , dYcXWbYaXZ (10-36)
).,()(
)()(),()()(
)()()(
dvbucvaueE
eEeEvu
XYdvbujYcvaujX
vdYcXjubYaXjWvZujZW
(10-37)
PILLAI
18
,),()2(
2
1)( 2222 vuvuvuj
ZW
WYXZWZWZ
evu
(10-38)
where
and
From (10-38), we conclude that Z and W are also jointly distributed Gaussian r.vs with means, variances and correlation coefficient as in (10-39) - (10-43).
,2
,2
,
,
22222
22222
YYXXW
YYXXZ
YXW
YXZ
dcdc
baba
dc
ba
(10-39)
(10-40)
(10-41)
(10-42)
.)( 22
WZ
YYXXZW
bdbcadac
(10-43)
PILLAI
19
To summarize, any two linear combinations of jointly Gaussian random variables (independent or dependent) are also jointly Gaussian r.vs.
Of course, we could have reached the same conclusion by deriving the joint p.d.f using the technique developed in section 9 (refer (7-29)).
Gaussian random variables are also interesting because of the following result:
Central Limit Theorem: Suppose are a set of zero mean independent, identically distributed (i.i.d) random
Linear operator
Gaussian input Gaussian output
),( wzfZW
nXXX ,,, 21
Fig. 10.3
PILLAI
20
variables with some common distribution. Consider their scaled sum
Then asymptotically (as )
Proof: Although the theorem is true under even more general conditions, we shall prove it here under the independence assumption. Let represent their common variance. Since
we have
.21
n
XXXY n
(10-44)
n
).,0( 2NY
2
,0)( iXE
.)()( 22 ii XEXVar
(10-45)
(10-46)
(10-47)
PILLAI
21
Consider
where we have made use of the independence of the r.vs But
where we have made use of (10-46) - (10-47). Substituting (10-49) into (10-48), we obtain
and as
n
iX
n
i
nujXnuXXXjjYuY
nu
eEeEeEu
i
in
1
1
//)(
)/(
)()()( 21
(10-48)
.,,, 21 nXXX
,1
21
!3!21)(
2/3
22
2/3
333222/
no
n
u
n
uXj
n
uXj
n
ujXEeE iiinujX i
(10-49)
,1
21)(
2/3
22 n
Y no
n
uu
(10-50)
2 2 / 2lim ( ) ,uY
nu e
(10-51)
PILLAI
22
.1lim xn
ne
n
x
(10-52)
[Note that terms in (10-50) decay faster than But (10-51) represents the characteristic function of a zero mean normal r.v with variance and (10-45) follows.
The central limit theorem states that a large sum of independent random variables each with finite variance tends to behave like a normal random variable. Thus the individual p.d.fs become unimportant to analyze the collective sum behavior. If we model the noise phenomenon as the sum of a large number of independent random variables (eg: electron motion in resistor components), then this theorem allows us to conclude that noise behaves like a Gaussian r.v.
3 / 2(1/ )o n 3 / 21/ ].n
2
since
PILLAI
23
It may be remarked that the finite variance assumption is necessary for the theorem to hold good. To prove its importance, consider the r.vs to be Cauchy distributed, and let
where each Then since
substituting this into (10-48), we get
which shows that Y is still Cauchy with parameter In other words, central limit theorem doesn’t hold good for a set of Cauchy r.vs as their variances are undefined.
.21
n
XXXY n
(10-53)
).( CX i
,)( ||uX eu
i
(10-54)
n| |/
1
( ) ( / ) ~ ( ),n
u nY X
i
u u n e C n
(10-55)
.n
PILLAI
24
Joint characteristic functions are useful in determining the p.d.f of linear combinations of r.vs. For example, with X and Y as independent Poisson r.vs with parameters and respectively, let
Then
But from (6-33)
so that
i.e., sum of independent Poisson r.vs is also a Poisson random variable.
.YXZ (10-56)
).()()( uuu YXZ (10-57)
)1()1( 21 )( ,)( juju e
Ye
X eueu (10-58)
)( )( 21)1)(( 21 Peu
jueZ
(10-59)
1 2
PILLAI