Upload
clarissa-harvey
View
276
Download
0
Tags:
Embed Size (px)
Citation preview
§1.1 Discrete random variables
§1.2 Discrete random vectors
§1.1 Discrete random variables
§1.2 Discrete random vectors
§1 Entropy and mutual information
§1.1.1 Discrete memoryless source and entropy§1.1.2 Discrete memoryless channel
and mutual information
§1.1.1 Discrete memoryless source and entropy
Example 1.1.1
Let X represent the outcome of a single roll of a fair die.
6/16/16/16/16/16/1
654321
P
X
1)(,)(,),(),(
,,,
)( 121
21
q
ii
q
qaP
aPaPaP
aaa
xP
X
1. DMS (Discrete memoryless source )
Probability Space:
2. self information
Example 1.1.2
§1.1.1 Discrete memoryless source and entropy
1 2 1 2 3 4
( ) 0 5 0 5 ( ) 0 25 0 25 0 25 0 25
X a a Y a a a a,
P x . . P y . . . .
red white red white blue black
Analyse the uncertainty of red ball selected from X and from Y.
2. self information
I(ai) = f [p(ai)]Satisfy:
1) I(ai) is the monotone decreasing function of p(ai):
if p(a1)> p(a2), then I(a1) < I(a2) ;2) if p(ai)= 1, then I(ai)= 0;3) if p(ai)= 0 , then I(ai)→∞;4) if p(ai aj)= p(ai) p(aj), then I(aiaj)=I(ai)+I(aj)
§1.1.1 Discrete memoryless source and entropy
self information 1
( ) log log ( )( )i r r i
i
I a p ap a
I(ai)
p(ai)0 1
0)()(1 ji apap )()( ji aIaI
0)( iap )( iaI
1)( iap 0)( iaI
)()()( bIaIabI a and b are statistically independent
§1.1.1 Discrete memoryless source and entropy
Remark:
The measure of uncertainty of the random variable ai
The measure of information the random variable ai provides.
bitnat
hart
3. Entropy
1
( ) [ ( )] ( ) log ( )q
i i ii
H X E I a p a p a
Definition:Suppose X is a discrete random variable, whose
range R={a1,a2,…} is finite or countable.
Let p(ai)=P{X=ai}. The entropy of X is defined by
A measure ofamount of information provided by X.
uncertainty (or randomness) about X.
§1.1.1 Discrete memoryless source and entropy
average
Entropy-the amount of “information”provided by an observation of X
Example 1.1.3 100 balls in a bag, 80% is red, and remain is white. Now , we fetch out a ball. How about the information of every fetching?
2.08.0)(21 aa
xP
XN
aInaIn
N
II N )()( 2211
)()()()( 2211 aIapaIap
2
1
)(log)(i
ii apap),( 11 aNpn
)( 22 aNpn =0.722 bit/sig
§1.1.1 Discrete memoryless source and entropy
( ) (0.8,0.2)
0.722 /
H X H
bit sig
Entropy-the “uncertainty” or “randomness” about X
Example 1.1.4
01.099.0)(
,3.07.0)(
,5.05.0)(
21
3
321
2
221
1
1 aa
xP
Xaa
xP
Xaa
xP
X
)/(08.0)( 3 sigbitXH
)/(1)( 1 sigbitXH
)/(88.0)( 2 sigbitXH
average
§1.1.1 Discrete memoryless source and entropy
3. Entropy
1
( ) [ ( )] ( ) log ( )q
i i ii
H X E I a p a p a
1) units: bit/sig,nat/sig,hart/sig
2) If p(ai)=0, p(ai)log p(ai)-1 = 0
3) If R is infinite , H(X) may be +
Note:
§1.1.1 Discrete memoryless source and entropy
Example 1.1.5 entropy of BS
qpP
X 10 10 p
pq 1
)1log()1(log)( ppppXH
3. Entropy
1 21
1( , , ..., ) log
q
q ii i
H p p p pp
( )H p
entropy function
probability vector
( )H p
§1.1.1 Discrete memoryless source and entropy
4. The properties of entropy
Theorem1.1 Let X assume values in R={x1,x2,…,xr}.
0)( XH1)
2) H(X) = 0 iff pi = 1 for some i
3) H(X) ≤ logr ,with equality iff pi = 1/r for all i
——base of data compressing
§1.1.1 Discrete memoryless source and entropy
Proof:
(Theorem 1.1 in textbook)
Lemma: 0, log ( 1) log , ln 1.
with equality iff 1.
x x x e or x x
x
4) ),...,,(),...,,(2121 riiir pppHpppH
2
1
6
1
3
1)(
321 aaa
xP
X
3
1
2
1
6
1)(
321 aaa
yP
Y
6
1
2
1
3
1)(
321 bbb
zP
Z
Example 1.1.6 Let X,Y,Z are all discrete random variables:
4. The properties of entropy
1.46 ( / )H bit sig
§1.1.1 Discrete memoryless source and entropy
5) If X,Y are independent , then H(XY) = H(X) + H(Y)
4. The properties of entropy
Proof:
1 2
( ) ( )
q
i
a a aX
P x p a
( ) 1ii
p a
1 2
( )( )r
j
b b bY
p bP y
( ) 1jj
p b j
1
1( ) ( ) log
( )
r
j j
H Y p bp b
i1
1( ) ( ) log
( )
q
i i
H X p ap a
§1.1.1 Discrete memoryless source and entropy
Proof:
1 2,
( ) ( )
q
i
a a aX
P x p a
( ) 1i
i
p a 1 2 ,( )( )
r
j
b b bY
p bP y
( ) 1j
j
p b
)()()()(
),(),(
)(111
jijik
qrrq
bpapbapP
baba
xyP
XY
1)()(1 11
q
i
r
jji
qr
kk baPP
qr
k PPXYH
1 kk )(
1log)()(
q
i
r
jbapji ji
bap1 1
)(1log)(
i j
bpjij i
apij jibpapapbp )(
1)(
1 log)()(log)()(
Joint source:
§1.1.1 Discrete memoryless source and entropy
Theorem1.2 The entropy function H(p1,p2,…,pr) is a convex function of probability vector (p1,p2,…,pr) .
6) Convex properties
4. The properties of entropy
§1.1.1 Discrete memoryless source and entropy
0 1/2 1
1
)(
)1log()1(log)(
pH
ppppXH
Example 1.1.5 (continued)entropy of BS
H
p
5. conditional entropy
X, Y are a pair of random variables, if (X,Y)~p(x,y)
Then the conditional entropy of X , given Y is defined by
Definition:
,
1 1( | ) log ( , ) log
( | ) ( | )X Y
H X Y E p x yp x y p x y
§1.1.1 Discrete memoryless source and entropy
X yxp
yxpyYXH)|(
1log)|()|(
Y
yYXHypYXH )|()()|(
Y X yxp
yxpyp)|(
1log)|()(
YX yxp
yxp, )|(
1log),(
5. conditional entropy
Analyse:
§1.1.1 Discrete memoryless source and entropy
Example 1.1.7
3/40
1
0
1
?
1/4
1/2
1/2
X Y
pX(0)=2/3, pX(1) = 1/3
H(X)=?
H(X|Y=0) = ?
H(X|Y=?) = ?
H(X|Y) = ?
5. conditional entropy
H(X) = H(2/3,1/3)=0.9183 bit/sig
H(X|Y=0) = 0
H(X|Y=1) = 0
H(X|Y) = 1/3 bit/sig
H(X|Y=?) = H(1/2,1/2)=1 bit/sig
§1.1.1 Discrete memoryless source and entropy
5. conditional entropy
§1.1.1 Discrete memoryless source and entropy
Theorem1.3
with equality iff X and Y are independent.
( | ) ( )H X Y H X
Proof:
(conditioning reduces entropy)
Review
KeyWords: Measure of information
self information
entropy
properties of entropy
conditional entropy
Homework
1. P44: T1.1,2. P44: T1.4,3. P44: T1.6,
4. Let X be a random variable taking on a finite number of values. What is the relationship
of H(X) or H(Y) if (1) Y=2X ? (2) Y=cosX ?
Homework
1
1 1
5. Let be an ensemble of points , ,
and let ( )= , prove that
1 1( ) log +(1- ) log +(1- ) ( )
1where Y is an ensemble of -1 points , ,
with probabilities ( ) ( ) /(1- );
1 1
M
X M
M
Y j X j
X M a a
P a
H X H Y
M a a
P a P a
j M
. Prove that
1 1( ) log +(1- ) log +(1- )log( -1)
1and determine condition for equality.
H X M
Homework
6. Given a chessboard with 8×8=64 squares. A chessman is put randomly in a square. Guess the location of the chessman. Find the uncertainty of the result.
if we mark every square by its row and column number, and already know the row number of the chessman, how about the uncertainty?
Coin flip. A fair coin is flipped until the first head occurs. Let X denote the number of flips required. Find the entropy H(X) in bits.
thinking :
r
rr
n
n
112
1 )1( r
rrn
n
n
...
21...
21
21
21
......321
)( 32 n
n
xP
X
Homework
Imply:
§1.1 Discrete random variables
§1.2 Discrete random vectors
§1 Entropy and mutual information
§1.1.1 Discrete memoryless source and entropy§1.1.2 Discrete memoryless channel
and mutual information
Channel1 2
,
{ , , , }i
r
x X
X a a a
1 2
,
{ , , , }
j
s
y Y
Y b b b
p(y∣x)
1
( )
( ) 1
j i
s
j ij
or p b a
p b a
§1.1.2 Discrete memoryless channel and mutual information
§1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel)
The model of DMC
01
r-1
01
s-1
p(y|x)
r input symbols, s output symbols
representation of DMC
x yp(y|x)
0)|( xyp for all x,y
y
xyp 1)|( for all x
§1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel)
transition probabilities
graph
representation of DMC
§1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel)
matrix
11 12 11 1 2 1 1
21 22 21 2 2 2 2
1 21 2
1
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
( ) 1
ss
ss
r r rsr r s r
s
j ij
p p pp b a p b a p b a
p p pp b a p b a p b aP
p p pp b a p b a p b a
p b a
transition probabilities matrix
representation of DMC
§1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel)
formula
( ) ( )j ip y x p b a
1 2
1
, ,
( )( )
( ) 1
r
i
r
ii
a a aX
p aP x
p a
1 2
1
, ,
( )( )
( ) 1
s
j
s
ji
b b bY
p bP y
p b
1
( ) 1s
j ij
p b a
Example 1.1.8: BSC (Binary Symmetric Channel)
r = s = 2
pp
ppP
1
1p(0|0) = p(1|1) = 1-p
p(0|1) = p(1|0) = p
0 0
1 1
1-p
1-p
p
p
§1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel)
§1.1.2 Discrete memoryless channel and mutual information
Example 1.1.9: BEC (Binary Erasure Channel)
1. DMC (Discrete Memoryless Channel)
0 1 ? 1 1 ? ?
0 1 0 ? 1 ? 1 ?
Example 1.1.9: BEC (Binary Erasure Channel)
r = 2, s = 3
p(0|0) = p, p(?|0) = 1-p
p(1|1) = q, p(?|1) = 1-q
ppP
10
01
p0
1
0
1
?
1-p
1-q
q
§1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel)
2. average mutual information
definition
I(X;Y) = H(X) – H(X|Y)
§1.1.2 Discrete memoryless channel and mutual information
Channel1 2
,
{ , , , }i
r
x X
X a a a
1 2
,
{ , , , }
j
s
y Y
Y b b b
p(y∣x) or p(ai|bj)
1
( ) 1s
j ij
p b a
H(X) H(X|Y)
entropy equivocation
average mutual information
The reduction in uncertainty about X conveyed by the observations Y;
The information about X from Y.
2. average mutual information
definition
X XY yxp
yxpxp
xp)|(
1log),(
)(
1log)(
XY XY yxP
yxPxP
yxP)|(
1log),(
)(
1log),(
§1.1.2 Discrete memoryless channel and mutual information
I(X;Y) = H(X) – H(X|Y)
( | )( , ) log
( )XY
p x yp x y
p x
( , )( , ) log
( ) ( )XY
p x yp x y
p x p y
( | )( , ) log
( )XY
p y xp x y
p y
2. average mutual information
definition
§1.1.2 Discrete memoryless channel and mutual information
I(X;Y) and I(x;y)
( | )( , ) log
( )
P x yI x y
P x
I(X;Y) = EXY[I(x;y)]
mutual information
I(X;Y) and H(X)
properties
1) Non-negativity of average mutual information
Theorem1.4 For any discrete random variables X and Y,
0);( YXI .Moreover I(X;Y) = 0 iff X and Y are independent.
§1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information
Proof:(Theorem 1.3 in textbook)
We do not expect to be misled on average by observing the output of channel.
properties
§1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information
X Y’ Y
listener-in
S encrypt
Key
channel decrypt D
total loss
message : arrive at four
ciphertext : duulyh dw irxu
A cryptosystem
I(X;Y) = I(Y;X)
I(X;Y) = H(Y) – H(Y|X)
I(X;Y) = H(X) – H(X|Y)
I(X;Y) = H(X) + H(Y) – H(XY)
Joint entropy
§1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information
3) relationship between entropy and average mutual information
2) symmetry
Mnemonic Venn diagram H(X)
H(Y)
H(Y∣X)H(X∣Y)
I(X;Y)
H(XY)
properties
§1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information
a1
a2
ar
b1
b2
br
a1
a2
b1
b2
b5
b3
b4
a1
a2
a3
b1
b2
Recognising channel
properties
4) Convex property
§1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information
I(X;Y)=f [P(x) , P(y|x)]
( | )( ; ) ( ) log
( )
( ) ( ) ( | ) ( )
( ) ( | ) ( )
X Y
X X
P x yI X Y P xy
P x
P y P xy P y x P x
P xy P y x P x
properties
4) Convex properties
Theorem1.5 I(X;Y) is a convex function of the input probabilities P(x).
§1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information
I(X;Y)=f [P(x) , P(y|x)]
Theorem1.6 I(X;Y) is a convex function of the transitionprobabilities P(y|x).
(Theorem 1.6 in textbook)
(Theorem 1.7 in textbook)
properties
Example 1.1.10 analyse the I(X;Y) of BSC
§1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information
0 1
( ) 1
X
P x
source : , channel :
1- p
1- p
p
p
0 0
1 1
( | )( ; ) ( ) log
( )X Y
P x yI X Y P xy
P x
( | )( ) log ( ) ( | )
( )X Y
P y xP xy H Y H Y X
P y
Example 1.1.10 analyse the I(X;Y) of BSC
§1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information
1 1 1( | ) ( ) log log log ( )
( | )X Y
H Y X p xy p p H pp y x p p
( ) ( 2 )H Y H p p
( ; ) ( 2 ) ( )I X Y H p p H p
Example 1.1.10 analyse the I(X;Y) of BSC
§1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information
( ; ) ( 2 ) ( )I X Y H p p H p
Review
KeyWords: Channel and it’s information measure
channel model
equivocation
average mutual information
mutual information
properties of average mutual information
Thinking
( ; ) 0I X Y ( ; ) 0Cov X Y
: ( ; ) ( ) ( )comparing I X Y H X H Y, ,
§1.1.2 Discrete memoryless channel and mutual information
§1.1.2 Discrete memoryless channel and mutual information
1 2 3( )Tb b b b
Let the source have alphabet A={0,1} with p0=p1=0.5.
Let encoder C have alphabet B={0,1,… ,7}and let the elements of B have binary representation
( 0) 5 (101)b t
The encoder is shown below. Find the entropy of the coded output and find the output sequence if the input sequence is a(t)={101001011000001100111011} and the initial contents of the registers are
Example 1.1.11
D Q D QD Qa(t)
b0 b1 b2
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
Yt Yt+1
§1.1.2 Discrete memoryless channel and mutual information
a(t)={101001011000001100111011}
b = {001242425124366675013666}
Homework
1. P45: T1.10,
2. P46: T1.19(except c)
3. Let the DMS 0 1
( ) 0.6 0.4
X
p x
conveys message through a channel:
Calculate that:(1) H(X) and H(Y);(2) the mutual information of xi and yj (i,j=1,2);(3) the equivocation H(X|Y) and average mutual information.
5 1(0 | 0) (1 | 0) 6 6(0 |1) (1 |1) 3 1
4 4
p pP
p p
Homework
4. Suppose that I(X;Y)=0.Does this imply that I(X;Z)=I(X;Z|Y)?
5. In a joint ensemble XY, the mutual information I(x;y) is a random variable. In this problem we are concerned with the variance of that random variable, VAR[I(x;y)].
(1) Prove that VAR[I(x;y)]=0 iff there is a constant αsuch that, for all x,y with P(xy)>0,
P(xy)= αP(x) P(y)(2) Express I(X;Y) in term of α and interpret the special
case α =1. (continued)
Homework
5. (3) for each of the channel in fig5 , find a probability
assignment P(x) such that I(X;Y) >0 and VAR[I(x;y)]=0 . Calculate I(X;Y).
a1
a2
a3
b1
b2
1
1
1
a1
a2
a3
b1
b2
1/2
1/2
1/2b3
1/2
1/2
1/2
§1.1 Discrete random variables
§1.2 Discrete random vectors
§1.1 Discrete random variables
§1.2 Discrete random vectors
§1 Entropy and mutual information
§1.2.1 Extended source and joint entropy§1.2.2 Extended channel and mutual information
§1.2.1 Extended source and joint entropy
1. Extended source
Source model
q
q
ppp
aaa
xP
X
...
...
)( 21
211
1
q
iip
N-times extended
source],,[
),(
21
21
qi
N
aaaXX
XXXX
1 2
1 1 1 1
1
( ), , ( )
( ) ( )( )
( ) 1
N
N
N
Nq q qq
i i i i
q
ii
a a a a a aX
p p a a aP x
p
Example 1.2.1
2. Joint entropy
Definition:
The joint entropy H(XY) of a pair of discrete random
variables (X,Y) with a joint distribution p(x,y) is defined as ( ) ( , ) log ( , )
x X y Y
H XY p x y p x y
which can also be expressed as
( ) [log ( , )]H XY E p x y
§1.2.1 Extended source and joint entropy
2. Joint entropy Extended DMS
Nq
iii
N ppXHXH1
)(log)()()(
)...(log)...(...21
1
21
21 1 1NN
N
iii
q
iiii
q
i
q
i
aaapaaap
q
i
q
iii
q
iii
N
Napapapap
1 11 1
2
1
11)()...()(log)(
q
i
q
iii
q
iii
N
NNNapapapap
1 11 1
11
1
)()...()(log)(...
)()(...)( XNHXHXH
§1.2.1 Extended source and joint entropy
2. Joint entropy memory source
§1.2.1 Extended source and joint entropy
2) Joint entropy
q
i
q
jjiji aapaapXXH
1 121 )(log)()(
)(2
1)( 212 XXHXH
bit/sig
1) Conditional entropy
2 11 1
( | ) ( ) log ( | )q q
i j j ii j
H X X p a a p a a
3) (per symbol) entropy
3. Properties of joint entropy
Theorem1.7 (Chain rule) :
H(XY) = H(X) + H(Y|X)
Proof: ( ) ( , ) log ( , )x X y Y
H XY p x y p x y
Xx Yy
xypxpyxp )|()(log),(
Xx YyXx Yy
xypyxpxpxypxp )|(log),()(log)|()(
Xx YyXx
xypyxpxpxp )|(log),()(log)(
)|()( XYHXH
§1.2.1 Extended source and joint entropy
Let X be a random variable, its probability space is :
6
1
3
1
2
1210
)(xP
X
Its joint probability )( jiaaP
ia ja
1/4 1/4 0
1/24
1/24
1/4
0
1/24
1/8
0 1 2
0
1
2
Example 1.2.3
§1.2.1 Extended source and joint entropy
3. Properties of joint entropy
H(X)=?
P(aj| ai) aj=0 aj =1 aj =2 ai=0 1/2 1/2 0 ai=1 3/4 1/8 1/8 ai=2 0 1/4 3/4
Relationship
)|()()( 12121 XXHXHXXH
)( 21 XXH )(2 XH
)|( 12 XXH)(XH
H(X2) ≥ H(X2|X1)
H(X1X2) ≤ 2H(X1)
§1.2.1 Extended source and joint entropy
3. Properties of joint entropy
2( ) ( )H X H X
General stationary source
)(...)()(
...
)( 21
21
q
q
apapap
aaa
xP
X1)(
1
q
iiap
),...()...(2121 NiiiN aaapXXXP },...,2,1{,...,, 21 qiii N
Let X1,X2,…,XN be dependent , the joint probability is :
§1.2.1 Extended source and joint entropy
)...|()...|()()...( 1211211 NNN xxxxPxxPxPxxP
3. Properties of joint entropy
• Joint entropy
N
NNii
iiiiiiN aaapaaapXXXH...
21
1
2121)...(log)...()...(
)...(1
)( 21 NN XXXHN
XH
§1.2.1 Extended source and joint entropy
Definition of entropies
3. Properties of joint entropy
• conditional entropy
)...|(log)...()...|(121
1
1...
11 NN
N
N iiiiii
iiNN aaaapaapXXXH
• (per symbol) entropy
Theorem1.8 (Chain rule for entropy):
Let X1,X2,…,Xn be drawn according to p(x1,x2,…,xn). Then
1 2 1 11
( ... ) ( | ... )n
n i ii
H X X X H X X X
Proof (do it by yourself)
§1.2.1 Extended source and joint entropy
3. Properties of joint entropy
)...|()...|( 2211121 NNNN XXXXHXXXXH
)...|()( 11 NNN XXXHXH
)()( 1 XHXH NN
)...|(lim)(lim 121 NNN
NN
XXXXHXHH
§1.2.1 Extended source and joint entropy
Relation of entropies
——base of data compressing
If H(X)<∞, then:
3. Properties of joint entropy
entropy rate
Theorem1.9 (Independence bound on entropy):
Let X1,X2,…, Xn be drawn according to p(x1,x2,…,xn). Then
1 21
( ... ) ( )n
n ii
H X X X H X
with equality iff the Xi are independent
§1.2.1 Extended source and joint entropy
(P37(corollary) in textbook)
3. Properties of joint entropy
§1.2.1 Extended source and joint entropy
Example 1.2.4
Suppose a memoryless source with A={0,1} having equal probabilities emits a sequence of six symbols. Following the sixth symbol, suppose a seventh symbol is transmitted which is the sum modulo 2 of the six previous symbols. What is the entropy of the seven-symbol sequence?
3. Properties of joint entropy
§1.1 Discrete random variables
§1.2 Discrete random vectors
§1 Entropy and mutual information
§1.2.1 Extended source and joint entropy§1.2.2 Extended channel and mutual information
1. The model of extended channel
source encoder
channel decoder
(U1,U2,…,Uk) (X1,X2,…,XN)
(Y1,Y2,…,YN)(V1,V2,…,Vk)
A general communication system
XN
YN
§1.2.2 Extended channel and mutual information
Extended channel
1 2( ) NNX X X X X
1 2( ) N
NY Y Y Y Y
1 2{ , , , }i i rX x X a a a 1 2{ , , , }j j sY y Y b b b ( )P y x
1 2 1 2
1
( ) ( )
( )
N N
DMC N
i ii
P y x P y y y x x x
P y x
1. The model of extended channel
§1.2.2 Extended channel and mutual information
1. The model of extended channel
§1.2.2 Extended channel and mutual information
1 1 1 1 1 1
2 1 1 2 1 1 1 2
( ) ( )
( ) ( | ) ( )
( ) ( )N N
N N
h k
r r s sr s
X Y
a a b b
a a a p b b b
a a b b
2. Average mutual information
( ; ) ( ; ) ( ) ( | )N N N N NI X Y I X Y H X H X Y
)|()( NNN XYHYH ( | )
( ) log( )N NX Y
P y xP xy
P y
NN YXyx ,,
N Nr
k
s
h h
khhk p
pp
1 1 )(
)|(log)(
)...(),...(11 NN hhhkkk bbaa
§1.2.2 Extended channel and mutual information
example 1.2.5
3. The properties
Theorem1.11 If the components(X1, X2,…,XN) of XN areindependent, then
1
( ; ) ( ; )N
i ii
I X Y I X Y
§1.2.2 Extended channel and mutual information
(Theorem 1.8 in textbook)
3. The properties
Theorem1.12 If XN =(X1, X2,…,XN) and YN =(Y1, Y2,…,YN) are random vectors and the channel is memoryless,
that is
N
iiiNN xyPxxyyP
111 )|(),...,|,...,(
then
1
( ; ) ( ; )N
i ii
I X Y I X Y
§1.2.2 Extended channel and mutual information
(Theorem 1.9 in textbook)
example 1.2.6
Let X1,X2,…,X5 be independent identically distributedrandom variables with common entropy H. Also let Tbe a permutation of the set {1, 2,3,4,5}, and let Yi = XT(i)
1 2 3 4 5
3 2 5 1 4
Show that ,);(5
1i
ii YXI ).;( 55 YXI
§1.2.2 Extended channel and mutual information
Review
Keywords:
Measure of information
vector Extented source
joint entropy
Extented channelstationary source
(per symbol) entropy
conditional entropy entropy rate
Review
chain rule for entropy
Conclusion:
Independence bound on entropy
conditioning reduces entropy
properties of ( ; )I X Y
Homework
1. P47: T1.23,2. P47: T1.24,
1 2 -1
n-1
ii=1
3. Let , , , be i.i.d. random variables taking values
in {0,1}, with { 1} 1/ 2. Let 1 if X is odd
and 0 otherwise. Let 3.
(1) Show that and are independent, for , ,
n
i n
n
i j
X X X
Pr X X
X n
X X i j i j
1 2 1
{1,2, , };
(2) Find ( ), for ;
(3) Find ( ), is this equal to ( )?
i j
n
n
H X X i j
H X X X nH X
2 1
1
( | )1
( )
H X X
H X
4.Let X1, X2 be identically distributed random variables. Let be:
1 ) show that 0 1
2 ) when 0? 3 ) when 1?
Homework
5. Shuffles increase entropy. Argue that for any distribution on shuffles T and any distribution on card positions X that
H(TX) ≥ H(TX|T) , if X and T are independent.
Homework
Thinking :