§1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2...

Preview:

Citation preview

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1 Entropy and mutual information

§1.1.1 Discrete memoryless source and entropy§1.1.2 Discrete memoryless channel

and mutual information

§1.1.1 Discrete memoryless source and entropy

Example 1.1.1

Let X represent the outcome of a single roll of a fair die.

6/16/16/16/16/16/1

654321

P

X

1)(,)(,),(),(

,,,

)( 121

21

q

ii

q

qaP

aPaPaP

aaa

xP

X

1. DMS (Discrete memoryless source )

Probability Space:

2. self information

Example 1.1.2

§1.1.1 Discrete memoryless source and entropy

1 2 1 2 3 4

( ) 0 5 0 5 ( ) 0 25 0 25 0 25 0 25

X a a Y a a a a,

P x . . P y . . . .

red white red white blue black

Analyse the uncertainty of red ball selected from X and from Y.

2. self information

I(ai) = f [p(ai)]Satisfy:

1) I(ai) is the monotone decreasing function of p(ai):

if p(a1)> p(a2), then I(a1) < I(a2) ;2) if p(ai)= 1, then I(ai)= 0;3) if p(ai)= 0 , then I(ai)→∞;4) if p(ai aj)= p(ai) p(aj), then I(aiaj)=I(ai)+I(aj)

§1.1.1 Discrete memoryless source and entropy

self information 1

( ) log log ( )( )i r r i

i

I a p ap a

I(ai)

p(ai)0 1

0)()(1 ji apap )()( ji aIaI

0)( iap )( iaI

1)( iap 0)( iaI

)()()( bIaIabI a and b are statistically independent

§1.1.1 Discrete memoryless source and entropy

Remark:

The measure of uncertainty of the random variable ai

The measure of information the random variable ai provides.

bitnat

hart

3. Entropy

1

( ) [ ( )] ( ) log ( )q

i i ii

H X E I a p a p a

Definition:Suppose X is a discrete random variable, whose

range R={a1,a2,…} is finite or countable.

Let p(ai)=P{X=ai}. The entropy of X is defined by

A measure ofamount of information provided by X.

uncertainty (or randomness) about X.

§1.1.1 Discrete memoryless source and entropy

average

Entropy-the amount of “information”provided by an observation of X

Example 1.1.3 100 balls in a bag, 80% is red, and remain is white. Now , we fetch out a ball. How about the information of every fetching?

2.08.0)(21 aa

xP

XN

aInaIn

N

II N )()( 2211

)()()()( 2211 aIapaIap

2

1

)(log)(i

ii apap),( 11 aNpn

)( 22 aNpn =0.722 bit/sig

§1.1.1 Discrete memoryless source and entropy

( ) (0.8,0.2)

0.722 /

H X H

bit sig

Entropy-the “uncertainty” or “randomness” about X

Example 1.1.4

01.099.0)(

,3.07.0)(

,5.05.0)(

21

3

321

2

221

1

1 aa

xP

Xaa

xP

Xaa

xP

X

)/(08.0)( 3 sigbitXH

)/(1)( 1 sigbitXH

)/(88.0)( 2 sigbitXH

average

§1.1.1 Discrete memoryless source and entropy

3. Entropy

1

( ) [ ( )] ( ) log ( )q

i i ii

H X E I a p a p a

1) units: bit/sig,nat/sig,hart/sig

2) If p(ai)=0, p(ai)log p(ai)-1 = 0

3) If R is infinite , H(X) may be +

Note:

§1.1.1 Discrete memoryless source and entropy

Example 1.1.5 entropy of BS

qpP

X 10 10 p

pq 1

)1log()1(log)( ppppXH

3. Entropy

1 21

1( , , ..., ) log

q

q ii i

H p p p pp

( )H p

entropy function

probability vector

( )H p

§1.1.1 Discrete memoryless source and entropy

4. The properties of entropy

Theorem1.1 Let X assume values in R={x1,x2,…,xr}.

0)( XH1)

2) H(X) = 0 iff pi = 1 for some i

3) H(X) ≤ logr ,with equality iff pi = 1/r for all i

——base of data compressing

§1.1.1 Discrete memoryless source and entropy

Proof:

(Theorem 1.1 in textbook)

Lemma: 0, log ( 1) log , ln 1.

with equality iff 1.

x x x e or x x

x

4) ),...,,(),...,,(2121 riiir pppHpppH

2

1

6

1

3

1)(

321 aaa

xP

X

3

1

2

1

6

1)(

321 aaa

yP

Y

6

1

2

1

3

1)(

321 bbb

zP

Z

Example 1.1.6 Let X,Y,Z are all discrete random variables:

4. The properties of entropy

1.46 ( / )H bit sig

§1.1.1 Discrete memoryless source and entropy

5) If X,Y are independent , then H(XY) = H(X) + H(Y)

4. The properties of entropy

Proof:

1 2

( ) ( )

q

i

a a aX

P x p a

( ) 1ii

p a

1 2

( )( )r

j

b b bY

p bP y

( ) 1jj

p b j

1

1( ) ( ) log

( )

r

j j

H Y p bp b

i1

1( ) ( ) log

( )

q

i i

H X p ap a

§1.1.1 Discrete memoryless source and entropy

Proof:

1 2,

( ) ( )

q

i

a a aX

P x p a

( ) 1i

i

p a 1 2 ,( )( )

r

j

b b bY

p bP y

( ) 1j

j

p b

)()()()(

),(),(

)(111

jijik

qrrq

bpapbapP

baba

xyP

XY

1)()(1 11

q

i

r

jji

qr

kk baPP

qr

k PPXYH

1 kk )(

1log)()(

q

i

r

jbapji ji

bap1 1

)(1log)(

i j

bpjij i

apij jibpapapbp )(

1)(

1 log)()(log)()(

Joint source:

§1.1.1 Discrete memoryless source and entropy

Theorem1.2 The entropy function H(p1,p2,…,pr) is a convex function of probability vector (p1,p2,…,pr) .

6) Convex properties

4. The properties of entropy

§1.1.1 Discrete memoryless source and entropy

0 1/2 1

1

)(

)1log()1(log)(

pH

ppppXH

Example 1.1.5 (continued)entropy of BS

H

p

5. conditional entropy

X, Y are a pair of random variables, if (X,Y)~p(x,y)

Then the conditional entropy of X , given Y is defined by

Definition:

,

1 1( | ) log ( , ) log

( | ) ( | )X Y

H X Y E p x yp x y p x y

§1.1.1 Discrete memoryless source and entropy

X yxp

yxpyYXH)|(

1log)|()|(

Y

yYXHypYXH )|()()|(

Y X yxp

yxpyp)|(

1log)|()(

YX yxp

yxp, )|(

1log),(

5. conditional entropy

Analyse:

§1.1.1 Discrete memoryless source and entropy

Example 1.1.7

3/40

1

0

1

?

1/4

1/2

1/2

X Y

pX(0)=2/3, pX(1) = 1/3

H(X)=?

H(X|Y=0) = ?

H(X|Y=?) = ?

H(X|Y) = ?

5. conditional entropy

H(X) = H(2/3,1/3)=0.9183 bit/sig

H(X|Y=0) = 0

H(X|Y=1) = 0

H(X|Y) = 1/3 bit/sig

H(X|Y=?) = H(1/2,1/2)=1 bit/sig

§1.1.1 Discrete memoryless source and entropy

5. conditional entropy

§1.1.1 Discrete memoryless source and entropy

Theorem1.3

with equality iff X and Y are independent.

( | ) ( )H X Y H X

Proof:

(conditioning reduces entropy)

Review

KeyWords: Measure of information

self information

entropy

properties of entropy

conditional entropy

Homework

1. P44: T1.1,2. P44: T1.4,3. P44: T1.6,

4. Let X be a random variable taking on a finite number of values. What is the relationship

of H(X) or H(Y) if (1) Y=2X ? (2) Y=cosX ?

Homework

1

1 1

5. Let be an ensemble of points , ,

and let ( )= , prove that

1 1( ) log +(1- ) log +(1- ) ( )

1where Y is an ensemble of -1 points , ,

with probabilities ( ) ( ) /(1- );

1 1

M

X M

M

Y j X j

X M a a

P a

H X H Y

M a a

P a P a

j M

. Prove that

1 1( ) log +(1- ) log +(1- )log( -1)

1and determine condition for equality.

H X M

Homework

6. Given a chessboard with 8×8=64 squares. A chessman is put randomly in a square. Guess the location of the chessman. Find the uncertainty of the result.

if we mark every square by its row and column number, and already know the row number of the chessman, how about the uncertainty?

Coin flip. A fair coin is flipped until the first head occurs. Let X denote the number of flips required. Find the entropy H(X) in bits.

thinking :

r

rr

n

n

112

1 )1( r

rrn

n

n

...

21...

21

21

21

......321

)( 32 n

n

xP

X

Homework

Imply:

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1 Entropy and mutual information

§1.1.1 Discrete memoryless source and entropy§1.1.2 Discrete memoryless channel

and mutual information

Channel1 2

,

{ , , , }i

r

x X

X a a a

1 2

,

{ , , , }

j

s

y Y

Y b b b

p(y∣x)

1

( )

( ) 1

j i

s

j ij

or p b a

p b a

§1.1.2 Discrete memoryless channel and mutual information

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

The model of DMC

01

r-1

01

s-1

p(y|x)

r input symbols, s output symbols

representation of DMC

x yp(y|x)

0)|( xyp for all x,y

y

xyp 1)|( for all x

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

transition probabilities

graph

representation of DMC

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

matrix

11 12 11 1 2 1 1

21 22 21 2 2 2 2

1 21 2

1

( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )

( ) 1

ss

ss

r r rsr r s r

s

j ij

p p pp b a p b a p b a

p p pp b a p b a p b aP

p p pp b a p b a p b a

p b a

transition probabilities matrix

representation of DMC

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

formula

( ) ( )j ip y x p b a

1 2

1

, ,

( )( )

( ) 1

r

i

r

ii

a a aX

p aP x

p a

1 2

1

, ,

( )( )

( ) 1

s

j

s

ji

b b bY

p bP y

p b

1

( ) 1s

j ij

p b a

Example 1.1.8: BSC (Binary Symmetric Channel)

r = s = 2

pp

ppP

1

1p(0|0) = p(1|1) = 1-p

p(0|1) = p(1|0) = p

0 0

1 1

1-p

1-p

p

p

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

§1.1.2 Discrete memoryless channel and mutual information

Example 1.1.9: BEC (Binary Erasure Channel)

1. DMC (Discrete Memoryless Channel)

0 1 ? 1 1 ? ?

0 1 0 ? 1 ? 1 ?

Example 1.1.9: BEC (Binary Erasure Channel)

r = 2, s = 3

p(0|0) = p, p(?|0) = 1-p

p(1|1) = q, p(?|1) = 1-q

qq

ppP

10

01

p0

1

0

1

?

1-p

1-q

q

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

2. average mutual information

definition

I(X;Y) = H(X) – H(X|Y)

§1.1.2 Discrete memoryless channel and mutual information

Channel1 2

,

{ , , , }i

r

x X

X a a a

1 2

,

{ , , , }

j

s

y Y

Y b b b

p(y∣x) or p(ai|bj)

1

( ) 1s

j ij

p b a

H(X) H(X|Y)

entropy equivocation

average mutual information

The reduction in uncertainty about X conveyed by the observations Y;

The information about X from Y.

2. average mutual information

definition

X XY yxp

yxpxp

xp)|(

1log),(

)(

1log)(

XY XY yxP

yxPxP

yxP)|(

1log),(

)(

1log),(

§1.1.2 Discrete memoryless channel and mutual information

I(X;Y) = H(X) – H(X|Y)

( | )( , ) log

( )XY

p x yp x y

p x

( , )( , ) log

( ) ( )XY

p x yp x y

p x p y

( | )( , ) log

( )XY

p y xp x y

p y

2. average mutual information

definition

§1.1.2 Discrete memoryless channel and mutual information

I(X;Y) and I(x;y)

( | )( , ) log

( )

P x yI x y

P x

I(X;Y) = EXY[I(x;y)]

mutual information

I(X;Y) and H(X)

properties

1) Non-negativity of average mutual information

Theorem1.4 For any discrete random variables X and Y,

0);( YXI .Moreover I(X;Y) = 0 iff X and Y are independent.

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

Proof:(Theorem 1.3 in textbook)

We do not expect to be misled on average by observing the output of channel.

properties

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

X Y’ Y

listener-in

S encrypt

Key

channel decrypt D

total loss

message : arrive at four

ciphertext : duulyh dw irxu

A cryptosystem

I(X;Y) = I(Y;X)

I(X;Y) = H(Y) – H(Y|X)

I(X;Y) = H(X) – H(X|Y)

I(X;Y) = H(X) + H(Y) – H(XY)

Joint entropy

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

3) relationship between entropy and average mutual information

2) symmetry

Mnemonic Venn diagram H(X)

H(Y)

H(Y∣X)H(X∣Y)

I(X;Y)

H(XY)

properties

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

a1

a2

ar

b1

b2

br

a1

a2

b1

b2

b5

b3

b4

a1

a2

a3

b1

b2

Recognising channel

properties

4) Convex property

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

I(X;Y)=f [P(x) , P(y|x)]

( | )( ; ) ( ) log

( )

( ) ( ) ( | ) ( )

( ) ( | ) ( )

X Y

X X

P x yI X Y P xy

P x

P y P xy P y x P x

P xy P y x P x

properties

4) Convex properties

Theorem1.5 I(X;Y) is a convex function of the input probabilities P(x).

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

I(X;Y)=f [P(x) , P(y|x)]

Theorem1.6 I(X;Y) is a convex function of the transitionprobabilities P(y|x).

(Theorem 1.6 in textbook)

(Theorem 1.7 in textbook)

properties

Example 1.1.10 analyse the I(X;Y) of BSC

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

0 1

( ) 1

X

P x

source : , channel :

1- p

1- p

p

p

0 0

1 1

( | )( ; ) ( ) log

( )X Y

P x yI X Y P xy

P x

( | )( ) log ( ) ( | )

( )X Y

P y xP xy H Y H Y X

P y

Example 1.1.10 analyse the I(X;Y) of BSC

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

1 1 1( | ) ( ) log log log ( )

( | )X Y

H Y X p xy p p H pp y x p p

( ) ( 2 )H Y H p p

( ; ) ( 2 ) ( )I X Y H p p H p

Example 1.1.10 analyse the I(X;Y) of BSC

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

( ; ) ( 2 ) ( )I X Y H p p H p

Review

KeyWords: Channel and it’s information measure

channel model

equivocation

average mutual information

mutual information

properties of average mutual information

Thinking

( ; ) 0I X Y ( ; ) 0Cov X Y

: ( ; ) ( ) ( )comparing I X Y H X H Y, ,

§1.1.2 Discrete memoryless channel and mutual information

§1.1.2 Discrete memoryless channel and mutual information

1 2 3( )Tb b b b

Let the source have alphabet A={0,1} with p0=p1=0.5.

Let encoder C have alphabet B={0,1,… ,7}and let the elements of B have binary representation

( 0) 5 (101)b t

The encoder is shown below. Find the entropy of the coded output and find the output sequence if the input sequence is a(t)={101001011000001100111011} and the initial contents of the registers are

Example 1.1.11

D Q D QD Qa(t)

b0 b1 b2

0

1

2

3

4

5

6

7

0

1

2

3

4

5

6

7

Yt Yt+1

§1.1.2 Discrete memoryless channel and mutual information

a(t)={101001011000001100111011}

b = {001242425124366675013666}

Homework

1. P45: T1.10,

2. P46: T1.19(except c)

3. Let the DMS 0 1

( ) 0.6 0.4

X

p x

conveys message through a channel:

Calculate that:(1) H(X) and H(Y);(2) the mutual information of xi and yj (i,j=1,2);(3) the equivocation H(X|Y) and average mutual information.

5 1(0 | 0) (1 | 0) 6 6(0 |1) (1 |1) 3 1

4 4

p pP

p p

Homework

4. Suppose that I(X;Y)=0.Does this imply that I(X;Z)=I(X;Z|Y)?

5. In a joint ensemble XY, the mutual information I(x;y) is a random variable. In this problem we are concerned with the variance of that random variable, VAR[I(x;y)].

(1) Prove that VAR[I(x;y)]=0 iff there is a constant αsuch that, for all x,y with P(xy)>0,

P(xy)= αP(x) P(y)(2) Express I(X;Y) in term of α and interpret the special

case α =1. (continued)

Homework

5. (3) for each of the channel in fig5 , find a probability

assignment P(x) such that I(X;Y) >0 and VAR[I(x;y)]=0 . Calculate I(X;Y).

a1

a2

a3

b1

b2

1

1

1

a1

a2

a3

b1

b2

1/2

1/2

1/2b3

1/2

1/2

1/2

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1 Entropy and mutual information

§1.2.1 Extended source and joint entropy§1.2.2 Extended channel and mutual information

§1.2.1 Extended source and joint entropy

1. Extended source

Source model

q

q

ppp

aaa

xP

X

...

...

)( 21

211

1

q

iip

N-times extended

source],,[

),(

21

21

qi

N

aaaXX

XXXX

1 2

1 1 1 1

1

( ), , ( )

( ) ( )( )

( ) 1

N

N

N

Nq q qq

i i i i

q

ii

a a a a a aX

p p a a aP x

p

Example 1.2.1

2. Joint entropy

Definition:

The joint entropy H(XY) of a pair of discrete random

variables (X,Y) with a joint distribution p(x,y) is defined as ( ) ( , ) log ( , )

x X y Y

H XY p x y p x y

which can also be expressed as

( ) [log ( , )]H XY E p x y

§1.2.1 Extended source and joint entropy

2. Joint entropy Extended DMS

Nq

iii

N ppXHXH1

)(log)()()(

)...(log)...(...21

1

21

21 1 1NN

N

iii

q

iiii

q

i

q

i

aaapaaap

q

i

q

iii

q

iii

N

Napapapap

1 11 1

2

1

11)()...()(log)(

q

i

q

iii

q

iii

N

NNNapapapap

1 11 1

11

1

)()...()(log)(...

)()(...)( XNHXHXH

§1.2.1 Extended source and joint entropy

2. Joint entropy memory source

§1.2.1 Extended source and joint entropy

2) Joint entropy

q

i

q

jjiji aapaapXXH

1 121 )(log)()(

)(2

1)( 212 XXHXH

bit/sig

1) Conditional entropy

2 11 1

( | ) ( ) log ( | )q q

i j j ii j

H X X p a a p a a

3) (per symbol) entropy

3. Properties of joint entropy

Theorem1.7 (Chain rule) :

H(XY) = H(X) + H(Y|X)

Proof: ( ) ( , ) log ( , )x X y Y

H XY p x y p x y

Xx Yy

xypxpyxp )|()(log),(

Xx YyXx Yy

xypyxpxpxypxp )|(log),()(log)|()(

Xx YyXx

xypyxpxpxp )|(log),()(log)(

)|()( XYHXH

§1.2.1 Extended source and joint entropy

Let X be a random variable, its probability space is :

6

1

3

1

2

1210

)(xP

X

Its joint probability )( jiaaP

ia ja

1/4 1/4 0

1/24

1/24

1/4

0

1/24

1/8

0 1 2

0

1

2

Example 1.2.3

§1.2.1 Extended source and joint entropy

3. Properties of joint entropy

H(X)=?

P(aj| ai) aj=0 aj =1 aj =2 ai=0 1/2 1/2 0 ai=1 3/4 1/8 1/8 ai=2 0 1/4 3/4

Relationship

)|()()( 12121 XXHXHXXH

)( 21 XXH )(2 XH

)|( 12 XXH)(XH

H(X2) ≥ H(X2|X1)

H(X1X2) ≤ 2H(X1)

§1.2.1 Extended source and joint entropy

3. Properties of joint entropy

2( ) ( )H X H X

General stationary source

)(...)()(

...

)( 21

21

q

q

apapap

aaa

xP

X1)(

1

q

iiap

),...()...(2121 NiiiN aaapXXXP },...,2,1{,...,, 21 qiii N

Let X1,X2,…,XN be dependent , the joint probability is :

§1.2.1 Extended source and joint entropy

)...|()...|()()...( 1211211 NNN xxxxPxxPxPxxP

3. Properties of joint entropy

• Joint entropy

N

NNii

iiiiiiN aaapaaapXXXH...

21

1

2121)...(log)...()...(

)...(1

)( 21 NN XXXHN

XH

§1.2.1 Extended source and joint entropy

Definition of entropies

3. Properties of joint entropy

• conditional entropy

)...|(log)...()...|(121

1

1...

11 NN

N

N iiiiii

iiNN aaaapaapXXXH

• (per symbol) entropy

Theorem1.8 (Chain rule for entropy):

Let X1,X2,…,Xn be drawn according to p(x1,x2,…,xn). Then

1 2 1 11

( ... ) ( | ... )n

n i ii

H X X X H X X X

Proof (do it by yourself)

§1.2.1 Extended source and joint entropy

3. Properties of joint entropy

)...|()...|( 2211121 NNNN XXXXHXXXXH

)...|()( 11 NNN XXXHXH

)()( 1 XHXH NN

)...|(lim)(lim 121 NNN

NN

XXXXHXHH

§1.2.1 Extended source and joint entropy

Relation of entropies

——base of data compressing

If H(X)<∞, then:

3. Properties of joint entropy

entropy rate

Theorem1.9 (Independence bound on entropy):

Let X1,X2,…, Xn be drawn according to p(x1,x2,…,xn). Then

1 21

( ... ) ( )n

n ii

H X X X H X

with equality iff the Xi are independent

§1.2.1 Extended source and joint entropy

(P37(corollary) in textbook)

3. Properties of joint entropy

§1.2.1 Extended source and joint entropy

Example 1.2.4

Suppose a memoryless source with A={0,1} having equal probabilities emits a sequence of six symbols. Following the sixth symbol, suppose a seventh symbol is transmitted which is the sum modulo 2 of the six previous symbols. What is the entropy of the seven-symbol sequence?

3. Properties of joint entropy

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1 Entropy and mutual information

§1.2.1 Extended source and joint entropy§1.2.2 Extended channel and mutual information

1. The model of extended channel

source encoder

channel decoder

(U1,U2,…,Uk) (X1,X2,…,XN)

(Y1,Y2,…,YN)(V1,V2,…,Vk)

A general communication system

XN

YN

§1.2.2 Extended channel and mutual information

Extended channel

1 2( ) NNX X X X X

1 2( ) N

NY Y Y Y Y

1 2{ , , , }i i rX x X a a a 1 2{ , , , }j j sY y Y b b b ( )P y x

1 2 1 2

1

( ) ( )

( )

N N

DMC N

i ii

P y x P y y y x x x

P y x

1. The model of extended channel

§1.2.2 Extended channel and mutual information

1. The model of extended channel

§1.2.2 Extended channel and mutual information

1 1 1 1 1 1

2 1 1 2 1 1 1 2

( ) ( )

( ) ( | ) ( )

( ) ( )N N

N N

h k

r r s sr s

X Y

a a b b

a a a p b b b

a a b b

2. Average mutual information

( ; ) ( ; ) ( ) ( | )N N N N NI X Y I X Y H X H X Y

)|()( NNN XYHYH ( | )

( ) log( )N NX Y

P y xP xy

P y

NN YXyx ,,

N Nr

k

s

h h

khhk p

pp

1 1 )(

)|(log)(

)...(),...(11 NN hhhkkk bbaa

§1.2.2 Extended channel and mutual information

example 1.2.5

3. The properties

Theorem1.11 If the components(X1, X2,…,XN) of XN areindependent, then

1

( ; ) ( ; )N

i ii

I X Y I X Y

§1.2.2 Extended channel and mutual information

(Theorem 1.8 in textbook)

3. The properties

Theorem1.12 If XN =(X1, X2,…,XN) and YN =(Y1, Y2,…,YN) are random vectors and the channel is memoryless,

that is

N

iiiNN xyPxxyyP

111 )|(),...,|,...,(

then

1

( ; ) ( ; )N

i ii

I X Y I X Y

§1.2.2 Extended channel and mutual information

(Theorem 1.9 in textbook)

example 1.2.6

Let X1,X2,…,X5 be independent identically distributedrandom variables with common entropy H. Also let Tbe a permutation of the set {1, 2,3,4,5}, and let Yi = XT(i)

1 2 3 4 5

3 2 5 1 4

Show that ,);(5

1i

ii YXI ).;( 55 YXI

§1.2.2 Extended channel and mutual information

Review

Keywords:

Measure of information

vector Extented source

joint entropy

Extented channelstationary source

(per symbol) entropy

conditional entropy entropy rate

Review

chain rule for entropy

Conclusion:

Independence bound on entropy

conditioning reduces entropy

properties of ( ; )I X Y

Homework

1. P47: T1.23,2. P47: T1.24,

1 2 -1

n-1

ii=1

3. Let , , , be i.i.d. random variables taking values

in {0,1}, with { 1} 1/ 2. Let 1 if X is odd

and 0 otherwise. Let 3.

(1) Show that and are independent, for , ,

n

i n

n

i j

X X X

Pr X X

X n

X X i j i j

1 2 1

{1,2, , };

(2) Find ( ), for ;

(3) Find ( ), is this equal to ( )?

i j

n

n

H X X i j

H X X X nH X

2 1

1

( | )1

( )

H X X

H X

4.Let X1, X2 be identically distributed random variables. Let be:

1 ) show that 0 1

2 ) when 0? 3 ) when 1?

Homework

5. Shuffles increase entropy. Argue that for any distribution on shuffles T and any distribution on card positions X that

H(TX) ≥ H(TX|T) , if X and T are independent.

Homework

Thinking :

Recommended