79
§1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information §1.1.1 Discrete memoryless source and entropy §1.1.2 Discrete memoryless channel and mutual information

§1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Embed Size (px)

Citation preview

Page 1: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1 Entropy and mutual information

§1.1.1 Discrete memoryless source and entropy§1.1.2 Discrete memoryless channel

and mutual information

Page 2: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.1.1 Discrete memoryless source and entropy

Example 1.1.1

Let X represent the outcome of a single roll of a fair die.

6/16/16/16/16/16/1

654321

P

X

1)(,)(,),(),(

,,,

)( 121

21

q

ii

q

qaP

aPaPaP

aaa

xP

X

1. DMS (Discrete memoryless source )

Probability Space:

Page 3: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2. self information

Example 1.1.2

§1.1.1 Discrete memoryless source and entropy

1 2 1 2 3 4

( ) 0 5 0 5 ( ) 0 25 0 25 0 25 0 25

X a a Y a a a a,

P x . . P y . . . .

red white red white blue black

Analyse the uncertainty of red ball selected from X and from Y.

Page 4: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2. self information

I(ai) = f [p(ai)]Satisfy:

1) I(ai) is the monotone decreasing function of p(ai):

if p(a1)> p(a2), then I(a1) < I(a2) ;2) if p(ai)= 1, then I(ai)= 0;3) if p(ai)= 0 , then I(ai)→∞;4) if p(ai aj)= p(ai) p(aj), then I(aiaj)=I(ai)+I(aj)

§1.1.1 Discrete memoryless source and entropy

Page 5: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

self information 1

( ) log log ( )( )i r r i

i

I a p ap a

I(ai)

p(ai)0 1

0)()(1 ji apap )()( ji aIaI

0)( iap )( iaI

1)( iap 0)( iaI

)()()( bIaIabI a and b are statistically independent

§1.1.1 Discrete memoryless source and entropy

Remark:

The measure of uncertainty of the random variable ai

The measure of information the random variable ai provides.

bitnat

hart

Page 6: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

3. Entropy

1

( ) [ ( )] ( ) log ( )q

i i ii

H X E I a p a p a

Definition:Suppose X is a discrete random variable, whose

range R={a1,a2,…} is finite or countable.

Let p(ai)=P{X=ai}. The entropy of X is defined by

A measure ofamount of information provided by X.

uncertainty (or randomness) about X.

§1.1.1 Discrete memoryless source and entropy

average

Page 7: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Entropy-the amount of “information”provided by an observation of X

Example 1.1.3 100 balls in a bag, 80% is red, and remain is white. Now , we fetch out a ball. How about the information of every fetching?

2.08.0)(21 aa

xP

XN

aInaIn

N

II N )()( 2211

)()()()( 2211 aIapaIap

2

1

)(log)(i

ii apap),( 11 aNpn

)( 22 aNpn =0.722 bit/sig

§1.1.1 Discrete memoryless source and entropy

( ) (0.8,0.2)

0.722 /

H X H

bit sig

Page 8: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Entropy-the “uncertainty” or “randomness” about X

Example 1.1.4

01.099.0)(

,3.07.0)(

,5.05.0)(

21

3

321

2

221

1

1 aa

xP

Xaa

xP

Xaa

xP

X

)/(08.0)( 3 sigbitXH

)/(1)( 1 sigbitXH

)/(88.0)( 2 sigbitXH

average

§1.1.1 Discrete memoryless source and entropy

Page 9: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

3. Entropy

1

( ) [ ( )] ( ) log ( )q

i i ii

H X E I a p a p a

1) units: bit/sig,nat/sig,hart/sig

2) If p(ai)=0, p(ai)log p(ai)-1 = 0

3) If R is infinite , H(X) may be +

Note:

§1.1.1 Discrete memoryless source and entropy

Page 10: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Example 1.1.5 entropy of BS

qpP

X 10 10 p

pq 1

)1log()1(log)( ppppXH

3. Entropy

1 21

1( , , ..., ) log

q

q ii i

H p p p pp

( )H p

entropy function

probability vector

( )H p

§1.1.1 Discrete memoryless source and entropy

Page 11: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

4. The properties of entropy

Theorem1.1 Let X assume values in R={x1,x2,…,xr}.

0)( XH1)

2) H(X) = 0 iff pi = 1 for some i

3) H(X) ≤ logr ,with equality iff pi = 1/r for all i

——base of data compressing

§1.1.1 Discrete memoryless source and entropy

Proof:

(Theorem 1.1 in textbook)

Lemma: 0, log ( 1) log , ln 1.

with equality iff 1.

x x x e or x x

x

Page 12: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

4) ),...,,(),...,,(2121 riiir pppHpppH

2

1

6

1

3

1)(

321 aaa

xP

X

3

1

2

1

6

1)(

321 aaa

yP

Y

6

1

2

1

3

1)(

321 bbb

zP

Z

Example 1.1.6 Let X,Y,Z are all discrete random variables:

4. The properties of entropy

1.46 ( / )H bit sig

§1.1.1 Discrete memoryless source and entropy

Page 13: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

5) If X,Y are independent , then H(XY) = H(X) + H(Y)

4. The properties of entropy

Proof:

1 2

( ) ( )

q

i

a a aX

P x p a

( ) 1ii

p a

1 2

( )( )r

j

b b bY

p bP y

( ) 1jj

p b j

1

1( ) ( ) log

( )

r

j j

H Y p bp b

i1

1( ) ( ) log

( )

q

i i

H X p ap a

§1.1.1 Discrete memoryless source and entropy

Page 14: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Proof:

1 2,

( ) ( )

q

i

a a aX

P x p a

( ) 1i

i

p a 1 2 ,( )( )

r

j

b b bY

p bP y

( ) 1j

j

p b

)()()()(

),(),(

)(111

jijik

qrrq

bpapbapP

baba

xyP

XY

1)()(1 11

q

i

r

jji

qr

kk baPP

qr

k PPXYH

1 kk )(

1log)()(

q

i

r

jbapji ji

bap1 1

)(1log)(

i j

bpjij i

apij jibpapapbp )(

1)(

1 log)()(log)()(

Joint source:

§1.1.1 Discrete memoryless source and entropy

Page 15: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Theorem1.2 The entropy function H(p1,p2,…,pr) is a convex function of probability vector (p1,p2,…,pr) .

6) Convex properties

4. The properties of entropy

§1.1.1 Discrete memoryless source and entropy

0 1/2 1

1

)(

)1log()1(log)(

pH

ppppXH

Example 1.1.5 (continued)entropy of BS

H

p

Page 16: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

5. conditional entropy

X, Y are a pair of random variables, if (X,Y)~p(x,y)

Then the conditional entropy of X , given Y is defined by

Definition:

,

1 1( | ) log ( , ) log

( | ) ( | )X Y

H X Y E p x yp x y p x y

§1.1.1 Discrete memoryless source and entropy

Page 17: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

X yxp

yxpyYXH)|(

1log)|()|(

Y

yYXHypYXH )|()()|(

Y X yxp

yxpyp)|(

1log)|()(

YX yxp

yxp, )|(

1log),(

5. conditional entropy

Analyse:

§1.1.1 Discrete memoryless source and entropy

Page 18: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Example 1.1.7

3/40

1

0

1

?

1/4

1/2

1/2

X Y

pX(0)=2/3, pX(1) = 1/3

H(X)=?

H(X|Y=0) = ?

H(X|Y=?) = ?

H(X|Y) = ?

5. conditional entropy

H(X) = H(2/3,1/3)=0.9183 bit/sig

H(X|Y=0) = 0

H(X|Y=1) = 0

H(X|Y) = 1/3 bit/sig

H(X|Y=?) = H(1/2,1/2)=1 bit/sig

§1.1.1 Discrete memoryless source and entropy

Page 19: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

5. conditional entropy

§1.1.1 Discrete memoryless source and entropy

Theorem1.3

with equality iff X and Y are independent.

( | ) ( )H X Y H X

Proof:

(conditioning reduces entropy)

Page 20: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Review

KeyWords: Measure of information

self information

entropy

properties of entropy

conditional entropy

Page 21: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Homework

1. P44: T1.1,2. P44: T1.4,3. P44: T1.6,

4. Let X be a random variable taking on a finite number of values. What is the relationship

of H(X) or H(Y) if (1) Y=2X ? (2) Y=cosX ?

Page 22: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Homework

1

1 1

5. Let be an ensemble of points , ,

and let ( )= , prove that

1 1( ) log +(1- ) log +(1- ) ( )

1where Y is an ensemble of -1 points , ,

with probabilities ( ) ( ) /(1- );

1 1

M

X M

M

Y j X j

X M a a

P a

H X H Y

M a a

P a P a

j M

. Prove that

1 1( ) log +(1- ) log +(1- )log( -1)

1and determine condition for equality.

H X M

Page 23: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Homework

6. Given a chessboard with 8×8=64 squares. A chessman is put randomly in a square. Guess the location of the chessman. Find the uncertainty of the result.

if we mark every square by its row and column number, and already know the row number of the chessman, how about the uncertainty?

Page 24: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Coin flip. A fair coin is flipped until the first head occurs. Let X denote the number of flips required. Find the entropy H(X) in bits.

thinking :

r

rr

n

n

112

1 )1( r

rrn

n

n

...

21...

21

21

21

......321

)( 32 n

n

xP

X

Homework

Imply:

Page 25: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1 Entropy and mutual information

§1.1.1 Discrete memoryless source and entropy§1.1.2 Discrete memoryless channel

and mutual information

Page 26: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Channel1 2

,

{ , , , }i

r

x X

X a a a

1 2

,

{ , , , }

j

s

y Y

Y b b b

p(y∣x)

1

( )

( ) 1

j i

s

j ij

or p b a

p b a

§1.1.2 Discrete memoryless channel and mutual information

Page 27: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

The model of DMC

01

r-1

01

s-1

p(y|x)

r input symbols, s output symbols

Page 28: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

representation of DMC

x yp(y|x)

0)|( xyp for all x,y

y

xyp 1)|( for all x

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

transition probabilities

graph

Page 29: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

representation of DMC

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

matrix

11 12 11 1 2 1 1

21 22 21 2 2 2 2

1 21 2

1

( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )

( ) 1

ss

ss

r r rsr r s r

s

j ij

p p pp b a p b a p b a

p p pp b a p b a p b aP

p p pp b a p b a p b a

p b a

transition probabilities matrix

Page 30: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

representation of DMC

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

formula

( ) ( )j ip y x p b a

1 2

1

, ,

( )( )

( ) 1

r

i

r

ii

a a aX

p aP x

p a

1 2

1

, ,

( )( )

( ) 1

s

j

s

ji

b b bY

p bP y

p b

1

( ) 1s

j ij

p b a

Page 31: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Example 1.1.8: BSC (Binary Symmetric Channel)

r = s = 2

pp

ppP

1

1p(0|0) = p(1|1) = 1-p

p(0|1) = p(1|0) = p

0 0

1 1

1-p

1-p

p

p

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

Page 32: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.1.2 Discrete memoryless channel and mutual information

Example 1.1.9: BEC (Binary Erasure Channel)

1. DMC (Discrete Memoryless Channel)

0 1 ? 1 1 ? ?

0 1 0 ? 1 ? 1 ?

Page 33: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Example 1.1.9: BEC (Binary Erasure Channel)

r = 2, s = 3

p(0|0) = p, p(?|0) = 1-p

p(1|1) = q, p(?|1) = 1-q

qq

ppP

10

01

p0

1

0

1

?

1-p

1-q

q

§1.1.2 Discrete memoryless channel and mutual information

1. DMC (Discrete Memoryless Channel)

Page 34: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2. average mutual information

definition

I(X;Y) = H(X) – H(X|Y)

§1.1.2 Discrete memoryless channel and mutual information

Channel1 2

,

{ , , , }i

r

x X

X a a a

1 2

,

{ , , , }

j

s

y Y

Y b b b

p(y∣x) or p(ai|bj)

1

( ) 1s

j ij

p b a

H(X) H(X|Y)

entropy equivocation

average mutual information

The reduction in uncertainty about X conveyed by the observations Y;

The information about X from Y.

Page 35: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2. average mutual information

definition

X XY yxp

yxpxp

xp)|(

1log),(

)(

1log)(

XY XY yxP

yxPxP

yxP)|(

1log),(

)(

1log),(

§1.1.2 Discrete memoryless channel and mutual information

I(X;Y) = H(X) – H(X|Y)

( | )( , ) log

( )XY

p x yp x y

p x

( , )( , ) log

( ) ( )XY

p x yp x y

p x p y

( | )( , ) log

( )XY

p y xp x y

p y

Page 36: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2. average mutual information

definition

§1.1.2 Discrete memoryless channel and mutual information

I(X;Y) and I(x;y)

( | )( , ) log

( )

P x yI x y

P x

I(X;Y) = EXY[I(x;y)]

mutual information

I(X;Y) and H(X)

Page 37: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

properties

1) Non-negativity of average mutual information

Theorem1.4 For any discrete random variables X and Y,

0);( YXI .Moreover I(X;Y) = 0 iff X and Y are independent.

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

Proof:(Theorem 1.3 in textbook)

We do not expect to be misled on average by observing the output of channel.

Page 38: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

properties

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

X Y’ Y

listener-in

S encrypt

Key

channel decrypt D

total loss

message : arrive at four

ciphertext : duulyh dw irxu

A cryptosystem

Page 39: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

I(X;Y) = I(Y;X)

I(X;Y) = H(Y) – H(Y|X)

I(X;Y) = H(X) – H(X|Y)

I(X;Y) = H(X) + H(Y) – H(XY)

Joint entropy

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

3) relationship between entropy and average mutual information

2) symmetry

Mnemonic Venn diagram H(X)

H(Y)

H(Y∣X)H(X∣Y)

I(X;Y)

H(XY)

properties

Page 40: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

a1

a2

ar

b1

b2

br

a1

a2

b1

b2

b5

b3

b4

a1

a2

a3

b1

b2

Recognising channel

properties

Page 41: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

4) Convex property

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

I(X;Y)=f [P(x) , P(y|x)]

( | )( ; ) ( ) log

( )

( ) ( ) ( | ) ( )

( ) ( | ) ( )

X Y

X X

P x yI X Y P xy

P x

P y P xy P y x P x

P xy P y x P x

properties

Page 42: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

4) Convex properties

Theorem1.5 I(X;Y) is a convex function of the input probabilities P(x).

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

I(X;Y)=f [P(x) , P(y|x)]

Theorem1.6 I(X;Y) is a convex function of the transitionprobabilities P(y|x).

(Theorem 1.6 in textbook)

(Theorem 1.7 in textbook)

properties

Page 43: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Example 1.1.10 analyse the I(X;Y) of BSC

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

0 1

( ) 1

X

P x

source : , channel :

1- p

1- p

p

p

0 0

1 1

( | )( ; ) ( ) log

( )X Y

P x yI X Y P xy

P x

( | )( ) log ( ) ( | )

( )X Y

P y xP xy H Y H Y X

P y

Page 44: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Example 1.1.10 analyse the I(X;Y) of BSC

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

1 1 1( | ) ( ) log log log ( )

( | )X Y

H Y X p xy p p H pp y x p p

( ) ( 2 )H Y H p p

( ; ) ( 2 ) ( )I X Y H p p H p

Page 45: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Example 1.1.10 analyse the I(X;Y) of BSC

§1.1.2 Discrete memoryless channel and mutual information

2. Average mutual information

( ; ) ( 2 ) ( )I X Y H p p H p

Page 46: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Review

KeyWords: Channel and it’s information measure

channel model

equivocation

average mutual information

mutual information

properties of average mutual information

Page 47: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Thinking

( ; ) 0I X Y ( ; ) 0Cov X Y

: ( ; ) ( ) ( )comparing I X Y H X H Y, ,

§1.1.2 Discrete memoryless channel and mutual information

Page 48: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.1.2 Discrete memoryless channel and mutual information

1 2 3( )Tb b b b

Let the source have alphabet A={0,1} with p0=p1=0.5.

Let encoder C have alphabet B={0,1,… ,7}and let the elements of B have binary representation

( 0) 5 (101)b t

The encoder is shown below. Find the entropy of the coded output and find the output sequence if the input sequence is a(t)={101001011000001100111011} and the initial contents of the registers are

Example 1.1.11

D Q D QD Qa(t)

b0 b1 b2

Page 49: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

0

1

2

3

4

5

6

7

0

1

2

3

4

5

6

7

Yt Yt+1

§1.1.2 Discrete memoryless channel and mutual information

a(t)={101001011000001100111011}

b = {001242425124366675013666}

Page 50: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Homework

1. P45: T1.10,

2. P46: T1.19(except c)

3. Let the DMS 0 1

( ) 0.6 0.4

X

p x

conveys message through a channel:

Calculate that:(1) H(X) and H(Y);(2) the mutual information of xi and yj (i,j=1,2);(3) the equivocation H(X|Y) and average mutual information.

5 1(0 | 0) (1 | 0) 6 6(0 |1) (1 |1) 3 1

4 4

p pP

p p

Page 51: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Homework

4. Suppose that I(X;Y)=0.Does this imply that I(X;Z)=I(X;Z|Y)?

5. In a joint ensemble XY, the mutual information I(x;y) is a random variable. In this problem we are concerned with the variance of that random variable, VAR[I(x;y)].

(1) Prove that VAR[I(x;y)]=0 iff there is a constant αsuch that, for all x,y with P(xy)>0,

P(xy)= αP(x) P(y)(2) Express I(X;Y) in term of α and interpret the special

case α =1. (continued)

Page 52: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Homework

5. (3) for each of the channel in fig5 , find a probability

assignment P(x) such that I(X;Y) >0 and VAR[I(x;y)]=0 . Calculate I(X;Y).

a1

a2

a3

b1

b2

1

1

1

a1

a2

a3

b1

b2

1/2

1/2

1/2b3

1/2

1/2

1/2

Page 53: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1 Entropy and mutual information

§1.2.1 Extended source and joint entropy§1.2.2 Extended channel and mutual information

Page 54: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.2.1 Extended source and joint entropy

1. Extended source

Source model

q

q

ppp

aaa

xP

X

...

...

)( 21

211

1

q

iip

N-times extended

source],,[

),(

21

21

qi

N

aaaXX

XXXX

1 2

1 1 1 1

1

( ), , ( )

( ) ( )( )

( ) 1

N

N

N

Nq q qq

i i i i

q

ii

a a a a a aX

p p a a aP x

p

Example 1.2.1

Page 55: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2. Joint entropy

Definition:

The joint entropy H(XY) of a pair of discrete random

variables (X,Y) with a joint distribution p(x,y) is defined as ( ) ( , ) log ( , )

x X y Y

H XY p x y p x y

which can also be expressed as

( ) [log ( , )]H XY E p x y

§1.2.1 Extended source and joint entropy

Page 56: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2. Joint entropy Extended DMS

Nq

iii

N ppXHXH1

)(log)()()(

)...(log)...(...21

1

21

21 1 1NN

N

iii

q

iiii

q

i

q

i

aaapaaap

q

i

q

iii

q

iii

N

Napapapap

1 11 1

2

1

11)()...()(log)(

q

i

q

iii

q

iii

N

NNNapapapap

1 11 1

11

1

)()...()(log)(...

)()(...)( XNHXHXH

§1.2.1 Extended source and joint entropy

Page 57: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2. Joint entropy memory source

§1.2.1 Extended source and joint entropy

2) Joint entropy

q

i

q

jjiji aapaapXXH

1 121 )(log)()(

)(2

1)( 212 XXHXH

bit/sig

1) Conditional entropy

2 11 1

( | ) ( ) log ( | )q q

i j j ii j

H X X p a a p a a

3) (per symbol) entropy

Page 58: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

3. Properties of joint entropy

Theorem1.7 (Chain rule) :

H(XY) = H(X) + H(Y|X)

Proof: ( ) ( , ) log ( , )x X y Y

H XY p x y p x y

Xx Yy

xypxpyxp )|()(log),(

Xx YyXx Yy

xypyxpxpxypxp )|(log),()(log)|()(

Xx YyXx

xypyxpxpxp )|(log),()(log)(

)|()( XYHXH

§1.2.1 Extended source and joint entropy

Page 59: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Let X be a random variable, its probability space is :

6

1

3

1

2

1210

)(xP

X

Its joint probability )( jiaaP

ia ja

1/4 1/4 0

1/24

1/24

1/4

0

1/24

1/8

0 1 2

0

1

2

Example 1.2.3

§1.2.1 Extended source and joint entropy

3. Properties of joint entropy

H(X)=?

P(aj| ai) aj=0 aj =1 aj =2 ai=0 1/2 1/2 0 ai=1 3/4 1/8 1/8 ai=2 0 1/4 3/4

Page 60: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Relationship

)|()()( 12121 XXHXHXXH

)( 21 XXH )(2 XH

)|( 12 XXH)(XH

H(X2) ≥ H(X2|X1)

H(X1X2) ≤ 2H(X1)

§1.2.1 Extended source and joint entropy

3. Properties of joint entropy

2( ) ( )H X H X

Page 61: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

General stationary source

)(...)()(

...

)( 21

21

q

q

apapap

aaa

xP

X1)(

1

q

iiap

),...()...(2121 NiiiN aaapXXXP },...,2,1{,...,, 21 qiii N

Let X1,X2,…,XN be dependent , the joint probability is :

§1.2.1 Extended source and joint entropy

)...|()...|()()...( 1211211 NNN xxxxPxxPxPxxP

3. Properties of joint entropy

Page 62: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

• Joint entropy

N

NNii

iiiiiiN aaapaaapXXXH...

21

1

2121)...(log)...()...(

)...(1

)( 21 NN XXXHN

XH

§1.2.1 Extended source and joint entropy

Definition of entropies

3. Properties of joint entropy

• conditional entropy

)...|(log)...()...|(121

1

1...

11 NN

N

N iiiiii

iiNN aaaapaapXXXH

• (per symbol) entropy

Page 63: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Theorem1.8 (Chain rule for entropy):

Let X1,X2,…,Xn be drawn according to p(x1,x2,…,xn). Then

1 2 1 11

( ... ) ( | ... )n

n i ii

H X X X H X X X

Proof (do it by yourself)

§1.2.1 Extended source and joint entropy

3. Properties of joint entropy

Page 64: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

)...|()...|( 2211121 NNNN XXXXHXXXXH

)...|()( 11 NNN XXXHXH

)()( 1 XHXH NN

)...|(lim)(lim 121 NNN

NN

XXXXHXHH

§1.2.1 Extended source and joint entropy

Relation of entropies

——base of data compressing

If H(X)<∞, then:

3. Properties of joint entropy

entropy rate

Page 65: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Theorem1.9 (Independence bound on entropy):

Let X1,X2,…, Xn be drawn according to p(x1,x2,…,xn). Then

1 21

( ... ) ( )n

n ii

H X X X H X

with equality iff the Xi are independent

§1.2.1 Extended source and joint entropy

(P37(corollary) in textbook)

3. Properties of joint entropy

Page 66: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.2.1 Extended source and joint entropy

Example 1.2.4

Suppose a memoryless source with A={0,1} having equal probabilities emits a sequence of six symbols. Following the sixth symbol, suppose a seventh symbol is transmitted which is the sum modulo 2 of the six previous symbols. What is the entropy of the seven-symbol sequence?

3. Properties of joint entropy

Page 67: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

§1.1 Discrete random variables

§1.2 Discrete random vectors

§1 Entropy and mutual information

§1.2.1 Extended source and joint entropy§1.2.2 Extended channel and mutual information

Page 68: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

1. The model of extended channel

source encoder

channel decoder

(U1,U2,…,Uk) (X1,X2,…,XN)

(Y1,Y2,…,YN)(V1,V2,…,Vk)

A general communication system

XN

YN

§1.2.2 Extended channel and mutual information

Page 69: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Extended channel

1 2( ) NNX X X X X

1 2( ) N

NY Y Y Y Y

1 2{ , , , }i i rX x X a a a 1 2{ , , , }j j sY y Y b b b ( )P y x

1 2 1 2

1

( ) ( )

( )

N N

DMC N

i ii

P y x P y y y x x x

P y x

1. The model of extended channel

§1.2.2 Extended channel and mutual information

Page 70: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

1. The model of extended channel

§1.2.2 Extended channel and mutual information

1 1 1 1 1 1

2 1 1 2 1 1 1 2

( ) ( )

( ) ( | ) ( )

( ) ( )N N

N N

h k

r r s sr s

X Y

a a b b

a a a p b b b

a a b b

Page 71: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2. Average mutual information

( ; ) ( ; ) ( ) ( | )N N N N NI X Y I X Y H X H X Y

)|()( NNN XYHYH ( | )

( ) log( )N NX Y

P y xP xy

P y

NN YXyx ,,

N Nr

k

s

h h

khhk p

pp

1 1 )(

)|(log)(

)...(),...(11 NN hhhkkk bbaa

§1.2.2 Extended channel and mutual information

example 1.2.5

Page 72: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

3. The properties

Theorem1.11 If the components(X1, X2,…,XN) of XN areindependent, then

1

( ; ) ( ; )N

i ii

I X Y I X Y

§1.2.2 Extended channel and mutual information

(Theorem 1.8 in textbook)

Page 73: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

3. The properties

Theorem1.12 If XN =(X1, X2,…,XN) and YN =(Y1, Y2,…,YN) are random vectors and the channel is memoryless,

that is

N

iiiNN xyPxxyyP

111 )|(),...,|,...,(

then

1

( ; ) ( ; )N

i ii

I X Y I X Y

§1.2.2 Extended channel and mutual information

(Theorem 1.9 in textbook)

Page 74: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

example 1.2.6

Let X1,X2,…,X5 be independent identically distributedrandom variables with common entropy H. Also let Tbe a permutation of the set {1, 2,3,4,5}, and let Yi = XT(i)

1 2 3 4 5

3 2 5 1 4

Show that ,);(5

1i

ii YXI ).;( 55 YXI

§1.2.2 Extended channel and mutual information

Page 75: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Review

Keywords:

Measure of information

vector Extented source

joint entropy

Extented channelstationary source

(per symbol) entropy

conditional entropy entropy rate

Page 76: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Review

chain rule for entropy

Conclusion:

Independence bound on entropy

conditioning reduces entropy

properties of ( ; )I X Y

Page 77: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

Homework

1. P47: T1.23,2. P47: T1.24,

1 2 -1

n-1

ii=1

3. Let , , , be i.i.d. random variables taking values

in {0,1}, with { 1} 1/ 2. Let 1 if X is odd

and 0 otherwise. Let 3.

(1) Show that and are independent, for , ,

n

i n

n

i j

X X X

Pr X X

X n

X X i j i j

1 2 1

{1,2, , };

(2) Find ( ), for ;

(3) Find ( ), is this equal to ( )?

i j

n

n

H X X i j

H X X X nH X

Page 78: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

2 1

1

( | )1

( )

H X X

H X

4.Let X1, X2 be identically distributed random variables. Let be:

1 ) show that 0 1

2 ) when 0? 3 ) when 1?

Homework

Page 79: §1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1 Entropy and mutual information

5. Shuffles increase entropy. Argue that for any distribution on shuffles T and any distribution on card positions X that

H(TX) ≥ H(TX|T) , if X and T are independent.

Homework

Thinking :