Information Theory & Coding Techniques-DCom

Information Theory & Coding Techniques Ajinkya C. Kulkarni

Subject : DCOM Unit 5 By Ajinkya C. Kulkarni [email protected] Communication

1

Contents at Glancey Introduction y Basics of Information theory y Entropy concept y Data compression y Variable length coding y Shannon Fano coding y Huffman coding

2

Digital Communication

Information theoryy Information basically is a set of symbols with each symbol has its own probability of occurrence.y s1, s2, .. Are symbols then probability is

given by P(si)= pi

3


Information theory1 I ( p ) ! log( ) ! log p pMeasures amount of information in event of occurrence of probability pProperties of I(p) I(p) u 0 (a real nonnegative measure) I(p1,2) =I(p1p2) = I(p1)+I(p2) for independent event I(p) is a continuous function of p4 Digital Communication

Entropythe average minimum bit length of coding symbols without distortion (the low-bound amount of the compacted data completely truly recovered back)

Entropy function was introduced by Shannon. Denoted by H(p)

5


EntropySet of symbols (alphabet) S={s1, s2, , sN}, N is number of symbols in the alphabet. Probability distribution of the symbols: P={p1, p2, , pN} According to Shannon, the entropy H of an information source S is defined as follows:N

H ! pi log 2 ( pi )i !1Digital Communication

6

Terms in EntropyN

!

p logi i !1

2

( pi )

y pi - probability that symbol si will occur in S. y log2(1/pi) indicates the amount of information ( self-information as defined by Shannon contained in si, which corresponds to the number of bits needed to encode si )

7


Entropy for binary source: N=2S={0,1} p0=p p1=1-pp 0 1 1-p

H ! [ p log 2 p (1 p ) log 2 (1 p )]8 Digital Communication

Entropy for uniform distribution: pi=1/NUniform distribution of probabilities: pi=1/N:N

H ! (1 / N ) log 2 (1 / N ) ! log 2 ( N )i !1

Pi=1/N

s1 s2 Examples: N= 2: pi=0.5; H=log2(2) = 1 bit N=256: pi=1/256; H=log2(256)= 8 bits9 Digital Communication

sN

Entropy gives min. number of bits required for coding

Entropy exampley X is sampled from {a, b, c, d} y P: {1/2, 1/4, 1/8, 1/8} y Find entropy.

10


This is B. Tech. IIIrd year.

Data Compressionthe process of coding that will effectively reduce the total number of bits needed to represent certain information

I/P

Data Compression

Network

Data Decompression

Recovered data O/P

11


Lossless vs Lossy Compressiony If the compression and decompression processes

induce no information loss, then the compression scheme is lossless; otherwise, it is lossy.y Compression ratio = B0/B1

B0= number of bits before compression B1= number of bits after compression

12


Compression codes

Variable length codingy Shannon-Fano code y Huffmans code

13


Shannon Fano codingA top-down approach Shannon Fano Algorithm1) Sort symbols according their probabilities: p1 e p2 e e pN 2) Recursively divide into parts, each with approx. the same number of counts (probability)

14


Calculate Entropy & Shannon Fano code for given InformationShannon-Fano Code: Example (1st step)

si ABCDE-

pi 15/39 7/39 6/39 6/39 5/39

A,B, C,D,E 15,7, 6,6,5 0 A,B15+7 =22

1 C,D,E6+6+5=17

15


Shannon-Fano Code: Example (2nd step)si ABCDEpi 15/39 7/39 6/39 6/39 5/39A,B, C,D,E 15,7, 6,6,5 0 A,B15+7 =22

1 C,D,E6+6+5=17

0 A15

1 B7

0 C6

1 D,E6+5=11

16


Shannon-Fano Code: Example (3rd step)si ABCDEpi 15/39 7/39 6/39 6/39 5/390 A15

A,B, C,D,E 15,7, 6,6,5 0 A,B15+7 =22

1 C,D,E6+6+5=17

1 B7

0 C6

1 D,E6+5=11

0 D17 Digital Communication

1 E5

6

Shannon-Fano Code: Example (Result)Symbol A B C D E pi 15/39 7/39 6/39 6/39 5/390 Binary tree 0 A 1 B C D18 Digital Communication

-log2(pi) 1.38 2.48 2.70 2.70 2.961 0 0

Code 00 01 10 110 111 Total:1 1 E

Subtotal 2*15 2*7 2*6 3*6 3*5 89 bits

H=89/39=2.28 bits

Huffmans CodingHuffman Coding AlgorithmArrange the symbols according to decreasing probability

A down-up approachRepeat the procedure until only one symbol remains

Pick lowest prob. Symbols . Add them to form parent node

Assign codeword for each sequence based on path

Assign new prob. To parent & maintain order

Delete child symbols from list

19


Huffman Coding Explained.Source alphabet A = {a, b, c, d, e} Probability distribution: {0.2, 0.4, 0.2, 0.1, 0.1}b (0.4) a (0.2) c (0.2) d (0.1) e (0.1)

20


Huffman Coding (Result)Entropy: H(S) = - [0.2*log2(0.2)*2 + 0.4*log2(0.4)+0.1*log2(0.1)*2] = 2.122 bits / symbol Average Huffman codeword length: L = 0.2*2+0.4*1+0.2*3+0.1*4+0.1*4 = 2.2 bits / symbol

In general: H(S) L < H(S) + 121 Digital Communication

Properties of Huffman codey Unique Prefix Property:

precludes any ambiguity in decoding (not unique)y Optimality:y minimum redundancy code - proved optimal for a given data model y The two least prob symbols will have the same length for their

Huffman codes, differing only at the last bit. y Symbols that occur more frequently will have shorter Huffman codes than symbols that occur less frequently.y The average code length for an information source S

is strictly less than entropy+ 1.22 Digital Communication

Shannons theoremSource Coding theoremGiven the discrete memory less source of entropy H the average code word length L for any source encoding is given as,

HLConsidering Shannon Fano code, M = number of messages to be transmitted N = bits per word Then M=2N

23


Shannon Heartly TheoremChannel Capacity TheoremThe channel capacity of band limited, white Gaussian channel is given by,

P C ! B log 2 (1 ) NB= Channel bandwidth P= Signal power N= noise within channel bandwidth24 Digital Communication

Bits/sec

Shannon Heartly TheoremExplanationSample of transmitted signal Sample of Received signal

Gaussian Noise

A discrete time memory less white Gaussian channel

25


Trade off between BW & SNRy Effect of SNR y Effect of BW y Trade Off

26


Assignmenty A discrete memory less source has S={a, b, c, d, e} & P={0.4,

0.19, 0.16, 0.15, 0.1}. Explain Shannon Fano algorithm to construct code for this source.y Explain Huffman coding algorithm for the same source above.

Also, find entropy & code length.y Why we use variable length coding? y State & prove information capacity theorem.

Date of submission: 27th Nov 201027 Digital Communication

End of Unit 5

28


Documents

Information Theory & Coding Techniques-DCom