Click here to load reader
Upload
priyank-bangar
View
413
Download
6
Embed Size (px)
Citation preview
Information Theory & Coding Techniques Ajinkya C. Kulkarni
Subject : DCOM Unit 5 By Ajinkya C. Kulkarni [email protected] Communication
1
Contents at Glancey Introduction y Basics of Information theory y Entropy concept y Data compression y Variable length coding y Shannon Fano coding y Huffman coding
2
Digital Communication
Information theoryy Information basically is a set of symbols with each symbol has its own probability of occurrence.y s1, s2, .. Are symbols then probability is
given by P(si)= pi
3
Digital Communication
Information theory1 I ( p ) ! log( ) ! log p pMeasures amount of information in event of occurrence of probability pProperties of I(p) I(p) u 0 (a real nonnegative measure) I(p1,2) =I(p1p2) = I(p1)+I(p2) for independent event I(p) is a continuous function of p4 Digital Communication
Entropythe average minimum bit length of coding symbols without distortion (the low-bound amount of the compacted data completely truly recovered back)
Entropy function was introduced by Shannon. Denoted by H(p)
5
Digital Communication
EntropySet of symbols (alphabet) S={s1, s2, , sN}, N is number of symbols in the alphabet. Probability distribution of the symbols: P={p1, p2, , pN} According to Shannon, the entropy H of an information source S is defined as follows:N
H ! pi log 2 ( pi )i !1Digital Communication
6
Terms in EntropyN
!
p logi i !1
2
( pi )
y pi - probability that symbol si will occur in S. y log2(1/pi) indicates the amount of information ( self-information as defined by Shannon contained in si, which corresponds to the number of bits needed to encode si )
7
Digital Communication
Entropy for binary source: N=2S={0,1} p0=p p1=1-pp 0 1 1-p
H ! [ p log 2 p (1 p ) log 2 (1 p )]8 Digital Communication
Entropy for uniform distribution: pi=1/NUniform distribution of probabilities: pi=1/N:N
H ! (1 / N ) log 2 (1 / N ) ! log 2 ( N )i !1
Pi=1/N
s1 s2 Examples: N= 2: pi=0.5; H=log2(2) = 1 bit N=256: pi=1/256; H=log2(256)= 8 bits9 Digital Communication
sN
Entropy gives min. number of bits required for coding
Entropy exampley X is sampled from {a, b, c, d} y P: {1/2, 1/4, 1/8, 1/8} y Find entropy.
10
Digital Communication
This is B. Tech. IIIrd year.
Data Compressionthe process of coding that will effectively reduce the total number of bits needed to represent certain information
I/P
Data Compression
Network
Data Decompression
Recovered data O/P
11
Digital Communication
Lossless vs Lossy Compressiony If the compression and decompression processes
induce no information loss, then the compression scheme is lossless; otherwise, it is lossy.y Compression ratio = B0/B1
B0= number of bits before compression B1= number of bits after compression
12
Digital Communication
Compression codes
Variable length codingy Shannon-Fano code y Huffmans code
13
Digital Communication
Shannon Fano codingA top-down approach Shannon Fano Algorithm1) Sort symbols according their probabilities: p1 e p2 e e pN 2) Recursively divide into parts, each with approx. the same number of counts (probability)
14
Digital Communication
Calculate Entropy & Shannon Fano code for given InformationShannon-Fano Code: Example (1st step)
si ABCDE-
pi 15/39 7/39 6/39 6/39 5/39
A,B, C,D,E 15,7, 6,6,5 0 A,B15+7 =22
1 C,D,E6+6+5=17
15
Digital Communication
Shannon-Fano Code: Example (2nd step)si ABCDEpi 15/39 7/39 6/39 6/39 5/39A,B, C,D,E 15,7, 6,6,5 0 A,B15+7 =22
1 C,D,E6+6+5=17
0 A15
1 B7
0 C6
1 D,E6+5=11
16
Digital Communication
Shannon-Fano Code: Example (3rd step)si ABCDEpi 15/39 7/39 6/39 6/39 5/390 A15
A,B, C,D,E 15,7, 6,6,5 0 A,B15+7 =22
1 C,D,E6+6+5=17
1 B7
0 C6
1 D,E6+5=11
0 D17 Digital Communication
1 E5
6
Shannon-Fano Code: Example (Result)Symbol A B C D E pi 15/39 7/39 6/39 6/39 5/390 Binary tree 0 A 1 B C D18 Digital Communication
-log2(pi) 1.38 2.48 2.70 2.70 2.961 0 0
Code 00 01 10 110 111 Total:1 1 E
Subtotal 2*15 2*7 2*6 3*6 3*5 89 bits
H=89/39=2.28 bits
Huffmans CodingHuffman Coding AlgorithmArrange the symbols according to decreasing probability
A down-up approachRepeat the procedure until only one symbol remains
Pick lowest prob. Symbols . Add them to form parent node
Assign codeword for each sequence based on path
Assign new prob. To parent & maintain order
Delete child symbols from list
19
Digital Communication
Huffman Coding Explained.Source alphabet A = {a, b, c, d, e} Probability distribution: {0.2, 0.4, 0.2, 0.1, 0.1}b (0.4) a (0.2) c (0.2) d (0.1) e (0.1)
20
Digital Communication
Huffman Coding (Result)Entropy: H(S) = - [0.2*log2(0.2)*2 + 0.4*log2(0.4)+0.1*log2(0.1)*2] = 2.122 bits / symbol Average Huffman codeword length: L = 0.2*2+0.4*1+0.2*3+0.1*4+0.1*4 = 2.2 bits / symbol
In general: H(S) L < H(S) + 121 Digital Communication
Properties of Huffman codey Unique Prefix Property:
precludes any ambiguity in decoding (not unique)y Optimality:y minimum redundancy code - proved optimal for a given data model y The two least prob symbols will have the same length for their
Huffman codes, differing only at the last bit. y Symbols that occur more frequently will have shorter Huffman codes than symbols that occur less frequently.y The average code length for an information source S
is strictly less than entropy+ 1.22 Digital Communication
Shannons theoremSource Coding theoremGiven the discrete memory less source of entropy H the average code word length L for any source encoding is given as,
HLConsidering Shannon Fano code, M = number of messages to be transmitted N = bits per word Then M=2N
23
Digital Communication
Shannon Heartly TheoremChannel Capacity TheoremThe channel capacity of band limited, white Gaussian channel is given by,
P C ! B log 2 (1 ) NB= Channel bandwidth P= Signal power N= noise within channel bandwidth24 Digital Communication
Bits/sec
Shannon Heartly TheoremExplanationSample of transmitted signal Sample of Received signal
Gaussian Noise
A discrete time memory less white Gaussian channel
25
Digital Communication
Trade off between BW & SNRy Effect of SNR y Effect of BW y Trade Off
26
Digital Communication
Assignmenty A discrete memory less source has S={a, b, c, d, e} & P={0.4,
0.19, 0.16, 0.15, 0.1}. Explain Shannon Fano algorithm to construct code for this source.y Explain Huffman coding algorithm for the same source above.
Also, find entropy & code length.y Why we use variable length coding? y State & prove information capacity theorem.
Date of submission: 27th Nov 201027 Digital Communication
End of Unit 5
28
Digital Communication