NH 67, Karur Trichy Highways, Puliyur C.F, 639 114 Karur ...chettinadtech.ac.in/storage/16-09-27/16-09-27-15-19-46-3818-mkumar.pdfFor a block of messages, code vector is generated

NH – 67, Karur – Trichy Highways, Puliyur C.F, 639 114 Karur District

Internal Examination – III Answer Key

DEPARTMENT OF CSE & IT

Branch & Section: II CSE Date & Time: 22.09.16 & 3 Hours

Semester: III Max.Marks: 100

Subject: CS6304 Analog and Digital Communication Staff Incharge: M.Kumar, AP/IT

PART A (10 × 2 = 20Marks)

1. A source is emitting symbols x1, x2, x3 with probabilities 0.6, 0.3 and 0.1 respectively.

What is the entropy of the source?

Entropy H = 2

1

1log

M

k

k k

PP

= 2 2 2

1 1 10.6log 0.3log 0.1log

0.6 0.3 0.1

= 0.4421+0.5211+0.3322 = 1.2954 Information bits

2. Find Hamming distance between the following codes C1 = [1000111] and C2 =

[0001011].

Hamming distance between the code C1 and C2 = 3

3. State channel coding theorem or Shannon’s second theorem. (Dec 2012) (May

2007,2008,2009)

Given a source of M equally likely symbols with M>>1, which is generating

information at a rate R. Given channel capacity with C, then if R≤C there exists a coding

technique such that the output of the source may be transmitted over the channel with a

probability of error in the received message which may be made arbitrarily small.

4. Differentiate between block codes and convolutional codes.

S.No Linear Block codes Convolutional codes

1 Block codes are generated by X = M.G

Convolutional codes are generated by

convolution between message sequence

and generating sequence.

2 For a block of messages, code vector is

generated

Each message bit is encoded separately.

Every message bits generates two or more

encoded bits

3 Coding is block by block Coding is bit by bit.

4 Syndrome decoding is used for most

likelihood decoding

Viterbi decoding is used for most likelihood

decoding

5 Generator matrix, parity check matrix

and syndrome vectors used for analysis

Code tree, code trellis and state diagram

are used for analysis

5. What is entropy? State any two properties of entropy. (Dec 2007)

Entropy is also called average information per message. It is the ratio of total

information to number of messages.

Entropy, H = 2

1

1log

M

k

k k

PP

Properties:

(1) For p(xi) = 1 for any one symbol and all other p(xi) = 0, the entropy is zero.

(2) For M number of equally likely symbols entropy is max 2logH M

(3) Entropy is lower bound on average number of bits per symbol.

6. Define channel capacity. (Dec 2015)

The channel capacity of the discrete memoryless channel is given as maximum

average mutual information. The maximization is taken with respect to input probabilities

P(xi). max

( ) ( , )iC p x I X Y

7. Define mutual information and state its properties. (Dec 2009)

The mutual information is defined as the amount of information transferred when xi is

transmitted and yj is received. It is represented by I(xi,yj) and given as,

2

( | ), log

( )

i j

i j

i

p x yI x y

p x

Properties:

(1) The mutual information is symmetric. I(X,Y) = I(Y,X)

(2) The mutual information is always positive. I(X,Y)≥0

8. What is prefix coding? (Dec 2003)

Prefix coding is variable length coding algorithm. It assigns binary digits to the messages as

per their probabilities of occurrence. Prefix of the codeword means any sequence which is

initial part of the code. In prefix code, no codeword is the prefix of any other codeword. If 01

is one of the codeword, then none of the other codewords begin with 01.

9. State source coding theorem. (Dec 2012)

Given a discrete memoryless source of entropy H, the average codeword length N for

any distortionless source encoding is bounded as N >H.

10. State Information capacity theorem. (Dec 2011)

The channel capacity of the white bandlimited Gaussian channel is,

2log 1S

C BN

Here, B is the channel bandwidth.

PART B (5 × 13 = 65Marks)

11. (a) Define PWM and explain one method of generating PWM. (13 marks)

Pulse Width Modulated Waveform (3 Marks)

Generation of PWM signal: (8 Marks)

This simple circuit based around the familiar NE555 or 7555 timer chip is used to produce

the required pulse width modulation signal at a fixed frequency output. The timing capacitor

C is charged and discharged by current flowing through the timing networks RA and RB.

The output signal at pin 3 of the 555 is equal to the supply voltage switching the transistors

fully “ON”. The time taken for C to charge or discharge depends upon the values of RA, RB.

The capacitor charges up through the network RA but is diverted around the resistive

network RB and through diode D1. As soon as the capacitor is charged, it is immediately

discharged through diode D2 and network RB into pin 7. During the discharging process the

output at pin 3 is at 0 V and the transistor is switched “OFF”.

Then the time taken for capacitor, C to go through one complete charge-discharge cycle

depends on the values of RA, RB and C with the time T for one complete cycle being given as:

The time, TH, for which the output is “ON” is: TH = 0.693(RA).C

The time, TL, for which the output is “OFF” is: TL = 0.693(RB).C

Total “ON”-“OFF” cycle time given as: T = TH + TL with the output frequency being ƒ = 1/T

With the component values shown, the duty cycle of the waveform can be adjusted from

about 8.3% (0.5V) to about 91.7% (5.5V) using a 6.0V power supply. The Astable frequency

is constant at about 256 Hz and the motor is switched “ON” and “OFF” at this rate.

Resistor R1 plus the “top” part of the potentiometer, VR1 represent the resistive network of

RA. While the “bottom” part of the potentiometer plus R2 represent the resistive network of

RB above.

(b) Explain the method of demodulating PWM signal. (13 Marks)

The square generator and the monostable multivibrator circuits were used to

generate the PWM signal. To recover the original audio signal from a PWM signal, a decoder

or demodulator is need in the receiver circuit.

There are two common techniques used for pulse-width demodulation. One method is

that the PWM signal must first be converted to a pulse-amplitude modulation (PAM) signal

and then passed through a low-pass filter. The PWM signal is applied to an integrator and

hold circuit. When the positive edge of pulse appears, the integrator generates ramp output

whose magnitude is proportional to the pulse width. After the negative edge, the hold circuit

maintains the peak ramp voltage for a given period and then forces the output voltage to

zero. The waveform is the sum of a sequence of constant-amplitude and constant-width pulse

generated by demodulator. This signal is then applied to the input of clipping circuit, which

cuts off the portion of signal below the threshold voltage and outputs the reminder.

Therefore the output of clipping circuit is a PAM signal whose amplitude is proportional to

the width of PWM signal. Finally, the PAM signal passes through a simple low-pass filter and

the original audio signal is obtained.

12. (a) Consider a (7, 4) linear block code with parity check matrix given by

1001110

0101011

0011101

H

(i) Generate the codebook. (8 Marks)

(ii) Show that this code is a Hamming code. (2 marks)

(iii) Illustrate the relation between the minimum distance and structure of the

parity check matrix by considering the codeword 0101100. (3 marks)

From given Parity check Matrix, we can identify the Parity matrix by using H = IPT |

111

101

110

011

P

Codebook can be generated by using C = M.P = [M1 M2 M3 M4]

111

101

110

011

C1 = M1+M3+M4

C2 = M1+M2+M4

C3 = M2+M3+M4

Message Bits Code

C1 C2 C3 Codeword Weight

M1 M2 M3 M4 0 0 0 0 000 00000000 0 0 0 0 1 111 0001111 4 0 0 1 0 101 0010101 3 0 0 1 1 010 0011010 3 0 1 0 0 011 0100011 3 0 1 0 1 100 0101100 3 0 1 1 0 110 0110110 4 0 1 1 1 001 0111001 4 1 0 0 0 110 1000110 3 1 0 0 1 001 1001001 3 1 0 1 0 111 1010111 5 1 0 1 1 100 1011100 4 1 1 0 0 101 1100101 4 1 1 0 1 010 1101010 4 1 1 1 0 000 1110000 3 1 1 1 1 111 1111111 7

Minimum non-zero weight of the code is dmin = 3

So the obtained code is Hamming code.

(b) Consider a (7, 4) cyclic code whose generator polynomial is g(x) = 1 + x2 + x3. (13

Marks)

(i) Encode the message 1001 using encoder and algorithm.

(ii) Decode the received word if error has occurred at middle bit using both syndrome

calculator circuit and algorithm. (13 Marks)

For example, q = 3 and message polynomial M(P) = X3+1.

In Systematic code, the code of the given message can be found by using

C(P) =

)(

)(

PG

PMXrem

q

=

1

)1(23

33

XX

XXrem =

123

36

XX

XXrem =X+1

Therefore the code is = 011.

The transmitted Codeword is = 1001011.

Encoder Circuit:

Given g(X) = 1+X2+X3 = X3+g2X2+g1X1+1

On comparison, g1 = 0, g2 = 1

Syndrome Calculator Circuit:

If the error is introduced in the middle bit, then received word is = 1000011.

Syndrome is calculated by using

S(P) =

)(

)(

PG

PRrem =

1

)1(23

6

XX

XXrem =X2+1

So the Syndrome of the received word is = 101. Since the syndrome is non-zero, the received

word contains error.

13. (a) Explain the term mutual information in a discrete memory less channel. State and

prove its properties. (13 Marks)

Mutual information measures the amount of information that can be obtained about

one random variable by observing another. It is important in communication where it can be

used to maximize the amount of information shared between sent and received signals. The

mutual information of X relative to Y is given by:

http://en.wikipedia.org/wiki/Mutual_information

where SI (Specific mutual Information) is the pointwise mutual information.

A basic property of the mutual information is that

That is, knowing Y, we can save an average of I(X;Y) bits in encoding X compared to not

knowing Y.

Mutual information is symmetric:

Mutual information can be expressed as the average Kullback–Leibler divergence

(information gain) of the posterior probability distribution of X given the value of Y to the

prior distribution on X:

In other words, this is a measure of how much, on the average, the probability distribution on

X will change if we are given the value of Y. This is often recalculated as the divergence from

the product of the marginal distributions to the actual joint distribution:

Mutual information is closely related to the log-likelihood ratio test in the context of

contingency tables and the multinomial distribution and to Pearson's χ2 test: mutual

information can be considered a statistic for assessing independence between a pair of

variables, and has a well-specified asymptotic distribution.

(b) A channel has the following channel matrix.

1 0

( | )0 1

p pP Y X

p p

(i) Draw the channel diagram. (5 Marks)

(ii) If the source has equally likely symbols, compute the probabilities associated

with the channel output for p=0.2. (8 Marks)

http://en.wikipedia.org/wiki/Pointwise_mutual_information

http://en.wikipedia.org/wiki/Symmetric_function

http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence

http://en.wikipedia.org/wiki/Posterior_probability

http://en.wikipedia.org/wiki/Prior_probability

http://en.wikipedia.org/wiki/Likelihood-ratio_test

http://en.wikipedia.org/wiki/Multinomial_distribution

http://en.wikipedia.org/wiki/Pearson%27s_chi-square_test

( 1) ( 1| 1) ( 1) ( 1| 2) ( 2)p y p y x p x p y x p x

( 1) 0.8*0.5 0 0.4p y

( 2) ( 2 | 1) ( 1) ( 2 | 2) ( 2)p y p y x p x p y x p x

( 1) 0.2*0.5 0.2*0.5 0.2p y

( 3) ( 3 | 1) ( 1) ( 3 | 2) ( 2)p y p y x p x p y x p x

( 3) 0 0.8*0.5 0.4p y

14. (a) What is source coding theorem? Explain with suitable examples. (13 Marks)

The main concepts of information theory can be grasped by considering the most

widespread means of human communication: language. Two important aspects of a good

language are as follows: First, the most common words (e.g., "a", "the", "I") should be shorter

than less common words (e.g., "benefit", "generation", "mediocre"), so that sentences will not

be too long. Such a tradeoff in word length is analogous to data compression and is the

essential aspect of source coding. Second, if part of a sentence is unheard or misheard due to

noise — e.g., a passing car — the listener should still be able to glean the meaning of the

underlying message. Such robustness is as essential for an electronic communication system

as it is for a language; properly building such robustness into communications is done by

channel coding. Source coding and channel coding are the fundamental concerns of

information theory.

Information theory is generally considered to have been founded in 1948 by Claude

Shannon in his seminal work, "A Mathematical Theory of Communication." The central

paradigm of classical information theory is the engineering problem of the transmission of

information over a noisy channel. The most fundamental results of this theory are Shannon's

source coding theorem, which establishes that, on average, the number of bits needed to

represent the result of an uncertain event is given by its entropy; and Shannon's noisy-

channel coding theorem, which states that reliable communication is possible over noisy

channels provided that the rate of communication is below a certain threshold called the

channel capacity. The channel capacity can be approached in practice by using appropriate

encoding and decoding systems.

Information theory is closely associated with a collection of pure and applied

disciplines that have been investigated and reduced to engineering practice under a variety

http://en.wikipedia.org/wiki/Data_compression

http://en.wikipedia.org/wiki/Source_coding

http://en.wikipedia.org/wiki/Channel_capacity

http://en.wikipedia.org/wiki/Claude_Elwood_Shannon

http://en.wikipedia.org/wiki/Claude_Elwood_Shannon

http://en.wikipedia.org/wiki/A_Mathematical_Theory_of_Communication

http://en.wikipedia.org/wiki/Source_coding_theorem

http://en.wikipedia.org/wiki/Information_entropy

http://en.wikipedia.org/wiki/Noisy-channel_coding_theorem

http://en.wikipedia.org/wiki/Noisy-channel_coding_theorem

of rubrics throughout the world over the past half century or more: adaptive systems,

anticipatory systems, artificial intelligence, complex systems, complexity science,

cybernetics, informatics, machine learning, along with systems sciences of many

descriptions. Information theory is a broad and deep mathematical theory, with equally

broad and deep applications, amongst which is the vital field of coding theory.

Coding theory is concerned with finding explicit methods, called codes, of increasing

the efficiency and reducing the net error rate of data communication over a noisy channel to

near the limit that Shannon proved is the maximum possible for that channel. These codes

can be roughly subdivided into data compression (source coding) and error-correction

(channel coding) techniques. In the latter case, it took many years to find the methods

Shannon's work proved were possible. A third class of information theory codes are

cryptographic algorithms (both codes and ciphers). Concepts, methods and results from

coding theory and information theory are widely used in cryptography and cryptanalysis.

See the article ban (information) for a historical application.

Information theory is also used in information retrieval, intelligence gathering,

gambling, statistics, and even in musical composition.

As another example, consider information that needs to be transferred from point A

to B. Assuming information to be a jazz band playing a particular composition. This same

composition can be used as a ring tone in a mobile phone or can be played on a keyboard to a

live audience. Even though the end result for the mobile phone and the keyboard are

different - the underlying composition is the same. Information theory deals with this kind of

information that is supposed to be transferred to point A to B.

(b) (i) Explain channel coding theorem in detail. (6 Marks)

Theorem: For any DMC, if R < C, then R is achievable. Conversely, if R > C, it is not achievable.

Proof: We start proving that, if R < C, then R is achievable. In order to do so, it is

enough to construct a particular sequence of codes with rate R such that λ(n) → 0. Our

strategy to complete this goal will be to consider codes which are randomly generated

and evaluate the arithmetic error probabilities of decoding these codes using jointy

typicality decoding, i.e. decode the channel output as the codeword which is jointly

typical with the channel output. Through the arithmetic error probabilities of these

randomly generated codewords, we will obtain proof of the existence of a code with

maximal probability of error going to zero

as n → ∞.

Consider the random code generated by the following algorithm:

1. Fix p(x) such that I(X, Y ) = C and generate M = 2nR+1 codewords

http://en.wikipedia.org/wiki/Rubric_(academic)

http://en.wikipedia.org/wiki/Adaptive_system

http://en.wikipedia.org/wiki/Anticipatory_system

http://en.wikipedia.org/wiki/Artificial_intelligence

http://en.wikipedia.org/wiki/Complex_system

http://en.wikipedia.org/wiki/Complexity_science

http://en.wikipedia.org/wiki/Cybernetics

http://en.wikipedia.org/wiki/Informatics

http://en.wikipedia.org/wiki/Machine_learning

http://en.wikipedia.org/wiki/Systems_science

http://en.wikipedia.org/wiki/Coding_theory

http://en.wikipedia.org/wiki/Data_compression

http://en.wikipedia.org/wiki/Error-correction

http://en.wikipedia.org/wiki/Code_(cryptography)

http://en.wikipedia.org/wiki/Cipher

http://en.wikipedia.org/wiki/Cryptography

http://en.wikipedia.org/wiki/Cryptanalysis

http://en.wikipedia.org/wiki/Ban_(information)

http://en.wikipedia.org/wiki/Information_retrieval

http://en.wikipedia.org/wiki/Intelligence_(information_gathering)

http://en.wikipedia.org/wiki/Gambling

http://en.wikipedia.org/wiki/Statistics

http://en.wikipedia.org/wiki/Musical_composition

i=1 independently2 according to p(xn) = Qn p(xi). Call the collection of codewords,

C, the codebook.

2. Assign each message W in {1, . . . , M} at random to a codeword Xn(W ) in C.

3. Assume the codebook C and p(y|x) are known beforehand to the decoder.

4. W = c for c ∈ {1, . . . , M} if c is the only message such that (Xn(c), Y n) are jointly typical.

Otherwise, define W

(ii) Alphanumeric data are entered into a computer from a remote terminal through a

voice grade telephone channel. The channel has a bandwidth of 3.4 KHz and output

signal to noise ratio of 20dB. The terminal has a total of 128 symbols. Determine the

Shannon’s limit for information capacity. (7 Marks)

Given B = 3.4KHz = 3400 Hz

(S/N)dB = 20 dB (S/N) = 1020/10 = 102=100.

As per Shannon’s limit for channel capacity C = B * log2(1+S/N)

= 3400*log2(101) = 22637.9 bits/channel

15. (a) State channel capacity theorem. Find Shannon’s limit for an AWGN channel and

explain the significance of it. (13 Marks)

Shannon's Theorem gives an upper bound to the capacity of a link, in bits per second

(bps), as a function of the available bandwidth and the signal-to-noise ratio of the link.

The Theorem can be stated as:

C = B * log2(1+ S/N)

where C is the achievable channel capacity, B is the bandwidth of the line, S is the average

signal power and N is the average noise power.

The signal-to-noise ratio (S/N) is usually expressed in decibels (dB) given by the formula:

10 * log10(S/N)

so for example a signal-to-noise ratio of 1000 is commonly expressed as 10 * log10(1000) =

30 dB.

Here is a graph showing the relationship between C/B and S/N (in dB):

Examples

Here are two examples of the use of Shannon's Theorem.

Modem

For a typical telephone line with a signal-to-noise ratio of 30dB and an audio bandwidth of

3kHz, we get a maximum data rate of:

C = 3000 * log2(1001)

which is a little less than 30 kbps.

Satellite TV Channel

For a satellite TV channel with a signal-to noise ratio of 20 dB and a video bandwidth of

10MHz, we get a maximum data rate of:

C=10000000 * log2(101)

which is about 66 Mbps.

(b) Explain Viterbi decoding algorithm for decoding of convolutional encoder.

(13 Marks)

The decoding algorithm uses two metrics: the branch metric (BM) and the path metric

(PM). The branch metric is a measure of the “distance” between what was transmitted and

what was received, and is defined for each arc in the trellis. In hard decision decoding, where

we are given a sequence of digitized parity bits, the branch metric is the Hamming Distance

between the expected parity bits and the received ones. An example is shown in

figure

where the received bits are 00. For each state transition, the number on the arc shows the

branch metric for that transition. Two of the branch metrics are 0, corresponding to the only

states and transitions where the corresponding Hamming distance is 0. The other non-zero

branch metrics correspond to cases when there are bit errors. The path metric is a value

associated with a state in the trellis (i.e., a value associated with each node). For hard

decision decoding, it corresponds to the Hamming distance over the most likely path from

the initial state to the current state in the trellis. By “most likely”, we mean the path with

smallest Hamming distance between the initial state and the current state, measured over all

possible paths between the two states. The path with the smallest Hamming distance

minimizes the total number of bit errors, and is most likely when the BER is low.

Computing the Path Metric

Suppose the receiver has computed the path metric PM[s, i] for each states (of which

there are2k−1, where k is the constraint length) at time step i. The value of PM[s, i] is the total

number of bit errors detected when comparing the received parity bits to the most likely

transmitted message, considering all messages that could have been sent by the transmitter

until time step i(starting from state “00”, which we will take by convention to be the starting

state always).Among all the possible states at time step i , the most likely state is the one with

the smallest path metric. If there is more than one such state, they are all equally good

possibilities.

Finding the Most Likely Path

We can now describe how the decoder finds the most likely path. Initially, state ‘00’

has a cost of 0 and the other 2k−1−1states have a cost of ∞.

The main loop of the algorithm consists of two main steps: calculating the branch

metric for the next set of parity bits, and computing the path metric for the next column. The

path metric computation may be thought of as an add-compare-select procedure:

1. Add the branch metric to the path metric for the old state.

2. Compare the sums for paths arriving at the new state (there are only two such

paths to compare at each new state because there are only two incoming arcs from

the previous column).

3. Select the path with the smallest value, breaking ties arbitrarily. This path corresponds to

the one with fewest errors.

PART C (1 × 15 = 15Marks)

16. (a) A discrete memory less source has five symbols x1, x2, x3, x4, x5 with probabilities

0.4, 0.19, 0.16, 0.15, 0.1 respectively. (i) Construct Shannon Fano code for the source

and calculate code efficiency. (ii) Construct Huffman code for the source and calculate

code efficiency. Compare the two algorithms based on the result obtained. (15 Marks)

Probabilities of the symbols:

P(x1) = 0.4 P(x2) = 0.19 P(x3) = 0.16 P(x4) = 0.15 P(x5) = 0.1

Entropy or Average Information, H =

M

k k

kP

P1

2

1log

= 2 2 2 2 2

1 1 1 1 10.4log 0.l9 og 0.16 og 0.15log 0.1log

0.4 0.19 0.16 0.15 0.1l l

= 0.5288+0.4552+0.4230+0.4105+0.3322

= 2.1497 information bits/sec

Huffman Coding:

Average Codeword length is

N =

M

k

kk nP1

= 0.4*1 0.19*3 0.16*3 0.15*3 0.1*3 = 2.2

Coding Efficiency 2.1497

97.71%2.2

H

N

Shannon – Fano Coding:

X P(X) I II III Codeword

X1 0.4 0 0 00

X2 0.19 0 1 01

X3 0.16 1 0 10

X4 0.15 1 1 0 110

X5 0.1 1 1 1 111


N =

M

k

kk nP1

= 0.4*2 0.19*2 0.16*2 0.15*3 0.1*3 = 2.25

Coding Efficiency 2.1497

95.54%2.25

H

N

(b) In a message, each letter occurs the following percentage of times:

Letter: A B C D E F

% of occurrence: 23 20 11 9 15 22

(i) Calculate the entropy of this alphabet of symbols. (4 Marks)

(ii) Devise a codebook using Huffman technique and find average codeword length.

(4 Marks)

(iii) Devise a codebook using Shannon-Fano technique and find average codeword

length. (4 Marks)

(iv) Compare and comment on the result of both techniques. (3 Marks)

Probabilities of the symbols:

P(A) = 23/100 = 0.23 P(B) = 20/100=0.2 P(C) = 11/100=0.11 P(D) = 9/100 = 0.09

P(E) = 15/100 = 0.15 P(F) = 22/100 = 0.22

Entropy or Average Information, H =

M

k k

kP

P1

2

1log

=

22.0

1log22.0

15.0

1log15.0

09.0

1log09.0

11.0

1log11.0

2.0

1log2.0

23.0

1log23.0 222222

= 2.50611 information bits/sec

Huffman Coding:


N =

M

k

kk nP1

=

4*09.04*11.03*15.02*2.02*22.02*23.0 = 2.55

Coding Efficiency %28.9855.2

50611.2

N

H

Shannon – Fano Coding:

X P(X) I II III Codeword

A 0.23 0 0 00

B 0.2 0 1 0 010

C 0.11 0 1 1 011

D 0.09 1 0 0 100

E 0.15 1 0 1 101

F 0.22 1 1 11


N =

M

k

kk nP1

=

2*22.03*15.03*09.03*11.03*2.02*23.0 = 2.55

Coding Efficiency %28.9855.2

50611.2

N

H

So both Huffman coding and Shannon-Fano coding produced the same coding

efficiency.

Faculty Incharge HoD/CSE&IT

Documents

NH 67, Karur Trichy Highways, Puliyur C.F, 639 114 Karur ...chettinadtech.ac.in/storage/16-09-27/16-09-27-15-19-46-3818-mkumar.pdfFor a block of messages, code vector is generated