Download pdf - FORWARD ERROR CORRECTION FOR DIGITAL …kk/dtsp/tutoriaalit/Bhargava.pdf · A block diagram which describes the digital communication process using forward error correction ... block

JANUARY 1983

FORWARD ERROR CORRECTION SCHEMES FOR DIGITAL COMMUNICATIONS VlJAY K. BHARGAVA

HE utility of coding was demonstrated by the work of Shannon in 1948. Shannon’s work established that the ultimate limit of performance set by the noise

on the channel is not the accuracy, but the rate at which data can be reliably transmitted.

A block diagram which describes the digital communication process using forward error correction (FEC) is shown in Fig. 1. This paper shall not be concerned with error control coding schemes which involve some type of detection and retransmission-the so-called automatic repeat request (ARQ) procedures [ 11.

Encoder-Decoder (CODEC) Historically, the coding systems have been separated into

block and convolutional error-correcting techniques. In an ( n , k ) linear block code a sequence of k information

bits is algebraically related to n-k parity bits to give an overall encoded block of n bits. Usually modulo-2 arithmetic is used, which is simply the EXCLUSIVE-,OR operation in logic. In this arithmetic 1 CB 1 = 0 and there are never any “carries.” Hence, an odd number of 1’s sums to 1. Linear codes form a linear vector space and have the very important property that two code words can be added to produce a third code word. The code rate is r = k / n , and n is called the block length. Note that the introduction of error-control coding requires more capacity; this can be in the form of wider bandwidth, longer bursts in time division multiple access (TDMA) systems, or a

r - - - - - - - - l

DISCRETE . I I

I I I ’ I

’.L ---I---- A

Fig. 1. Digital communication process using forward error correction.

higher “chip” rate (and hence a higher bandwidth) in spread spectrum systems, if the same processing gain is needed.

The Hamming weight of a code word c, denoted w(c), is defined to be the number of nonzero components of c. For example, if = (1 10101), then w(c) = 4. The Hamming distance between two code words and g , denoted d(c1 , g ) , is the number of positions in which they differ. For exampleif~~=(110101)and~~=(111000)thend(~~,~z) = 3. Clearly, d(c1 , g ) = w ( g e g ) = w(c3), where c3, for linear codes, is a code word. Therefore, the distance between any two code words equals the weight of one of the code words and the minimum distance d for a linear block code equals the minimum weight of its nonzero code words.

A code can correct all patterns of t or fewer random errors and detect all patterns having no more than s errors, provided that s + t + 1 < d . Only if the code is used for error correction can the code correct all patterns of t or fewer random errors, provided that 2t $: 1 < d .

Convolutional codes are a subset of the so-called tree codes. A convolutional code of rate l /v may be generated by a K stage shift register and v modulo-2 adders. A simple example is the rate 1/2 convolutional encoder shown in Fig. 2.

Information bits are shifted in at the left, and for each information bit the output of the modulo-2 adders provide two channel bits. The constraint length of the code expressed in information bits is defined as the number of shifts over which a single information bit can influence the encoder output. For the simple binary convolutional code, the constraint’length is equal t o K , the length of the shift register.

The decoder uses the redundancy introduced in the process of encoding and sometimes the reliability (defined below) of

I . .

SWITCH RATE; .

I 1 I Fig. 2. Encoder for a rate ‘lh, constraint length -3

convolutional code.

0163-6804/83/0100-0011 $01.00 0 1983 IEEE

11

IEEE COMMUNICATIONS MAGAZINE

I - P b 0 0

I I I - Pe

Fig. 3. Binary symmetric channel.

the received information to decide which information bit was actually sent.

Modulator-Demodulator (MODEM) The encoded sequence is suitably modulated and trans-

mitted over the noisy channel. In systems where coherent demodulation is possible (that is, where a carrier reference can be obtained), phase shift keying (PSK) is often used. In binary PSK an encoded 1 is represented by the wave form sl(t) = Acos ac t , while an encoded 0 is represented by the antipodal signal so(t) = -sl(t) = Acos (act + T), the waveforms changing at discrete times T, seconds (symbol duration) apart.

The physical channel or the waveform channel consists of all the hardware (for example, filtering and amplification) and the physical media that the waveform passes through in going from the output of the modulator to the input of the demodulator.

The demodulator estimates which of the possible symbols was transmitted, based upon an observation of the received- signal. For PSK with white Gaussian noise and perfect phase tracking, the optimum receiver is a correlator or matched filter receiver which is sampled each T, seconds to determine its polarity. It is easily shown that the voltage z , at the matched filter output at the sample time is a Gaussian random variable with mean k fl, (depending upon whether a 1 or 0 was transmitted) and variance u2 = N0/2. In the above E, is the energy per symbol (what we pay) and No denotes the one sided noise spectral density (what we must combat). When a symbol differs from a bit (that is, when we use coding) we will denote the energy per bit by E b .

Hard Decisions, Soft Decisions In practical communication systems, we rarely have the

ability.to process the actual analog voltages zi (the values taken by the random variable z ) . The normal practice is to quantize these voltages. If a binary quantization is used, we say that a hard decision has been made on the correlator output as to which level was actually sent. In this case, we have the so-called binary symmetric channel (BSC) with probability of error P,, shown in Fig. 3 . For example, in coherent PSK with equally likely transmitted symbols, the optimum threshold is at zero. Then the demodulator output is a zero if the voltage z at the matched filter output is negative. Otherwise, the output is a one.

With coding, it is desirable to keep an indication of how reliable the decision was. A soft-decision demodulator first decides whether the output voltage is above or below the decision threshold, and then computes a “confidence” number which specifies how far from the decision threshold the demodulator output is. This number in theory could be an analogue quantity, but in most practical applications a three-

An example of three-bit quantization is shown in Fig. 4. The input to the demodulator is binary, while the output is 8-ary, delineated by one decision threshold and three pairs of confidence thresholds. The information available to the decoder is increased considerably and translates as an additional gain of 2 dB in most instances [2]. The receiver complexity is increased as an AGC will probably be needed, and three bits will have to be manipulated for every channel bit. The channel resulting from three-bit quantization on a Gaussian channel is called the binary input, 8-ary output, ’ discrete memoryless channel (DMC), and is shown in Fig. 5.

Coding Gain

’ bit (eight-level) quantization is used.

Before we start our study of codes, consider a Gaussian memoryless channel with one-sided noise spectral density No and under no bandwidth limitation. Let EL denote the received energy per bit. Then it can be shown that for Eb/No greater than - 1.6 dB, there exists some coding scheme which allows us to communicate with zero error, while reliable communication is not generally possible at lower signal-to-noise ratios. On the other hand, it is well known that uncoded PSK over the same channel will require about 9.6 dB to achieve a bit error rate of Thus, as shown in Fig. 6, a potential coding gain of 1 1.2 dB is theoretically possible. Coding gain is defined as

12

JANUARY 1983 r the difference in values of Eb/No required to attain a particular error rate without coding and with coding.

It must be stressed that this coding gain is obtained at the expense of an increase in the necessary transmission bandwidth. The bandwidth expansion is the reciprocal of the coding rate. Thus, for a rate-1/2 code, the transmitted symbol energy E, is 3 dB less than E b . We also point out that coding gain is a useful concept only when one Can obtain performance improvements by increasing the power. In certain communication links at high signal-to-noise ratios, there is a floor on performance that can not be overcome by simply increasing the power. The use of coding might considerably reduce the floor or make it disappear altogether. In such a situation one might be tempted to say that the coding gain is infinite, but this tends to be a meaningless statement. The fact is that without coding the desired performance could never have been obtained [2).

While another revolution in coding may be needed to deliver the theoretically possible coding gain of 1 1.2 dB, it is safe to say that coding systems (delivering 2-6 dB) will be used routinely in digital communication links as hardware costs decrease and system complexity increases. There are several reasons for this [9]:

1. Phenomenal decrease in the cost of digital electronics.

2. Significant improvement in various decoding .

algorithms. 3. Much slower (or no) decrease in the cost of analog

components, such as power amplifier, antenna and so on.

Asymptotic coding gain, a figure of merit for a particular code, depends only on the code rate and the minimum distance. To define it, consider a t-error correcting code with rate r and minimum distance d 2 2t + 1. If we use the code with a hard decision PSK demodulator, it can be shown that the bit error rate P b is [2]

0 I 2 3 4 5

.6 7

Fig. 5. Eight-level soft quantized DMC produced by a three-bit quantizer on a Gaussian channel. I

13

With a soft-quantized PSK demodulator, we have

P b,s 2 Q( J Z Z F C ) -

Recall that for uncoded PSK

P b = Q(

Thus, the asymptotic coding gain C, for the two cases is:

C, ,< r ( t + 1) = 10 log r(t + l), dB (hard decision) C, < rd = 10 log rd, dB (soft decision)

The above indicates that soft decision decoding is about 3 dB more efficient than hard decision decoding at very high EdN.. A figure of 2 dB is more likely at realistic values of Eb/No.

Block Codes and Their Decoding We shall illustrate the idea of a block code by the

following example:

(SOFT OUANTIZATION)

(HARD OUANTIZATION)

CODING GAIN


Example I Consider the set of code words (000000), (001101),

(01001 l) , (01 11 IO), (1001 lo) , (10101 l ) , (1 10101) and (1 1 1000). These Z3 = 8 code words form a vector space of dimension three and thus a (6,3) code. The minimum weight (of nonzero code words) is 3 and hence the minimum distance is 3. Thus, the code is single-error correcting.

Block codes'.are also called parity check codes, for if c = (~~,~~,c~,c~,c~~,c~)isacodewordinthecodeofExample 1, then knowing c 1 ,cz,c3 allows one to solve for the other three bits by using the following parity check equations:

c1 @ c 3 = c 4 c1 @ c2 = c5

The code of this example is said to be in the systematic form; the first three bits in any code word can be considered as the message bits while the last three bits,' which are calculated from the first three bits, are the redundant or parity bits.

Cyclic Codes

C 2 @ C 3 = Cf3

It is perhaps a remarkable fact that many of the important block codes found to date can be reformulated to be cyclic codes or closely related to cyclic codes. For such codes, if an n tuplec=(co,cl ,cz, ..., Cn-~)isacodeword,thentupler'= (cn- 1, co, C I , . . . , cn- 2) obtained by shiftingicyclically one place to the right is also a code word. This class of codes can be easily encoded using linear shift registers with feedback. Further, because of their inherent algebraic structure, the decoding has been greatly simplified, both conceptually and in practice.

Examples of cyclic and related codes include the Bose- Chaudhuri-Hocquenhem (BCH), Reed-Solomon, Hamming, maximal-length, Reed-Muller, Golay, quadratic residue, projective geometry, Euclidean geometry, difference sets, Goppa, and quasi-cyclic codes. The classes form overlapping sets so that a particular code may be a BCH code and also a quadratic residue code. Recent applications of codes from this family to digital communication include a (31,15) Reed- Solomon code for the joint tactical information distribution system(JTIDS),a(127,112)BCHcodeforthelNTELSAT V system, and a (7,2) Reed-Solomon code for the air force satellite communications (AFSATCOM) wideband channels [I].

Example Consider the encoder of Fig. 7, which generates a (7,3)

cyclic code. Suppose we wish to encode (010):With the gate turned on and the switch in position 2, information bits shift into register and into the channel sequentially, and the contents of the shift register are as follows:

ro r l rz r 3

Initial state 0 0 0 0 . . First shift 0 0 0 0 Second shift 1 1 1 0 Third shift 0 1 1 1

The gate is then turned off, the switch is thrown to position 1 and the four parity bits (011 1) are shifted to obtain the encoded word as

(0111 010)

The Concept of Syndrome and Error Detection

The basic element of the decoding procedure consists of computing the syndrome defined according to the following operation: re-encode the received information bits to compute a parity sequence in exactly the same fashion as the encoder; compare these parity bits to the corresponding parity bits actually received using a modulo-2 adder whose output forms the syndrome.

Clearly, when no errors have occurred, the parity bits computed at the decoder will be identical to those actually received, and the syndrome bits will be zero. I f the syndrome bits are not zero, then errors have been detected.

For error correction the syndrome is processed further. Thus, error correction is substantially more involved than error detection.

Summary of Important Classes of Block Codes In this section we discuss the characteristics of some

important classes of block codes. Most of them are cyclic (or related to cyclic codes). Further, we limit ourselves to only binary codes.

Bose-Chaudhuri-Hocquenghem (BCH) Codes ' The BCH codes are the best constructive codes for

channels in which errors affect successive symbols inde- pendently. These codes are cyclic and have the following . parameters:

Block length: n = 2"' - 1, m = 3,4,5, ... Number of information bits: k 2 n - mt Minimum distance: d 3 2t + 1

Reed-Solomon Codes Each symbol here can be represented as m bits. These

codes have the parameters: Symbols: m bits per symbol Block length: n = 2"' - 1 symbols = m(2" - 1) bits Number of parity symbols: (n - k) = 2t symbols = m . 2 t bits Minimum distance: d = 2t + 1 symbols

Example 2 Let t = 1 and m = 2. Denoting the symbols as 0,1,2, and 3,

14

r JANUARY 1983

we can write their binary representation as 0 = 00 1 =01 2 = 10 3 = 11

and we have a code with the following parameters; n = 22 - 1 = 3 symbols = 6 bits

( n - k) = 2 symbols = 4 bits This code can correct any inphase burst (i.e. spanning a symbol) of length 2.

For example, suppose the code word (1,2,3) was transmitted. We write it as (01 10 11). Since the code is one-symbol error correcting, it will decode any inphase burst error of length 2. In general, a t symbol error correcting Reed-Solomon code can correct t inphase bursts of length m bits in each code word.

For any (n,k) code with minimum distance d, it can be shown that d < n - k + 1. Since d = n - k + 1 for RS codes, they are called maximum distance separable. Reed-Solomon codes are now commercially available with development spurred by military tactical communication.

Reed-Solomon codes are extremely well suited for burst- error correction and for use as outer codes in a powerful coding system known as the concatenated coding system [2]. The basic idea of concatenation is to factor the channel encoder and decoder in a way shown in Fig. 8. By choosing an inner code (block or convolutional) appropriately and taking a Reed-Solomon code as the outer code, lower decoding complexity and larger coding gains are possible compared to an unfactored system.

Colay Code This is a very special three-error correcting (23,12) cyclic

code with minimum distance 7 and is based on the following tantalizing number theoretic fact: 1 + (:”.) + (22”) + (23”) = 2048 = 211, which makes the code a “perfect” code. The code has been widely used as a (24,lZ) code with minimum distance 8 by adding an extra parity bit which is a. parity check over the other 2 3 bits. Unfortunately, the Colay code does not generalize to other combinations of n and k.

Hamming Codes These are cyclic codes having the following parameters:

Block length: n = 2“‘ - 1 Number of parity bits: k = m ‘ Minimum distance: d = 3

Maximum-Length Codes These are cyclic codes with the following parameters:

Block length: n = 2”’ - 1 Number of information bits: k = m Minimum distance: d = 2”’ - 1

These codes are related to maximal-length sequences used extensively in spread spectrum communications and for closed-loop time-division multiple-access synchronization, to name two examples [I]. They are also called simplex codes, an intriguing contact between algebraic coding theory and the geometry of n dimensions [3].

Quadratic Residue Codes The minimum distances of codes in this family are typically

comparable to those of BCH codes of comparable lengths. The quadratic residue codes are cyclic codes with the following parameters:

Block length: n = p a prime number of the form 8m & 1 Number of information bits: k = (p + 1)/2 Minimum distance: d > Jn

The above list is by no means an exhaustive list. For example, we have not mentioned codes based on the combinatorial configurations of finite geometries, Coppa codes, quasi-cyclic codes, to name a few [5].

Decoding of Block Codes The algebraic structure imposed on block codes has

produced a number of decoding techniques for these codes, and the theory is quite well developed. Thus, the various schemes will be touched upon only briefly. To use many of these techniques requires the use of binary quantization (hard decisions) at the demodulator output. The first step is to form the syndrome. In medical parlance, a syndrome is a pattern of symptoms that aids in the diagnosis of a disease. Here the “disease” is the error pattern and a “symptom” is a parity check failure. This felicitous coinage is due to Hagelbarger [3]. To correct errors, the syndrome is processed further using any one of the following methods:

Table look up decoding-It can be shown that there is a unique correspondence between the 2”-k distinct syndromes and the correctable error patterns. Thus, for codes with small redundancy we can store the error patterns in a read-only memory (ROM) with the syndrome of the received word forming the address: The error pattern would then be added modulo-2 to the received sequence to produce the transmitted code word.

Example 3 For the code of Example 1, the following correspondence

can be established:

Fig. 8. Concatenated coding system.

15

Correctable error pattern Syndrome

000000 000 00000 1 00 1 00001 0 010 000 1 00 100 001 000 101 0 10000 01 1 100000 110 100001 111

Suppose that the code word c = (1 10101) is transmitted and E= (0101 01) is received. We calculate the syndrome of r as:

received parity bits = 101 parity bits obtained by re-encoding received

-

information c =gJ

.*. Syndrome = 1 10

From the table we note that (1 10) is the ‘syndrome corresponding to the correctable error pattern g = ( 100000). Thus~+g=(010101)+(100000)=(110101)isidentified as the transmitted code word. Now suppose (01 11 10) is transmitted and (1 01 1 10) is received. The syndrome can be computed as before to obtain (101) which corresponds to (001000). The decoded word is identified as (1 01 1 10) + (00 1000) = ( 1001 10). This is an incorrect decoding since the error pattern caused by the channel (1 10000) is not a correctable error pattern. This code corrects single errors in any position and one error pattern of double errors. Thus, as noted earlier, it is a single error correcting code.

Algebraic techniques-The ’ most prominent among these is the iterative decoding algorithm for BCH codes due to Berlekamp. It is perhaps the deepest and most impressive theoretical result in coding theory (block or convolutional). A systems engineer who wishes to minimize the complexity of a BCH decoder is still well advised to use Berlekamp’s procedure [3]. The algorithm was interpreted in terms of the design of a minimum-length shift register to produce a given sequence by Massey [2]. The key idea is to compute a so- called error-locator polynomial and solve for its roots. The complexity of this algorithm increases only as the square of the number of errors to be corrected. Thus, it is feasible to decode powerful codes. The use of Fourier-like transforms has recently been proposed as a vehicle for reducing decoder complexity [2].

The BCH decoder could be implemented at moderate data rates in a special purpose processor with an associated finite field arithmetic unit, and memory. Highly parallel realization has been used to achieve very high data rates (40 Mbps) [2].

The standard BCH decoding algorithm is a bounded- distance algorithm. That is, no error patterns of more than t errors can be corrected. This technique does not generalize easily to utilize soft decisions. At present, soft decisions can only be utilized via some other techniques in combination with the standard hard decision BCH decoding algorithm. Two such schemes are Forney’s generalized minimum distance decoding and Chase’s algorithm [2].

Permutation decoding is another example of algebraic decoding and the so-called error trapping is a special case of it [4]. This technique is based on the fact that if the weight of the syndrome for an (n,k) t-error correcting code is at most t , then the information bits are correct. If the weight of the syndrome is greater than t , then at least one information bit is incorrect.

Majority logic decoding-There are codes that, because of the special form of their parity check equations, are majority logic decodable. Majority-logic decoding is the simplest form of threshold decoding that is applicable to both block and convolutional codes. Recall that any syndrome bit is a linear combination of error bits. Thus, a syndrome bit represents a known sum of error bits. Further, any linear combination of syndrome bits is also a known sum of error bits. Hence, all 2n-k such possible combinations of syndrome bits are of the known sum of error bits available at the receiver.

In the simplest case, decoding for these codes is performed on a bit-by-bit basis. For every received bit several parity check equations are checked giving rise to a particular value. The element 1 or 0 receiving the majority votes is taken to be the correct value for that bit. Many examples of this type of decoding procedure are given in [ 11.

Convolutional Codes and Their Decoding Convolutional codes using either Viterbi or sequential

decoding have the ability to utilize whatever soft-decision information might be available to the decoder. It is not surprising that they have been used widely even though their theory is not as mathematically profound as that of block codes. Most good convolutional codes have been found by computer search rather than by algebraic construction.

Convolutional codes can be studied from many diffferent approaches. For the purpose of illustrating. their decoding methods, it is necessary to outline both the tree and the trellis approaches.

The convolutional encoder of Fig. 2 can be described by the code tree of Fig. 9. Each branch of the tree represents a single input bit-an input zero corresponds to the upper branch and an input one corresponds to the lower branch. Clearly any sequence of input bits traces out a particular path through the tree. Specifically, a 101 10 sequence traces out a 1 1 01 00 10 10 output sequence.

In Fig. 9 we have labeled each node of the tree with a member from the following set of binary pairs: (00, 01, 10, 11) corresponding to the contents of the two left-most positions of the encoder register at that point in the tree. This number is called the state of the encoder.

We see that the tree contains redundant information which can be eliminated by merging, at any given level, all nodes corresponding to the same encoder state. The redrawing of the tree with merging paths has been called a trellis by Forney. Figure 10 represents a trellis for the convolutional encoder of Fig. 2. As before, an input 0 corresponds to the selection of the upper branch and an input 1 to the lower branch. Each

16

r ~~

JANUARY 1983

possible input sequence corresponds to a particular path in the trellis.

Unlike block codes, several distance measures have been proposed for convolutional codes, and each one is important and useful for particular decoding techniques.

The nth order column distance function d,(n) of a convolutional code is the minimum Hamming distance between all pairs of code words of length n branches which differ in their first branch of the code tree. The column distance function is a nondecreasing function of n, and assumes two particular values of special interest: d, the minimum distance of the code when n = K , the constraint length of the code; and dr, the free distance of the code when n - a .

The minimum distance of a convolutional code is’ the important parameter for determining the error probability of the code when used with threshold decoding. The free distance is useful in determining the code performance with Viterbi decoding and sequential decoding.

Decoding of Conoolutional Codes The problem of decoding a convolutional code can be

thought of as attempting to find a path through the trellis or the tree by making use of some decoding rule.

Viterbi Decoding,Algorithrn

This algorithm leads to a maximum likelihood decoder for convolutional codes. In fact, it applies to any trellis code, not just convolutional codes. The significance of the trellis viewpoint is that the number of nodes in the trellis does not continue to grow as the number o f input bits increases but remains at 2KT’. The Viterbi algorithm computes a “metric” for every possible path through the trellis. It then discards a number of paths at every node that exactly balances the number of new paths that are created. Thus, it is possible to maintain a relatively small list of paths that are always guaranteed to contain the maximum-likelihood choice. The decoding algorithm can easily operate on soft-decisioned data. This is a major advantage of Viterbi decoding.

Viterbi decoding is presently the most important decoding technique for providing coding gain for a variety of channels. Unfortunately, it has been well over a decade since there have been any fundamentally new ideas in Viterbi decoding, and that technology appears to be near the asymptote of its learning-curve [9].

We note that the complexity of the Viterbi algorithm is an exponential ‘function of the code’s constraint length K ; unfortunately, the larger K is, the better the code is likely to be (that is, the larger are the coding gains that can be obtained). We are motivated to consider decoding algorithms that will work on convolutional codes with very large values of K, say K>> 10, the present limit of Viterbi decoders.

One last point worth mentioning is that Viterbi decoding does not perform very well in a bursty channel. In those channels, interleaving of data may thus have to be considered to obtain low correlation between noise samples. However, interleaving requires ‘a significant increase in the encoding delay which may not be acceptable in certain.appiications.

Sequential Decoding

The complexity of sequential decoders is relatively independent of constraint length, so that codes with much larger constraint lengths can be used. A more rapid rate of change of error probability is achieved with increasing Eb/N,. This technique is more suitable than Viterbi decoding when low bit error rates (< 10-5) are required.

Sequential decoding was first introduced by Wozencraft but the most widely used algorithm to date is due to Fano. It is an efficient method for finding the most probable code word, given the received sequence, without searching the entire tree. The explored path is probably only local; that is, the procedure is suboptimum. The search is performed in a sequential manner, always operating on a single path, but the decoder can back up and change previous decisions. Each time the decoder moves forward, a “tentative” decision is made. If an incorrect decision is made, subsequent extensions of the path will be wrong. The decoder is eventually able to recognize this situation. When this happens, a substantial amount of computation is needed to recover the correct path. Backtracking and trying alternate paths continues until it finally decodes successfully.

A major problem with sequential decoding schemes is that

17


the number of computations required in advancing one node deeper into the code tree is so ill-behaved a random variable that even with very fast decoding circuitry and very large buffers, performance is limited by the probability of buffer overflow [SI.

Threshold Decoding

Some convolutional codes are threshold decodable in that several parity checks’are calculated for each message bit and if they exceed a threshold, a decision on the correctness of the bit is made. Moderate values of coding gain (1 to 3 dB) can be obtained with relatively inexpensive decoders and limited amount of redundancy.

Diffuse threshold decoding and the Gallager adaptive burst-finding scheme are two important variations of threshold decoding. These algorithms can also deal with burst errors [ 11.

Comparison of .Block and Convolutional Codes The theory of block codes is much older and richer than the

theory of convolutional codes and the discussion on block codes is much longer than the discussion on convolutional codes. However, until recently this unbalance did not apply to practical applications. The discussion presented here is applicable to an additive Gaussian white noise channel. For spread spectrum systems with jamming and fading channels, the benefits of using codes (either block or convolutional) are even more spectacular.

We.will take as our basis of comparison an uncoded BPSK system employing coherent detection. The same can be extended to QPSK, since the four-phase modulation may be considered as being the superposition of two BPSK systems each acting upon the orthogonal sine and cosine components of the carrier signal. We will assume that the information rate is fixed for all coded systems. The coded system will require more RF bandwidth. Comparisons among different techniques will be made at bit error rates = 1 0-5 and 1 O-8. Table 1 has been adapted from [2]. Here the column labelled “data rate capability” is taken to be the following: low (less than 10

Kbps), moderate (10 Kbps to 1 Mbps), high (1 Mbps to 20 Mbps) and very high (greater than 20 Mbps).

At moderate and high data rates for a given level of complexity, convolutional codes with Viterbi decoding appears to be the most attractive technique. This assumes that there is no appreciable interference other than Gaussian noise; it also assumes that a decoded bit error rate of lo-’ is satisfactory and that the overail system transmits long sequences of bit streams. This advantage to the Viterbi algorithm follows because, in order to apply an algebraic decoding algorithm to a block code it is necessary to use hard-decisions, whereas the Viterbi algorithm can be adapted to accept soft-decisions with relative ease. However, if more efficient algorithms for decoding long block codes with soft decisions are developed, they will undoubtedly be quite competitive.

At very high data rates, concatenated Reed-Solomon and short block code systems can provide roughly the same gain with less complexity than Viterbi decoding.

For larger coding gains at high speeds, sequential decoding with hard decisions appears to be the most attractive choice. At moderate data rates a better case can be made for using sequential decoding with soft decisions.

In situations where the system protocols require the transmission of blocks of data (such as TDMA), all convolutional systems require flushouts and restarts, that is, the CODEC must be set to the all zero state before processing the next block. For such systems, block codes appear to be more attractive.

Threshold decoding is a very attractive technique for systems operating at very high speeds with small complexity. In digital satellite systems a potential use of this technique will be in digital telephony, where the user requires a smaller bit rate than that of the uncoded system but desires extremely high bandwidth efficiency.

For mobile terminals operating in the presence of large doppler offset and doppler rate, and multipath and fading, the use of Reed-Solomon codes with soft-decision decoding ap-

I STATE 00 00 00 00 00 00

01

IO

01 . 01 01 Fig. lov Trellis for the convolutlonal encoder of Fig. 2.

18

r JANUARY 1983

TABLE 1

MODULATION ON A GAUSSIAN CHANNEL CoMPARlSON OF MAJOR CODING TECHNIQUES WITH BPSK OR QPSK

Coding Coding gain(dB) gain(dB) Data rate

Coding technique at lo-’ at lo-’ capability

Concatenated (RS and Viterbi) 6.5-7.5 8.5-9.5 Moderate Sequential decoding (soft decisions) 6.0-7.0 8.0-9.0 Moderate Block codes (soft decisions) 5.0-6.0 6.5-7.5 Moderate Concatenated (RS and short block) 4.5-5.5 6.5-7.5 Very high Viterbi decoding 4.0-5.5 5.0-6.5 High Sequential decoding (hard decisions) 4.0-5.0 6.0-7.0 High Block codes (hard decisions) 3.0-4.0 4.5-5.5 High Block codes-threshold decoding 2.0-4.0 3.5-5.5 High Convolutional codes-threshold

decoding 1.5-3.0 2.5-4.0 Very high

pears extremely attractive to combat the bursty nature of the channel. Alternative coding techniques applicable are threshold decoders with interleaving or soft-decision Viterbi decoding with interleaving. However, interleaving requires a signific’ant increase in.the encoding delay. In packet networks where decoding, re-encoding and retransmission at several intermediate nodes is required, the delay associated with interleaved convolutional codes may not be acceptable. In such situations the solution might be to avoid interleaving by using one of the block codes suited for this purpose, which, rather than dispersing the bursts by interleaving, exploits it for improved error performance [9].

It should be stressed that a major part of these comparisons is influenced by today’s digital integrated circuit technology. Advances in this technology could modify relative‘comparisons of complexity and achievable data rates. . ,

Conclusion The purpose of this paper was to introduce forward error

correction schemes for digital communications. Various families of codes and their decoding methods were outlined. The performance of these codes over an additive Gaussian white noise channel was discussed.

An enormous amount of literature is now available on coding. For the interested reader, we have narrowed it down to ‘a brief bibliography. More comprehensive lists of references are available in [1,2,4].

The theory of error correcting codes is an active area of research. Some critics claim that coding will be eliminated to conserve spectrum and space. This may not be the final answer since guard spaces, guard times,. and minimum antenna separations are themselves users of spectrum, time, and space, and yet do not fully eliminate mutual interference and error [lo]. Indeed, what Solomon Golomb wrote well over a decade ago is still very much true:

A message with content and clarity Has gotten to be quite a rarity

To combat the terror Of serious error

Use bits of appropriate parity.

Acknowledgment The support of the author’s research by the Natural

Sciences and Engineering Research Council of Canada and by le Programme de Formation de Chercheurs et d’Action Concertbe du Gouvernement du Quebec is gratefully acknowledged. The author would also like to thank Drs. Jean Conan, David Haccoun, and Gerald Seguin for helpful discussions.

Bibliography [l] V. K. Bhargava. D. Haccoun, R. Matyas, and P. Nuspl, Digital

Communications by Satellite: Modulation, Multiple Access and Coding, NY: Wiley, 1981.

[2] G. C. Clark, Jr . and J. B. Cain, Error Correction Coding for Digital Communications, NY: Plenum Press, 1981.

[3] R. J. McEliece, The Theory oflnformation and Coding, Reading, MA: Addison Wesley, 1977. .

[4] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error- Correcting Codes, Amsterdam, North-Holland and NY: Elsevier/North Holland, 1977.

[5] 1. F. Blake and R. C. Mullin, The Mathematical Theory of Coding, NY: Academic Press, 1975.

[6] W. W. Peterson and E. J . Weldon, Jr., Error Correcting Codes, Second Edition, Cambridge, MA: MIT Press, 1972.

[7] S. Lin, An Introduction to Error-Correcting Codes, Englewood Cliffs, NJ: Prentice Hall, 1970.

[8] E. R. Berlekamp, Algebraic Coding Theory, NY: McGraw Hill, 1968. [9] E. R. Berlekamp, “The technology of error-correcting codes,” Proc.

[lo] 1. M. Jacobs, “Practical applications of coding,” IEEE Trans. Inf.

[ l l ] J . K. Wolf, “A survey of coding theory, 1967-1972,” IEEE Trans.

[ 121 H. 0. Burton and D. D. Sullivan, “Errors and error control,” Proc. IEEE, vol. 60, pp. 1293-1301, Nov. 1972.

[ 131 R. T. Chien, “Block coding techniques for reliable data transmission,” IEEE Trans. on Commun. Technol., COM-19, pp. 743-751, Oct. 1971.

[14] A. J. Viterbi, “Convolutional codes and their performance in communication systems,” IEEE Trans. o n Commun. Technol., COM-

[15] G. D. Forney, Jr., “Burst correcting codes for the classic bursty channel,” IEEE Trans. on Commun. Technol., COM-19, pp. 772- 781, Oct. 1971.

[I61 G. D. Forney, Jr., “Coding and its application in space communications,” IEEE Spectrum, pp. 47-58, June 1970.

[17] J. F. Hayes, “The Viterbi algorithm applied to digital data transmission,” Communications Magazine, vol. 13, no. 12, pp. 15- 20, March 1975.

IEEE, vol. 68, pp. 564-593, May 1980.

Theory, IT-20, pp. 305-310, May 1974.

Inf. Theory, IT-19, pp. 381-389, July 1973.

19, pp. 751-772, Oct. 1971.

Vijay K. Bhargava was born in,Beawar, India, on September 22, 1948. He received the B.Sc. (Math. and Eng.), M.Sc. (EE), and Ph.D. (EE) from Queen’s University, Kingston, Ontario in 1970, 1972, and 1974 respectively.

He is currently an Associate Professor of Electrical Engineering at Concordia University in Montreal. His researth interest is in the area of digital communications with special emphasis on error control coding technjques, cryptography and spread spectrum communications, and he has been a consultant to government agencies and industries in these areas. He is a coauthor of the book, Digital Communications by Satellite, published by John Wiley & Sons in December 1981, For the year 1982- 1983 he is on sabbatical leave at Ecole Polytechnique de Montreal.

Vijay Bhargava is the junior past chairman of the IEEE Montreal Section and is a director of the IEEE Conferences Montreal, Inc. He is a co-vice chairman and the chairman (local arrangements) of the 1983 IEEE International Symposium on Information Theory. He was instrumental in the formation of the Information Theory Chapter in the Montreal Section and currently serves as its chairman.

19