Fixed-point error analysis of two-channel perfect reconstruction filter banks with perfect alias cancellation

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 46, NO. 11, NOVEMBER 1999 1437

Transactions Briefs

Fixed-Point Error Analysis of Two-ChannelPerfect Reconstruction Filter Banks

with Perfect Alias Cancellation

Bryan E. Usevitch and Carlos L. Betancourt

Abstract—This paper studies the effects of fixed-point arithmetic intwo-channel perfect reconstruction (PR) filter banks. Practical implemen-tations of filter banks often require scaling of coefficients and differingbinary word sizes to maintain dynamic range. When scaling is usedwith fixed precision arithmetic, the perfect alias cancellation (PAC) andPR constraints no longer hold. The main contribution of this paper isthe derivation of constraints whereby PAC is maintained, even whencoefficients are scaled and when the analysis and synthesis filter banksuse different binary word lengths. Once PAC is established, the fixed-point effects on PR properties can be analyzed using standard methods.The theory is verified by comparing predicted and actual reconstructionsignal-to-noise ratios resulting from simulating a symmetric wavelettransform.

Index Terms—Finite wordlength effects, wavelet transforms.

I. INTRODUCTION

This paper studies the errors introduced by the use of fixed-point arithmetic in perfect reconstruction (PR) filter banks. The mainproblem focused on is determining the required filter word lengthsto obtain “reasonable” reconstruction signal-to-noise ratios (SNR’s).This study includes the case where the filter coefficients need to bescaled to maintain dynamic range and where different binary wordlengths are used in the analysis and synthesis filters. Filter bankswhich use scaling can no longer maintain perfect alias cancellation(PAC) in general. However, this paper derives a constraint on thescaling factors that guarantees PAC. Although PAC is not provento maximize the SNR, it simplifies the error analysis by eliminatingthe aliased signal in the reconstructed output. The theory is verifiedthrough simulations on finite impulse response (FIR) PR filter banksusing a symmetric wavelet transform (SWT) [1]. It is shown that onaverage, the reconstruction SNR’s can be predicted to within an errorof 1.5%. The SNR predictions given are based on output signals thathave not been requantized to the same number of bits as the originalinput signal. Also studied is the case where the output is quantizedto the same number of bits as the input. For this case, a statisticalapproach is proposed for selecting the filter word lengths which willproduce PR to a desired degree of probability.

The two sources of error in fixed-precision arithmetic are filter-coefficient quantization and arithmetic roundoff [2]. Section II-Aanalyzes the filter coefficient quantization in PR filter banks. It alsoderives conditions whereby PAC is preserved in spite of coefficientscaling and differing binary wordlengths. Section II-B analyzes thearithmetic roundoff in PR filter banks. Section III gives simulationresults and Section IV gives conclusions.

Manuscript received July 25, 1999. This work was supported in partby NASA FAR under Contract 961119. This paper was recommended byAssociate Editor M. Simaan.

The authors are with the Department of Electrical and Computer Engineer-ing, the University of Texas at El Paso, El Paso, TX 79968-0523 USA.

Publisher Item Identifier S 1057-7130(99)09222-8.

Fig. 1. Two-channel filter bank.

II. FIXED-POINT ERROR ANALYSIS

A. Filter-Coefficient Quantization Error

The approach used in this paper for determining fixed-point arith-metic errors is to model the overall filter bank with an equivalenttransfer function. It is well known that the input–output relationshipof the two-channel filter bank (see Fig. 1) is given by the following:

X̂(z) =1

2[H0(z)G0(z) +H1(z)G1(z)]X(z)

+1

2[H0(�z)G0(z) +H1(�z)G1(z)]X(�z)

=T (z)X(z)+ S(z)X(�z) (1)

where

S(z) =1

2[H0(�z)G0(z) +H1(�z)G1(z)]: (2)

PR filter banks are designed such thatS(z) = 0; which completelyremoves the aliased signalX(�z) from the output. It is readilyverified from (2) that PAC is achieved when [3]

G0(z) = H1(�z) and G1(z) = �H0(�z): (3)

Once aliasing is eliminated, the overall transfer function of thefilter bank is given byT (z) in (1). OnceT (z) is determined, thefilter coefficient quantization error (for either FIR or infinite impulseresponse (IIR) filter banks) is determined readily from standardtechniques [2]. For example, theZ-transform of the coefficient errorfor an FIR filter bank can be expressed as

F(z) =

1

n=�1

[tQ(n)� t(n)]z�n

=TQ(z)� T (z):

Assuming no overflow, the mean-square error (MSE) due to coeffi-cient quantization is given by [2]

E[e2cq(n)] =1

2�

�

��

jF(ej!)j2SXX(ej!)d!

whereSXX is the input power spectral density (PSD).When implementing PR filter banks with finite precision arithmetic,

the filter constraints given in (3) no longer guarantee PAC. Specif-ically, (3) only assures PAC if no scaling is used and the analysisand synthesis filters use the same binary word precision (also calledquantization step size using error analysis terminology). In manypractical applications, both scaling and word-precision adjustmentare required to maintain dynamic range. A constraint is now derivedso that PAC is still maintained when these conditions exist.

1057–7130/99$10.00 1999 IEEE

1438 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 46, NO. 11, NOVEMBER 1999

Suppose that the analysis and synthesis filter coefficients are repre-sented with different binary word sizes corresponding to quantizationstep sizes ofQ1 andQ2; respectively. For a fixed scaling factork;the expression forS(z) can be rewritten as

2S(z)=1

kH0(�z)

Q

[kH1(�z)]Q �

1

kH1(�z)

Q

[kH0(�z)]Q

(4)

where [x]Q denotes the value ofx quantized to a step size ofQ.Defining the quantization error in the filterHi(z) as E(i)

Q (z) fori 2 f0; 1g and j 2 f1; 2g; (4) can be expressed as

2S(z) =1

kH0(�z) + E

(0)Q (�z) kH1(�z) +E

(1)Q (�z)

�1

kH1(�z) + E

(1)Q (�z) kH0(�z +E

(0)Q (�z) :

(5)

Simplifying (5) shows that a necessary and sufficient condition forPAC is

k2 =E(i)Q (z)

E(i)Q (z)

: (6)

An equivalent condition, which can be derived from (6), is that eachfilter coefficient from the filtersh0(n) andh1(n) satisfies

k2 =[k ]Q1

k

Q

: (7)

Equation (7) is now used to derive a theorem for determining validscaling valuesk which maintain PAC.

Theorem 1: A necessary and sufficient condition for perfect aliascancellation in two-band filter banks when scaling byk and where thebinary representations of the analysis and synthesis filter coefficientsdiffer by c bits is

k2 = 2c (8)

where it is assumed that the binary number representations are largeenough so that overflow does not occur.

The proof of Theorem 1 for the casec = 0 (k = 1) followsimmediately from (4), since the quantization step sizesQ1 andQ2

are identical. The wavelet transform tends to compact energy [4].Therefore, it is often necessary to scale down (k > 1) the analysisfilter outputs to represent the signal efficiently without overflow. Theproof of Theorem 1 is given for this more practical case (k > 1,c > 0), noting that the proof for thek < 1; c < 0 case follows ina similar manner.

Proof:

Sufficiency: Assume thatk > 1 and c > 0. Let the smallestquantization levels for each binary representation be given byQ1

andQ2, where2cQ1 = Q2. Consider a filter coefficient which isscaled to give(1=k) = n1 andk = n2. Note that

n2 = k = k21

k = k2n1 = 2cn1:

The quantized value ofn2 (by rounding) is given by

[n2]Q = hn22bi2�b (9)

whereb is the exponent needed to line up the point of rounding withthe standard binary point, andhxi denotes the value ofx rounded

Fig. 2. Additive model for roundoff error in a two-channel filter bank.

to the nearest integer. Using a similar argument, the quantized valueof n1 is

[n1]Q = hn12b+ci2�(b+c) (10)

=(hn22bi2�b)2�c

= [n2]Q 2�c

from which it follows that [n2]Q =[n1]Q = 2c = k2. Repeatingthis argument for each filter coefficient shows that (7) is satisfied andPAC is achieved.

Necessity: Assume PAC is satisfied andk2 6= 2c. The quantizedvalue ofn2 is still given by (9) and the quantized value ofn1 givenin (10) becomes

[n1]Q = n22b 1

k22c 2�(b+c):

Since PAC is satisfied, it follows that

[n2]Q = k2[n1]Q

hn22bi2�b = n22

b 2c

k2k2

2c2�b: (11)

For equality in (11), both the integer and exponent must agree, whichrequiresk2 = 2c (a contradiction). QED

Theorem 1 shows that as long as (8) is satisfied, the error due toaliasing can be eliminated, even when filter coefficients are scaledand when different binary word sizes are used in the analysis andsynthesis filter banks. Once PAC is established, the filter coefficientquantization error is determined readily fromT (z) using standardtechniques [2].

B. Roundoff Error

The roundoff error introduced by an FIR filter depends on thenumber of nontrivial multiplications per output and the quantizationstep sizeQ given by the word length after the multiplier. For signalswith sufficiently large bandwidth and sufficient dynamic range, thequantization errors can be modeled as white noise with varianceQ2=12 [5]. Therefore, assuming independence and no overflow inthe additions, the roundoff MSE at the output of an FIR filter isgiven by

E[e2ro(n)] = �q2

where� is the number of nontrivial multiplications per output, andq2 is the variance of a random variable uniformly distributed in(�Q=2; Q=2).

When this model is applied to a two-channel filter bank thetotal roundoff MSE at the output is determined by the independentsequenceseh (n) and eg (n) for i 2 f0; 1g; as shown in Fig. 2.To find the contribution of the error sequenceseh (n); consider azero-mean uncorrelated sequence with variance�2 that is upsampledby M and filtered byu(n) to give r(n). Since upsampling is a time-varying operation, the mean-square outputE[r2(n)] is periodic with

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 46, NO. 11, NOVEMBER 1999 1439

Fig. 3. Prediction percent errors where the reconstructed output word lengthis identical to the filter output word length.

periodM . Averaging overM samples, the average power ofr(n)is given by

E[r2(n)] =�2

M

1

n=�1

u2(n):

Using this result, the total average roundoff MSE can be computedat the filter-bank output.

III. SIMULATION RESULTS AND DISCUSSION

The above fixed-point error analysis was applied to the hardwareimplementation of a SWT for word lengths between 8 and 16bits (for higher word lengths, the error effect was found to benegligible, producing SNR’s above 70 dB). Twelve one-dimensionalinput signals obtained from a raster scan of 8-bit synthetic apertureradar (SAR) images were used for the simulations. A scaling factorof k =

p2 was used to maintain the dynamic range. Since the

constraint in (8) required one bit of difference between the analysisand synthesis filter coefficient word lengths, 12 bits and 11 bits wereused, respectively. The input PSD was estimated with an averagedAR(3) process over the 12 input signals. The actual SNR’s weremeasured after fixed-point simulation using a CAD tool.

Fig. 3 shows the average and peak percent errors obtained in thepredictions. As shown, the average percent error is less than 1.5% forall the word lengths and the maximum peak error corresponds to 3%.Thus, the results agree well with the theory. Note that the estimatesbecome more accurate for large word lengths and that the estimatesdepend on how well the AR process models the input PSD. Theseresults can be used by a system designer to pick the correct wordlength to achieve a desired system SNR.

The percent errors shown in Fig. 3 are based on predictions wherethe reconstructed output word length is identical to the filter outputword length. In other words, the output signal is not quantized back to8 bits, which was the original word length of the input. Now considerthe case where requantization takes place. Fig. 4 shows that above10 bits, the theoretical and actual SNR’s diverge. This is becausethe error magnitude at the output becomes small with respect to thequantization step size of the input signal.

For the higher word-length case, the analysis is modified as follows.Since integer input is used in these simulations, the total outputreconstructed error must be less than one-half so that the roundedoutput values exactly equal the input values. The approach takenhere to give PR is to choose the error standard deviation to be small

Fig. 4. Estimated and measured SNR’s—PR is obtained for word lengths of12 bits or more after quantization to 8 bits.

Fig. 5. Estimated standard deviation of the output error signal.

enough so that the error is less than one-half (to a desired degreeof probability).

Fig. 5 shows that by choosing a word length of 12 bits (or more),the estimated standard deviation of the output error is less than 10%of the input quantization step size (Q = 1). Using 12 bits resultedin PR in the simulations. On the other hand, for 11-bit word lengths,which correspond to an estimated error standard deviation of lessthan 20% of the input quantization step size, PR was not obtained,but the actual SNR measured above 60 dB (see Fig. 4).

IV. CONCLUSION

The effect of fixed-point arithmetic was studied for a two-channelPR filter bank. The analysis was simplified by enforcing PAC which,although not proven to maximize SNR, eliminates the aliasing error inthe reconstructed output signal. Reconstruction SNR’s were predictedby performing an error analysis on the hardware implementation of aSWT. Simulations showed that the results agree well with the theory.In the case where the output signal is requantized to the same numberof bits as the input, a statistical approach was used to obtain PR toa desired degree of probability.

REFERENCES

[1] C. Brislawn, “Classification of nonexpansive symmetric extension trans-forms for multirate filter banks,” Los Alamos National Laboratory, LosAlamos, NM, Mar. 1996.

1440 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 46, NO. 11, NOVEMBER 1999

[2] B. Liu, “Effect of finite word length on the accuracy of digital filters—AReview,” IEEE Trans. Circuit Theory,vol. CT-18, pp. 670–677, Nov.1971.

[3] M. J. Smith and T. P. Barnwell, “Exact reconstruction techniques fortree-structured subband coders,”IEEE Trans. Acoust., Speech, SignalProcess.,vol. ASSP-34, pp. 434–441, June 1986.

[4] B. Usevitch and M. Orchard, “Smooth wavelets, transform coding,and Markov-1 processes,”IEEE Trans. Signal Processing,vol. 42, pp.2561–2569, Nov. 1995.

[5] C. W. Barnes, B. N. Tran, and S. H. Leung, “On the statistics of fixed-point roundoff error,”IEEE Trans. Acoust., Speech, Signal Process.,vol.ASSP-33, pp. 595–606, June 1985.

On the Properties of the Reduction-by-CompositionLMS Algorithm

Sau-Gee Chen, Yung-An Kao, and Ching-Yeu Chen

Abstract— The recently proposed low-complexity reduction-by-composition least-mean-square (LMS) algorithm (RCLMS) costs onlyhalf multiplications compared to that of the conventional direct-formLMS algorithm (DLMS). This work intends to characterize its propertiesand conditions for mean and mean-square convergence. Closed-formmean-square error (MSE) as a function of the LMS step-size� and anextra compensation step-size� are derived, which are slightly largerthan that of the DLMS algorithm. It is shown, when � is small enoughand � is properly chosen, the RCLMS algorithm has comparableperformance to that of the DLMS algorithm. Simple working rules andranges for � and � to make such comparability are provided. For thealgorithm to converge, a tight bound for � is also derived. The derivedproperties and conditions are verified by simulations.

Index Terms—Adaptive signal processing, convergence, LMS algo-rithm.

I. INTRODUCTION

The direct-form least-mean-square (DLMS) algorithm is the mostpopular temporal-domain adaptive filtering algorithm due to its sim-plicity and robustness. Regarding the temporal-domain approaches,there exist many least-mean-square (LMS) variants in reducing thecoefficient update complexities such as the well-known sign-error,sign-input, and zero-forcing algorithms.

However, few improvements were done in reducing its filteringcomplexities. Recently, a so-called fast exact LMS (FELMS) algo-rithm [4] was proposed to retain the same convergence properties asthose of DLMS, while reducing the multiplication complexities ofboth filtering and updating complexities by as much as 25%, with asmall increase in the number of additions.

More recently, Chenet al. proposed a new reduction-by-composition LMS (RCLMS) adaptive filtering algorithm [1]. Thealgorithm was simulated to have comparable performance to that ofthe DLMS algorithm, while costing 50% fewer multiplications atthe expense of 50% more additions than the DLMS algorithm. The

Manuscript received December 10, 1998; revised August 1999. This workwas supported by the National Science Council, Republic of China, underGrant NSC84-2213-E009-083. This paper was recommended by AssociateEditor P. Diniz.

The authors are with the Department of Electronics Engineering andInstitute of Electronics, National Chiao Tung University, Hsinchu, Taiwan,R.O.C.

Publisher Item Identifier S 1057-7130(99)09219-8.

algorithm can be combined with the FELMS algorithm in reducingits coefficient update complexity. However, so far, the algorithm’sproperties have not been fully addressed.

Here, the properties of the convergence, both in the mean and inthe mean square, are investigated in detail, verified by simulations. Itis shown, when the common step-size� is very small and an extracompensation step-size� is properly chosen, the RCLMS algorithmhas comparable performance to that of the DLMS algorithm. Dueto the extra step constant�; the excess mean-square error (MSE) isshown to be slightly higher than that of the DLMS algorithm for zero-mean input signal. The excess MSE is proportional to�. Also, it isshown that the allowable bound for the step-size� is a function of thestep-size�. Specifically, the larger the step-size� is, the narrowerthe bound is for the step-size�.

The paper is organized as follows. In Section II, the RCLMSalgorithm will be reviewed, followed by its stability analysis in thethird section. Section III covers the issues of weight convergencein mean and mean-square senses, convergence bound for�; andexcess MSE. This section also suggests simple working rules forRCLMS algorithm, leading to a comparable performance to theDLMS algorithm. The derived properties, bounds as well as workingrules, are verified with the simulations in the Section IV. The finalsection draws a conclusion.

II. THE RCLMS ALGORITHM

For real-number systems, given an adaptive filter with inputsequencex(n) and coefficientswk(n)’s, the RCLMS algorithm isdescribed below. For the filtering part

y(n) =

N�1

k=0

wk(n)x(n� k)

=

N=2�1

k=0

f[x(n� 2k) + w2k+1(n)]

� [x(n� 2k � 1) + w2k(n)]g � C(n)� P (n) (1)

where

C(n) =

N=2�1

k=0

w2k(n)w2k+1(n) (2)

P (n) =

N=2�1

k=0

x(n� 2k)x(n� 2k � 1)

=P (n� 2) + x(n)x(n� 1)� x(n�N)x(n�N � 1):

(3)

N is an even number, andx(n) = 0; P (n) = 0 for n < 0. NotethatP (n) only costs one multiplication and two additions. The time-varying complicatedC(n) can be replaced by a simpler scalarhN(n)

as follows, which costs only one extra multiplication, as depicted in(6). Therefore, for the filtering part

y0(n) =

N=2�1

k=0

f[x(n� 2k) + w2k+1(n)]

� [x(n� 2k � 1) + w2k(n)]g � hN(n)� P (n)

= y(n)� [hN(n)� C(n)]: (4)

For the weight update part

wk(n+ 1) =wk(n) + �e0(n)x(n� k);

k = 0; 1; � � � ; N � 1 (5)

1057–7130/99$10.00 1999 IEEE

Documents

Fixed-point error analysis of two-channel perfect reconstruction filter banks with perfect alias cancellation