5
2004 IEEE 11 th Digital Signal Processing Workshop & IEEE Signal Processing Education Workshop BAYESN ESTIMATION OF NON-STATIONARY AR MODEL PAMETERS V AN UNKNOWN F ORGETTING FACTOR Vdclav Sm(dl UTIA, Czech Academy of Sciences, Prague, Czech Republic smidl@utia.cas.cz ABSTRACT We study Bayesi estimation of the time-varying param- eters of a non-station AR process. This is aditionally achieved via exponential forgetti ng. A numecall y tractable solution is available if the forgetting factor is known a pri- ori. This assumption is now relaxed. Instead, we propose joint Bayesian estimati on of the parameters and the un- known forgetting factor. The posterior distribution is in- tractable, and is approximated using the Variational-Bayes (VB) method. Improved pareter tracking is revealed in simulation. 1. INTRODUCTION The standd Bayesian approach to on-line estimation of AutoRegressive CAR) mod el parameters leads to the clas- sical solution in terms of the normal equations [1]. ese resul ts, reviewed in Section 2, can be extended to cope with non-stationary AR processes, by using a known forgetting factor [2]. The resulting recursive updates involve the same dyadic form (Section 3) as in the staonary case. This pa- per confronts the difficulty associated with selecting a value for the forgetting factor. Using Bayesian pnciples, the for- getting f actor is included as a random vable , to be esti- mated at each step (Section 4). A Variational Bayes (VB) approximaon of the intractable posterior distribution re- stores e dyadic update structure of the conventional al- gorithm, with the forgetting factor replaced by its posterior expect ation . The overhead implied by the proc edure is the need to evuate is expected value via an iterative algo- m, as explained in Section 4.2. In Section 5, a simula- tion is presented, involving estimation of time-varying AR parameters. e proposed appach appears to offer signif- icant i mpr ovements in tracng, comped to a fixed known forgetting factor. In Section 6, the approach is discussed in the context of cuently available solutions in the literature. 0-7803-8434-2/04/$20.02004 IEEE Anthony Quinn Trinity College Dublin, Ireland aquin[email protected] 2. BAYESIAN ESTIMATION OF STATIONARY AR MODEL PARAMETERS A sc uni vat e process is first studied: p Xn = - Lakx"-k + en· k=l (1) Here, Xn denotes the AR process observed at times n = 1,2,3, ... , and e" is the model ling error. The unknown parameters are = [a',,wherea = [al, .. . ,ap]'. The model order, p, is assum known in this paper. The classi- cal s oluti on to estimation of is based on the Wiener Mini- mum Mean Squed Error (MMSE) criterion [1], leading to the so-called normal equations. The standard Bayesian ap- proach is based on the assumption that en in (1) is white noise with Gaussian distribution; i.e. f (en) = N(O,l). Then: f (xnla, , x) = N (-a'x", 2 ) , (2) where n > p, and Xn = [X-l'" xn-p] is the vector of regressor s. On-line esti mation requires the posterior distri- bution of e to be elicited at each time n. Bayes' rule, in recursive form, is used to update the posterior distribution of par ameters from time n - 1 to n: where Xn = [Xl,' . . ,X,, ] , denotes the data histo at time n, and Xo = n, by asssignment. If the required prior, f (0), is chosen to be conjugate to the observation model (2) [3], then the functional form of t he prior and posterior e i dentical under update (3). Since the model (2) belongs to the exponential family, both a conjugate prior and sufficient statistics are available, of the Normal-inverse·Gamma (N type [3J: 221 . a-v N19a, (V, v) ( . ( ) X N,9 ,v exp { _�a-2 [-1, a'l V [-1, all'}, (4)

[IEEE 3rd IEEE Signal Processing Education Workshop. 2004 IEEE 11th Digital Signal Processing Workshop, 2004. - Taos Ski Valley, NM, USA (1-4 Aug. 2004)] 3rd IEEE Signal Processing

Embed Size (px)

Citation preview

Page 1: [IEEE 3rd IEEE Signal Processing Education Workshop. 2004 IEEE 11th Digital Signal Processing Workshop, 2004. - Taos Ski Valley, NM, USA (1-4 Aug. 2004)] 3rd IEEE Signal Processing

2004 IEEE 11 th Digital Signal Processing Workshop & IEEE Signal Processing Education Workshop

BAYESIAN ESTIMATION OF NON-STATIONARY AR MODEL PARAMETERS VIA AN UNKNOWN FORGETTING FACTOR

Vdclav Sm(dl UTIA, Czech Academy of Sciences,

Prague, Czech Republic [email protected]

ABSTRACT

We study Bayesian estimation of the time-varying param­eters of a non-stationary AR process. This is traditionally achieved via exponential forgetting. A numerically tractable solution is available if the forgetting factor is known a pri­ori. This assumption is now relaxed. Instead, we propose joint Bayesian estimation of the AR parameters and the un­known forgetting factor. The posterior distribution is in­tractable, and is approximated using the Variational-Bayes (VB) method. Improved parameter tracking is revealed in simulation.

1. INTRODUCTION

The standard Bayesian approach to on-line estimation of AutoRegressive CAR) model parameters leads to the clas­sical solution in terms of the normal equations [1]. These results, reviewed in Section 2, can be extended to cope with non-stationary AR processes, by using a known forgetting factor [2]. The resulting recursive updates involve the same dyadic form (Section 3) as in the stationary case. This pa­per confronts the difficulty associated with selecting a value for the forgetting factor. Using Bayesian principles, the for­getting factor is included as a random variable , to be esti­mated at each step (Section 4). A Variational Bayes (VB) approximation of the intractable posterior distribution re­stores the dyadic update structure of the conventional al­gorithm, with the forgetting factor replaced by its posterior expectation . The overhead implied by the procedure is the need to evaluate this expected value via an iterative algo­rithm, as explained in Section 4.2. In Section 5, a simula­tion is presented, involving estimation of time-varying AR parameters. The proposed approach appears to offer signif­icant improvements in tracking, compared to a fixed known forgetting factor. In Section 6, the approach is discussed in the context of currently available solutions in the literature.

0-7803-8434-2/04/$20.0002004 IEEE

Anthony Quinn

Trinity College Dublin, Ireland

[email protected]

2. BAYESIAN ESTIMATION OF STATIONARY A R

MODEL PARAMETERS

A scalar univariate AR process is first studied:

p

Xn = -Lakx"-k + (J"en· k=l

(1)

Here, Xn denotes the AR process observed at times n = 1,2,3, ... , and (J"e" is the modelling error. The unknown parameters are (J = [a',o'j',wherea = [al, .. . ,ap]'. The model order, p, is assumed known in this paper. The classi­cal solution to estimation of (J is based on the Wiener Mini­mum Mean Squared Error (MMSE) criterion [1], leading to the so-called normal equations. The standard Bayesian ap­proach is based on the assumption that en in (1) is white noise with Gaussian distribution; i.e. f (en) = N(O,l). Then:

f (xnla, (J", x,,) = N (-a'x", (J"2) , (2) where n > p, and Xn = [X,,-l'" xn-p] is the vector of regressors. On-line estimation requires the posterior distri­bution of e to be elicited at each time n. Bayes' rule, in recursive form, is used to update the posterior distribution of parameters from time n - 1 to n:

where Xn = [Xl,' . . ,X,, ] , denotes the data history at time n, and Xo = n, by asssignment.

If the required prior, f (0), is chosen to be conjugate to the observation model (2) [3], then the functional form of the prior and posterior are identical under update (3). Since the model (2) belongs to the exponential family, both a conjugate prior and sufficient statistics are available, of the Normal-inverse·Gamma (Ni(i) type [3J:

221

. a-v N19a,/7 (V, v) :3 ( . (V: ) X N,9 ,v

exp { _�a-2 [-1, a'l V [-1, all'}, (4)

Page 2: [IEEE 3rd IEEE Signal Processing Education Workshop. 2004 IEEE 11th Digital Signal Processing Workshop, 2004. - Taos Ski Valley, NM, USA (1-4 Aug. 2004)] 3rd IEEE Signal Processing

Wi9 (V, II) = r (0.511))' -0.5v lVaal-0.5 2o.5p,

v = [ ViI Val

(5)

(6)

where (6) denotes the partitioning of V E �(p+ l) x (p+ l) into blocks. with Vn being the (1,1) element. V, II are the sufficient statistics of Niga,lJ" (-).1·1 denotes the matrix de­terminant.

The statistics of the conjugate prior distribution, Vo, 110, are chosen to reflect our initial knowledge of parameters. If we do not have any preference, we use a diffuse distribu­tion. Typically Vo = Ellp+l, 110 = E2, where Ip+l is the (p + 1) x (p + 1) identity matrix, and El, 102 are small pos­itive scalars. Substituting (2) into (3) and invoking (4) at time n - 1, then the posterior distribution at time rz > pis

Vn-l + 1. (9)

Here, xn = [xn , x�]' is the extended regression vector. (8) will be called a dyadic update in this paper. Since the recur­sion begins at n = p + 1, we choose Vp = Vo and lip = 110. This is equivalent to choosing the distribution on parameters to be invariant for n :::; p.

The following moments of (7) will be required later in the paper:

Va��n Val;n, An

(0)

where. for example, an denotes the expected value of a with respect to distribution f (aIXn). In (12), 'IfJ(-) denotes the Digamrna function [4].

3. ESTIMATION FOR A NON-STATIONARY AR MODEL USING A KNOWN FORGETTING FACTOR

The stationary assumption, above, is rarely met in practice. Ideally, then, a model of parameter variations is required, using, for example, a hidden state variable modelling ap­proach [5] and the Kalman filter. When not available, the estimation problem is under-determined, obviating the full Bayesian solution. The standard batch (off-line) algorithm uses windowing. Alternatively. the concept of forgetting [2] is used in adaptive signal processing [6], recursive estima­tion [7J, and in particle filtering approaches [8].

A Bayesian treatment of forgetting was developed in [91. There, the missing model of parameter evolution is opti­mally approximated (in the sense of minimum Kullback­Leibler distance) via a probabilistic operator:

The notation f (.) 6" indicates the replacement of the ar­gument of f (-) by On, where On is the time-varying un­known parameter set at time n. j ( . ) is a pre-selected al­ternative distribution, expressing auxiliary knowledge about On at time n. Coefficient <Pn, 0 :::; <Pn :::; 1, is known as a for­getting factor, Note that the dependence of (13) on 0n-l is replaced by dependence on the sequence, <Pi, i = 1, . . . l n, a fact suppressed in the notation of (13).

In this paper, the N iQ distribution with parameters \I., v is used as the alternative. It is typically chosen as a diffuse distribution, with the same parameter values as the prior: if = Vo, ii = va. The NiQ conjugate family (4) is closed under the convex combining (i.e. geometric mean) in (13), yielding another member of the same family. Substituting (13) and (2) into the time-varying form of (3), the following recursive update of the N ig statistics is revealed:

<PnVn-l + xnX� + (1- <;On) if, <Pnlln-l + 1 + (1 - 1>n) iI.

(14)

(15)

When <Pn = 1, the update is identical to the stationary equa­tions (8,9).

4. EXTENSION TO AN UNKNOWN FORGETTING FACTOR

A time-varying sequence of forgetting factors, <;On, can be accommodated by (13), permitting variable non-stationary dynamics to be modelled. No guidance exists for its choice, however, and so a known constant is used in practice. In this Section, we derive a Bayesian technique for posterior inference of the forgetting factor sequence, 1>n, in tandem with the AR model parameters, On. Specifically, the task is to elicit the posterior distribution, f(On, q'>,.,IXn). Using Bayes' rule:

f(On, 1>nIXn) ex: f(xnIOn, Xn-1)f(On!Xn-1, <;on)f (1)n). (16)

In the first term on the right-hand side (given by (2), it is assumed that observations, Xn, are conditionally indepen­dent of 1>n. given model parameters, On. The second term on the right-hand side is given by (13). The third term on the right hand side follows from the assumption that 1>n is independent of previous data Xn-1, Vn. f (¢n) is chosen conservatively, being uniform in the interval [0,1];

222

Page 3: [IEEE 3rd IEEE Signal Processing Education Workshop. 2004 IEEE 11th Digital Signal Processing Workshop, 2004. - Taos Ski Valley, NM, USA (1-4 Aug. 2004)] 3rd IEEE Signal Processing

i.e. f (¢n) = U ([0,1]). The resulting posterior distribu­tion (16) is not analytically tractable. Therefore, we seek a suitable approximation. In this paper, the approximation is forced to exhibit posterior independence between Bn and

¢n: leOn, ¢nIXn) � /(8nIXn)/(¢nIXn), (17)

where distributions, J ( .) , are chosen appropriately. This is the approximating function class used in the Variational Bayes procedure, reviewed next.

4.1, Variational Bayes (VB) approximation

Theorem 1 (Variational Bayes) Let f (BI, 021X) be apos­

terior pdf of 01, B2• given data, X. Let / (BI, B2IX) be an approximate pdf restricted to the set of conditionally inde­pendent distributions on 81, B2,'

/ «(h, (}2IX) = A «(}IIX) /2 (02IX) . (18) Then, the minimum of the Kullback·Leibler (KL) distance

[10],

{fl (BIIX) ,72 (02IX) } =

arglJli!}KL (i CBl,BzIX) IIJ(01,02IX») , (19) h,h

is reached jor

71(01IX) ex exp(Eo2Ix(ln(J(01,02,X)))),(20) fdB21X) ex exp(EOllx(ln(f(01,82,X»»).(21)

We will refer to (20,21) as the VB-optimal posteriors. Func­tions of X arising in (20,21) will be called VB-statistics. E. (.) denotes expectation of the argument with respect to the other VB-posterior.

The VB-statistics, which parameterize 71 ( .) (20), are need­ed for evaluation of72 (.) (21), and vice-versa. Hence, the VB solution (20,21) is usually not available in closed form. It is found by iteration to convergence of the following al­gorithm, which is a stochastic generalization of the classical Expectation-Maximization (EM) algorithm.

Algorithm 4.1 (Variational EM (VEM» Cyclic iteration of the following steps converges to a solution of (20,21):

E-step: compute approximate distribution of O2 at iter­ation i:

7;i) (B2IX) ex: exp ( 1ti-1) (OlIX)ln f(BI,B2,X)dBl. lOl

(22) M-step: using approximate distribution from the ilh E­

step, compute approxinUlte distribution of 01 at iteration i:

1�i) (OIID) ex: exp ( 1;i) (B2IX)ln f(81,02,X)dB2. loo - (23)

Convergence of the algorithm was proven in [11].

4.2. VB-conjugate on-line inference

We intend to use the VB approximation (Theorem 1) at each step of the on-line update (16). VB-optimal poste­riors (20,21) remain functionally invariant during the up­date from n � 1 to n if they are drawn from a family of distributions closed under the combination of (i) exponen­tial forgetting, (ii) Bayes updating (16), and (iii) the VB­approximation. This requirement, called VB-conjugacy [12], ensures a tractable on-line update of sufficient statistics. It may be shown [12] that the following choices satisfy VB­conjugacy:

7 (Bn-1IXn-d j (BnIXn-1)

Nig (Vn-1, I)n-d , Nig (VJ')'

(24)

(25)

In Section 4, thechoicef (¢nIXn-d = f (¢n) = U ([0, 1]) was made, and so VB-conjugacy is not required for ¢ ...

Substituting (24) and (25) into (13), and the result, along with (2), into (16), then the VB-optimal posterior for Bn, using Theorem 1, is:

1 (BnIXn) = NiQ (Vn, vn), (26)

Vn ¢nVn-1 + xnx� + (1 � ¢n) fi (27)

Vn = ¢nVn-l + 1 + (1- ¢n) v. (28) Hence, the updates of the VB-statistics for en are identical to the sufficient statistic updates in the case of known for­getting factor (14,15), ¢n, but with the latter replaced by ¢n. This is the expected value of ¢n with respect to the VB-optimal posterior, 7 (¢n!Xn), at time n.

-4.3. Evaluation of the VB-statistic, ¢n 1 (¢n IXn) is analytically intractable, since it is normalized with respect to the tenn

where

v (¢n) = ¢n Vn-1 + (1 � cPn)V, (30)

v (¢n) = ¢nl-'n-l + (1 � cPn)v, (31)

with (Nig(·,·) given by (5). Note that the recursion in (30) is onto the VB-statistic, Vn-1• given by (27), and ditto for (31), From (30,31), the values of (¢n) at the extrema of 1>n are:

«0) = (NiQ (V, v) , (32)

«(1) = WiQ(Vn-l,Vn-I), (33)

with Vn-l and Vn-l-the V B-statistics at the last step-­being given by (27) and (28) respectiVely. We now propose a tractable approximation of (29) which matches these end­points,

223

Page 4: [IEEE 3rd IEEE Signal Processing Education Workshop. 2004 IEEE 11th Digital Signal Processing Workshop, 2004. - Taos Ski Valley, NM, USA (1-4 Aug. 2004)] 3rd IEEE Signal Processing

Proposition 1 (Approximation of Normalizing Constant)

(¢n) � exp (hI + h2<Pn)· Matching the extrema (32,33), then:

(34)

hI In (HiQ (V, ii) , (35)

h2 = In (HiQ (Vn-l, lIn-I) -In Wig (V, ii) �36)

Now, the VB-optimal distribution for ¢n is greatly simpli­fied:

7 (¢nIXn) � £xp(a) U ([0, 1]), (37)

being the truncated exponential distribution of parameter a, on support [0, 1]. Its parameter is

a = (Vn-l - ii) In;;: - �tr (( Vaa;n-l - Vaa) VB��n) -In (NiQ (Vn-l. vn-d + In (Hi(; (V, ii)

1 ( _'] ( - ) [ _'] , - -= -1, an Vn-1 - V -1, an . (38) 2a;

The VB-moments, :;;., are given by (10, 11,12), using the VB-statistics (27,28). tr denotes the trace of the matrix. The required mean of (37) is given by:

-

--- exp (a) (1 - a) - 1 ¢n = ---=--'-�----'-:-:--a(l-exp(a)) . (39)

dJn (39) is a function of Vn (27) and Vn (28), via (10,11,12). Meanwhile, Vn and Vn are functions of ¢n. Hence, ¢;, is ob­tained using the iterative VEM algorithm (Algorithm 4.1), yielding the VB-optimal AR parameter inference at time n

(26). Note that, under approximation (34), this VB-conjugate distribution for en is undisturbed.

5. TRACKING AN AR PROCESS WITH ABRUPT

PARAMETER CHANGEPOINTS

A univariate, second-order (i.e. xn = [xn, Xn-I, Xn-2]'), stable AR model is simulated, with parameters a = 1, and { [-l.8, 0.98]' a

= [0.29, 0.98],

Abrupt switching between these two cases occurs every 30 . samples. The alternative distribution (25) parameters are

chosen to be

V = diag ([1,0.001,0.001]) , v = 10. (40)

The prior distribution, 7 (epIXp) (26) is chosen equal to the alternative distribution. At each time-step, the VEM algo-

. hm' . . . I' d ·th 7(1) 0 7 d d h nt IS InlUa Ize WI 'f'n = . , an stoppe w en

� J 5Ol_ --'" /""\ A " " A f\ " A A ./"'--.. /\. � J 'j i l-- "-../ VVVV'lVVV� V "J� z -50 �------------------------------�

20 40

\; ; q D " .:" • • • • < • • • , • <

'

::

.. A . . . . \. <.... I." t > <

w � , .. �� on· line VB

60 80 100

Fig. 1. Estimation of non-stationary AR process using time­varying forgetting. In sub-figures 2-4, full lines denote simulated values of parameters, dashed lines denote VB­posterior expected values, and dotted lines denote uncer­tainty bounds.

I ¢�m) - ¢hm-1) I < 0.001. The required number of iter­

ations, m, to reach convergence is recorded. As a variant of this scheme, an 'on-line' VB algorithm allows only two iterations of the VEM algorithm per time-step. Results of parameter estimation are displayed in Figure 1.

The algorithm promptly detects the changepoints. It achieves this by switching the estim�d forgetting factor automatically to a value close to zero (¢n = 0.05 when n = 33). This causes an abrupt 'statistics dump' of the accu mu­lated VB-statistics, Vn, Vn, and their re-initialization with the alternative (i.e. prior) values, V, ii. Thus, the estima­tion process is effectively restarted. Note that the required number of iterations of the VEM algorithm is significan� higher at the changepoints. Therefore, at these points, <Pn, obtained using the on-line VEM variant, is higher than the value found by allowing the algorithm to iterate to conver­gence (Figure 1). For comparison, estimates obtained using a fixed, known forgetting factor (14,15), <Pn = 0.9, "in, are displayed in the 4th sub-figure. Clearly, parameter tracking

224

Page 5: [IEEE 3rd IEEE Signal Processing Education Workshop. 2004 IEEE 11th Digital Signal Processing Workshop, 2004. - Taos Ski Valley, NM, USA (1-4 Aug. 2004)] 3rd IEEE Signal Processing

is greatly improved using the on-line estimation of ¢n, even when the VEM iterations are stopped before convergence. This should be the case whenever parameter variations are not too rapid.

6. DI SCUSSION

The main drawback of the proposed VEM algorithm is that convergence to a solution at each time-step is not assured within a bounded number of iterations. This can present problems for on-line processing. The issue was addressed in [11] for stationary data. Using only one iteration of the VEM algorithm at each time-step, it was proved that the al­gorithm converges asymptotically to the true time-invariant posterior. However, no such result is available for non-stat­ionary models.

A known, time-invariant forgetting factor is used in many estimation methods for non-stationary processes [7J. Previ­ous attempts have been made to relax the assumption of an

a priori known forgetting factor, particularly in Recursive Least Square (RLS) algorithms. The method presented in [13] is the closest to our approach. It uses a gradient-based approach for estimation of the forgetting factor. However, the criterion of asymptotic MSE, minimized in [13], can only cope with slow parameter variations.

Clearly, ¢;;. plays a critical rOle in detection of non­stationarities in the data, and, consequently, in quality of pa­rameter estimation. In this paper, the simple approximation �oposition 1), which yielded a closed-form expression for ¢n (39), influences the quality of the tracking results . Fur­ther research may suggest better approximations for critical applications.

Finally, in on-line processing, the maximum number of VEM iterations per time-step should be determined by the

sampling period of the data. If this number is too low, per­fonnance of the algorithm will again be degraded.

7. CONCLUSION

The principle of VB-conjugacy has led to a method for ap­proximate joint Bayesian estimation of non-stationary AR parameters, On, in tandem wi� a time-varying forgetting factor, ¢n. On-line estimation, ¢n, of the latter proved to be of significance, since (i) ¢n responds sensitively to change­

points in the data, and (ii) it weights the accumulated statis­tics agai� alternative values. During non-stationary be­

haviour, ¢n falls, allowing statistics to be partially re-initialized. Detection of changepoints, and improved parameter track­ing, are the direct consequence, as compared with tradi­tional stationary forgetting. An iterative VEM procedure is required at each time-step.

8. REFERENCES

[ 1] J. Makhoul, "Linear prediction: A tutorial review," Proceedings of the IEEE, vol. 63, no. 4, pp. 561-580, 1975.

[2] A. H. Iazwinski, Stochastic Processes and Filtering Theory, Academic Press, New York, 1979.

[3] J.M. Bernardo and A.F.M. Smith, Bayesian Theory, John Wiley & Sons, Chichester, New York, Brisbane, Toronto, Singapore, 1997, 2nd edition.

[4] M. Abramowitz and LA. Stegun, Handbook of math­ematical functions, Dover Publications, Inc., New York,1972.

[5] M. Cassidy and W.o. Penny, "Bayesian nonstationary autoregressive models for biomedical signal analysis," IEEE Transactions on Biomedical Engineering, vol. 49, no. 10,2002.

[6J G. V. Moustakides, "Locally optimum adaptive signal processing algorithms," IEEE Transactions on Signal Processing, vol. 46, no. 12, pp. 3315-3325, 1998.

[7] L. Ljung and T. Soderstrom, Theory and practice of recursive identification, MIT Press, Cambridge; Lon­don, 1983.

[8] Kotecha I.H. Esteve F. Djuric, P.M . and E. Perret, "Sequential parameter estimation of time-varying non­

gaussian autoregressive processes," Eurasip J. Appl. Sign. Process., vol. 2002, no. 8,2002.

[9] R. Kulhary and M.B. Zarrop, "On general concept of forgetting," Intematiollal Journal of Control, vol. 58, no.4,pp.905-924,1993.

[ 10] S. J. Roberts and W. D. Penny, "Variational Bayes for generalized autoregressive models," IEEE Trans­

actions on Signal ProceSSing, vol. 50, no. 9, pp. 2245-2257,2002.

[11] M. Sato, "Online model selection based on the Varia­tional Bayes," Neural Computation, vol. 13, pp. 1649-168 1,200 1.

[12] V. Smidl, The Variational Bayes Approach in Sig­nal Processing, Ph.D. Thesis, Trinity College Dublin, Dublin, Ireland, 2004.

[13] c.-F. So, s. C. Ng, and S. H. Leung, "Gradient based variable forgetting factor RLS algorithm," Signal Pro­cessing, vol. 83, pp. 1163-1175,2003.

225