Polynomial Filtering Of Discrete-time Stochastic Linear Systems …ing.univaq.it/manes/FilesLavoriPDF/Germani/IEEE_TAC97... · 2009-07-02 · 1106 IEEE TRANSACTIONS ON AUTOMATIC CONTROL,

1106 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 42, NO. 8, AUGUST 1997

Polynomial Filtering of Discrete-Time StochasticLinear Systems with Multiplicative State Noise

Francesco Carravetta, Alfredo Germani, and Massimo Raimondi

Abstract—In this paper, the problem of finding an optimalpolynomial state estimate for the class of stochastic linear mod-els with a multiplicative state noise term is studied. For suchmodels, a technique of state augmentation is used, leading to thedefinition of a general polynomial filter. The theory is developedfor time-varying systems with nonstationary and non-Gaussiannoises. Moreover, the steady-state polynomial filter for stationarysystems is also studied. Numerical simulations show the highperformances of the proposed method with respect to the classicallinear filtering techniques.

Index Terms—Kalman filter, Kronecker algebra, polynomialfilter, stochastic bilinear systems, stochastic stability.

I. INTRODUCTION

SYSTEMS with multiplicative state noise, also known inliterature as bilinear stochastic systems (BLSS’s), have

been widely studied since the 1960’s because, from an en-gineering point of view, they constitute a more adequatemathematical model for the analysis and control of some im-portant physical processes. In particular, we stress that bilinearmodels are often derived from basic principles in chemistry,biology, ecology, economics, physics, and engineering [3].Moreover, the well-known bilinear systems (BLS’s) becomeBLSS’s when the input is affected by additive noise.

In control engineering, BLS’s are appealing for their bettercontrollability with respect to the linear ones [2]. In thisframework, considerable importance is devoted to control andstabilization problems, as shown in [5]–[11]. The problem ofparameter estimation for BLS’s and BLSS’s was consideredin [12]–[15].

The state estimation problem for BLSS’s constitutes animportant topic in all those cases in which the state itself is notavailable directly. In [4], the filtering problem for linear controlsystems is considered. In [16], the same problem, for a class ofnonlinear systems including the bilinear ones, is studied, anda linear filter is obtained by considering the nonlinear term asan additive noise. BLSS’s can be considered as linear systemswhose dynamic matrices are a random process and vice versa

Manuscript received August 11, 1995; revised October 29, 1996. Recom-mended by Associate Editor, E. Yaz. This work was supported in part byMURST and by Progetto Finalizzato Trasporti 2 under Grant 4.2.1.

F. Carravetta and M. Raimondi are with the Istituto di Analisi dei Sistemie Informatica del CNR Viale Manzoni 30, 00185 Roma, Italy.

A. Germani is with the Dipartimento di Ingegneria Elettrica, Univer-sitadell’Aquila, 67100 Monteluco (L’Aquila), Italy, and Istituto di Analisidei Sistemi e Informatica del CNR Viale Manzoni 30, 00185 Roma, Italy(e-mail: [email protected]).

Publisher Item Identifier S 0018-9286(97)05068-X.

[26]–[31]. In [17] and [18], following this interpretation, alinear filtering technique for the state estimation of BLSS’s isproposed.

In this paper we consider the following class of BLSS’s:

(1)

(2)

where and arewhite sequences (not necessarily Gaussian) in and

, respectively, and are matrices of suitable dimen-sions, whereas is a bilin-ear map. Moreover, we will assume the independence of

and .The problem we would like to face is the filtering of the

state , given the measurement process . Itis well known that when , this problem is solved bythe famous Kalman filter which yields the linear minimumvariance optimal state estimate (actually optimal among allfilters in the Gaussian case) [32]. The general case is, untilnow, unsolved. As mentioned above, a suboptimal solutioncan be obtained by substituting the stochastic forcing term in(1), namely

(3)

by a process having the same first- and second-order prop-erties. Indeed, it is readily proved that is a whitesequence so that the Kalman filter can be implemented in orderto have the optimal linear estimate. Of course, the stochasticsequence given by (3) is not Gaussian so that the Kalman filterdoes not give the optimal estimate. Recently, the problem offinding nonlinear filters for non-Gaussian linear models hasbeen considered. In particular, a quadratic filter is proposed in[19], and its extension to a more general polynomial case isconsidered in [20].

In this paper, we are able to define a filter for a BLSSsuch as (1) and (2), which is optimal in a class of polynomialtransformations. We also stress that a Gaussian-noise settingis meaningful in the present case. The theory developed hereincludes, as a particular case, the one described in [20],which can be simply obtained by setting to zero the bilinearform in (1). It should also be noted that a converse pointof view could be adopted in that a way of constructinga polynomial filter for BLSS’s could be to compute allmoments of the stochastic forcing term (3) and then using

0018–9286/97$10.00 1997 IEEE

CARRAVETTA et al.: POLYNOMIAL FILTERING WITH MULTIPLICATIVE STATE NOISE 1107

the polynomial filter for linear non-Gaussian systems definedin [20]. However, this way is not convenient at all. Indeed, thecomputation of the moments of (3) requires the computation ofthe state moments. The application of the procedure describedin [20] for state-moments computation leads, in this case,to a very cumbersome nonlinear equation, giving very hardimplementation problems that are difficult to analyze as faras its convergence properties are concerned. In the generalpolynomial case, it is much more convenient to assumeas a starting point for the development of the theory, therepresentation (just used in [17] and [18]) of the BLSS (1),(2) as a linear system with a stochastic dynamical matrix. Inthis framework, in order to obtain a self-contained generalsolution of the polynomial filtering problem for the class ofthe BLSS, here we will adopt just the basic strategy describedin [20]. The resulting algorithm will be sufficiently general toinclude as a very particular case the polynomial filter for thelinear non-Gaussian systems.

Roughly speaking, the method used here consists of defininga linear system whose state and output processes includeKronecker powers and products of the original state andoutput processes so that it is amenable to be treated withKalman filtering theory. For this purpose, the main tool isthe Kronecker algebra. Some important formulas about thissubject are also deduced (e.g., the expression of the Kroneckerpower of a vector polynomial).

We stress that, in the present case, the existence of a stablesolution for the polynomial filter is not guaranteed simply bythe stability of the dynamic matrix as in the linear case.

The paper is organized as follows: in Section II, we recallsome notions in estimation theory which are essential to betterunderstanding the meaning of polynomial estimate. In thisframework, we define the class of polynomial estimators andrecursive algorithms which we will use later. In Section III,we make precise the problem statement, and Sections IVand V explain how to build up the augmented system. InSection VI, the way to implement the filter on the augmentedsystem is described. In Section VII, we present the stationarycase and the steady-state theory. Section VIII contains someremarks about the computer implementation of the algorithm.In Section IX, numerical simulations are presented showingthe high performance of polynomial filtering with respect tothe standard linear methods. Two appendixes are included:Appendix A, containing the proof of the main theorem of thepaper defining the augmented system, and Appendix B, wherethe main definitions and properties about Kronecker algebraare reported together with some new results.

II. POLYNOMIAL ESTIMATES

Our aim is to improve the performance of standard linearfiltering for the class of the BLSS (1), (2). For this purpose wewill look for the optimal filter among the class of estimatorsconstituted by all the fixed-degree causal polynomial transfor-mations of the measurements. We now clarify this point bygiving some definitions which will be useful in the following.

Let be a probability space. For any given sub-algebra of and integer , let us denote by

the Banach space of the-dimensional -measurable randomvariables with finite th moment as

measurable,

where is the euclidean norm in . Moreover, when isthe -algebra generated by a random variable ,that is , we will use the notation to indicate

. Finally, if is a closed subspace of ,we will use the symbol to indicate the orthogonalprojection of onto .

As is well known, the optimal minimum variance estimateof a random variable with respect to a randomvariable , that is , is given by the conditionalexpectation (C.E.) . If and are jointly Gaussian,then the C.E. is the following affine transformation of:

(4)

where .Moreover, defining

(4) also can be interpreted as the projection on the subspace

such that

Unfortunately, in the non-Gaussian case, no simple charac-terization of the C.E. can be achieved. Consequently, it isworthwhile to consider suboptimal estimates which have asimpler mathematical structure that allows the treatment ofreal data. The simplest suboptimal estimate is the optimalaffine one, that is , which is still given bythe right-hand side (RHS) of (4). In the following, such anestimate will be denoted with and shortly called the optimallinear estimate. Suboptimal estimates comprised between theoptimal linear and the C.E. can be considered by projectingonto subspaces, greater than , like subspaces ofpolynomial transformations of Y. We define theth degreespace of the polynomial transformations of asthe following (closed) subspace of :

where the symbol denotes the Kronecker power (seeAppendix B). By defining the vector

...(5)

we have that


We define the th-order polynomial estimateas the randomvariable . Since

the polynomial estimate improves (in terms of error variance)the linear one. Let be the closure in of

=

since, in general, for we cannot assertthat the polynomial estimate “approaches” the optimal onefor increasing polynomial degrees. Nevertheless, the C.E. of

can be decomposed as

where is the orthogonal subspace of. From the pre-vious relation we infer that the polynomial estimate can beconsidered as an approximation of the optimal one only when

is suitably small. However, the polynomialestimate always yields an improvement with respect to theperformance of a linear estimator. Moreover, we can calcu-late it by suitably modifying the space of observed randomvariables and using (4)

(6)

where

Now, let us consider asequence of random variables inand another of observed ones in .

The problem of estimating , given , canbe solved by defining the vector

...

and applying (4) with so that theoptimal linear estimate of is obtained. When the jointsequence is Gaussian, (4) yields the optimal estimate

. Similarly, if the momentsare finite and known, theth-order

polynomial estimate can be obtained by extending the vectoras in (5). However, such a method is highly inefficient,

because it leads to a fast growth of the dimensions of involvedmatrices so that it does not result in being very useful from anapplication point of view. A more realistic approach shouldconsist of searching for a recursive algorithm able to yieldthe above estimates. For this purpose, we give the followingdefinition.

Definition 2.1: We say that the estimate of (notnecessarily optimal) is recursive of order if there existsa sequence of random variables and transformations

such that the following equations hold:

(7)

(8)

As is well known, in the Gaussian case and when thesequences are the state and output evolutions of alinear discrete-time dynamic system, the optimal estimator ofthe state satisfies a recursive equation as in (7), (8), withgiven by a suitable linear transformation, ( identitymatrix) and (the Kalman filter). The same equationsgive, in the non-Gaussian case, the optimal linear estimate.

In the next section, we will prove that when andare the state and output processes, respectively, for a BLSS asin (1), (2), it is possible to find a structure (7), (8) wherehas the form

with linear and polynomialsuch that (7) and (8) yieldthe sequence of optimal estimates in a certain subclassof all the polynomial transformations of fixed and finite degree.In order to define more precisely this subclass of polynomialestimators, we need to give some preliminary definitions.

Consider the above-defined vector and let be afixed integer; we define the subspace

as

where the ’s are suitably dimensioned matrices. Sincethe subspace is finite dimensional, and there-fore closed, we have that for any it is possible to orthopro-ject there the random variable . Then, we can give thefollowing definition.

Definition 2.2: The random variable , given by

is said to be the -order polynomial estimate of .The random variable represents the optimal estimate

of among all the -degree polynomials, including crossproducts between observations which lie in a time window ofwidth . Since


and

the result is that the estimate quality had to improve forincreasing and/or .

III. T HE PROBLEM

The problem we are faced with is the filtering one for thefollowing class of stochastic discrete-time bilinear systems:

(9)

(10)

where, for any

Moreover, , whereasis a bilinear form in . The random variable (the initialcondition) and the random sequencessatisfy the following conditions for any .

1) There exists an integer such that

2) The initial state forms, together with the sequencesa family of independent ran-

dom variables.3) All random sequences are

white.

It should be noted that the vector , in (9),due to the bilinearity hypothesis, can be written in the form

where is a suitable matrix and denotesthe th entry of the vector . Then, system (9), (10) canbe rewritten as

(11)

(12)

where

(13)

System (11), (12) is a linear system with a stochastic dynamicmatrix. It is equivalent to the original bilinear system because itgenerates exactly the same state and output processes. Hence,in order to obtain a state estimate, we can consider this lattersystem in place of (9), (10).

Our goal is the determination of a discrete-time filter, thatis a recursive algorithm in the form (7), (8), which gives atany time the optimal polynomial state estimate of -order (see Definition 2.1) for the system (9), (10), given allthe available observations at time .

In the next sections, it will be shown how to obtain sucha polynomial filter. Moreover, in the constant parameter case,conditions will be defined assuring the existence of a stationarypolynomial filter.

The approach that follows goes along the same line as in[20], consisting essentially of the transpose of the originaryproblem to a linear filtering one, solvable by means of theKalman filter. In order to define a polynomial estimator whichalso takes into account cross products between observations atdifferent times, we need to introduce the following so-called“extended memory system.”

IV. THE EXTENDED MEMORY SYSTEM

Given the system (9), (10), and having chosen an integer, let us define the following vectors:

......

(14)with and . Taking into accountthe equivalent equations (11), (12), we have thatsatisfy the following relations:

(15)

(16)

where

......

......

......

(17)

......

......

(18)

...

......

......

We call (15), (16) anextended memory system.In the next section, which contains the main result of this

paper, we will be able to derive the evolution of the Kroneckerpowers of the above-defined extended state and output.


V. THE AUGMENTED SYSTEM

Let us consider the integer for which Property 1) ofSection III holds. We define theaugmented observationas thevector

...(19)

Moreover, we define theaugmented stateas the vector, where

...(20)

Now, for a bilinear system such as (9), (10), satisfying theProperties 1) and 2) of Section III, let us build up the extendedmemory system (15), (16), the augmented observations (19),and the augmented state (20). Let and be theidentity matrix in and in , respectively. Then, thefollowing theorem holds.

Theorem 5.1:The processes and definedin (19) and (20) satisfy the following equations:

(21)

(22)

where

...

...

Moreover, are zero mean uncorrelated se-quences such that

(23)

whose auto- and cross-covariance matrices

have the following block structure:

where, for arematrices, respectively, given by the

following formulas:

(24)

(25)

(26)


(27)

(28)

Proof: See Appendix A.We call system (21), (22) anaugmented system. It is

a classical time-varying stochastic linear system. Its stateand observation noises are zero mean uncorrelated sequencesand are also mutually uncorrelated at different times. Forthese noises we are able to calculate their auto- and crosscovariances. Hence, for the augmented system the optimallinear state estimate can be calculated by means of the Kalmanfilter equations. In order to proceed along this way, we firstneed to determine the quantities andfor which appear in the augmented systemmatrices and in (23)–(28).

The matrices can be recursivelycalculated from , as stated in the following theorem.

Theorem 5.2:Let, for

and then we have

(29)

where are given by the followingrecursive equations:

(30)

(31)

with initial conditions

(32)

(33)

Proof: First of all, note that the matrix , defined in(17), can be rewritten in the compact form

(34)

where the null blocks are suitably dimensioned, isdefined in (13), and denotes as usual the identitymatrix in (we conventionally assume that it vanishesfor ). From (34), and for , (35) follows,as shown at the bottom of the page. Moreover, for any pair ofintegers using Theorem B.3 and Property (93c), we have

(36)

where

(35)


Taking into account (34), we have the resulting (37), as shownat the bottom of the page, where

(38)

Equation (37) substituted in (36) yields, by exploiting (38) and(35), (30) and (31). From (38) we have

and substituting this in (35), we obtain (29). Finally, note thatfrom (13) and taking into account (34), the initial condition(32) follows. Moreover, from (29)–(31) we infer that tocompute it is enough to know the matrices ,for , which are given by (33), as immediatelyfollows from (38).

Theorem 5.2 allows us to compute recursively the matricesfor from the initial conditions

(32), (33). Condition (32) is immediately given from the data,whereas to obtain (33) we use the following result.

Theorem 5.3:The matrices are given by thefollowing formula:

(39)

Proof: By applying (106) and Corollary B.8, and byexploiting (13), we have

As far as , the vectors appearingin the expressions of the augmented noise covariances, areconcerned, the following theorem shows that their calculationis possible by means of a recursive algorithm.

Theorem 5.4:The vector of the expected values ,defined as

...

satisfies the following recursive equation:

(40)

where

...

Proof: See Appendix A.

(37)


VI. POLYNOMIAL FILTERING

Now we are able to apply the Kalman filter to system (21),(22). It should be highlighted that since the samples of theaugmented state and output noises are in general correlated atthe same time, the system needs to use the Kalman filter in aversion given by [22], which takes into account this nonzerocorrelation. The equations to use are the following:

(41)

(42)

(43)

(44)

(45)

(46)

where is the filter gain, are the filteringand one-step prediction errors covariances, respectively, andthe other symbols are defined as in Theorem 5.1. If the matrix

is singular, it is possible to usethe Moore–Penrose pseudo-inverse.

Equation (41) yields recursively the vector , that is theoptimal linear estimate of with respect to the aggregatevector of all the augmented observations up to time:

...

(we remind readers that here the unit element allows us toreduce anaffine estimation problem to astrictly linear one).From Definition (20) of and (14) of , it followsthat the original state, , is the aggregate of the firstentries of the vector . Since the optimal linear estimatewith respect to is the projection of the random vector

on the subspace linearly spanned by, it follows thatwe can obtain the optimal linear estimate of with respectto , i.e., , by extracting in the first entries

(47)

Equation (47) implies that the error covariance of the originalstate, namely , is given by

(48)

where is given by (45) and hence is the topleft block of . By remembering the structure of theextended observation (14) and of the augmented one (19), from

Definition 2.2 we infer that is the -optimal poly-nomial estimate with respect to the originary measurements

. As in [20], we call a polynomialfilter the whole set of operations constituted by the recursiveequations (41), (42) and by the extraction of the firstentriesin , resulting in an algorithm having the form (7), (8).

VII. STATIONARITY AND STEADY-STATE BEHAVIOR

Equations (41), (47) allow us the recursive calculation of thestate polynomial estimate for the time-varying bilinear system(9), (10). However, in the time-varying case the result will bein general dependent on the initial conditions, whose statisticsare often unknown. Moreover, the gain equations (43)–(46)need to be implemented simultaneously to the filter equations(41), (42).

Due to the high complexity of this filter, it assumes greatimportance from a practical point of view, to know when thereexists the steady-state version of (44)–(46). Here we will limitourselves to examining some important subclasses of bilinearsystems for which we will be able to give necessary conditionsunder which a stationary behavior can be achieved.

First of all, let us consider the case when the systemmatrices and the bilinear form ofsystem (9), (10) are time independent: ,and . Moreover, let us assume thenoises are weakly stationary sequences (thatis, their moments are time invariant). This case is modeled bythe following stationary bilinear system:

(49)

(50)

which can be rewritten, as in the time-varying case, in thelinear form with stochastic dynamic matrix

(51)

(52)

where

(53)

The corresponding augmented system is

(54)

(55)

As is well known, the Kalman filter implemented on a time-invariant system such as (54), (55), having second-orderweakly stationary noises, admits a steady-state gain under theadditional hypotheses of stabilizability and detectability [22].Moreover, from Theorem 5.4, it follows that the extendedstate moments, , given by (40),converge if and only if the matrix is asymptoticallystable. By observing the structure of , we infer thatit is asymptotically stable if and only if the eigenvalues of


all the matrices , for , belong tothe unit circle of the complex plane. It also follows thatthe stability of the matrices , for ,implies the asymptotic stationarity of the augmented noises.Such a condition is then sufficient to assure the existence ofa stationary filter. Now, the main problem is to give sufficientconditions for the stability of .

We will see that for a strictly bilinear system, even time-invariant and with stationary noises, the possibility to imple-ment a stationary polynomial filter is not, in general, assured.Indeed, we are able to find a counterexample in a particularbut important case, that is when the noises are Gaussian, asshown in the following theorem.

Theorem 7.1:For the matrices

(56)

with given by

......

......

......

(57)

......

...(58)

and under the hypotheses that is Gaussian andfor it results that there exists

such that (56) is unstable for all .Proof: Let us suppose, for sake of simplicity, that the

entries of have unit variance and are mutually independent.By using Property (93h) and taking into account the structureof (57) and that of (58), we have

where . Hence

Since are Gaussian and independent, wehave

odd,otherwise;

and hence

Note that all the terms of the summation in the right side havethe same sign. Finally, we have

where the right side is obtained by calculating the sum forand taking into account that

Hence, we have , for , fasterthan . Since

where are the eigenvalues of , this implies theexistence of at least one eigenvalue greater than one.

The circumstance that the availability of the steady-statemoments of any order is not assured for a bilinear systemrepresents a limitation in designing stationary polynomialfilters. In order to be more precise about this limitation, letus introduce the following definition.

Definition 7.2: For a stochastic bilinear system such as(49), (50), we define the stochastic stability degreeasthe maximum order for which the extended state moments

converge to a finite value for, for any initial condition . We

set when the first moment is not convergent.For a stochastic time-invariant linear system having finite

noise moments of all orders, can assume only the valueszero or ; that is, if the dynamic matrix is stable (unstable),

. This fact is a trivial reformulation ofthe theory developed in [20]. For a bilinear system such as(49), (50), it is possible to implement a stationary polynomialfilter of order, for any IN and (here

denotes integer part). The determination of the stochasticstability degree is hence useful for stating in advance themaximum order for which the state polynomial estimate iscomputable by means of a stationary filter. For this purpose,some results, useful for the determination of the stochasticstability degree, can be found in [17], [18], [24], and [25].Here we specialize the above-mentioned results in order tostudy the stochastic stability of the Kronecker powers, up tothe th degree of the extended state or the stationarity of theextended state moments, which is the same.


Lemma 7.3: The stochastic system

(59)

where is a sequence of independent identically dis-tributed stochastic matrices, has itsth moment asymptoticallystable if

where denotes the maximum eigenvalue of matrix.Proof: Taking the th Kronecker power in (59) we have

(60)

From the hypotheses it follows that

hence

The thesis follows by applying [18, Lemma 3.2] to (60).It is now possible to determine a sufficient condition for the

stability of (40). In fact, the following theorem holds.Theorem 7.4:If

(61)

then is stable.Proof: Observe that the function is convex on the

set of symmetric nonnegative matrices. This easily follows bythe property [21]:

; hence, using the Holderand Jensen inequality and (61),

which, using Lemma (7.3), proves the thesis.Corollary 7.5: A sufficient condition for the stability of

(40) relative to (49), (50) is

Proof: The thesis follows from the inequality:

applying Theorem 7.4 with and taking intoaccount of the block-triangular structure of .

VIII. I MPLEMENTATION REMARKS

Some numerical simulations have been carried out on aDigital “alpha” workstation by implementing the polynomialfilter equations in order to produce for any pair of integers

the -order optimal polynomial stateestimate of a BLSS.

For this purpose, we have written a C-language programwhose main part is devoted to the efficient implementation ofthe algorithms, described in Sections V and VI and AppendixB, for the computation of the filter parameters. By observingthe formulas which define the augmented system parameters,in the statement of Theorem 5.1, it becomes evident that thecomputational effort of the whole polynomial filter algorithmquickly grows for increasing and/or . Nevertheless, wepoint out that even low-order polynomial filters (quadratic orcubic filters) which do not require a particularly sophisticatedimplementation show very high performances with respect tothe classical linear filter. Indeed, as shown in some numericalsimulations of the polynomial filter for linear systems [20], theerror variance of a cubic filter may be 80% smaller than theKalman filter. As we will see later, these high performancesare confirmed by low-order polynomial filters for a BLSS. Inthe case presented here, the second-degree polynomial filteryields a signal error variance of 54% less than linear filter. Inthe same case we have been able to compute the fourth-degreepolynomial filter (indeed, a high-order filter, in that it requiresa state space of dimension 30 for two-state variables of thesystem) which yields an improvement of 75% with respect tothe linear filter. As shown in some pictures, the restoration ofthe noisy signal is very impressive.

We would like to stress that the high dimensionality of thefilter is not by itself a true limitation for the implementation.In fact, by using an efficient implementation scheme for thosedata structures which appear in the formulas as matrices of pro-hibitive dimension, it is possible to overcome such difficulties.It should be noted that the computational effort is mostly due tothe calculation of the augmented system parameters. In manycases that are relevant from an application point of view, thatis, time-invariance of system parameters, stationarity of noises,polynomial degree less than the stochastic stability degree (seeSection VII), we can separate the augmented system matrixcomputation from the filter equations (41)–(46) that do notshow relevant computational troubles. In all of these cases,polynomial filtering is amenable to real-time applications. Thenumerical simulations presented here concern the filteringof time-invariant BLSS’s with stationary noises and degreeless than the stochastic stability degree so that the stationarypolynomial filter is implemented using the steady-state gain,and the augmented matrices are calculated before filtering.

Among all the algorithms which are necessary for thecomputation of the augmented system parameters, the mostburdensome are those involved in the computation of theextended state moments , thatappear in the augmented noise covariance (24)–(28). These areobtained by running (40) until convergence is achieved. Thedynamical matrix of (40) may be very large and exceedthe available computer memory space. We think that for large


degrees (i.e., three or more) many tricks can be conceived,when a larger computer memory is not available, in order tosave memory space (for instance, to save and use only suitablysmall blocks of the matrix).

In order to calculate the matrix and the augmentednoises covariances, the computation of the matrix moments

is needed. These are obtained bymeans of the algorithm defined by Theorem 5.2, which in turnrequires the matrices given by (39). The matrices

, which appear in (39) (and are definedin Corollary B.8), are dimensioned; that is, they maybe too large. In our example, for theyhave 2 entries! Nevertheless, this very high dimensionalityis only apparent. In fact, by considering (104) we realize that

can be viewed as an operator which simply permutes theentries of a vector (permutation matrix). A permutation matrixis a zero–one square matrix with one (and only one) unity oneach row and column so that it can be simply implementedas a string of 2 integers, each one representing the columnindex of the unity in a row. Also note that the commutationmatrices, given by B.6, which appears in many formulas, arepermutation matrices.

Finally, the last kind of matrices widely used in the wholealgorithm, which can easily grow toward huge dimensions,is the binomial matrix , defined in Theorem B.6, andthe generalization defined in Theorem B.9. Theseare integer matrices with many null entries; for this reasonwe have implemented them as integer sparse matrices. Inspite of this expediency, we have observed that the matrices

used in (39) can still exceed the computer memory.Anytime this happens we adopt the method, mentioned above,consisting of calculating only small blocks of the matricesand removing them after their utilization. Thus, we can avoidovercoming space memory availability, in spite of a growth ofthe CPU time. This method surely can be adopted for higher-polynomial degrees and system orders and always assures thatthe computation will be made with the same memory usage.

It should be underlined that, in the most important stationarycase, all the above-mentioned expediences are useful, andsometimes necessary, in order to treat efficiently the majorcritical parts of the whole algorithm, even if they can producea great growth of the CPU time needed for the filter parametercomputations. However, they do not affect filter measurementprocessing.

IX. SIMULATION EXAMPLE

The example of an application we are going to considerbelongs to the class of the so-calledswitching systems, widelyused in many research areas such as failure detection, speechrecognition, and, more generally, in the modeling of phys-ical systems affected by abrupt changes in the parameters[26]–[31]. In particular, we are interested in the class ofsystems described by the following partially observed equationdefined on , evolving in :

(62)

where are white sequences andis a white random matrix sequence taking values in

the finite set with probabilities. System (62) can be easily represented as

a BLSS in the following way. Let be thecanonical base in , and let us define the white sequence

assuming values in with. Then

(63)

From the above hypotheses, it follows that, and using this in (63) results in

(64)

By combining (62) and (64), we obtain the BLSS (11), (13)with .

Now, in order to test the filter, let us consider the switchingsystem (62), with

and . Moreover, let the white randomsequences be defined as

where denotes the characteristic function ofand the disjoint events andhave probability

Following the above described procedure, such a switchingsystem is represented by the following BLSS:

(65)

where and is a white sequencedefined as with .

For this system, we have built the steady-state augmentedsystem for the polynomial degrees with

and the quadratic and cubic also with .To each one of these augmented systems we have applied(43)–(46) in order to obtain the steady-state gains and error


Fig. 1. True and measured signal.

covariances. Then, for all these cases, we have used the gainsin the filter equations (41), (42), starting from initial condition

. The corresponding estimates for the signalare readily obtained by using the relation .Moreover, the signal error variance, namely, is given bythe relation , where is the steady-state valueof the state error covariance given by (48). By denotingwith the a priori state errorcovariances given by the 1, 2, 3, 4th-degree polynomial filters,respectively (all with ), and with thecovariances relative to the quadratic and cubic cases,respectively, the obtained values are the following:

......

...

......

...

......

...

......

...

......

...

where we have the 2 2 matrix blocks on the top left sidebecause they contain in the main diagonal the steady-state esti-mate error variance of each component of the state. The corre-

sponding values,for the signal error variances are

As implied by the overall theory described in Section II,we can see that both signal-error variances and state-errorvariances of each component of the state decrease with theincreasing of polynomial degree. In the case,the signal-error variance is 75% less than in the linear filteringcase. Also for the error-variance values relative to the quadraticand cubic cases with , we observe, as expected, animprovement with respect to the same cases with .However, in our experience, the contribution of the increasing

is less effective than the increasing of the polynomialdegree.

In Table I, the sampled variances of the state and signal,obtained with a number iterations, are reported.As expected, these values are close to the abovea priorivariance values. In the same table are also reported the signalsampled variances for the Monte Carlo run of 60 iterationsrelative to Figs. 1–5. Fig. 1 displays the sample paths obtainedfor the observed and true signal, whereas Figs. 2–4 displaythe same path of the true signal with different polynomialestimates.

X. CONCLUSIONS

The -optimal polynomial filter for the BLSS (1), (2),given the -order polynomial estimate (see Definition


Fig. 2. True and filtered signal with� = 1; � = 0.

TABLE I

2.2) of the state by means of recursive equations in the form(7), (8), has been defined for any pair of integers. In particular, the polynomial filter equations are (41), (42),

and (47). These need to use, at each step, only powers of thelast observations so that the computational burden remainsconstant over time. The polynomial filter is obtained by meansof the following steps:

1) construction of the extended memory system (15), (16)(if this step is skipped);

2) construction of the augmented system;3) application of the Kalman filter equations to the aug-

mented system.

Equations (43) and (46) allow the computation of filterparameters. These need, in general, to be implemented simul-taneously to the filter equations (41) and (43). Nevertheless, ifthe BLSS is time invariant, the noises are stationary sequencesand the matrix (defined in the statement of Theorem 5.4)is asymptotically stable, then we can adopt the steady-stateapproximation of the Kalman filter, thus obtaining a greatreduction of computational effort.

In Section VII, it is shown that the stability of the ma-trix , or equivalently the stability of all the matrices

is not implied by the stabilityof so that in general the steady-state polynomialfilter can be implemented only up to a certain finite degree.Corollary 7.6 gives a sufficient condition for the stability ofthe matrix .

Even if the computational burden of polynomial filteringgrows when and/or increase, many tricks (e.g., as inSection VIII) can be conceived in order to considerably reducecomputer memory and CPU time utilization. Numerical simu-lations presented in Section IX show the high performance ofpolynomial filtering with respect to standard linear filtering.For a second-order BLSS, we have observed for the (4,0)-order filter, an error-variance reduction of 75%. It should bestressed that the (2, 0)-order (quadratic) filter also shows a highperformance (54%). In this case, computer time for executingsteps 1), 2), and 3), has been less than 1 s and practically alldevoted to filter parameter computations.

We think that future research work on polynomial filteringshould concern the following points:

1) reducing the computational burden of the algorithm inorder to actually make very high-order filters imple-mentable;

2) investigation of the possible convergence of polynomialestimators, with respect to and increasing, towardC.E. and evaluation of the convergence error;

3) analysis of the influence of values on the polynomialfilter performance. We conjecture that, for a stableBLSS, this influence tends to vanish whenincreasesbecause the observations tend to be uncorrelated whentheir mutual distance in time grows;

4) extension of the polynomial filtering to the class of linearsystems with a multiplicative state noise modeled as aMarkov chain or, more in general, as a colored stochasticsequence.




To conclude, we say that this paper represents a first tenta-tive attack upon nonlinear filtering problems via a polynomialalgorithm. We feel that this could be a way of construct-ing suboptimal filters for a more general class of nonlinearsystems.

APPENDIX AAUGMENTED SYSTEM CONSTRUCTION

In this Appendix, the proof of Theorem 5.1, which definesthe structure and the main properties of the augmented system,is reported. For this purpose we need to state some preliminary

results (Lemma A.1 and Lemma A.2). In particular, LemmaA.2 will allow us to readily prove Theorem 5.4.

Let denote positive integers, ,sequences of random matrices in and , respec-tively, and , sequences of random vectors in

and , respectively. For any , let the followingproperties be satisfied.

1)

where denotes the Euclidean norm.


2) and the setare mutually independent.

3) and the set are mutuallyindependent.

In the following thebinomial matrices(101), (102) will beused often, which will be denoted as , highlighting thedependence from the dimensionof the vectors involved inthe Kronecker power; moreover, the symbol will denotethe identity matrix in .

In order to simplify the notations, let us introduce thefollowing symbols:

(66)

Obviously for we have

(67)

With the above notations, it is now possible to state thefollowing two lemmas.

Lemma A.1: Let be a sequence of stochastic ma-trices in and be a deterministic matrix in .Moreover, let us define, for the followingfunctions:

(68)

with

(69)

(70)

Then, for any couple of (deterministic) matrices inand , respectively, we have that

(71)

and furthermore, for we have that

(72)

(73)

(74)

(75)

(76)

where

(77)

(78)

Proof: From 3) it follows that for any, , ,and are mutually independent; hence taking into ac-count (66), (67) we have

As far as (72) is concerned, taking , we have

where Condition 2) and (67) have been exploited.


Similarly, from 2) and (67) again, the result is

In order to prove (73), consider (92); the result is

(79)

where is given by

By applying Properties (93c) and (93e) it follows that

(80)

By applying Corollary B.4 we obtain

(81)

(82)

By substituting (82) and (81) in (80) and then the result in (79),using Property 3) and taking into account (66), we obtain (73).

Equations (74) and (75) easily follow by applying Property3):

It remains to prove (76). For this purpose, note thatis shown to be formally equal to

with the substitution of withand with . As a consequence,

noting that with these substitutions, and taking into account(66), (78) becomes equal to (77), it follows that (76) holdstrue too.

Lemma A.2: Let be the vector

(83)

where is a deterministic matrix andan integer. Let us consider the augmented vectors

......

...


and the matrix

where

(84)

then there exists the following representation:

(85)

with defined as

...(86)

where are defined as (68).Proof: Let us consider theth Kronecker power of both

sides of (83)

Using Theorem B.6 and Property (93c) we have

and by adding and subtracting totheir expected values, we obtain

By adding and subtracting to andtheir expected values, in the first two terms

of the right side of the previous expression the result is

which can be rewritten as

(87)

where the are defined by (68)–(70). By aggregating ina vector the given by (87) and takinginto account (66), we obtain (85).

Proof of Theorem 5.1:Let us apply Lemmas A.1 andA.2 by setting

and with these choices, from Conditions 1) and 2) ofSection III, it follows that Properties 1)–3) are satisfied;moreover, we have

, and then (21) holds true, with given by

...(88)

From (88), (72) it follows that the sequence isuncorrelated. Moreover, from (88) and (73)–(76) follows (24)with

from which (25) and (26) follow, taking into account (77),(78).

Now, let us apply Lemmas A.1 and A.2 by setting

Then, Properties 1)–3) are again verified and we have, , , . Hence, (21) holds


with given by

...(89)

From (89) and (72) the uncorrelation of the sequencefollows. Moreover, since , from (78) wehave that

and then from (88) and (89), (28) follows.From (51), (52) and applying Lemmas A.1 and A.2 with

(hence , , , ) (23) follows. Withthe same assignments, from (31) we have

and then from (73)–(77), (28) follows, giving the cross-correlation matrix between augmented noises.

Now, we can also prove Theorem 5.4.Proof of Theorem 5.4:Let us apply Lemma A.2 by set-

ting

These choices yield and , hence (85)has in this case the following form:

(90)

where

...

...

By applying Lemma A.1, from (71) it follows thatand

then . Hence, taking the expectations on bothsides of (90), (40) follows.

APPENDIX BKRONECKER ALGEBRA

Throughout this paper, we have widely used Kroneckeralgebra [21]. Here, for the sake of completeness, we recallsome definitions and properties and also give some new resultson this subject.

Definition B.1: Let and be matrices of dimensionand , respectively. Then the Kronecker productis defined as the matrix

where the are the entries of .Of course, this kind of product is not commutative.Definition B.2: Let be the matrix

(91)

where denotes theth column of , then the stack ofis the vector

(92)

Observe that a vector such as in (92) can be reduced to amatrix as in (91) by considering the inverse operation ofthe stack denoted by . With reference to the Kroneckerproduct and the stack operation, the following properties hold[21]:

(93a)

(93b)

(93c)

(93d)

(93e)

(93f)

(93g)

where are suitably dimensioned matrices, arevectors, and denotes the trace of a square matrix.The Kronecker power of the matrix is defined as

As an easy consequence of (93b) and (93g), it follows that

(93h)

It is easy to verify that for , , the th entryof is given by

(94)where and denote the integer part and-modulo, re-spectively. Even if the Kronecker product is not commutative,the following property holds [20], [23].


Theorem B.3:For any given pair of matrices ,, we have

(95)

where the commutation matrix is thematrix such that its entry is given by

if ;

otherwise.(96)

Observe that , hence in the vector case whenand , (95) becomes

(97)

Corollary B.4: For any given matrices havingdimensionsrespectively, denoted with the identity matrix in wehave

Proof: By applying Properties (93b) and (93c) and The-orem B.3 we have

Moreover, let us recall the following recursive formula [20].Lemma B.5: For any and for any

let be the matrix such that

(98)

Then the sequence satisfies the following equations:

(99)

where is the identity matrix in .In [20] can be found the proof of a binomial formula for

the Kronecker power, which generalizes the classical Newtonone, as is asserted by the following theorem.

Theorem B.6:For any integer the matrix coefficientsof the following binomial power formula:

(100)

constitute a set of matrices such that for

(101)

(102)

where and are as in Lemma B.5.Lemmas B.7 and B.9 and Corollary B.8 constitute new

results about Kronecker algebra.Lemma B.7: Given , , there exists a

matrix such that

where

and is the identity in IN.Proof: Let us express the vector as

(103)

where are the th column of and , respectively.Using Theorem B.3, (103) can be rewritten as

...

so that the proof is completed.Corollary B.8: Given a matrix IN there

exists a matrix such that

(104)

where

if ;if .

(105)


Proof: Equation (104) is obviously true for . Let; by supposing (104) true for with as in

(105), we obtain

from which the thesis follows.We can also generalize formula (100) to the polynomial

case. Obviously, given any polynomialIN, its th Kronecker power admits a

representation as

(106)

where are suitable matrices. We extend the defini-tion of symbol , with when at least one of the

’s is negative, such as

(107)

Moreover, we can prove the following statement.Lemma B.9: The matrices in (106)

satisfy the recursive formula

for (108)

for

(109)

Proof: Equation (108) is obvious. In order to prove(109), let us consider the polynomial power

Now, let us consider the term

If , it is equal to

If , then

then, taking into account (106) we can write

(110)

Now, by considering the generic term of the summation onthe left-hand side of (110), that is

, we must look at the RHS for those terms which arecharacterized by the indexes . They are of the form

with , for, whenever , and

with , . Then, taking intoaccount (107), (109) is proved.

REFERENCES

[1] R. R. Mohler and C. N. Shen,Optimal Control of Nuclear Reactors.New York: Academic, 1970.

[2] R. R. Mohler,Bilinear Control Processes. New York: Academic, 1970.[3] C. Bruni, G. Di Pillo, and G. Koch, “Bilinear systems: An appealing

class of ‘nearly linear’ systems in theory and applications,”IEEE Trans.Automat. Contr.,vol. AC-19, Aug. 1974.

[4] C. Bruni and G. Koch, “Suboptimal filtering for linear-in-control sys-tems with small observation noise,” inProc. IV IFAC Symp. Identifica-tion Syst. Parameter Estimation,Tblisi, Sept. 1976.

[5] X. Yang, R. R. Mohler, and R. M. Burton, “Adaptive suboptimal filteringof bilinear systems,”Int. J. Contr.,vol. 52, no. 1, 1990.

[6] R. Luesink and H. Nijmeijer “On the stabilization of bilinear systemsvia constant feedback,” inLinear Algebra and Its Applications.NewYork: Elsevier, 1989.

[7] L. K. Chen, X. Yang, and R. R. Mohler, “Stability analysis of bilinearsystems,”IEEE Trans. Automat. Contr.,vol. 36, Nov. 1991.


[8] X. Yang and L. K. Chen, “Stability of discrete bilinear systems withtime delayed feedback functions,”IEEE Trans. Automat. Contr.,vol.38, Jan. 1993.

[9] , “Stability of discrete bilinear systems with time-delayed feed-back functions,”IEEE Trans. Automat. Contr.,vol. 38, Jan. 1993.

[10] S. P. Banks and M. K. Yew, “On the optimal control of bilinear systemsand its relation to lie algebras,”Int. J. Contr.,vol. 43, no. 3, 1986.

[11] J. L. Willems and J. C. Willems, “Robust stabilization of uncertainsystems,”SIAM J. Contr. Opt.,vol. 21, no. 3, May 1983.

[12] M. S. Ahmed, “Parameter estimation in bilinear systems by instrumentalvariable method,”Int. J. Contr.,vol. 44, no. 4, 1986.

[13] M. M. Gabr, “A recursive (on-line) identification of bilinear systems,”Int. J. Contr.,vol. 44, no. 4, pp. 911–917, 1986.

[14] H. Dai, N. K. Sinha, and S. C. Puthenpura, “Robust combined estimationof states and parameters of bilinear systems,”Automatica,vol. 25, no.4, 1989.

[15] R. R. Mohler and W. J. Kolodzej, “An overview of stochastic bilinearcontrol processes,”IEEE Trans. Syst., Man Cybern.,vol. SMC-10, Dec.1980.

[16] E. Yaz, “Robust, exponentially fast state estimator for some nonlinearstochastic systems,”Int. J. Syst. Sci.,vol. 23, no. 4, 1992.

[17] , “Full and reduced order observer design for discrete stochasticbilinear systems,”IEEE Trans. Automat. Contr.,vol. 37, Apr. 1992.

[18] W. L. De Koning, “Optimal estimation of linear discrete-time systemswith stochastic parameters,”Automatica,vol. 20, no. 1, 1984.

[19] A. De Santis, A. Germani, and M. Raimondi, “Optimal quadraticfiltering of linear discrete time non-Gaussian systems,”IEEE Trans.Automat. Contr.,vol. 40, June 1995.

[20] F. Carravetta, A. Germani, and M. Raimondi, “Polynomial filtering forlinear discrete time non-Gaussian systems,”SIAM J. Contr. Opt.,vol.34, Sept. 1996.

[21] R. Bellman,Introduction to Matrix Analysis. New York: McGraw-Hill,1970.

[22] A. V. Balakrishnan,Kalman Filtering Theory. New York: Optimiza-tion Software, 1984.

[23] G. S. Rodgers,Matrix Derivatives. New York: Marcel Dekker, 1980.[24] X. Feng, K. A. Loparo, Y. Ji, and H. J. Chizeck, “Stochastic stability

properties of jump linear systems,”IEEE Trans. Automat. Contr.,vol.37, Jan. 1992.

[25] Y. Ji and H. J. Chizeck, “Bounded sample path control of discrete timejump linear system,”IEEE Trans. Syst., Man Cybern.,vol. 19, Mar.1989.

[26] J. K. Tugnait, “Detection and estimation for abruptly changing systems,”Automatica,vol. 18, no. 5, 1982.

[27] P. Andersson, “Adaptive forgetting in recursive identification throughmultiple models,”Int. J. Contr.,vol. 42, no. 5, 1985.

[28] H. A. P. Blom and Y. Bar-Shalom, “Time-reversion of a hybrid statestochastic difference system with a jump-linear smoothing application,”IEEE Trans. Inform. Theory,vol. 36, July 1990.

[29] R. E. Helmick and W. D. Blair, “Fixed interval smoothing for Markovianswitching systems,”IEEE Trans. Inform. Theory,vol. 41, Nov. 1995.

[30] A. S. Willsky, E. Y. Chow, S. B. Gershwin, C. S. Greene, P. K. Houpt,and A. L. Kurkjian, “Dynamic model-based techniques for the detectionof incidents on freeways,”IEEE Trans. Automat. Contr.,vol. 25, June1980.

[31] A. S. Willsky, “A survey of design methods for failure detection indynamic systems,”Automatica,vol. 12, p. 601, 1976.

[32] R. E. Kalman, “A new approach to linear filtering and predictionproblems,”ASME J. Basic Eng.,series D, vol. 82, 1960.

Francesco Carravettareceived the Laurea degreein electrical engineering from the University of Cal-abria, Italy, in 1988 and the Ph.D. degree in systemsengineering from the University “La Sapienza,”Rome, in 1995.

After graduating, he served as an Engineer onthe staff of the Research Laboratory of Alcatel-Face Company, Pomezia, Italy. Since 1990, hehas worked as a Researcher at the Institute ofSystems Analysis and Computer Science (IASI) ofthe National Research Council (CNR) in Rome.

His research interests include theory and applications of filtering, prediction,detection, identification, and control of stochastic dynamical systems.

Alfredo Germani received the Laurea degree inphysics in 1972 and the Postgraduate Diploma insystems and control theory in 1974, both from theUniversity “La Sapienza,” Rome.

From 1975 to 1986 he worked as a Researcherat the Institute of Systems Analysis and ComputerScience (IASI) of the National Research Council(CNR) in Rome. In 1978 and 1979, he was at theDepartment of System Science at the Universityof California, Los Angeles, as a Visiting Scholar.From 1981 to 1983, he was a Contract Professor of

Systems and Identification at the University of Ancona, Italy. From 1986to 1987, he was a Professor of Automatic Control at the University ofCalabria, Italy. Since 1987, he has been a Full Professor of System Theoryat the University of L’Aquila, Italy. He is now teaching courses on SystemTheory, Systems Identification, and Data Analysis and Nonlinear, Stochastic,and Optimal Control Theory. Since 1986, he has been a Coordinator of theSystem and Control Theory Research Group of the IASI-CNR, Rome. Hehas published more than 80 research papers in the fields of nonlinear sys-tems, distributed systems, finitely additive white noise theory, approximationtheory, nonlinear filtering, image processing, and mathematical modeling forbiological processes.

Dr. Germani is member of the SIAM Society, the American MathematicalSociety, and of the International Federation of Nonlinear Analysts.

Massimo Raimondi received the Laurea degree in1990 and the Ph.D. in 1995, both in mathematics,from the University “La Sapienza,” Rome.

He is currently a Teacher of Mathematics in ahigh school in Rome and has a research appointmentat the Institute of Systems Analysis and ComputerScience (IASI) of the National Research Council(CNR) in Rome. His interests include the probabilitytheory and its applications.

Documents

Polynomial Filtering Of Discrete-time Stochastic Linear Systems …ing.univaq.it/manes/FilesLavoriPDF/Germani/IEEE_TAC97... · 2009-07-02 · 1106 IEEE TRANSACTIONS ON AUTOMATIC CONTROL,