New Kalman filter and smoother consistency tests

Automatica 49 (2013) 3141–3144

Contents lists available at ScienceDirect

Automatica

journal homepage: www.elsevier.com/locate/automatica

Technical communique

New Kalman filter and smoother consistency tests✩

Richard G. Gibbs 1

Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Road, Laurel, MD 20723, United States

a r t i c l e i n f o

Article history:Received 8 March 2013Received in revised form10 April 2013Accepted 17 June 2013Available online 6 August 2013

Keywords:Kalman filtersOptimal estimationSmoothing

a b s t r a c t

We derive three new tests that can be applied to a Kalman filter to check for inconsistencies. The FilterResidual Test can detect observations that are outliers but would be missed by a basic residual testbecause the uncertainty of the expected observation is large relative to the uncertainty of the observation.The Smoother Residual Test uses the output from a Modified Bryson–Frazier (MBF) smoother to detectobservations that are outliers. The Smoother State Test compares the state estimates from the filter andMBF smoother to detect model inconsistencies, in particular insufficient process noise.

© 2013 Elsevier Ltd. All rights reserved.

1. Introduction

This paper derives three new tests that can be applied to theoutput from a Kalman filter to detect observation outliers and filtermodel inconsistencies. Two of these tests involve the use of theModified Bryson–Frazier (MBF) Fixed Interval Smoother.

A discussion of the practical considerations for implementingKalman filters and, in particular, detecting and correcting anoma-lous behavior, can be found in Grewal and Andrews (2008). Thesources of inconsistency in a Kalman filter that are relevant to thispaper can be summarized as:

• Measurement errors are inconsistent with the filter’s standarddeviation for the error. Too small a standard deviation for themeasurement error results in too much weight being given tothe measurement.

• Some measurements may be outliers because of gross errorsthat are not part of the measurement error model.

• Insufficient process noise causes the filter to converge tootightly to a solution that progressively results in too muchweight being given to the filter estimate and too little weightto the measurements.

In Section 4 we discuss the standard residual test, which tests foroutliers by comparing the filter residual with its standard devia-tion. In Section 5 we derive a new test on the filter residuals that

✩ The material in this paper was not presented at any conference. This paper wasrecommended for publication in revised form by Associate Editor Tongwen Chenunder the direction of Editor André L. Tits.

E-mail address: [email protected] Tel.: +1 443 778 7422; fax: +1 443 778 6519.

0005-1098/$ – see front matter© 2013 Elsevier Ltd. All rights reserved.http://dx.doi.org/10.1016/j.automatica.2013.07.013

can detect outliers in situations where the standard residual testdoes not. In Section 6 we give a simple analytical example of thisnew test that illustrates how it detects outliers missed by the stan-dard residual test. In Section 7wederive a test on theMBF residualsthat also checks for outliers. In Section 8 we derive a test that com-pares the filter and smoother estimates of the state. This test candetect errors caused by insufficient process noise or filter mismod-eling. In practice it is assumed for this consistency checking thatthe underlying probability distributions are reasonably approxi-mated as Gaussian.

2. Notation and Kalman filter equations

We assume that the filter and smoother are applied to the timeinterval [t0, tN ] with N observations at times t1, t2 . . . tN . The statespace model is given by

xj+1 = φjxj + wj, (1)

yj = Hjxj + ej, (2)

where xj and yj are the true state and measurement respectivelyat time tj. The dimension of the measurement vector, yj, may varywith j. The vectors wj, ej, and x0 are independent zero-mean Gaus-sian random variables with covariances Qj, Rj, and P0 respectively,and φj is the state transition matrix from time tj to tj+1. The propa-gation equations for the Kalman filter estimate and covariance are

xj+1|j = φjxj|j, (3)

Pj+1|j = φjPj|jφTj + Qj, (4)

where xj|k is the state estimate at time tj given the first k observa-tions and Pj|k is the covariance of xj|k. The Kalman filter update at

http://dx.doi.org/10.1016/j.automatica.2013.07.013

http://www.elsevier.com/locate/automatica

http://www.elsevier.com/locate/automatica

http://crossmark.dyndns.org/dialog/?doi=10.1016/j.automatica.2013.07.013&domain=pdf

mailto:[email protected]

http://dx.doi.org/10.1016/j.automatica.2013.07.013

3142 R.G. Gibbs / Automatica 49 (2013) 3141–3144

time tj is given by

xj|j = xj|j−1 + Kjzj, (5)

zj = yj − Hjxj|j−1, (6)

Pj|j = BjPj|j−1, (7)

Bj = I − KjHj, (8)

Kj = Pj|j−1HTj N

−1j , (9)

Nj = HjPj|j−1HTj + Rj. (10)

The covariance matrices Qj, Rj, and P0 and the transition matrix φjmay be singular. The only requirement is that Nj is non-singular. Inthe following the estimation error is defined and some easily de-rived results are given.

xi|j = xi|j − xi, (11)

zi = ei − Hixi|i−1 − xi

= ei − Hixi|i−1, (12)

xj+1|j = φjxj|j − wj, (13)

xj|j = Bjxj|j−1 + Kjej. (14)

Two other identities are required. The first is derived directly from(9) and (10), the second by substituting (7) through (10) in orderinto Pj|jHT

j .

I − HjKj = RjN−1j , (15)

Pj|jHTj = KjRj. (16)

3. MBF equations

After completing the Kalman filter forward pass, the followingequations, which are equivalent to the equations from Bierman(1977, pp. 223–224) (after some notation changes and typograph-ical corrections), are solved recursively in the backward pass usingdata saved from the Kalman filter.

Λj = HTj N

−1j Hj + BT

j ΛjBj, (17)

Λj = φTj Λj+1φj, (18)

ΛN = 0, (19)

λj = −HTj N

−1j zj + BT

j λj, (20)

λj = φTj λj+1, (21)

λN = 0. (22)

The smoothed state and covariance can then be found by substitu-tion in the equations

Pj|N = Pj|j − Pj|jΛjPj|j, (23)

xj|N = xj|j − Pj|jλj. (24)

Bierman (1977) derived the MBF smoother from the Rauch–Tung–Striebel (RTS) smoother (Rauch, Tung, & Striebel, 1965), the deriva-tion of which requires that the covariance and transition matricesbe non-singular. A proof that does not require this was given byGibbs (2011).

4. Standard Filter Residual Test

The standard residual test consists of comparing themagnitudeof each component of the residual to its standard deviation, that is

the test on the ith component of zj is tj,i where

tj,i =(zj)i /(Nj)i,i, (25)

where (zj)i means the ith component of zj and (Nj)i,k means thekth component of the ith row of the matrix Nj. An observation isrejected if this test exceeds a chosen threshold, which is based onthe total observations for the system. The probability of any partic-ular residual exceeding 3σ , that is tj,i > 3, is 0.0027. However, asthe number of observations in the filter run grows it becomesmorelikely that some residuals will exceed 3σ . If there are 100 total ob-servations then the probability of at least one residual exceeding3σ is 0.2369. For 1000 observations the probability is 0.9330. Ifwe consider 4σ then the probability of a single residual exceedingthis is 0.00006. The probabilities for 100 and 1000 observations are0.0063 and 0.4692 respectively.

The limitation of this test is that if the covariance of theexpected observation, HjPj|j−1HT

j , is large relative to the covarianceof the observation error, Rj, then it is unlikely that an observationerror that is large relative to its standard deviationwill be detected.

The new test described in the next section can be used inthis situation if there are sufficient filter measurements such thatthere is some redundancy in the quantity being observed by themeasurements.

5. New Filter Residual Test

Whereas the standard residual test only uses the filter’s a prioriinformation, the test described here tests each observation againsta combination of the a priori information and a weighted averageof all the observations at that time. The test is defined by

z+

j = yj − Hjxj|j. (26)

The difference between this test and the standard residual test isthat it uses xj|j instead of xj|j−1. Whereas zj is the inconsistencybetween the observations and the a priori filter estimate, z+

j is theinconsistency between the observations and the a posteriori filterestimate. However, we shall show that this test can be performedbefore actually doing the filter update. First (26) is written as

z+

j = zj − Hjxj|j − xj|j−1

. (27)

Substituting (5) in (27) and then using (15) gives

z+

j =I − HjKj

zj = RjN−1

j zj. (28)

Treating z+

j as a linear transformation of zj, its expectation is zeroand its covariance is denoted by N+

j , where

N+

j = Ez+

j z+

jT

= RjN−1j Rj. (29)

The quantities needed for the New Filter Residual Test are definedby (28) and (29). Although the residual z+

j was originally definedin terms of the a posteriori filter state estimate, both z+

j and itscovariance, N+

j , can easily be computed without doing the actualfilter update. It is only necessary to compute the inverse of Nj. Thequantities that are evaluated for the test are given by

t+j,i =

z+

j

i

N+

j

i,i. (30)

The following procedure is suggested for using this test: (i) com-pute the quantities defined by (28)–(30), (ii) if none of the t+j,i ex-ceeds the selected threshold (based on the total observations) thenall the observations are acceptable and the test is complete, other-wise (iii) discard the observationwith the largest t+j,i and repeat theprocess with the observations remaining at this time, starting withstep (i).

R.G. Gibbs / Automatica 49 (2013) 3141–3144 3143

6. Example of the New Filter Residual Test

We present here a simple example where the New Filter Resid-ual Test could catch measurement outliers that the Standard Fil-ter Residual Test would not catch. The main conditions for this tohappen are that the state uncertainty is much larger than the ob-servation uncertainties and that there is redundancy in the obser-vations, that is the dimension of the observation vector exceeds thedimension of the part of the state vector observed.

Suppose we have a 1-dimensional state, x, with an a priorivariance at time tj given by

Pj|j−1 = σ 2X , (31)

and suppose that we make at this time a set of n identically dis-tributed observations, that is

yi = x + ei, Var(ei) = σ 2R , 1 ≤ i ≤ n. (32)

The measurement matrix, Hj, is a single column where all the en-tries are 1. The measurement noise matrix is given by Rj = σ 2

R I . Itis straightforward to show that the matrix Nj, as defined by (10), isgiven by

Nj =

σ 2R + σ 2

X σ 2X · · · σ 2

Xσ 2X σ 2

R + σ 2X · · · σ 2

X...

...

σ 2X σ 2

X · · · σ 2R + σ 2

X

. (33)

We now definem as the ratio of σX to σR, that is

σX = mσR, (34)

and substitute in (33), which gives

Nj = σ 2R

1 + m2 m2

· · · m2

m2 1 + m2· · · m2

......

m2 m2· · · 1 + m2

. (35)

It is straightforward to verify that

N−1j = σ−2

R (1 + nm2)−1

p · · · −m2

−m2· · · −m2

......

−m2· · · p

, (36)

p = 1 + (n − 1)m2. (37)

Substituting (36) in (28) gives

z+

j

i= (zj)i −

m2

1 + nm2

n

k=1

(zj)k

. (38)

The components of the residual for this example are

(zj)i = ei + xj − xj|j−1. (39)

Substituting in (38) gives

z+

j

i=

xj − xj|j−1

1 + nm2+ ei −

m2

1 + nm2

nk=1

ek. (40)

Using (29) and (36) it is easily shown that

N+

j

i,i

=pσ 2

R

1 + nm2. (41)

The equation for the test quantity t+j,i defined in (30) is found bycombining (40) and (41).

z+

j

i

N+

j

1/2i,i

=

1 + nm2

1/2 eip1/2σR

+m2

p1/21 + nm2

1/2×

xj − xj|j−1

mσX−

nk=1

ekσR

. (42)

If nm2≫ 1 then 1 + nm2

≈ nm2 and p ≈ (n − 1)m2. Substitutingthese approximations into (42) gives

z+

j

i

N+

j

1/2i,i

=

n

n − 1

1/2 eiσR

+

n

n − 1

1/2

×

xj − xj|j−1

nmσX−

1n

nk=1

ekσR

. (43)

The term in (43) involving the state is a Gaussian random variablewith a standard deviation of approximately (nm)−1. This term canbe neglected on the assumption of the size of the term nm. Thus(43) can be approximated as

z+

j

i

N+

j

1/2i,i

=

n − 1n

1/2

×

eiσR

−1

n − 1

i−1k=1

ekσR

−1

n − 1

nk=i+1

ekσR

. (44)

Thus the terms on the right of (44) approximate to the differencebetween the measurement error of interest and the average of allthe other measurement errors. The test quantity, t+j,i , as defined in(30), is just the absolute value of (44). It is easy to see that if mea-surement i is an outlier then it will make a large contribution to t+j,i ,and the test will probably exceed the threshold, but a smaller con-tribution (by a factor (n − 1)−1) to t+j,k for k = i, and the test is notlikely to exceed the threshold. Even if the outlier causes several ofthe t+j,k to exceed the test threshold, measurement i is still likely tobe rejected because it makes its largest contribution to t+j,i .

7. Smoother Residual Test

The Smoother Residual Test is similar to the Filter Residual Test,but is based on all the observations by using the smoother’s stateestimate. It is defined by

zNj = yj − Hjxj|N . (45)

Using (26) and (24) this can be written as

zNj = z+

j − Hjxj|N − xj|j

= z+

j + HjPj|jλj. (46)

Eqs. (61), (63), and (57) in Gibbs (2011) are

Λi =

N−1k=i

UTk,iφ

TkH

Tk+1N

−1k+1Hk+1φkUk,i, (47)

λi = −

N−1k=i

UTk,iφ

TkH

Tk+1N

−1k+1zk+1, (48)

Uk,j =

I for j = kBkφk−1 . . . Bj+1φj for j < k. (49)

We show in Section 9 that the filter residuals are uncorrelated.Since the residuals are zero-mean this is equivalent to showing

EzizTj

= 0 for i = j. (50)

3144 R.G. Gibbs / Automatica 49 (2013) 3141–3144

Using this result it is easily shown that the covariance of λi is Λi.In (46) the first term depends on the residuals zi where i ≤ j andthe second term depends on the residuals zi where i > j. Thus itfollows that these two terms are uncorrelated. Thus the covarianceof zNj , denoted by NN

j , is

NNj = N+

j + HjPj|jΛjPj|jHTj . (51)

Substituting (15) into (29) and using (16) gives

N+

j = Rj − HjPj|jHTj . (52)

From (23) the second term in (51) can be written as

HjPj|jΛjPj|jHTj = Hj

Pj|j − Pj|N

HT

j . (53)

Substituting (52) and (53) into (51) gives

NNj = Rj − HjPj|NHT

j . (54)

The quantities evaluated for the test are given by

tNj,i =

zNj iNN

j

i,i. (55)

To use the test, the quantities defined by (55) are computed fromthe smoother results and for any tNj,i that exceeds the selectedthreshold the corresponding observation is discarded. After this iscompleted, the filter would be reprocessed using the reduced setof observations.

In practice, if this test detects a significant number of residualsexceeding the test threshold, only those observations where theresidual exceeds the threshold by a large amount should be initiallydiscarded, and the filter and smoother should be rerunwith the re-duced set of observations. After this, all the observationswhere theresidual exceeds the threshold should be discarded. This two passprocess limits the possibility of eliminating observationswhere theresidual exceeded the threshold as a result of the effect that an-other observation with a large residual had on the observation. Af-ter this second pass the filter would be executed a third time withthe reduced set of observations.

After the final execution of the Smoother Residual Test, the filteris processed again with the reduced set of observations. The FilterResidual Test could also be used as a final test for outliers in thisexecution.

8. Smoother State Test

The Smoother State Test defined here can be used as a consis-tency check on the filter performance for the whole time interval[t0, tN ]. The test is based on comparing the difference between thefilter state and the smoother state with the covariance of this dif-ference. This has the potential to detect insufficient process noisein the filter model. When the process noise is too small the statecovariance becomes too small as the filter progresses, which re-sults in the filter applying toomuchweight to the current estimateand not enough to the observations. As a result the filter effectivelygives relatively too much weight to the earlier observations andnot enough to the later ones. However, because the smoother usesall observations for the state estimate at each time, these estimatesare not biased towards the earlier observations. Thus the differencebetween the filter and smoother state estimates may diverge withtime. The test derivation is straightforward. From (24)

xj|j − xj|N = Pj|jλj. (56)

As stated in Section 7, the covariance of λi is Λi. So from (56) and(23) the covariance of this state difference is

Covxj|j − xj|N

= Pj|jΛjPj|j = Pj|j − Pj|N . (57)

The quantities evaluated for this test are

tSj,i =xj|j − xj|N

i

Pj|j − Pj|N

i,i. (58)

To use the test, the quantities in (58) would be evaluated forall the filter update times. A significant number exceeding theselected thresholdwould indicate an issuewith the filtermodeling.Changing the filter process noise and rerunning the filter andsmoother could then be tried to see if this eliminates the problem.Inspecting which states failed the test may give an indication ofhow the process noise should be increased, but the increases willbasically be based on using different values and inspecting theresults.

9. Proof that filter residuals are uncorrelated

It has been shown that the filter residuals are uncorrelated, forexample Anderson and Moore (1979, pp. 100–103), but the proofis based on using the conditional probabilities of the residuals af-ter filter updates. The residual tests developed here treat the filterstate and residuals as random variables that are linear combina-tions of the initial state and the process and measurement noises,so a proof of (50) for this situation is givenhere. From (12) the equa-tion for E

zizTk

has four terms, ofwhich E

eieTk

is 0, and, assuming

k < i, the term involving EeixTk|k−1

is 0. Thus

EzizTk

= HiE

xi|i−1xTk|k−1

HT

k − HiExi|i−1eTk

. (59)

Substituting (13) into (14), then substituting from (49), gives

xi|i = Ui,i−1xi−1|i−1 + Ui,i (Kiei − Biwi−1) . (60)

By induction we get for any k < i

xi|i = Ui,kxk|k +

ij=k+1

Ui,jKjej − Bjwj−1

. (61)

If we change the index in (61) from i to i−1, then substitute in (13)after changing the index from j to i− 1, then use (14) to substitutefor xk|k, we get, for k < i − 1,

xi|i−1 = φi−1Ui−1,kBkxk|k−1 + Kkek

− wi−1

+ φi−1

i−1j=k+1

Ui−1,jKjej − Bjwj−1

. (62)

Since xk|k−1 is independent of all the e and w terms because theyoccur after xk|k−1, and the covariance of xi|i−1 is Pi|i−1, we get from(62) for the terms in (59)

Exi|i−1xTk|k−1

= φi−1Ui−1,kPk|k, (63)

Exi|i−1eTk

= φi−1Ui−1,kKkRk. (64)

Substituting these two terms into (59) and substituting from (16)gives the desired result.

References

Anderson, Brian D. O., & Moore, John B. (1979). Optimal filtering. Englewood Cliffs,NJ: Prentice-Hall.

Bierman, Gerald J. (1977). Factorization methods for discrete sequential estimation.Academic Press.

Gibbs, Richard G (2011). Square root Modified Bryson–Frazier smoother. IEEETransactions on Automatic Control, 56(2), 452–456.

Grewal, M. S., & Andrews, A. P. (2008). Kalman filtering theory and practice usingMATLAB. Wiley.

Rauch, H. E., Tung, F., & Striebel, C. T. (1965). Maximum likelihood estimates oflinear dynamic systems. AIAA Journal, 3(8), 1445–1450.

http://refhub.elsevier.com/S0005-1098(13)00361-0/sbref1





Documents

New Kalman filter and smoother consistency tests