5
COMPLEX SYSTEMS AND PARTICLE FILTERING onica F. Bugallo and Petar M. Djuri´ c Department of Electrical and Computer Engineering Stony Brook University, Stony Brook, NY 11794, USA e-mail: {monica,djuric}@ece.sunysb.edu ABSTRACT In this paper we address the problem of applying particle filtering to complex systems. In general, we consider complex systems as ones with nonlinearities and high dimensionality of the state space. We examine strategies for filtering where the state space is partitioned into subspaces and where each subspace is explored by its own particle filter. These particle filters are interconnected and communicate essential information that is necessary for accurate operation of the particle filters. We demonstrate the proposed approach on a simulated complex system. Index Termsrecursive estimation, filtering, dynamic systems 1. INTRODUCTION Complex systems can be defined as structured systems composed of many interconnected components that contribute to an overall behavior which cannot be predicted from the behavior of the individual components [2, 18]. These systems constantly evolve with time and are characterized by nonlinearities that are usually difficult to understand. They are of great interest in engineering and in many sciences including biology, chemistry, meteorology, economics, social sciences, and neurosciences. Complex systems abound in nature. For example, atmospheric sciences study many complex systems that are of meteorological importance. Systems that represent phenomena like thunderstorms, tornadoes, cyclones, and jet streams are highly complex. Objectives in studying them include accurate weather forecasting, prediction of seasonal and interannual climate fluctuations, and understanding the implications of human-induced perturbations on the global climate. Much of the work in this field is to process data according to models that are built on the principles of physics [19]. In engineering, as in other fields, understanding of complex systems is also of great importance. Examples This work has been supported by the National Science Foundation under CCF-0515246, the Office of Naval Research under Award N00014-06-1- 0012. The work has been carried out while the second author held the Chair of Excellence of Universidad Carlos III de Madrid-Banco de Santander. include control and monitoring of industrial processes [12], communications [6], sensor and power networks [20], and robotics [4]. The objectives of these disciplines are identical to those already mentioned, and that is, to comprehend and characterize the subsystems of the studied systems and the interactions among them so that we can build better systems and can control them more easily. Traditionally, in engineering the systems are assumed linear because that makes their analysis much simpler. However, reality is often different, and many systems including those characterized as complex are simply nonlinear. The nonlinearities may arise from a variety of reasons including the physical nature of the building components of the system and/or the interactions among them. Several methods have been developed to deal with nonlinear systems including extended Kalman filtering (EKF) [1, 11, 14, 16], Gaussian sum filtering [21], approximations of the first two moments of posterior densities [10, 17], and unscented Kalman filtering (UKF) [13, 15]. Most of these techniques do not perform well in presence of nonlinearities. If in addition the noises in the system are non-Gaussian, the performance of the methods may degrade even more. In the past decade and a half, particle filters (PFs) have gained considerable popularity in dealing with nonlinear and/or non-Gaussian systems [9]. This is due to their simplicity, generality and success over a wide range of challenging applications. However, in problems related to complex systems, the dimensions of the state spaces may be very large. Recall that PFs track densities of interest by approximating the densities with a set of particles and weights associated to the particles. When the dimension of the state space is high, for satisfactory performance of the PF, usually, a very large number of particles are required. We should also keep in mind that in this case, straightforward implementation of particle filtering may lead to a collapse of the algorithm. The reason is the difficulty in drawing particles in parts of the state spaces that contain non-negligible probability masses. If we add to all this that particle filtering is a computationally expensive methodology and that its computational complexity grows with the number of particles, the role of particle filtering in studying complex systems becomes highly questionable. 1183 978-1-4244-2941-7/08/$25.00 ©2008 IEEE Asilomar 2008

[IEEE 2008 42nd Asilomar Conference on Signals, Systems and Computers - Pacific Grove, CA, USA (2008.10.26-2008.10.29)] 2008 42nd Asilomar Conference on Signals, Systems and Computers

  • Upload
    petar-m

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [IEEE 2008 42nd Asilomar Conference on Signals, Systems and Computers - Pacific Grove, CA, USA (2008.10.26-2008.10.29)] 2008 42nd Asilomar Conference on Signals, Systems and Computers

COMPLEX SYSTEMS AND PARTICLE FILTERING

Monica F. Bugallo and Petar M. Djuric

Department of Electrical and Computer EngineeringStony Brook University, Stony Brook, NY 11794, USA

e-mail: {monica,djuric}@ece.sunysb.edu

ABSTRACT

In this paper we address the problem of applying particlefiltering to complex systems. In general, we consider complexsystems as ones with nonlinearities and high dimensionalityof the state space. We examine strategies for filteringwhere the state space is partitioned into subspaces and whereeach subspace is explored by its own particle filter. Theseparticle filters are interconnected and communicate essentialinformation that is necessary for accurate operation of theparticle filters. We demonstrate the proposed approach on asimulated complex system.

Index Terms— recursive estimation, filtering, dynamicsystems

1. INTRODUCTION

Complex systems can be defined as structuredsystems composed of many interconnected components thatcontribute to an overall behavior which cannot be predictedfrom the behavior of the individual components [2, 18]. Thesesystems constantly evolve with time and are characterized bynonlinearities that are usually difficult to understand. Theyare of great interest in engineering and in many sciencesincluding biology, chemistry, meteorology, economics, socialsciences, and neurosciences.

Complex systems abound in nature. For example,atmospheric sciences study many complex systems thatare of meteorological importance. Systems that representphenomena like thunderstorms, tornadoes, cyclones, and jetstreams are highly complex. Objectives in studying theminclude accurate weather forecasting, prediction of seasonaland interannual climate fluctuations, and understanding theimplications of human-induced perturbations on the globalclimate. Much of the work in this field is to process dataaccording to models that are built on the principles of physics[19].

In engineering, as in other fields, understanding ofcomplex systems is also of great importance. Examples

This work has been supported by the National Science Foundation underCCF-0515246, the Office of Naval Research under Award N00014-06-1-0012. The work has been carried out while the second author held the Chairof Excellence of Universidad Carlos III de Madrid-Banco de Santander.

include control and monitoring of industrial processes [12],communications [6], sensor and power networks [20], androbotics [4]. The objectives of these disciplines are identicalto those already mentioned, and that is, to comprehend andcharacterize the subsystems of the studied systems and theinteractions among them so that we can build better systemsand can control them more easily.

Traditionally, in engineering the systems are assumedlinear because that makes their analysis much simpler.However, reality is often different, and many systemsincluding those characterized as complex are simplynonlinear. The nonlinearities may arise from a varietyof reasons including the physical nature of the buildingcomponents of the system and/or the interactions amongthem.

Several methods have been developed to deal withnonlinear systems including extended Kalman filtering (EKF)[1, 11, 14, 16], Gaussian sum filtering [21], approximationsof the first two moments of posterior densities [10, 17], andunscented Kalman filtering (UKF) [13, 15]. Most of thesetechniques do not perform well in presence of nonlinearities.If in addition the noises in the system are non-Gaussian, theperformance of the methods may degrade even more.

In the past decade and a half, particle filters (PFs) havegained considerable popularity in dealing with nonlinearand/or non-Gaussian systems [9]. This is due to theirsimplicity, generality and success over a wide range ofchallenging applications. However, in problems related tocomplex systems, the dimensions of the state spaces maybe very large. Recall that PFs track densities of interestby approximating the densities with a set of particles andweights associated to the particles. When the dimensionof the state space is high, for satisfactory performanceof the PF, usually, a very large number of particles arerequired. We should also keep in mind that in this case,straightforward implementation of particle filtering may leadto a collapse of the algorithm. The reason is the difficultyin drawing particles in parts of the state spaces that containnon-negligible probability masses. If we add to all this thatparticle filtering is a computationally expensive methodologyand that its computational complexity grows with the numberof particles, the role of particle filtering in studying complexsystems becomes highly questionable.

1183978-1-4244-2941-7/08/$25.00 ©2008 IEEE Asilomar 2008

Page 2: [IEEE 2008 42nd Asilomar Conference on Signals, Systems and Computers - Pacific Grove, CA, USA (2008.10.26-2008.10.29)] 2008 42nd Asilomar Conference on Signals, Systems and Computers

In this paper, we present an approach for using PFs inthe challenging cases of complex systems. In particular, wedevelop a methodology that avoids the collapse of traditionalparticle filtering by building an interconnected network ofparticle filters, each of them working on small dimensionalspaces. Thereby, the general problem of sequential estimationof the evolution of high-dimensional state vectors is brokeninto simpler problems. Since the states of the system areinterdependent and the computation of the particle weightsmay require knowledge of the complete state, the PFs mustcommunicate essential information for their proper operation.

The paper is organized as follows. In the next section weformulate the problem. In Section 3 we present the basic ideaand in Section 4 we elaborate on its specifics. We demonstratethe proposed method on simulated data in Section 5 andconclude the paper with a few final remarks in Section 6.

2. PROBLEM FORMULATION

We study particle filtering methods for complex systems thatcan be represented by the following state-space model:

xt = fx(xt−1, ut), state equation (1)

yt = fy(xt, vt), observation equation (2)

where t = 1, 2, · · · represents time index, xt is the state ofthe system at time instant t, yt are observations made aboutthe system at time instant t, fx(·) and fy(·) are functions thatcan be nonlinear in their arguments, and ut and vt are noisesin the state and observation equations, respectively. Basedon the given model, the general objective is to estimate theunobserved state xt from the observations yt. We note herethat the state vector xt may also include constants.

3. THE BASIC IDEA

In this section, we first explain the basic idea and in the nextsection we elaborate on its specifics. We propose particlefiltering for complex systems by using a set of interconnectedPFs operating on partitioned subspaces of the complete statespace. To that end, we decompose the state space intoseparate subspaces of lower dimensionality which form apartition of the original space and we assume that the interestis in finding the marginal posterior densities of the statevectors that span these subspaces. The marginal posteriordensities are approximated by random measures generatedby PFs that run in each of the subspaces. For example, ifwe have a problem where the state space is 15–dimensional,we could run a PF where the state vector has a dimensionof 15. With the proposed methods instead, we first partitionthe state space, say, into five subspaces of dimension threeeach, and run five separate PFs, each of them tracking three-dimensional state vectors. To make the overall scheme work,some of the PFs will have to communicate information withother PFs in order to carry out the necessary steps of particlefiltering. Fig. 1 represents a pictorial description of the

proposed idea for a system with five PFs. There, the arrowssymbolize flow of information from one filter to another. Notethat the exchange of information does not have to involve allthe filters and between two filters the communication is notnecessarily bidirectional.

PF4

PF1

PF2

PF3

PF5

Fig. 1. The basic idea.

4. SPECIFICS OF THE APPROACH

Suppose that the state vector xt is partitioned into x1,t, x2,t,· · · , xK,t, where each of these vectors span subspaces of theoriginal state space. We distinguish the following cases:

1. The state vectors xk are separable both in the state andobservation equations, that is,

xk,t = fxk(xk,t−1, uk,t), k = 1, 2, · · · ,K

yk,t = fyk(xl,t,vk,t), k, l = 1, 2, · · · ,K

(3)

but there is uncertainty of which state vector xl

contributes to the observation yk. Otherwise, thesystem would be completely separable and there is nointerplay among the state vectors xk.

This scenario is typical in target tracking with dataassociation [3]. The states of the k−th target spanthe subspace of that target. The observation equationrepresents the measurements of a target, but it is notknown with certainty which target it is, and thereforedata association is needed.

2. The state vectors xk are separable in the state equation,but not in the observation equation, that is

xk,t = fxk(xk,t−1,uk,t), k = 1, 2, · · · , K

yt = fy(x1,t,x2,t, · · · ,xK,t, vt)(4)

This case is also typical in target tracking wheremeasurements represent signal strength [8, 20].

3. The state vectors xk are not separable in the stateequation but are separable in the observation equation,that is,

xk,t = fxk(x1,t−1, x2,t−1, · · · ,xK,t−1,uk,t)

yk,t = fy(xk,t, vk,t)(5)

1184

Page 3: [IEEE 2008 42nd Asilomar Conference on Signals, Systems and Computers - Pacific Grove, CA, USA (2008.10.26-2008.10.29)] 2008 42nd Asilomar Conference on Signals, Systems and Computers

where k = 1, 2, · · · , K. An example of this setupis the representation of an inter- or intra-cellularbiosystem. We measure specific molecular species,but they interact among themselves through the stateequation.

4. The state vectors xk is neither separable in the stateequation nor in the observation equation, that is,

xk,t = fxk(x1,t−1,x2,t−1, · · · , xK,t−1, uk,t)

yt = fy(x1,t, x2,t, · · · , xK,t, vt)(6)

where k = 1, 2, · · · ,K.

This is the most general case of the above four. Quiteoften, the state vector of each subspace is a functionof the same state vector in the previous time step andonly of the “neighboring” state vectors. Similarly, theelements of the observation vector yt may be functionsof some and not all state vectors.

As already pointed out, we divide the state space intosubspaces and each of them is analyzed by a differentPF. The partitioning of the state space depends on whichof the above cases we have. If necessary, the PFscommunicate and exchange information with other PFs.Whether there is communication depends on the functionalrelationships that arise from the state and observationequations. The information that is exchanged can also vary.The transmitted information of a particular PF representscompressed information about the random measure generatedby the PF.

For example, consider the simple case when there areonly two PFs, and the state-space is two-dimensional, i.e.,xt = [x1,t x2,t]�. They track the posterior distributions ofx1,t and x2,t, respectively. Let the state equation of x1,t begiven by

x1,t = fx1(x1,t−1, x2,t−1) + u1,t (7)

and the observation equation by

yi,t = fyi(x1,t, x2,t) + vi,t (8)

where i is the index of the observation. The PF that takes careof x1,t has particles of the state given by x

(m)1,t , where m =

1, 2, · · · ,M , with M being the total number of particles. ThisPF does not know the particles of x2,t, which are generatedby the other PF. If we implement a first-order Taylor seriesexpansion of the above equations around μ2,t−1 and μ2,t,where μ2,t−1 and μ2,t are the means of x2,t−1 and x2,t,respectively, we get

x1,t ≈ fx1(x1,t−1, μ2,t−1) +∂fx1(x1,t−1, x2,t−1)

∂x2,t−1

∣∣μ2,t−1

× (x2,t−1 − μ2,t−1) + u1,t (9)

yi,t ≈ fyi(x1,t, μ2,t) +∂fyi

(x1,t, x2,t)∂x2,t

∣∣μ2,t

× (x2,t − μ2,t) + vi,t. (10)

If we take expectations on both sides of (9) and (10) withrespect to x2,t−1 and x2,t, respectively, we obtain

x1,t ≈ fx1(x1,t−1, μ2,t−1) + u1,t (11)

yi,t ≈ fyi(x1,t, μ2,t) + vi,t. (12)

Thus, if the compressed information about the particles fromthe second subspace is given by the mean, the first PF cancontinue with its operation unimpeded. The second PF is inan equivalent situation. It gets from the first particle the meanof the particles of that filter and proceeds by using equationsanalogous to (11) and (12).

−3 −2 −1 0 1 2 3 4 5 6 70

1

2

3x 10−3

particlesw

eigh

ts

−3 −2 −1 0 1 2 3 4 5 6 70

0.2

0.4

0.6

0.8

superparticles

supe

rwei

ghts

Fig. 2. Top: Random measure. Bottom: Approximation ofthe random measure.

We can view the information that is transmitted to otherPFs as “projections” of the random measures generated bythe PFs in their subspaces. It is clear that projectionsusing a single mean can be extremely crude, in particularwhen the particle distributions are multimodal or when thecovariances of the particles are large. To improve on thisapproximation, one can communicate not one mean, butseveral means, which correspond to different clusters ofparticles. These clusters may be interpreted as means of thevarious modes of the random measures. In addition to themeans, the PFs also report the total weights of the clusters.In other words, we replace the original random measurewith a much simpler one composed of “superparticles” andtheir associated “superweights.” This is depicted in Fig. 2,where the top figure shows the original random measure andthe bottom figure its projection with three superparticles andsuperweights.

We can improve the above approximations by usingsecond order Taylor expansions. Following a similar line of

1185

Page 4: [IEEE 2008 42nd Asilomar Conference on Signals, Systems and Computers - Pacific Grove, CA, USA (2008.10.26-2008.10.29)] 2008 42nd Asilomar Conference on Signals, Systems and Computers

reasoning we get the following expressions:

x1,t ≈ fx1(x1,t−1, μ2,t−1)

+12

∂2fx1(x1,t−1, x2,t−1)∂x2

2,t−1

∣∣μ2,t−1

σ22,t−1 + u1,t (13)

yi,t ≈ fyi(x1,t, μ2,t) +

12

∂2fyi(x1,t, x2,t)∂x2

2,t

∣∣μ2,t

σ22,t + vi,t

(14)

where σ22,t−1 and σ2

2,t are the variances of the particles fromthe random measure of the second PF at time instants t − 1and t, respectively. In this case, the PF communicates to theother PF not only means but also variances. If the randommeasure is approximated by more than one mode, then eachmode is represented by its mean and variance.

These are only a few of many possibilities. Often, the bestchoice for exchanging information is problem dependent.

5. SIMULATION RESULTS

We demonstrate the idea on a simple example. We chose thestate dynamic model as a random walk

xt = xt−1 + ut (15)

where the system state, xt ∈ R2, is defined by xt =

[x1,t x2,t]�, and the state noise, ut ∈ R2, was Gausssian,

i.e., ut ∼ N (0, I2). The observation vector y�t = [y1,t y2,t]

was modeled by

y1,t = x21,t + x2

2,t + v1,t

y2,t = x21,t − x2

2,t + v2,t(16)

where the observation noise, vt = [v1,t v2,t]� ∈ R2,

was independent from ut and was also Gaussian, vt ∼N (0, I2). Based on the made assumptions and observations,the objective was to estimate xt as accurately as possible.

We make two remarks about the model. First, theposterior of xt has four modes. In Fig. 3, we see the particlesof the states clearly grouped in four different clusters and thereconstructed marginal distributions with two modes each.Second, the model belongs to the second class of modelsdescribed in Section C.4.2. We quickly see that the modelcan be modified so that we have two separate models for x1,t

and x2,t, i.e.,

z1,t = y1,t + y2,t = 2x21,t + v1,t + v2,t

z2,t = y1,t − y2,t = 2x22,t + v1,t − v2,t.

(17)

However, the noises in the two equations are fully correlated.This new formulation allows us to run two separate PFs thatdo not need to communicate.

We designed a multiple PF which is composed of twoPFs that exchange information of estimates about the modes(superparticles) and weights of the modes (superweights).For comparison and benchmarking purposes, we also

implemented the standard PF and two separate PFs, whichestimated x1,t and x2,t from z1,t and z2,t, respectively.

We simulated evolutions of the system for 50 time steps.The state started at (0, 0). In the implementation of the PFs,we used a total of M = 200 particles. In other words,when we implemented the multiple PF, each filter used 100particles. We computed the mean square error (MSE) as aperformance figure of merit, where

MSEt =1J

J∑j=1

[(|xj

1,t| − |xj1,t|)2 + (|xj

2,t| − |xj2,t|)2

]

where xji,t , i = 1, 2 was the true value of the states at time

instant t in the j run, and xji,t , i = 1, 2 was the corresponding

estimate obtained by the filter. The MSE plots were obtainedby averaging J = 50 independent simulations.

The results are shown in Fig. 3 on the right. The curvescorresponding to the different filters are labeled as multiplePF for the multiple PF, PF for the standard PF, and SeparatedPF for the two particle filters that used the observations as in(17). We see that the multiple PF outperformed the standardPF even in the two-dimensional state space [5, 7].

6. CONCLUSIONS

In this paper we presented ideas for implementing particlefiltering in problems where the state spaces have highdimensions. These ideas are based on modifying the particlefilter to a set of communicating particle filters. Eachparticle filter tracks a marginal density of the state space andcommunicates essential information about it to the remainingparticle filters. By partitioning the state space into subspacesof much lower dimensions, we make the exploration of thesesubspaces with particles much easier and more efficient thanthat of the complete space. Some initial simulation resultsthat demonstrate the ideas were also provided.

7. REFERENCES

[1] B. D. Anderson and J. B. Moore, Optimal Filtering,Prentice-Hall, New Jersey, 1979.

[2] S. Y. Auyang, Foundation of Complex System Theories,Cambridge University Press, New York, 1998.

[3] Y. Bar-Shalom and X.-R. Li, Multitarget-Multisensortracking: Principles and Techniques, YBS, 1995.

[4] H. J. Chang, C. S. G. Lee, Y.-H. Lu, and Y. C.Hu, “Simultaneous localization and mapping withenvironmental-structure prediction,” IEEE Transactionson Robotics, vol. 23, pp. 281–293, 2007.

[5] P. M. Djuric, M. F. Bugallo, and J. Mıguez,“Multiple particle filtering with fusion,” in theProceedings of International Analysis for BayesianAnalysis (ISBA’2004), Vina del Mar (Chile), 2004.

1186

Page 5: [IEEE 2008 42nd Asilomar Conference on Signals, Systems and Computers - Pacific Grove, CA, USA (2008.10.26-2008.10.29)] 2008 42nd Asilomar Conference on Signals, Systems and Computers

−5 0 5−5

−4

−3

−2

−1

0

1

2

3

4

5x2

x1

0 10 20 30 40 5010−2

10−1

100

101

t

MS

E

PFSeparated PFMultiple PF

Fig. 3. Left: An instance of the joint posterior represented by the dots (particles) and the corresponding marginal distributions(solid lines). Right: MSE performance comparison.

[6] P. M. Djuric, J. H. Kotecha, J. Zhang, Y. Huang,T. Ghirmai, M. F. Bugallo, and J. Mıguez, “Particlefiltering,” IEEE Signal Processing Magazine, vol. 20,no. 5, pp. 19–38, 2003.

[7] P. M. Djuric, T. Lu, and M. F. Bugallo, “Multipleparticle filtering,” in the Proceedings of the IEEE 32ndInternational Conference on Acoustics, Speech andSignal Processing (ICASSP’2007), Honolulu (Hawaii),2007.

[8] P. M. Djuric, M. Vemula, and M. F. Bugallo, “Targettracking by particle filtering in binary sensor networks,”IEEE Transactions on Signal Processing, vol. 56, no. 6,pp. 2229–2238, 2008.

[9] A. Doucet, N. de Freitas, and N. Gordon, Eds.,Sequential Monte Carlo Methods in Practice, Springer,New York, 2001.

[10] S. Fruhwirth-Schnatter, “Data augmentation anddynamic linear models,” Journal of Time SeriesAnalysis, vol. 15, pp. 183–202, 1994.

[11] A. C. Harvey, Forecasting, Structural Time SeriesModels and the Kalman Filter, Cambridge UniversityPress, Cambridge, UK, 1989.

[12] M.W. Hofbaur and B.C. Williams, “Hybrid estimationof complex systems,” IEEE Transactions on Systems,Man, and Cybernetics - Part B: Cybernetics, vol. 34,no. 5, pp. 2178–2191, 2004.

[13] K. Ito and K. Xiong, “Gaussian filters for nonlinearfiltering problems,” IEEE Transactions on AutomaticControl, vol. 45, no. 5, pp. 910–927, 2000.

[14] A. H. Jazwinski, Stochastic Processes and FilteringTheory, Academic Press, New York, 1970.

[15] S. Julier, J. Uhlmann, and H. F. Durrant-Whyte,“A new method for the nonlinear transformation andcovariances in filters and estimator,” IEEE Transactionson Automatic Control, , no. 3, pp. 477–482, 2000.

[16] L. Ljung and T. Soderstrom, Theory and Practice ofRecursive Identification, The MIT Press, Cambridge,MA, 1983.

[17] C. J. Masreliez, “Approximate non-Gaussian filteringwith linear state and observation relations,” IEEETransactions on Automatic Control, vol. 20, pp. 107–110, 1975.

[18] G. Nicolis and C. Nicolis, Foundations of ComplexSystems, World Scientific, London, UK, 2007.

[19] E. Ott, B. R. Hunt, I. Szunyogh, A. V. Zimin, E. J.Kostelich, M. Corazzas, E. Kaalnyas, D. J. Patil, andJ. A. York, “A local ensemble Kalman filter foratmospheric data assimilation,” Tellus, vol. 56, no. 5,pp. 415–428(14), 2004.

[20] X. Sheng and Y-H. Hu, “Maximum likelihood multiple-source localization using acoustic energy measurementswith wireless sensor networks,” IEEE Transactions onSignal Processing, vol. 53, pp. 44–53, 2005.

[21] H. W. Sorenson and D. L. Alspach, “Recursive Bayesianestimation using Gaussian sums,” Automatica, vol. 7,pp. 465–479, 1971.

1187