Upload
petar-m
View
213
Download
1
Embed Size (px)
Citation preview
COMPLEX SYSTEMS AND PARTICLE FILTERING
Monica F. Bugallo and Petar M. Djuric
Department of Electrical and Computer EngineeringStony Brook University, Stony Brook, NY 11794, USA
e-mail: {monica,djuric}@ece.sunysb.edu
ABSTRACT
In this paper we address the problem of applying particlefiltering to complex systems. In general, we consider complexsystems as ones with nonlinearities and high dimensionalityof the state space. We examine strategies for filteringwhere the state space is partitioned into subspaces and whereeach subspace is explored by its own particle filter. Theseparticle filters are interconnected and communicate essentialinformation that is necessary for accurate operation of theparticle filters. We demonstrate the proposed approach on asimulated complex system.
Index Terms— recursive estimation, filtering, dynamicsystems
1. INTRODUCTION
Complex systems can be defined as structuredsystems composed of many interconnected components thatcontribute to an overall behavior which cannot be predictedfrom the behavior of the individual components [2, 18]. Thesesystems constantly evolve with time and are characterized bynonlinearities that are usually difficult to understand. Theyare of great interest in engineering and in many sciencesincluding biology, chemistry, meteorology, economics, socialsciences, and neurosciences.
Complex systems abound in nature. For example,atmospheric sciences study many complex systems thatare of meteorological importance. Systems that representphenomena like thunderstorms, tornadoes, cyclones, and jetstreams are highly complex. Objectives in studying theminclude accurate weather forecasting, prediction of seasonaland interannual climate fluctuations, and understanding theimplications of human-induced perturbations on the globalclimate. Much of the work in this field is to process dataaccording to models that are built on the principles of physics[19].
In engineering, as in other fields, understanding ofcomplex systems is also of great importance. Examples
This work has been supported by the National Science Foundation underCCF-0515246, the Office of Naval Research under Award N00014-06-1-0012. The work has been carried out while the second author held the Chairof Excellence of Universidad Carlos III de Madrid-Banco de Santander.
include control and monitoring of industrial processes [12],communications [6], sensor and power networks [20], androbotics [4]. The objectives of these disciplines are identicalto those already mentioned, and that is, to comprehend andcharacterize the subsystems of the studied systems and theinteractions among them so that we can build better systemsand can control them more easily.
Traditionally, in engineering the systems are assumedlinear because that makes their analysis much simpler.However, reality is often different, and many systemsincluding those characterized as complex are simplynonlinear. The nonlinearities may arise from a varietyof reasons including the physical nature of the buildingcomponents of the system and/or the interactions amongthem.
Several methods have been developed to deal withnonlinear systems including extended Kalman filtering (EKF)[1, 11, 14, 16], Gaussian sum filtering [21], approximationsof the first two moments of posterior densities [10, 17], andunscented Kalman filtering (UKF) [13, 15]. Most of thesetechniques do not perform well in presence of nonlinearities.If in addition the noises in the system are non-Gaussian, theperformance of the methods may degrade even more.
In the past decade and a half, particle filters (PFs) havegained considerable popularity in dealing with nonlinearand/or non-Gaussian systems [9]. This is due to theirsimplicity, generality and success over a wide range ofchallenging applications. However, in problems related tocomplex systems, the dimensions of the state spaces maybe very large. Recall that PFs track densities of interestby approximating the densities with a set of particles andweights associated to the particles. When the dimensionof the state space is high, for satisfactory performanceof the PF, usually, a very large number of particles arerequired. We should also keep in mind that in this case,straightforward implementation of particle filtering may leadto a collapse of the algorithm. The reason is the difficultyin drawing particles in parts of the state spaces that containnon-negligible probability masses. If we add to all this thatparticle filtering is a computationally expensive methodologyand that its computational complexity grows with the numberof particles, the role of particle filtering in studying complexsystems becomes highly questionable.
1183978-1-4244-2941-7/08/$25.00 ©2008 IEEE Asilomar 2008
In this paper, we present an approach for using PFs inthe challenging cases of complex systems. In particular, wedevelop a methodology that avoids the collapse of traditionalparticle filtering by building an interconnected network ofparticle filters, each of them working on small dimensionalspaces. Thereby, the general problem of sequential estimationof the evolution of high-dimensional state vectors is brokeninto simpler problems. Since the states of the system areinterdependent and the computation of the particle weightsmay require knowledge of the complete state, the PFs mustcommunicate essential information for their proper operation.
The paper is organized as follows. In the next section weformulate the problem. In Section 3 we present the basic ideaand in Section 4 we elaborate on its specifics. We demonstratethe proposed method on simulated data in Section 5 andconclude the paper with a few final remarks in Section 6.
2. PROBLEM FORMULATION
We study particle filtering methods for complex systems thatcan be represented by the following state-space model:
xt = fx(xt−1, ut), state equation (1)
yt = fy(xt, vt), observation equation (2)
where t = 1, 2, · · · represents time index, xt is the state ofthe system at time instant t, yt are observations made aboutthe system at time instant t, fx(·) and fy(·) are functions thatcan be nonlinear in their arguments, and ut and vt are noisesin the state and observation equations, respectively. Basedon the given model, the general objective is to estimate theunobserved state xt from the observations yt. We note herethat the state vector xt may also include constants.
3. THE BASIC IDEA
In this section, we first explain the basic idea and in the nextsection we elaborate on its specifics. We propose particlefiltering for complex systems by using a set of interconnectedPFs operating on partitioned subspaces of the complete statespace. To that end, we decompose the state space intoseparate subspaces of lower dimensionality which form apartition of the original space and we assume that the interestis in finding the marginal posterior densities of the statevectors that span these subspaces. The marginal posteriordensities are approximated by random measures generatedby PFs that run in each of the subspaces. For example, ifwe have a problem where the state space is 15–dimensional,we could run a PF where the state vector has a dimensionof 15. With the proposed methods instead, we first partitionthe state space, say, into five subspaces of dimension threeeach, and run five separate PFs, each of them tracking three-dimensional state vectors. To make the overall scheme work,some of the PFs will have to communicate information withother PFs in order to carry out the necessary steps of particlefiltering. Fig. 1 represents a pictorial description of the
proposed idea for a system with five PFs. There, the arrowssymbolize flow of information from one filter to another. Notethat the exchange of information does not have to involve allthe filters and between two filters the communication is notnecessarily bidirectional.
PF4
PF1
PF2
PF3
PF5
Fig. 1. The basic idea.
4. SPECIFICS OF THE APPROACH
Suppose that the state vector xt is partitioned into x1,t, x2,t,· · · , xK,t, where each of these vectors span subspaces of theoriginal state space. We distinguish the following cases:
1. The state vectors xk are separable both in the state andobservation equations, that is,
xk,t = fxk(xk,t−1, uk,t), k = 1, 2, · · · ,K
yk,t = fyk(xl,t,vk,t), k, l = 1, 2, · · · ,K
(3)
but there is uncertainty of which state vector xl
contributes to the observation yk. Otherwise, thesystem would be completely separable and there is nointerplay among the state vectors xk.
This scenario is typical in target tracking with dataassociation [3]. The states of the k−th target spanthe subspace of that target. The observation equationrepresents the measurements of a target, but it is notknown with certainty which target it is, and thereforedata association is needed.
2. The state vectors xk are separable in the state equation,but not in the observation equation, that is
xk,t = fxk(xk,t−1,uk,t), k = 1, 2, · · · , K
yt = fy(x1,t,x2,t, · · · ,xK,t, vt)(4)
This case is also typical in target tracking wheremeasurements represent signal strength [8, 20].
3. The state vectors xk are not separable in the stateequation but are separable in the observation equation,that is,
xk,t = fxk(x1,t−1, x2,t−1, · · · ,xK,t−1,uk,t)
yk,t = fy(xk,t, vk,t)(5)
1184
where k = 1, 2, · · · , K. An example of this setupis the representation of an inter- or intra-cellularbiosystem. We measure specific molecular species,but they interact among themselves through the stateequation.
4. The state vectors xk is neither separable in the stateequation nor in the observation equation, that is,
xk,t = fxk(x1,t−1,x2,t−1, · · · , xK,t−1, uk,t)
yt = fy(x1,t, x2,t, · · · , xK,t, vt)(6)
where k = 1, 2, · · · ,K.
This is the most general case of the above four. Quiteoften, the state vector of each subspace is a functionof the same state vector in the previous time step andonly of the “neighboring” state vectors. Similarly, theelements of the observation vector yt may be functionsof some and not all state vectors.
As already pointed out, we divide the state space intosubspaces and each of them is analyzed by a differentPF. The partitioning of the state space depends on whichof the above cases we have. If necessary, the PFscommunicate and exchange information with other PFs.Whether there is communication depends on the functionalrelationships that arise from the state and observationequations. The information that is exchanged can also vary.The transmitted information of a particular PF representscompressed information about the random measure generatedby the PF.
For example, consider the simple case when there areonly two PFs, and the state-space is two-dimensional, i.e.,xt = [x1,t x2,t]�. They track the posterior distributions ofx1,t and x2,t, respectively. Let the state equation of x1,t begiven by
x1,t = fx1(x1,t−1, x2,t−1) + u1,t (7)
and the observation equation by
yi,t = fyi(x1,t, x2,t) + vi,t (8)
where i is the index of the observation. The PF that takes careof x1,t has particles of the state given by x
(m)1,t , where m =
1, 2, · · · ,M , with M being the total number of particles. ThisPF does not know the particles of x2,t, which are generatedby the other PF. If we implement a first-order Taylor seriesexpansion of the above equations around μ2,t−1 and μ2,t,where μ2,t−1 and μ2,t are the means of x2,t−1 and x2,t,respectively, we get
x1,t ≈ fx1(x1,t−1, μ2,t−1) +∂fx1(x1,t−1, x2,t−1)
∂x2,t−1
∣∣μ2,t−1
× (x2,t−1 − μ2,t−1) + u1,t (9)
yi,t ≈ fyi(x1,t, μ2,t) +∂fyi
(x1,t, x2,t)∂x2,t
∣∣μ2,t
× (x2,t − μ2,t) + vi,t. (10)
If we take expectations on both sides of (9) and (10) withrespect to x2,t−1 and x2,t, respectively, we obtain
x1,t ≈ fx1(x1,t−1, μ2,t−1) + u1,t (11)
yi,t ≈ fyi(x1,t, μ2,t) + vi,t. (12)
Thus, if the compressed information about the particles fromthe second subspace is given by the mean, the first PF cancontinue with its operation unimpeded. The second PF is inan equivalent situation. It gets from the first particle the meanof the particles of that filter and proceeds by using equationsanalogous to (11) and (12).
−3 −2 −1 0 1 2 3 4 5 6 70
1
2
3x 10−3
particlesw
eigh
ts
−3 −2 −1 0 1 2 3 4 5 6 70
0.2
0.4
0.6
0.8
superparticles
supe
rwei
ghts
Fig. 2. Top: Random measure. Bottom: Approximation ofthe random measure.
We can view the information that is transmitted to otherPFs as “projections” of the random measures generated bythe PFs in their subspaces. It is clear that projectionsusing a single mean can be extremely crude, in particularwhen the particle distributions are multimodal or when thecovariances of the particles are large. To improve on thisapproximation, one can communicate not one mean, butseveral means, which correspond to different clusters ofparticles. These clusters may be interpreted as means of thevarious modes of the random measures. In addition to themeans, the PFs also report the total weights of the clusters.In other words, we replace the original random measurewith a much simpler one composed of “superparticles” andtheir associated “superweights.” This is depicted in Fig. 2,where the top figure shows the original random measure andthe bottom figure its projection with three superparticles andsuperweights.
We can improve the above approximations by usingsecond order Taylor expansions. Following a similar line of
1185
reasoning we get the following expressions:
x1,t ≈ fx1(x1,t−1, μ2,t−1)
+12
∂2fx1(x1,t−1, x2,t−1)∂x2
2,t−1
∣∣μ2,t−1
σ22,t−1 + u1,t (13)
yi,t ≈ fyi(x1,t, μ2,t) +
12
∂2fyi(x1,t, x2,t)∂x2
2,t
∣∣μ2,t
σ22,t + vi,t
(14)
where σ22,t−1 and σ2
2,t are the variances of the particles fromthe random measure of the second PF at time instants t − 1and t, respectively. In this case, the PF communicates to theother PF not only means but also variances. If the randommeasure is approximated by more than one mode, then eachmode is represented by its mean and variance.
These are only a few of many possibilities. Often, the bestchoice for exchanging information is problem dependent.
5. SIMULATION RESULTS
We demonstrate the idea on a simple example. We chose thestate dynamic model as a random walk
xt = xt−1 + ut (15)
where the system state, xt ∈ R2, is defined by xt =
[x1,t x2,t]�, and the state noise, ut ∈ R2, was Gausssian,
i.e., ut ∼ N (0, I2). The observation vector y�t = [y1,t y2,t]
was modeled by
y1,t = x21,t + x2
2,t + v1,t
y2,t = x21,t − x2
2,t + v2,t(16)
where the observation noise, vt = [v1,t v2,t]� ∈ R2,
was independent from ut and was also Gaussian, vt ∼N (0, I2). Based on the made assumptions and observations,the objective was to estimate xt as accurately as possible.
We make two remarks about the model. First, theposterior of xt has four modes. In Fig. 3, we see the particlesof the states clearly grouped in four different clusters and thereconstructed marginal distributions with two modes each.Second, the model belongs to the second class of modelsdescribed in Section C.4.2. We quickly see that the modelcan be modified so that we have two separate models for x1,t
and x2,t, i.e.,
z1,t = y1,t + y2,t = 2x21,t + v1,t + v2,t
z2,t = y1,t − y2,t = 2x22,t + v1,t − v2,t.
(17)
However, the noises in the two equations are fully correlated.This new formulation allows us to run two separate PFs thatdo not need to communicate.
We designed a multiple PF which is composed of twoPFs that exchange information of estimates about the modes(superparticles) and weights of the modes (superweights).For comparison and benchmarking purposes, we also
implemented the standard PF and two separate PFs, whichestimated x1,t and x2,t from z1,t and z2,t, respectively.
We simulated evolutions of the system for 50 time steps.The state started at (0, 0). In the implementation of the PFs,we used a total of M = 200 particles. In other words,when we implemented the multiple PF, each filter used 100particles. We computed the mean square error (MSE) as aperformance figure of merit, where
MSEt =1J
J∑j=1
[(|xj
1,t| − |xj1,t|)2 + (|xj
2,t| − |xj2,t|)2
]
where xji,t , i = 1, 2 was the true value of the states at time
instant t in the j run, and xji,t , i = 1, 2 was the corresponding
estimate obtained by the filter. The MSE plots were obtainedby averaging J = 50 independent simulations.
The results are shown in Fig. 3 on the right. The curvescorresponding to the different filters are labeled as multiplePF for the multiple PF, PF for the standard PF, and SeparatedPF for the two particle filters that used the observations as in(17). We see that the multiple PF outperformed the standardPF even in the two-dimensional state space [5, 7].
6. CONCLUSIONS
In this paper we presented ideas for implementing particlefiltering in problems where the state spaces have highdimensions. These ideas are based on modifying the particlefilter to a set of communicating particle filters. Eachparticle filter tracks a marginal density of the state space andcommunicates essential information about it to the remainingparticle filters. By partitioning the state space into subspacesof much lower dimensions, we make the exploration of thesesubspaces with particles much easier and more efficient thanthat of the complete space. Some initial simulation resultsthat demonstrate the ideas were also provided.
7. REFERENCES
[1] B. D. Anderson and J. B. Moore, Optimal Filtering,Prentice-Hall, New Jersey, 1979.
[2] S. Y. Auyang, Foundation of Complex System Theories,Cambridge University Press, New York, 1998.
[3] Y. Bar-Shalom and X.-R. Li, Multitarget-Multisensortracking: Principles and Techniques, YBS, 1995.
[4] H. J. Chang, C. S. G. Lee, Y.-H. Lu, and Y. C.Hu, “Simultaneous localization and mapping withenvironmental-structure prediction,” IEEE Transactionson Robotics, vol. 23, pp. 281–293, 2007.
[5] P. M. Djuric, M. F. Bugallo, and J. Mıguez,“Multiple particle filtering with fusion,” in theProceedings of International Analysis for BayesianAnalysis (ISBA’2004), Vina del Mar (Chile), 2004.
1186
−5 0 5−5
−4
−3
−2
−1
0
1
2
3
4
5x2
x1
0 10 20 30 40 5010−2
10−1
100
101
t
MS
E
PFSeparated PFMultiple PF
Fig. 3. Left: An instance of the joint posterior represented by the dots (particles) and the corresponding marginal distributions(solid lines). Right: MSE performance comparison.
[6] P. M. Djuric, J. H. Kotecha, J. Zhang, Y. Huang,T. Ghirmai, M. F. Bugallo, and J. Mıguez, “Particlefiltering,” IEEE Signal Processing Magazine, vol. 20,no. 5, pp. 19–38, 2003.
[7] P. M. Djuric, T. Lu, and M. F. Bugallo, “Multipleparticle filtering,” in the Proceedings of the IEEE 32ndInternational Conference on Acoustics, Speech andSignal Processing (ICASSP’2007), Honolulu (Hawaii),2007.
[8] P. M. Djuric, M. Vemula, and M. F. Bugallo, “Targettracking by particle filtering in binary sensor networks,”IEEE Transactions on Signal Processing, vol. 56, no. 6,pp. 2229–2238, 2008.
[9] A. Doucet, N. de Freitas, and N. Gordon, Eds.,Sequential Monte Carlo Methods in Practice, Springer,New York, 2001.
[10] S. Fruhwirth-Schnatter, “Data augmentation anddynamic linear models,” Journal of Time SeriesAnalysis, vol. 15, pp. 183–202, 1994.
[11] A. C. Harvey, Forecasting, Structural Time SeriesModels and the Kalman Filter, Cambridge UniversityPress, Cambridge, UK, 1989.
[12] M.W. Hofbaur and B.C. Williams, “Hybrid estimationof complex systems,” IEEE Transactions on Systems,Man, and Cybernetics - Part B: Cybernetics, vol. 34,no. 5, pp. 2178–2191, 2004.
[13] K. Ito and K. Xiong, “Gaussian filters for nonlinearfiltering problems,” IEEE Transactions on AutomaticControl, vol. 45, no. 5, pp. 910–927, 2000.
[14] A. H. Jazwinski, Stochastic Processes and FilteringTheory, Academic Press, New York, 1970.
[15] S. Julier, J. Uhlmann, and H. F. Durrant-Whyte,“A new method for the nonlinear transformation andcovariances in filters and estimator,” IEEE Transactionson Automatic Control, , no. 3, pp. 477–482, 2000.
[16] L. Ljung and T. Soderstrom, Theory and Practice ofRecursive Identification, The MIT Press, Cambridge,MA, 1983.
[17] C. J. Masreliez, “Approximate non-Gaussian filteringwith linear state and observation relations,” IEEETransactions on Automatic Control, vol. 20, pp. 107–110, 1975.
[18] G. Nicolis and C. Nicolis, Foundations of ComplexSystems, World Scientific, London, UK, 2007.
[19] E. Ott, B. R. Hunt, I. Szunyogh, A. V. Zimin, E. J.Kostelich, M. Corazzas, E. Kaalnyas, D. J. Patil, andJ. A. York, “A local ensemble Kalman filter foratmospheric data assimilation,” Tellus, vol. 56, no. 5,pp. 415–428(14), 2004.
[20] X. Sheng and Y-H. Hu, “Maximum likelihood multiple-source localization using acoustic energy measurementswith wireless sensor networks,” IEEE Transactions onSignal Processing, vol. 53, pp. 44–53, 2005.
[21] H. W. Sorenson and D. L. Alspach, “Recursive Bayesianestimation using Gaussian sums,” Automatica, vol. 7,pp. 465–479, 1971.
1187