19
The Fokker Planck Equation in Stochastic Dynamics Mrinal Kumar, Puneet Singla 1 Ph.D. Students, Department of Aerospace Engineering Texas A&M University June, 2005 Introduction The stu dy of stoc has tic dyn ami c systems is mat hemati cally a complex subject that has cap tur ed the atten tio n of res earchers for over a cen tur y . Albe rt Einste in was the rst to formally study a stochastic process in his investigations of the Brownian motion of pollen grain s suspended in a liquid. In the most general sense, a stochas tic process can be dene d as a family of rando m variables that depend on a real paramet er, usually time. Alternatively, a stochastic process can be loosely described as a set of random variables ‘indexed’ by the real parameter t (time) . This would be an exact description when talking of a discr ete process, in which case the countab le set of random var iables may be index ed by a parameter . F or continuous processes however, the set of random variables is not countable and hence not subject to indexing. In this document, we shall be concerned only with continuous stochastic process es. In the conte xt of Mechanics , the random variable in question is the state of the dynamic system, x N . In conventional, deterministic systems, the system state assumes a xed value at any given insta nt of time. However, in stoch astic dyna mics it is a random variable, usually characterized by its time-parameterized probability density function (PDF), W (x, t). In ess enc e, the study of stochas tic syst ems reduc es to nding the nature of suc h time-evolution of the system-state PDF. In this study, when we say stochastic, we mainly have two dierent types of dynamic systems in view - those with deterministic governing equations but random initial conditions ( S1 ), and those having both random excitation and rando m initial condi tions ( S2 ). Type S1 is a special case of type S2 . One may classify stochastic processes on the basis of their memory , i.e. the nature of depen- dence of the curren t PDF on the PDFs at previous time instances. A zero memory process is one in which the current PDF is completely independent of the PDFs at all past time inst ance s. In the st ric t sense, such a pr ocess is not pos si bl e in the real world, but it is nev ertheless a useful mathematica l model. White noise is a zero memory stochastic process. The most common stochastic process in the real world scenario is the Markovian process, in whic h the current PDF depends only on the PDF at the previous time step. The conditi onal proba bilit y density of a Markov process is theref ore:  p c (x n ,t n | x n1 ,t n1 ; ... ; x 2 , t 2 ; x 1 ,t 1 ) = p c (x n , t n | x n1 , t n1 ) (1) By virtue of its nature in the above equation, the right hand side is also called the transition proba bilit y density of the process. Homo geneo us Markov processes are those in which the 1 c Mrinal Kumar and Puneet Singla, 2004 1

fokkerplanck

Embed Size (px)

Citation preview

Page 1: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 1/19

The Fokker Planck Equation in Stochastic Dynamics

Mrinal Kumar, Puneet Singla1

Ph.D. Students, Department of Aerospace EngineeringTexas A&M University

June, 2005

Introduction

The study of stochastic dynamic systems is mathematically a complex subject that hascaptured the attention of researchers for over a century. Albert Einstein was the first toformally study a stochastic process in his investigations of the Brownian motion of pollengrains suspended in a liquid. In the most general sense, a stochastic process can be defined asa family of random variables that depend on a real parameter, usually time. Alternatively, astochastic process can be loosely described as a set of random variables ‘indexed’ by the realparameter t (time). This would be an exact description when talking of a discrete process,in which case the countable set of random variables may be indexed by a parameter. Forcontinuous processes however, the set of random variables is not countable and hence notsubject to indexing. In this document, we shall be concerned only with continuous stochasticprocesses. In the context of Mechanics, the random variable in question is the state of thedynamic system, x ∈ N. In conventional, deterministic systems, the system state assumesa fixed value at any given instant of time. However, in stochastic dynamics it is a randomvariable, usually characterized by its time-parameterized probability density function (PDF),W (x, t). In essence, the study of stochastic systems reduces to finding the nature of suchtime-evolution of the system-state PDF. In this study, when we say stochastic, we mainlyhave two different types of dynamic systems in view - those with deterministic governingequations but random initial conditions (S1 ), and those having both random excitation andrandom initial conditions (S2 ). Type S1 is a special case of type S2 .One may classify stochastic processes on the basis of their memory , i.e. the nature of depen-dence of the current PDF on the PDFs at previous time instances. A zero memory processis one in which the current PDF is completely independent of the PDFs at all past timeinstances. In the strict sense, such a process is not possible in the real world, but it isnevertheless a useful mathematical model. White noise is a zero memory stochastic process.The most common stochastic process in the real world scenario is the Markovian process, inwhich the current PDF depends only on the PDF at the previous time step. The conditionalprobability density of a Markov process is therefore:

pc(xn, tn| xn−1, tn−1; . . . ; x2, t2; x1, t1) = pc(xn, tn| xn−1, tn−1) (1)

By virtue of its nature in the above equation, the right hand side is also called the transitionprobability density of the process. Homogeneous Markov processes are those in which the

1 cMrinal Kumar and Puneet Singla, 2004

1

Page 2: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 2/19

transition probability depends only on the time duration between the initial and final times.The transition density of a Markov process obeys the following chain rule, formally known

as the Chapman-Kolmogorov equation :

ptr (x3, t3| x1, t1) =

∞−∞

ptr (x3, t3| x2, t2) ptr (x2, t2| x1, t1)dx2 (2)

The above equation is very general in scope, and not of particular practical significance.For a special class of homogeneous Markov processes known as Diffusion processes, Eq.2can be reduced to a partial differential equation in the probability density function of therandom variable. This is the well known Fokker-Planck equation (also known as the forward

Kolmogorov equation ), given by:

∂ W

(x, t)

∂t =−

Ni=1

∂ D(1)

i(x, t)

∂xi+

Ni=1

N j=1

∂ 2D(2)

ij

(x, t)

∂xi∂x j

W (x, t)

Or (3)

∂ W (x, t)

∂t= LFP (W (x, t))

where LFP () is the so called Fokker-Planck operator. Notice that the operator is linear. InEq.3, W (x, t) is the joint probability density function of the N-dimensional system state, x,at time t. D(1) is known as the Drift Coefficient and D(2) is called the Diffusion Coefficient .The governing dynamics of the corresponding stochastic (nonlinear) system is given by thefollowing equation and initial conditions:

x = f (x, t) + g(x, t)Γ(t), x(t0) = ? (4)

W (x, t = t0) = W (x, t0) (5)

The above represents a general dynamic system of type S2 , with δ-correlated random ex-citation Γ(t) (called the Langevin force) with the correlation function E[Γi(t1)Γ j (t2)] =Qδij δ(t1 − t2). The initial probability distribution is described by the density functionW (x, t0). The drift and diffusion coefficients in Eq.3 are related to the system dynamicsin the following manner:

D(1)(x, t) = f (x, t) + 12 ∂ g

(x

, t)∂ x Q??g(x, t) (6)

D(2)(x, t) =1

2g(x, t)QgT(x, t) (7)

Or, in indicial notation,

D(1)i (x, t) = f i(x, t) +

1

2

∂gij (x, t)

∂xkQgkj (x, t) (8)

D(2)ij (x, t) =

1

2gik(x, t)Qij g jk (x, t) (9)

2

Page 3: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 3/19

It is possible to write a more general form of Eq.3; for instance, by including higher orderderivatives. This leads us to the Kramers-Moyal expansion of the Fokker-Planck equation:

∂ W (x, t)

∂t=

∞i=1

− ∂

∂ x

i

D(i)(x, t)W (x, t) (10)

However, when the system dynamics follows Eq.4, with a Gaussian δ-correlated Langevinforcing term, all the coefficients D(i), i ≥ 3 vanish and Eq.3 is obtained. This simplificationdoes not take place for discrete random variables and non-Markovian processes. For com-pleteness, we note that the Kramers-Moyal expansion coefficients, D(k)(x, t) are given by thefollowing relation:

D(k)(x, t) =1

k!lim

τ −→0

1

τ [ξ(t + τ ) − x]k|ξ(t)=x (11)

It is worth noting what has been illustrated through the above developments. We havereplaced an awkward stochastic differential equation (Eq.4) with a completely deterministicpartial differential equation (Eq.3) with deterministic initial conditions (Eq.5). In otherwords, we have circumvented the problem of solving the equation of motion of a randomvariable x, by instead solving the equation of motion of the probability density , W (x, t), of the random variable. Furthermore, while Eq.4 is in general nonlinear, the revised problem tosolve, i.e. Eq.3 is always a linear equation. A brief derivation of this remarkable equation isin place. We shall follow the developments outlined by Ref:Fuller(1969), for general nonlinearN dimensional systems. Considering only white noise excitation, let us write the ith equationof the governing dynamics in Eq.4 as follows:

dxi

dt= f i(x1, x2, . . . , xN , t) + gi(t) ; i = 1, 2, . . . , N (12)

By writing the integral form of the above state equations, it can be shown that x(t) is aMarkov process (i.e. any future state of x depends only on its current state and increments of the white noise forcing term). From the discussion above on Markov processes, the Chapman-Kolmogorov equation can be applied on x:

ptr (x, t + δt| x0, t0) =

∞−∞

ptr (x, t + δt| x, t) ptr (x, t| x0, t0)dx (13)

Fuller drops the initial time t0 out of the above equation for simplicity, leading to:

p(x, t + δt) = ∞−∞

ptr (x, t + δt| x, t) p(x, t)dx (14)

Now, assuming the process to be homogeneous, the transition probability ptr (x, t + δt| x, t)can be replaced with q(z, δt|x − z, t), where z is the transition vector x − x. Eq.14 reducesto:

p(x, t + δt) =

∞−∞

q(z, δt| x − z, t) p(x − z, t)dz (15)

3

Page 4: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 4/19

So far, we have made the Chapman-Kolmogorov equation specific for homogeneous systemswith white noise excitation. We shall now approximate the right hand side of the above

equation by its Taylor series expansion about the state at time t, i.e. x − z. Fuller combinesthe probability functions q and p on the right hand side in Eq.15 into a single function r as:

r(x − z, t, z, δt) = q(z, δt|x − z, t) p(x − z, t) (16)

Therefore, expanding r about its first argument, i.e. x − z, we get:

r(x − z) = r(x) −

z1∂

∂x1+ . . . + zN

∂xN

r(x) +

1

2!

z1

∂x1+ . . . + zN

∂xN

2

r(x) − . . .

(17)Hence, Eq.15 becomes:

p(x, t + δt) =

∞−∞

. . .

∞−∞

r(x)dz1 . . . d zN

−N

i=1

∞−∞

. . .

∞−∞

zi∂r(x)

∂xidz1 . . . d zN

+1

2!

N i=1

N j=1

∞−∞

. . .

∞−∞

ziz j∂ 2r(x)

∂x2i

dz1 . . . d zN − . . .

(18)

Now interchanging the order of differentiations and integrations, we obtain:

p(x, t + δt) = ∞−∞ . . .

∞−∞ r(x)dz1 . . . d zN

−N

i=1

∂xi

∞−∞

. . .

∞−∞

zir(x)dz1 . . . d zN

+1

2!

N i=1

N j=1

∂ 2

∂x2i

∞−∞

. . .

∞−∞

ziz j r(x)dz1 . . . d zN − . . .

(19)

We now replace r(x) in Eq.19 using Eq.16; i.e. substitute r(x, t, z, δt) = q(z, δt|x, t) p(x, t):

p(x, t + δt) = p(x, t)

∞−∞

. . .

∞−∞

q(z, δt|x, t)dz1 . . . d zN

−N

i=1

∂xi

p(x, t)

∞−∞

. . .

∞−∞

ziq(z, δt|x, t)dz1 . . . d zN

+1

2!

N i=1

N j=1

∂ 2

∂x2i

p(x, t)

∞−∞

. . .

∞−∞

ziz j q(z, δt|x.t)dz1 . . . d zN

− . . .

(20)

4

Page 5: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 5/19

Notice that p(x, t) comes outside the integration, which is over the differential increment dz,because the state x(t) is independent of the increment at t. We now identify the integrals

in Eq.20 as various moments of the transition probability density function q(z, δt|x, t). Thefirst integral is simply unity, because it is the probability mass under q. The second termis the summation of the first moments zi = zi(δt, x, t), the third term is the summation of the second moments ziz j (δt, x, t) and so on. We now assume that the first two moments of z

are much greater than all its higher moments because the probability of z assuming a largevalue is very small (it is an increment over δt). Rearranging terms, and expanding the lefthand side using a first order Taylor series expansion about t:

∂p(x, t)

∂tδt = −

N i=1

∂xi[ p(x, t)zi(δt, x, t)]

+ 12!

N i=1

N j=1

∂ 2

∂x2i

[ p(x, t)ziz j (δt, x, t)]

(21)

As the final step, we take the limit δt → 0, and postulate that the following limiting valuesare obtained:

limδt→0

zi(δt, x, t)

δt= Di(x, t) (22)

limδt→0

ziz j (δt, x, t)

δt= Dij (x, t) (23)

We have the Kolmogorov version of the Fokker-Planck equation (equivalent to Eq.3):

∂p(x, t)

∂t= −

N i=1

∂xi(Di p(x, t)) +

1

2!

N i=1

N j=1

∂ 2

∂x2i

(Dij p(x, t)) (24)

Reduction of drift and diffusion coefficients to system dynamics coming soon...

For systems of type S1, the Langevin force term is zero, i.e. g(x, t) = 0, and the uncertainty

in initial conditions brings about the stochastic nature of the problem. In this special case,the Fokker-Planck equation reduces to the so called Liouville equation , given by:

∂ W (x, t)

∂t=

Ni=1

∂ D(1)i (x, t)

∂xi

W (x, t); W (x, t = t0) = W (x, t0) (25)

An example of an S1 system is the error propagation problem in celestial mechanics. In thissystem, the governing dynamics consists exclusively of rather ‘clean’ gravitational forces,which are very well understood. The source of randomness is the uncertainty in the initialstate of the object, due to the limited accuracy of measurement devices.

5

Page 6: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 6/19

Connections with Linear Stochastic Dynamics

Stochastic dynamics of linear systems can be rigorously extracted from the Fokker-Planckequation as a special case. Linearized propagation of system covariance finds widespreadapplication in filtering theory and error propagation. Even for systems with nonlinear gov-erning dynamics, it is a routine procedure to apply Gaussian closure and approximate theprobability density function with a relatively small number of parameters, which are usuallythe elements of the mean vector and covariance matrix. Such approximation is a major draw-back, especially for systems with a high degree of nonlinearity. Even for moderately nonlinearsystems, these approximations break down after sufficiently long durations of propagation,because of error accumulation from the dropped second and higher order terms. This isespecially relevant in error propagation into the future, because repeated measurements arenot available to update the current mean and covariance predictions. An example is the

problem of propagation of the state PDF of an asteroid several years into the future for thepurpose of prediction of its probability of collision with Earth.In filtering theory, the extended Kalman filter has however performed well despite these ob-vious shortcomings, because of the availability of measurements to correct the error resultingfrom linearized propagation. The reliability of such an approach is not guaranteed and diver-gence issues have led to the development of alternate nonlinear techniques like the UnscentedKalman filter.In this section, we show that the equations of propagation of mean and covariance of lin-earized systems can be obtained by the application of the Fokker-Planck equation to the basicdefinition of these quantities. For simplicity, let us first consider first a single dimensionallinear system. These results can be extended to multidimensional systems:

x = ax + g(t)Γ(t); E[Γ(t1)Γ(t2)] = Q δ(t2 − t1) (26)

W (x, t0) = N (µ0, σ0)

We have the following definitions:

µ(t) = E[x] =

∞−∞

xW (x, t)dx (27)

ν (t) = E[(x − µ(t))2] =

∞−∞

(x − µ(t))2W (x, t)dx (28)

In the equations to follow, it is to be understood that W = W (x, t), and for linear dynamics,W = N (x, µ(t), σ(t)) . Taking the time derivative of the above relations, and using Eq.3:

µ(t) =

∞−∞

x∂ W ∂t

dx =

∞−∞

xLFP (W )dx (29)

ν (t) =

∞−∞

2(x − µ)µW dx +

∞−∞

(x − µ)2∂ W ∂t

dx

=

∞−∞

2(x − µ)µ + (x − µ)2LFP ( )

W dx (30)

6

Page 7: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 7/19

Let us expand Eq.29:

µ(t) = ∞−∞

x

−∂ (axW )∂x

+ g2Q

2∂ 2W ∂x2

dx

= −a

∞−∞

xW dx − a

∞−∞

x2∂ W ∂x

dx +g2Q

2

∞−∞

x∂ 2W ∂x2

dx

= −a

∞−∞

xW dx + a

∞−∞

x2(x − µ)

σ2W dx +

g2Q

2

∞−∞

x

− 1

σ2+

(x − µ)2

σ4

W dx

= −aµ +a

σ2

∞−∞

(x − µ)3 + 2µ(x − µ)2 + µ2(x − µ)

W dx

+g2Q

2σ2 −µ +1

σ2 ∞

−∞ (x − µ)3 + µ(x − µ)2W dx= −aµ +

a

σ22µσ2 +

g2Q

2σ2

−µ +

1

σ2µσ2

= aµ(t)

We follow a similar development for the variance, ν (t), starting from Eq.30:

ν (t) = 2aµ

∞−∞

(x − µ)W dx − a

∞−∞

(x − µ)2W dx +a

σ2

∞−∞

(x − µ)4 − µ(x − µ)3

W dx

+g2Q

2σ2 ∞

−∞

(x − µ)2 −1 +(x − µ)2

σ2 W dx

= −aσ2 +a

σ2(3σ4) +

g2Q

2σ2(−σ2 +

3σ4

σ2)

= 2aν (t) + g2Q

which is the Riccati equation for the linear scalar system.

Global Weak Form Formulation of the Fokker Planck Equation

Consider the 2-dimensional nonlinear dynamic system:

x + f (x, x) = g(t)G(t) (31)

where G(t) is a zero-mean white noise process with strength Q. The corresponding Fokker-Planck equation for the probability density function is:

∂ W ∂t

+ x∂ W ∂x

− f ∂ W ∂ x

− ∂f

∂ xW − g2Q

2

∂ 2W ∂ x2

= 0 (32)

We seek a solution for W (x, x, t). Besides satisfying Eq.32, the obtained solution should alsofulfill the following constraints, so that it may be a valid probability density function:

7

Page 8: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 8/19

1. Positivity: W (x, t) > 0, ∀x, t.2. Normality:

∞−∞W (x, t)dx = 1.

The second constraint can be satisfied by appropriate normalization of the obtained solutionas a post-processing operation. In order to enforce the first constraint, we transform Eq.32such that W is replaced with log(W ). Letting W = eβ , we note the following transformations:

∂ W ∂χ

= W ∂β

∂χ(33)

∂ 2W ∂χ∂ξ

= W

∂ 2β

∂χ∂ξ+

∂β

∂χ

∂β

∂ξ

(34)

An important observation here is that in thus enforcing the positivity constraint, we end

up trading a linear equation for a nonlinear one (Eq.34). Since the approach to solution ineither case is numerical, we shall keep our fingers crossed. The following modified form of the Fokker-Planck equation, for the log-pdf (β ) results:

∂β

∂t+ x

∂β

∂x− f

∂β

∂ x− g2Q

2

∂ 2β

∂ x2− g2Q

2(

∂β

∂ x)2 =

∂f

∂ x(35)

Or,∂

∂t− Llog

FP

β =

∂f

∂ x(36)

The general idea in the global weak form formulation is to approximate the solution of the(log)pdf with a series sum of N basis functions, leading to a N th order non-gaussian closure:

β =N

i=1

γ i(t)φi(x, b) = ΥT(t)Φ(x, b) (37)

where b is a vector of time dependent parameters, usually used for desired scaling of thebasis functions. The basis functions may be a set of polynomials, radial basis functions orsome other suitable type. Substitution of Eq.37 into Eq.35 leads to the following expressionfor the residual error:

R(x, t) =N

i=1

γ i(t)φi(x, b) +N

i=1

γ i(t)L

l=1

∂φi(x, b)

∂bl

dbl

dt+ x

N i=1

γ i(t)∂φi(x, b)

∂x− f

N i=1

γ i(t)∂φi(x, b)

∂ x

−g2Q

2

N i=1

γ i(t)∂ 2φi(x, b)

∂ x2− g2Q

2

N i=1

N j=1

γ i(t)γ j (t)∂φi(x, b)

∂x

∂φ j (x, b)

∂x− ∂f

∂ x(38)

8

Page 9: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 9/19

The residual error, R(x, t), is projected onto a space spanned by N discreetly chosen inde-pendent weight functions (test functions), V p(x, b∗) ( p = 1 to N ). This projection results in

N coupled ordinary differential equations for each of the undetermined coefficients γ (t): ΩR(x, b, t)V p(x, b∗)dΩ = 0; p = 1, 2, . . . , N (39)

The weight functions represent the space over which the projection of the residual error isminimized. It is therefore important to select these functions such that they completely spanthe regions of significance in the solution space, i.e., they give the requisite weightage toall the regions over which the solution is active. This may require some heuristics for thedetermination of such ‘important’ regions from the point of view of obtaining the solution.The approach wherein the weight functions are the same as the basis functions is called theGalerkin global weak form method.

Systems with Polynomial Nonlinearity

For a large class of nonlinear systems, the function f (x, x) in Eq.31 can be modelled with apolynomial function. The various nonlinear oscillators, like the Duffing oscillator, the Van-derPol oscillator, the Rayleigh oscillator etc. are of this type. A set of independent polynomialsis a good choice for the basis functions for these systems. Furthermore, if the weight functionsare taken to be the same as the basis functions, multiplied with a carefully chosen gaussianpdf, it can be shown that all the integrals involved in Eq.39 can be computed rather easily.In particular, if we choose hermite polynomials to approximate the log-pdf (Eq.37), all theintegrals in Eq.39 reduce to moments of various order of the gaussian pdf used in the weight

functions. All these moments can be evaluated analytically. We have:

β (x, x, t) =N

i=0

M j=0

γ ij (t)Hi(x)H j (x) (40)

V pq(x, x) = H p(x)Hq(x)e−x2

e−x2

(41)

where x is shorthand for (x−µx)σx

, and x for (x−µx)σx

. The coefficients of the basis functions,γ ij (t), are the unknowns to be solved for. The time varying parameters µx(t), σx(t), µx(t)and σx(t) are pre-defined, and they are used for the scaling of basis polynomials.

There are several reasons for choosing the particular form of the basis and weight functionsshown in Eqs.40-41. Hermite polynomials form an orthogonal basis with respect to the ex-ponential function over the domain (−∞, ∞). This fact gives us an immediate advantage,because (−∞, ∞) is the exact theoretical domain of the solution space along each dimension(Eq.39). Therefore, we can exploit the orthogonality properties of the hermite polynomialswhile evaluating the integrals in Eq.39 analytically , over the exact solution domain. Thepresence of the gaussian-pdf in the weight function gives us a twofold advantage. Firstly, bywisely selecting (heuristically or otherwise) the profiles of the mean and standard deviationparameters, we can give greater weight to regions where the actual solution has its dominant

9

Page 10: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 10/19

presence, while theoretically integrating over the infinite domain in all the dimensions. Thisleads us to the second benefit - Polynomial functions have a tendency to blow out as they

move away from their ‘center’. By giving weightage only to the regions around the centerof these polynomials, we reduce the chance of divergence of the method due to ballooningintegrals.

Selection of Scaling Parameters: The parameters b∗ (used in the weight functions) playa crucial role in assigning appropriate weights to the significant regions in the infinite domainsolution space. As is obvious from Eq.41, the parameters µx,x determine the ‘center’ of thehermite polynomials (i.e. the polynomials are symmetric about a line parallel to the y-axispassing through this point). The point (µx, µx) is also where the mean of the gaussian-pdf used in the weight function lies. The parameters σx and σx define the ‘region of dominance’of the polynomials, because over 95% weightage is given to the 3σ ellipsoid (in 2D) centered

at (µx, µx). The regions outside this patch are given very little significance, although theyare theoretically included in the integration. Therefore, if the weight functions are locatedin a region where the actual solution is not present, or their ‘domain of dominance’ does notcover sufficient space, regions of significance will get ‘weighted out’ of the solution, leadingto errors. Therefore it is important to know beforehand where to look for the solution ( µx,x)and how much region to focus on ( σx,x).An immediate estimate for these parameters is available from the linear analysis of thesystem. The nominal trajectory starting from the mean initial conditions propagated throughthe nonlinear dynamics can be used as profiles for µx and µx. Similarly, the profiles for σx

and σx can be obtained by solving the Riccati equation for the linearized model. One coulduse a conservative estimate by using an integral multiple of the linear standard deviations

obtained from the Riccati equation to give weightage to a greater domain than the simple‘linear region of dominance’. However, this method will not work in general for long durationsof integration, because the propagated mean (from the initial pdf ) tends to drift away from theactual mean of the propagated pdf, because of the nature of diffusion processes. Therefore,the domain of dominance of the weight functions will simultaneously shift away from theregion where the actual nonlinear pdf has its presence. Moreover, the standard deviationprofiles obtained from the Riccati equation can be very different than the actual nonlinearstandard deviation profiles in general. Consequently, the linear standard deviations may notbe able to capture the regions of significance for long durations. The degree of nonlinearityof the system will play a crucial role in determining how long the parameters used fromlinearized analysis shall work. Nevertheless, some very good solutions can be obtained using

this approach, especially if the system being considered has a single isolated steady statesolution.This leads us to the next possible method for selecting the parameter profiles. If the nonlinearsystem being studied has a steady state probability density function, the parameter profilescan be designed such that they start from the initial pdf statistics, and evolve to finally attainthe statistics of the steady state pdf. This method requires that the steady state probabilitydensity function of the system be known before the solution is attempted. This may notalways be possible; even in the case that system does have a steady state pdf. Authors’ note:

A reference to Fuller’s treatment of stationary solutions of FPK is highly recommended

10

Page 11: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 11/19

There is another method of linearization, called stochastic linearization , which is generallyexpected to give better results than simple linearization. In this approach, an equivalent

linear model is designed for the actual nonlinear system by studying the response of theactual system to input signals. The parameters of the designed system are selected in amanner such that the linear model follows the output of the actual nonlinear system in anoptimal sense.An appealing idea for scaling parameter selection is to use the statistics of the probabilitydensity function obtained from the solution itself at the current time step. A conservativegaussian surface can be constructed using the first two moments of the actual pdf at thecurrent time step, which in turn would determine the center and domain of the weightfunctions for the next time step. This approach can be expected to capture the correctregions of significance for as long as the solution is sought. However, it is a numericallyintensive approach, and it is a difficult job to obtain the time rates of change of these

parameters at the current time step.

Application

In this section, we apply the method of global weak form formulation to solve the Fokker-Planck equation for a 2-D system with polynomial nonlinearity: (Ref:Muscolino)

x + ηx + x + ω(x2 + x2)x = gΓ(t); E[Γ(t1)Γ(t2)] = 2πS 0δ(t2 − t1) (42)

This system is known to have the following stationary probability density function:

W s(x, x) = kexp−

1

2g2

η(x2 + x2) +ω

2 (x2 + x2)2

(43)

where k is a normalizing constant. Comparing with Eq.31, we see that f (x, x) = (ηx + x +ω(x2 + x2)x). For reasons described above, scaled and orthonormalized hermite polynomials(see appendix) are chosen as the basis functions. The system dynamics (f (x, x), and itsderivatives) can also be represented by hermite polynomial series sums, which will be of great use, as will be shown below. We obtain the following weak form equation:

∞−∞

∞−∞

∂ β

∂t+ x

∂ β

∂x− f

∂ β

∂ x− πS 0g2

2

∂ 2β

∂x2− πS 0g2

2

∂ β

∂ x

2

− ∂f

∂ x

V pqdxdx = 0 (44)

11

Page 12: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 12/19

where,

β =N

i=0

M j=0

γ ij (t)Hi(x)H j (x) (45)

V pq = H p(x)Hq(x)e−x2

e−x2

(46)

f (x, x) =K

k=0

Ll=0

ζ klHk(x)Hl(x) (47)

∂f

∂ x=

K k=0

Ll=0

ζ klHk(x)Hl(x) (48)

The mean and standard deviation parameters to be used for scaling are obtained by solving

the linearized system, as described above. Muscolino et.al. have used a similar approach,but with stochastic linearization to obtain the relevant parameters. They have solved for thezero-mean process, i.e., µx and µx do not appear in the scaling, and the pdf is stationed at theorigin at all times, which is also the equilibrium point of the nonlinear system. This has beendone for simplicity of calculation. Obviously, the issues discussed above regarding attachingappropriate weights to significant regions in the solution space are rather easy to address inthis example; because firstly, the center of the solution pdf does not change at any time (it isa zero-mean process). Therefore, we always know where to look for the solution. Secondly,the system has a known stationary pdf - meaning thereby that the domain of significance iseasy to determine even for large times, because the final answer is known beforehand. Theauthors have mentioned that the precision with which the standard deviation parameters are

determined plays a crucial role in the convergence of the solution for a prescribed level of accuracy. This is true in general, the reasons for which have been described in the previoussection.For this particular example, we show in this paper that a very good solution can be obtainedeven with simple linearization (as opposed to stochastic linearization). The reason for suchreduced sensitivity to parameter tuning is attributed to improved normalization of the basisfunctions, suited to the particular problem. Furthermore, the formulation shown in this pa-per is applicable to the cases in which the process is not zero-mean.

We now take a closer look at each of the terms in the weak form equation (Eq.44) usinghermite polynomials:

12

Page 13: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 13/19

Term 1: ∞−∞

∞−∞

∂ β

∂t V pqdxdx

=

∞−∞

∞−∞

N i=0

M j=0

dγ ij

dtHi(x)H j (x)H p(x)Hq(x)e−x2

e−x2

dxdx

+

∞−∞

∞−∞

N i=0

M j=0

γ ij

∂ Hi(x)

∂µxµx +

∂ Hi(x)

∂σxσx

H j (x)H p(x)Hq(x)e−x2

e−x2

dxdx

+

∞−∞

∞−∞

N i=0

M j=0

γ ij Hi(x)

∂ H j (x)

∂µxµx +

∂ H j (x)

∂σxσx

H p(x)Hq(x)e−x2

e−x2

dxdx

(49)

Looking at the first of these three terms, we see that the summations and integrals can berearranged in the following manner:

N i=0

−∞Hi(x)H p(x)e−x2

dxM

j=0

γ ij

∞−∞

H j (x)Hq(x)e−x2

dx

(50)

Using the orthonormal properties of the hermite polynomials, Eq.50 collapses to:

N i=0

δip

M j=0

γ ij δ jq

= γ pq (51)

In the second and third integral expressions in Eq.49, we shall use the differentiation-recursionrelations among the hermite polynomials. For instance, note the following developments forthe bracketed terms in the second integral in Eq.49:

∂ Hi(x)

∂µxµx +

∂ Hi(x)

∂σxσx =

∂ Hi(x)

∂x

∂x

∂µxµx +

∂ Hi(x)

∂x

∂x

∂σxσx (52)

= −2iki−1kiσx

µx + xσxHi−1(x) (using recursion) (53)

= −2iki−1kiσx

k0µxH0(x) +k1σx

2H1(x)Hi−1(x) (54)

The expression in the curly brackets of Eq.53 has been represented in terms of hermitepolynomials in Eq.54 so as to simplify the evaluation of integrals. The normalization factorski have been defined in the appendix. Carrying out a similar rearrangement of terms as forthe first integral, the second integral in Eq.49 reduces to:

−N

i=1

2iki−1kiσx

γ iqk0µxδi−1,p +k1σx

2∆1,i−1,p

(55)

13

Page 14: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 14/19

The expression ∆αβγ has been explained in the appendix. The third integral in Eq.49 takesthe exact similar form as Eq.55, with i and p interchanged with j and q respectively. There-

fore, we have the following expression for term 1:

∞−∞

∞−∞

∂ β

∂tV pqdxdx = γ pq −

N i=1

2iki−1kiσx

γ iqk0µxδi−1,p +k1σx

2∆1,i−1,p

(56)

−M

j=1

2 jk j−1

k j σxγ pjk0µxδ j−1,q +

k1σx

2∆1,j−1,q

The advantage of orthonormalization is now clear. We see that Eq.56 contains the timederivative of only γ pq. Therefore, from each of the N × M weak form equations, we obtain a

differential equation for every one of the undetermined coefficients, without having to inverta matrix for this purpose. In general, with non-orthogonal polynomials, a set of differentialequations of the type Aγ + Bγ + Cγ 2+ f = 0 would have been obtained, which would requirea matrix inversion to separate out the time derivatives of the individual coefficients. No suchinversion is required here.Continuing with the weak form in Eq.44, we look at the remaining terms (after followingsimilar steps of simplification as for term 1):

Term 2:

∞−∞

∞−∞

x∂ β

∂xV pqdxdx

=N

i=1

2iki−1kiσx

∞−∞

Hi−1(x)H p(x)e−x2

dx×

M j=0

γ ij

∞−∞

k0µxH0(x) +k1σx

2H1(x)H j (x)Hq(x)e−x

2

dx

=M

j=0

2( p + 1)k p

k p+1σxγ p+1,j

µx +

k1σx

2∆1 jq

;for p < N , Term 2 = 0 for p = N

(57)

Term 3:− ∞−∞

−∞f

∂ β

∂ xV pqdxdx

=N

i=0

M j=1

2 jk j−1γ ij

k j σx

K k=0

∞−∞

Hi(x)H p(x)Hk(x)e−x2

dxL

l=0

ζ kl

∞−∞

H j−1(x)Hq(x)Hl(x)e−x2

dx

=N

i=0

M j=1

2 jk j−1k j σx

γ ij

K k=0

∆ipk

Ll=0

ζ kl∆ j−1,q,l

(58)

14

Page 15: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 15/19

Term 4: − πS 0g2 ∞

−∞ ∞−∞

∂ 2β

∂ x2V pqdxdx

= −πS 0g2N

i=0

∞−∞

Hi(x)H p(x)e−x2

dxM

j=2

4 j( j − 1)k j−2k j σ2

x

γ ij

∞−∞

H j−2(x)Hq(x)e−x2

dx

= −πS 0g24(q + 1)(q + 2)kq

kq+2σ2x

γ p,q+2 ;for q < M − 1, Term 4 = 0 otherwise

(59)

Term 5: − πS 0g2 ∞−∞

∞−∞

(∂ β

∂ x2)2V pqdxdx

= −πS 0g2

N i=0

N k=0

∞−∞ Hi(x)Hk(x)H p(x)e−x

2

dx

M j=1

γ ij×M

l=1

4 jlk j−1kl−1k j klσ2

x

∞−∞

H j−1(x)Hl−1(x)Hq(x)e−x2

dx

= −πS 0g2N

i=0

N k=0

∆ikp

M j=1

γ ij

M l=1

4 jlk j−1kl−1k j klσ2

x

∆ j−1,l−1,q

(60)

Term 6: − ∞

−∞ ∞

−∞

∂f

∂ xV pqdxdx

= −K

k=0

∞−∞

Hk(x)H p(x)e−x2

dxL

l=0

ζ kl

∞−∞

Hl(x)Hq(x)e−x2

dx

= −K

k=0

δkp

Ll=0

ζ klδlq

(61)

The final resulting equation for γ pq is:

γ pq =N

i=1

2iki−1

kiσx

γ iq

k0µxδi

−1,p +

k1σx

2

∆1,i

−1,p

+M

j=1

2 jk j−1

k j σx

γ pj

k0µxδ j

−1,q +

k1σx

2

∆1,j

−1,q

M j=0

2( p + 1)k p

k p+1σxγ p+1,j

µx +

k1σx

2∆1 jq

N i=0

M j=1

2 jk j−1k j σx

γ ij

K k=0

∆ipk

Ll=0

ζ kl∆ j−1,q,l

+ πS 0g24(q + 1)(q + 2)kq

kq+2σ2x

γ p,q+2 + πS 0g2N

i=0

N k=0

∆ikp

M j=1

γ ij

M l=1

4 jlk j−1kl−1k j klσ2

x

∆ j−1,l−1,q

+K

k=0

δkp

Ll=0

ζ klδlq

(62)

15

Page 16: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 16/19

The Meshless Local Weak Form Formulation

Material goes in here. The Meshless Local Petrov Galerkin (MLPG) method emergedas a promising numerical technique that offers a simplified methodology compared to theconventional FEM and other meshless techniques that require a background mesh for inte-gration. The MLPG scheme is a truly meshless method, in which grid generation using thenode distribution is not required at any stage.

16

Page 17: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 17/19

Appendix: Normalized Hermite Polynomials

Standard hermite polynomials are defined in the following manner:

Hn(χ) = (−1)neχ2 dn

dχne−χ2

(63)

These polynomials satisfy the orthogonality property with respect to the weighting functione−χ2

over the domain ] − ∞, ∞[ : ∞−∞

Hα(χ)Hβ (χ)e−χ2

dχ = 2αα!√

πδαβ (64)

The polynomials used in the approximation in Eq.40 have been obtained by normalizing thepolynomials in Eq.63 such that the integral shown in Eq.64 gives unity when α = β . In other

words, the following orthonormal polynomials have been used in Eq.40:

Hn(χ) =Hn(χ) 2nn!

√π

≡ Hn

kn(65)

These orthonormalized hermite polynomials have the following useful interrelationships:

∞−∞

Hα(χ)Hβ (χ)e−χ2

dχ = δαβ (66) ∞−∞

Hα(χ)Hβ (χ)Hγ (χ)e−χ2

dχ = ∆αβγ (67)

dHn(χ)dχ

= 2nkn−1kn

Hn−1(χ)∆3(α,β,γ ) (68)

where,

∆αβγ =

α!β !γ !√

π1

(s−α)!(s−β )!(s−γ )! if 2s is even, and s ≥ (α,β,γ ),

0 otherwise(69)

with s = α+β +γ 2 , and k has been defined in Eq.65.

The first few orthonormalized hermite polynomials are:

17

Page 18: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 18/19

H0(χ) =

1 √π (70)

H1(χ) =2χ 2√

π(71)

H2(χ) =4χ2 − 2

8√

π(72)

H3(χ) =8χ3 − 12χ

48√

π(73)

H4(χ) =16χ4 − 48χ2 + 12

384√

π(74)

Theoretical Aspects of the FPE

Brownian Motion

In the deterministic world, the simplest expression for the dynamics of a particle immersedin a fluid is given by:

mv + αv = 0 (75)

v + γv = 0 (76)

where αv is the frictional force, and γ = α/m = 1/τ . So, an initial velocity v(0) decays to

zero exponentially with relaxation time τ = 1/γ . The physics behind the frictional force isthe collision of the particle with fluid particles, and the deterministic model Eq.75 is a goodapproximation when the mass of the particle is large, so that the velocity due to thermalfluctuations is negligible. Now, from the kinetic theory of gases and the equipartition law,we have

1

2m < v2 >=

1

2kT (77)

where k is the Boltzmann constant. So, the thermal velocity vth =√

< v2 > =

kT/m isnegligible for large m. But, for small particles, Eq.75 needs to be modified to lead to thecorrect thermal energy. To this effect, the net force on the particle is decomposed into acontinuous damping force F c(t) and a fluctuating force F f (t):

F (t) = F c(t) + F f (t) = −αv(t) + F f (t) (78)

The properties of F f (t) are given only in average due to its stochastic nature. The reasonbehind its stochastic nature is the following. If we were to solve the motion of the particleexactly, we would need to consider its coupling with all the fluid particles, which are of theorder 1023. Since it is not practical to solve this enormous coupled system, we encapsu-late all these coupling terms in one stochastic force, given above, and specify its averagecharacteristics. We get:

v + γv = Γ(t) (79)

18

Page 19: fokkerplanck

8/2/2019 fokkerplanck

http://slidepdf.com/reader/full/fokkerplanck 19/19

Some of the properties of the Langevin force, Γ(t) are:

< Γ(t) >= 0 (80)< Γ(t)Γ(t) >= 0 ∀ |t − t| ≥ τ 0 (81)

where τ 0 is the mean duration of a collision. Eq.81 is reasonable because it assumes that thecollisions of different molecules are independent. Furthermore, τ 0 τ (= 1/γ ), hence takinga reasonable limit τ 0 → 0, to get:

< Γ(t)Γ(t) >= qδ(t − t) (82)

where q is the noise strength, q = 2γkT/m.

19