Monte Carlo 2014Course

Computational Physics

Advanced Monte Carlo Methods

notes by E. Carlon – Academic year 2012/2013

Contents

1 Calculation of integrals by Monte Carlo 3

2 Monte Carlo methods for systems in equilibrium 62.1 Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Detailed balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.4 The Metropolis algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 102.5 Equilibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.6 Correlation time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.7 Critical slowing down: the dynamical exponent . . . . . . . . . . . . . 162.8 Cluster algorithms: the Wolff algorithm for the Ising model . . . . . . . 172.9 Other single flip algorithms: the heat bath algorithm . . . . . . . . . . 192.10 Conserved order parameter Ising model . . . . . . . . . . . . . . . . . . 212.11 Monte Carlo in continuous time . . . . . . . . . . . . . . . . . . . . . . 222.12 Continuous systems: Lennard-Jones fluids . . . . . . . . . . . . . . . . 22

3 Out of equilibrium systems 243.1 Coupled chemical reactions . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1.1 Rate equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.1.2 Stochastic simulations: The Gillespie algorithm . . . . . . . . . 27

1

4 Problems 284.1 Non-uniform random numbers . . . . . . . . . . . . . . . . . . . . . . . 284.2 Gaussian RNG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.3 Monte Carlo integration (I) . . . . . . . . . . . . . . . . . . . . . . . . 304.4 Monte Carlo integration (II) . . . . . . . . . . . . . . . . . . . . . . . . 314.5 Stochastic matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.6 Detailed balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.7 Ising Model: Uniform sampling . . . . . . . . . . . . . . . . . . . . . . 334.8 Metropolis algorithm for the Ising model . . . . . . . . . . . . . . . . . 374.9 The heat bath algorithm for the Potts model . . . . . . . . . . . . . . . 394.10 Kawasaki algorithm: The local version . . . . . . . . . . . . . . . . . . 404.11 Kawasaki algorithm: The non-local version . . . . . . . . . . . . . . . . 414.12 Hard Disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.13 Monte Carlo in continuous time for a persistent random walk . . . . . . 444.14 Birth-annihilation process . . . . . . . . . . . . . . . . . . . . . . . . . 454.15 Lotka-Volterra Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.16 Brusselator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

A The Ising and Potts models 47

B The Master equation 48

2

1 Calculation of integrals by Monte Carlo

The term “Monte Carlo” first appeared in an article of 1949 by Metropolis and Ulam1.These authors suggested a method to solve a class of problems in Physics and Mathe-matics using a statistical approach. Actually similar ideas, under the name of statisticalsampling, were used earlier, much before the invention of computers.

A classical example is the so-called problem of the Buffon’s needle, named afterGeorges-Louis Leclerc, Comte de Buffon (1707-1788), a french naturalist and mathe-matician. Buffon considered a needle of length l thrown at random on a plane in whichare drawn some parallel long lines separated by a distance t (see Fig. 1). Defining xthe position of the center of the needle (0 ≤ x ≤ l) and θ the angle formed by theneedle with the axis perpendicular to the lines (−π/2 ≤ θ ≤ π/2) one can easily findthe probability that the needle crosses the lines from

P =2

tπ

∫ l/2

0

dx

∫ arccos(2x/l)

− arccos(2x/l)

dθ =2l

tπ

∫ 1

0

dy arccos(y) =2l

tπ(1)

Imagine now to throw the needle N times, and let N∗ the number of times it crossesthe lines. We could estimate the probability as P ≈ N∗/N . During the 19th and 20thcenturies several people performed the needle experiment to obtain an estimate of thenumber π as

π ≈ 2l

t

N

N∗(2)

which follows from Eq. (1). Obviously the result becomes more and more accurate asN increases2.

This calculation is an example of the use of statistical sampling to estimate thedouble integral appearing in Eq. (1). Usually, up to three dimensions, integrals can becalculated more efficiently using quadrature methods. For high dimensional integrals,however, the Monte Carlo method becomes much more efficient than quadratures. To

1N. Metropolis and S. Ulam The Monte Carlo method, JSTOR 44, 335 (1949)2An interactive Java applet for the calculation of π by sampling the ratio of needles crossing the

lines over the total number of needles can be found at http://mste.illinois.edu/reese/buffon/

bufjava.html

Figure 1: The Buffon’s needle: the needle a crosses the line, while the b does not.

3

http://mste.illinois.edu/reese/buffon/bufjava.html

http://mste.illinois.edu/reese/buffon/bufjava.html

illustrate how Monte Carlo integration works we restrict ourselves in this section tosome simple one-dimensional examples.

Consider a function of one variable defined in the interval [0, 1]. We can estimate itsintegral by generating N random numbers x1, x2 . . .xN uniformly in [0, 1] and write3:

I =

∫ 1

0

f(x) dx ≈ 1

N

N∑n=1

f(xn) (4)

If f(x) is a slowly varying function the approximant converges very rapidly to theexact value (if f(x) is a constant it is sufficient to sample it in one point). In somecases we could have functions which are negligibly small in (large) fraction of theintegration domain. In that case the uniform random number distribution is not veryefficient: most of the time random numbers are generated in a part of the integrationdomain where the function is small. Despite a considerable computational effort wemay end up with an estimate of the integrand with large statistical error. It is thenmore convenient to use a random number generator which samples more often the partof the integration domain where the function f(x) has relatively large value. Thisprocedure is called importance sampling.

A biased random number generator in the interval [0, 1] is characterized by a densityfunction W (x), which has the following properties W (x) ≥ 0 in [0, 1] and∫ 1

0

W (x) dx = 1 (5)

W (x) dx is equal to the fraction of random numbers generated in the interval[x, x+dx]. If N random numbers are generated then the distance between two of theseat position x is given by 1/NW (x). As a consequence

I =

∫ 1

0

f(x) dx ≈ 1

N

N∑n=1

f(xn)

W (xn)(6)

where xn are generated according to the distribution W (x).

3More generally if the domain of integration is [a, b], the integral is approximated as

I =

∫ b

a

f(x) dx ≈ b− aN

N∑n=1

f(xn) (3)

where xn is a random number selected uniformly from [a, b].

4

Example 1

Let us consider the following function f(x) = e−x2g(x) in [0,+∞), where

g(x) is slowly varying. For the Monte Carlo computation of this integral itis convenient to use a Gaussian random number generator, characterizedby a density W (x) =

√2/πe−x

2. One has:

I =

∫ +∞

0

f(x) dx ≈ 1

N

N∑n=1

f(xn)

W (xn)=

1

N

√π

2

N∑n=1

g(xn) (7)

Note that the domain of integration is unbounded here.The normalization factor should not be forgotten in the calculation! Inthis example this is

√2/π.

As seen in the previous example it is convenient to choose W (x) so that f(x)/W (x)becomes a slow varying function. This statement can be made more precise from thecalculation of the variance:

σ2I =

1

N2

N∑i=1

N∑j=1

(f(xi)

W (xi)− 〈f/W 〉

)(f(xj)

W (xj)− 〈f/W 〉

)(8)

Here 〈f/W 〉 denotes the exact value of the integral, obtained from sampling with aninfinite number of points. Using the fact that different samples (i 6= j) are independentone obtains in the limit N →∞

σ2I =

1

N

[〈(f/W )2〉 − 〈f/W 〉2

](9)

Thus the error scales as the squared root of the number of sampled points (σI ∼ 1/√N),

but can also be minimized by choosing W such that f/W is a slowly varying functionso that the term between brackets in Eq. (9) becomes small. In the optimal case oneshould try to reduce f/W to a constant, so that the term between brackets in Eq. (9)vanishes. However in the most general case it is impossible to do so. Note this wouldmean W (x) = αf(x). As W (x) is normalized (Eq. (5)), we have I = 1/α, which meansthat we know already the value of the integral.

The choice of W (x) is obvious for functions which are modulated by gaussians,exponentials . . .

5

2 Monte Carlo methods for systems in equilibrium

Consider a system in thermal equilibrium which is in contact with a reservoir at atemperature T . The system is characterized by some discrete4 number of microscopicconfigurations, labeled by µ. Let Eµ be the energy of the configuration µ. Accordingto one of the basic principles of statistical mechanics, for a system in thermodynamicequilibrium, the probability of finding it the state µ is proportional to e−βEµ , whereβ = 1/kBT and kB the Boltzmann constant.

Let Aµ now be an observable of the system. This could be the energy itself (Aµ =Eµ) or any other physical quantities which characterizes any given microstate µ. For asystem in equilibrium at a temperature T the average value of the observable is givenby:

〈A〉 =1

Z

∑µ

Aµe−βEµ (10)

where we have introduced the partition function:

Z =∑µ

e−βEµ (11)

Can we use the Monte Carlo method to get an approximate, but accurate, estimate of〈A〉? One possibility is to generate N configurations of our system at random. Let usdenote this subset with {µ}. In analogy to what seen in the previous Section we canapproximate the thermal average with

〈A〉 ≈∑{µ}Aµe

−βEµ∑{µ} e

−βEµ(12)

where the sum is restricted to a subset {µ} of sampled configurations.How can we select the configuration in {µ}? One possibility is to perform an random

sampling. As an example consider an Ising5 square L × L lattice, containig thus L2

spins. Random sampling consists in generating configurations by choosing L2 spinseither up or down with equal probability. Each time a new configuration is generated,this is uncorrelated with the previous ones. However, this is a very inefficient strategyfor the calculation of 〈A〉. The reason is the following. Even a “small” system, as anIsing square lattice with L = 20, has a large number of possible configurations. Eachspin can take two values and there are L2 spins, hence L = 20 has a configurationalspace of dimension 2400 ≈ 10120. Considering that with modern computers we couldsample 1010 configurations within a few hours of CPU time, this remains a negligiblefraction of the configuration space. An additional problem is the case in which the sumin Eq. (10) is dominated by very few states, as in the Ising model at low temperaturesin the ferromagnetic phase where the large majority of the spins are pointing to thesame direction. Random sampling most likely generates configurations where the totalmagnetization is small, as the spins are chosen independently from each other.

4For the sake of simplicity we consider the discrete case, but all what we discuss can be easilygeneralized to a continuum system, as for instance an interacting gas or a liquid.

5The Ising model in briefly introduced in Appendix A

6

Following what we have learnt in the previous section we should generate configu-rations of the system non-uniformly. If a given state µ is generated with a probabilityWµ then Eq. (10) takes the form

〈A〉 ≈∑{µ}Aµe

−βEµW−1µ∑

{µ} e−βEµW−1

µ

(13)

which is analogous to the Eq. (6). The most natural choice is Wµ = e−βEµ/Z, whichyields:

〈A〉 ≈∑{µ}Aµ∑{µ} 1

=1

N

∑{µ}

Aµ (14)

where N is the number of terms in the sum of Eq. (14).How is it possible to generate states according to the Boltzmann factor exp(−βEµ)?

This is explained in the following sections. First we have to introduce the concept ofMarkov chain.

2.1 Markov chains

A Markov chain is a discrete stochastic process which starting from a configuration µat time t = 0 generates a new configuration at a time t+ ∆t

µ→ γ → ν . . . (15)

Each transition depends only on the current state of the system and not on the previoushistory (Markov property). The evolution is stochastic i.e. is governed by transitionprobabilities P (µ→ ν) ≥ 0. The following normalization condition holds∑

ν

P (µ→ ν) = 1 (16)

The system at a given time is characterized by the probabilities pµ of being in the stateµ. We can write the probability of finding the system at a later time in a state ν as:

pν(t+ ∆t) =∑µ

pµ(t)P (µ→ ν) (17)

If the system has N states, we can arrange the pµ’s so to form a vector of dimensionN and rewrite Eq. (17) in the form of a vector-matrix product

p(t+ ∆t) = P p(t) (18)

where the matrix P has elements Pµν = P (ν → µ). Equation (18) can be rewrittenalso in the form of a differential equation in the limit ∆t → 0, as shown in AppendixB. In that form it is usually referred to as the Master equation.

Note that the condition (16) implies that the sum of each column of the matrix isequal to 1. We will call p a stochastic vector if 1) it has non-negative elements and

7

2)∑

µ pµ = 1. A square matrix with non-negative elements and such that the sum ofall columns is equal to 1 is called stochastic matrix. Note that in general a stochasticmatrix is not symmetric, i.e. P (µ→ ν) 6= P (ν → µ)

The following proposition will be proven during the exercise session

PropositionIf p is a stochastic vector and P a stochastic matrix, then P p is a stochasticvector. P has at least an eigenvalue λ = 1. All other eigenvalues will be suchthat |λk| < 1.

Example 2Consider a system composed of 3 states µ, ν and γ. The stochasticdynamics is defined by the probabilities P (µ → ν) = a, P (ν → γ) =P (γ → ν) = b (a, b < 1). All other probabilities for processes µ → γ,γ → µ and ν → µ are zero. The stochastic matrix associated is

P =

1− a 0 0a 1− b b0 b 1− b

(19)

The eigenvalues are λ1 = 1, λ2 = 1 − a and λ3 = 1 − 2b. Note thatin agreement with the previous proposition an eigenvalue is equal to 1,while the two others |λ1|, |λ2| < 1.Particularly interesting is the eigenvector associated to λ1 = 1, which isgiven by ω = (0, 1/2, 1/2). This is a stochastic vector.

We are interested in the eigenvector ω associated to the eigenvalue 1 of the stochas-tic matrix since this eigenvector describes a stationary state of the system, i.e. theprobability of finding the system in any of the possible configurations does not evolvewith time as from Eq. (18) follows

ω(t+ ∆t) = Pω(t) = ω(t) (20)

Example 3In the Example 2 we found that ω = (0, 1/2, 1/2) is stationary state.This means that given an initial state, the system will evolve towards astationary state in which it spends half of the time in ν and half in γ.In this example ωµ = 0, i.e. we will have probability zero of finding thesystem in the state µ. Can you understand why?

8

2.2 Ergodicity

An important property which we wish our system to have is that of ergodicity. Er-godicity means that any state of the system should be reachable from any other state.Note that in fact many of the transition rates from and to a given state can be zero,but given two arbitrary states µ and ν there should exist at least a path of transitionwith non-zero rates connecting them

µ→ γ → δ → ρ→ . . .→ ν (21)

Example 4The stochastic process given in the Example 2 is non-ergodic. Take γand µ. There is no way that the defined dynamics connects them: asP (γ → µ) = P (ν → µ) = 0 it is not possible to reach the state µ fromany other state of the system.

2.3 Detailed balance

We consider now the “inverse” problem. We would like to generate a stochastic dynam-ics which has as stationary state a given vector ω. Particularly interesting, on the viewof what discussed before is the case of convergence to equilibrium ωµ = exp(−βEµ)/Z.A given stochastic vector ωµ is stationary if it is eigenvector of the generator of dy-namics P with eigenvalue equal to 1 (Pω = ω), which means

ων =∑µ

ωµP (µ→ ν) (22)

Using Eq. (16) we can rewrite this condition as∑µ

ωνP (ν → µ) =∑µ

ωµP (µ→ ν) (23)

We refer to Eq. (23) as to the global balance condition. The interpretation is thatω is stationary if the total probability flow from a configuration ν to any configurationof the system (

∑µ ωνP (ν → µ)) is equal to the flow from any other configuration to ν

(∑

µ ωµP (µ→ ν)).Note that if the probabilities satisfy the so-called detailed balance condition

ωνP (ν → µ) = ωµP (µ→ ν) (24)

then Eq. (23) necessarily holds. The opposite is not true: Eq. (23) may be valid,but Eq. (24) be violated (see example in the exercice session). Detailed balance is acondition which is relatively simple to implement in practice, as it can be seen in thefollowing

9

Example 5Consider a system with three states. We wish to construct a stochasticmatrix which has ω = 1

6(1 4 1) as stationary state. We can do this by

making use of the detailed balance condition (Eq. (24)) and write:

P (1→ 2)

P (2→ 1)=ω2

ω1

= 4,P (3→ 2)

P (2→ 3)=ω2

ω3

= 4,P (1→ 3)

P (3→ 1)=ω3

ω1

= 1

(25)The detailed balance condition does not impose any constraint on theprobabilities of remaining in the same state P (µ→ µ). We need anyhowto have a stochastic matrix, so the sums of all columns must add up toone. One possible solution is

P =1

5

0 1 14 3 41 1 0

(26)

You can verify explicitely that Pω = ω, so ω is indeed eigenvector of Pwith eigenvalue equal to 1.Can you find other examples of matrices having the above ω as stationarystate?

If we wish that our dynamics to the Boltzmann distribution ωµ = exp(−βEµ)/Zthe detailed balance condition becomes

P (ν → µ)

P (µ→ ν)= e−β(Eµ−Eν) (27)

As this condition only fixes the ratios of the P (ν → µ), there is quite some freedom onthe specific choice of transition probabilities satisfying detailed balance.

2.4 The Metropolis algorithm

Ergodicity and detailed balance are the two important conditions that our stochasticMarkov chain has to fulfill in order to converge to thermodynamic equilibrium. Therates must also fulfill the normalization condition given in Eq. (16). This normalizationincludes the rate P (µ → µ) which is associated to the absence of transition. Whensetting ν = µ in Eq. (24) we find 1 = 1, i.e. the detailed balance condition is alwayssatisfied for any choice of P (µ→ µ). Hence we can set it to any values that we wish.

As the purpose is to have quick convergence to equilibrium, the optimal choice ofP (µ→ ν) to have as large as possible, but still consistent with Eqs. (24) and (16). Inpractice we can split the rates as follows:

P (µ→ ν) = g(µ→ ν)A(µ→ ν) (28)

10

where g(µ → ν) is called selection probability and A(µ → ν) the acceptance ratio.The selection probability tells us which states can be generated by the algorithm froma given initial state. The acceptance ratio is the fraction of times that the actualtransition takes place.

Example 6The simplest example of selection probability is that for an Ising modelwith single spin flip update. In this case a new state is generated byselecting a spin at random out of a total of M spins in the lattice. Thismeans that in this case the selection probability is g(µ → ν) = 1/M ifthe two states differ by a single spin, while g(µ→ ν) = 0 otherwise.

The Metropolis algorithm consists in the choice of the following acceptance ratio:

A(µ→ ν) =

{e−β(Eν−Eµ) ifEν > Eµ

1 otherwise(29)

Hence, if an update selected according to the rule defined by g(µ → ν) leads to alowering of the energy, it is always accepted. Otherwise it is accepted with a probabilityproportional to the Boltzmann factor associated to the energy difference between thefinal and initial state.

For any symmetric choice of selection probability (g(µ → ν) = g(ν → µ)), it iseasy to show that the Metropolis acceptance ratio (Eq. (29)) satisfies detailed balance(Eq. (24)). The proof of this claim is left to the reader.

2.5 Equilibration

The ergodicity and detailed balance conditions imply that the dynamics will convergeto equilibrium state. Given that the initial state is usually chosen at random it maytake a while before the equilibrium distribution is reached. Figure 2 shows differentsnapshots of configurations of the Ising model at different simulation times. The topleft figure is the random initial configuration. As the temperature is below the criticalone the spins tend to aling (ferromagnetic phase). During the simulations, larger andlarger domains with aligned spins are formed in the course of time, until these domainsreach a characteristic size which does not change in the course of the simulation.

To determine τeq the equilibration time in a simulation, one can compute somephysical quantities (like the total magnetization in an Ising lattice) and plot them asa function of time. When these start to fluctuate around a constant value thermalequilibrium is reached. Figure 3 shows a plot of the magnetization per spin of an Isingsystem versus the simulation time. As it can be seen the magnetization reaches anaverage value after about t = 3000 Monte Carlo steps per lattice site6.

6Time in a simulation is usually measured in terms attempted flips per lattice site. This is doneto have a time unit which allows comparison of two simulations of lattices of different sizes.

11

10

20

30

40

50

60

70

80

90

10010 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

10010 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

10010 20 30 40 50 60 70 80 90 100

10

20

30

40

50

60

70

80

90

10010 20 30 40 50 60 70 80 90 100

Figure 2: Configurations of a 100× 100 Ising lattice at T = 0.88Tc obtained from theMetropolis algoritm at different time steps. Top left: initial configuration, Bottom left:104 Monte Carlo steps, Top right: 105 MC steps, Bottom right: 106 MC steps. If wemeasure the time in sweeps, e.g. MC steps per spin the times shown correspond to 1,10 and 100 sweeps.

2.6 Correlation time

A basic requirement for an efficient simulation is that our Markov chain goes througha sufficiently large series of independent states so to make the averages in Eq. (14)sufficiently robusts. In the Ising model with the Metropolis algorithm discussed aboveat most one spin is flipped at each time step. The Ising magnetization for a latticewith N spins

m =1

N

∑i

si (30)

changes by an amount O(1/N) at each Monte Carlo step. The consequences is thatthe new configuration is strongly correlated with the original one and therefore manysteps are needed before this correlation is lost.

To quantify this we can measure the correlation time. Let us suppose that wehave let the simulation run for a sufficient time so to have reached equilibrium. Allthe physical quantities fluctuate around their average equilibrium values. Given, forinstance, m(t) the magnetization of the Ising model at time t, an interesting quantity

12

0 2000 4000 6000 8000 10000time (MC steps per lattice site)

-0.2

0

0.2

0.4

0.6

0.8

mag

netiz

atio

n

exact

Metropolis algorithm (2d Ising model)L = 100 T = 0.98Tc

1000 1100 1200 13000.9

0.92

0.94

0.96

L = 100 T = 0.8333 Tc

Figure 3: Magnetization of a 100 × 100 Ising model in the ferromagnetic phase asa function of the simulation time. In the main frame two random different initialconfigurations were simulated at T = 0.98Tc. After about 3000 Monte Carlo sweepsthe two curves converge to the same average value. The solid line is the exact value ofthe magnetization of the 2d Ising model. Inset: A similar run for a lower temperatureT = 0.8333Tc. Note that the equilibrium magnetization is reached faster (τeq ≈ 1000)compared to the T = 0.98Tc case. Fluctuations around the average magnetization arealso much smaller.

13

is the autocorrelation function7:

χ(t) =

∫dt′ [m(t′)− 〈m〉][m(t′ + t)− 〈m〉] (31)

where 〈m〉 is the average magnetization value, i.e. average over all values measuredduring the Monte Carlo run. Consider a very small value of t in Eq. (92); in this casem(t′ + t) ≈ m(t′) and the integrand will have a positive value, hence χ(t) > 0 for tsufficiently small. For sufficiently long times χ(t) is expected to decay to zero. Thisis because m(t′ + t) is expected to become uncorrelated to m(t′), so if m(t′) > 〈m〉,we expect equally likely a fluctuation of m above or below its average values at timest′+t. When integrated in dt′ the positive and negative contributions will cancel and theintegral in Eq. (92) will vanish. Indeed it does so with a leading asymptotic behaviorof the type:

χ(t) ∼ e−t/τ (32)

This exponential behavior can be justified on the basis of the dynamics describedby the Markov chain as in Eq. (17). Consider a vector p(0) describing the initialconfiguration at time t = 0. We can write it as a linear combination of eigenvectorsof P as p(0) =

∑k ckv

(k). The state vector at a later time t = n∆t is obtained by aproduct with P n as

p(t) = P np(0) =∑k

ckλnkv

(k) (33)

where λk are the associated eigenvalues. As we know the eigenvalue associated to thestationary state v(1) is λ1 = 1.

Given an observable m (as the magnetization in the Ising model), its average overa state given by the vector p can be written as

m =∑µ

mµpµ = M · p (34)

where mµ is the value of the observable in the state µ, which we formally wrote as ascalar product with M . Now using (33) we find

m(t) = M · v(1) +∑k>1

ckλnkM · v(k) = 〈m〉+

∑k>1

ckλnkMk (35)

where we have used the fact that the magnetization in the stationary state equals 〈m〉and we defined Mk = M · v(k). Recalling that n = t/∆t the previous relation identifiesthe relaxation times

τk = − ∆t

log |λk|(36)

from which followsm(t)− 〈m〉 =

∑k>1

ckMke−t/τk (37)

7Although we discuss here the Ising model and the properties of the magnetization, Eq. (92) ismore general and can be defined for any physical quantity

14

Using this expression we find for the autocorrelation function

χ(t) = Ae−t/τ2 +Be−t/τ3 + . . . (38)

which, in its leading behavior, reduces to an exponential decay as anticipated inEq. (32). The form given in Eq. (38) may be a more appropriate fitting in typicalsimulation results.

Example 7To complete the calculation of the Example 2 we add the two othereigenvectors v(2) = (−b, b − a, a) and v(3) = (0, 1,−1). The stationarystate was v(1) = (0, 1/2, 1/2). Note that the sum of the elements of thevectors v(2) and v(3) is zero. This is a general property of any stochasticmatrix, can you prove it?Suppose now that the system is initially in p = (1/3, 1/3, 1/3). Thelinear combination

p =3∑

k=1

ckv(k) = v(1) − 1

3bv(2) +

(1

6− a

3b

)v(3) (39)

The fact that c1 = 1 is a general property of any stochastic matrix. Canyou prove it?

In principle, one should use in the determination of averages as in Eq. (14) only datawhich are taken at time intervals larger than the autocorrelation time τ . In practicethis is not always possible as it would reduce sensibly the number of data points. Onegood practice is to use data taken at time intervals ∆T ; the correlation implies thatthe error estimates of the data averages depend on the correlation time τ . For instancefor uncorrelated data, of say magnetizations, the standard deviation on the mean of nexperimental points, is given by 8

σ =

√1

n− 1

(m2 − m2

)(40)

It can be shown9 that for measurments taken at intervals ∆T one has

σ =

√1 + 2τ/∆Tn− 1

(m2 − m2

)(41)

which reduces to Eq. (40) in the limit ∆T � τ .

8For more details see: M. Newman and G. T. Barkema, Monte Carlo Methods in Statistical Physics,Oxford University Press (1999).

9Muller-Krumbhaar and Binder, J. Stat. Phys. 8, 1 (1973)

15

2.7 Critical slowing down: the dynamical exponent

In the vicinity of the critical point in a system in equilibrium various quantities arecharacterized by power law singularities which are governed by some critical exponents.For instance in the Ising model the magnetization vanishes, by approaching the criticaltemperature from below as:

m ∼(Tc − TTc

)β(42)

The specific heat is also diverging as

c ∼∣∣∣∣T − TcTc

∣∣∣∣−α (43)

Two spins at a distance ~r are correlated for temperatures above and below Tc as follows:

〈s~δs~δ+~r〉 − 〈s~δ〉〈s~δ+~r〉 ∼ e−|~r|/ξ (44)

where ξ is the correlation length. By approaching Tc this quantity diverges as:

ξ ∼∣∣∣∣T − TcTc

∣∣∣∣−ν (45)

The values of α, β ν and γ (for the magnetic susceptibility) are universal, i.e. they de-pend only on the dimensionality and not, for instance, on the type of lattice considered(e.g. triangular, squared . . . ). In the 2d Ising model one has α = 0 (which correspondsto a logarithmic singularity), β = 1/8, ν = 1 and γ = 7/4.

The correlation length ξ is the typical size of cluster of spins pointing towards thesame direction (see Fig. 4). As these regions grow in size in the vicinity of Tc, theMetropolis Monte Carlo algorithm suffers from a so-called critical slowing down, i.e.it becomes increasingly difficult to flip a spin at random as this is likely to be coupledto neighboring spins pointing to the same direction. Hence also the correlation timediverges when approaching the critical point as follows:

τ ∼∣∣∣∣T − TcTc

∣∣∣∣−zν (46)

a relation which defines a new exponent z called the dynamical exponent. This exponentis algorithm dependent. It tells us how fast/slow the correlation time grows in thevicinity of the critical point. An algorithm with optimal performance near Tc wouldhave a small value of z.

In order to calculate z from simulation one usually sets the temperature to thecritical point T = Tc (note Tc is known exactly for the 2d Ising model). We note thatin a finite system of size L, the correlation length cannot diverge but is bounded bythe system size i.e. ξ ∼ L. Using this result and combining Eq. (45) with Eq. (46) wefind

τ ∼ ξz ∼ Lz (47)

16

5

10

15

20

25

30

35

40

45

505 10 15 20 25 30 35 40 45 50

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0 100 200 300 400 500

5

10

15

20

25

30

35

40

45

505 10 15 20 25 30 35 40 45 50

-0.1

-0.08

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0 100 200 300 400 500

Figure 4: Typical spin configurations (top) and magnetization vs. time plots (bottom)for the Metropolis algorithm for the Ising model. Left: T = 2.5, Right: T = 5. Spinsarrange in clusters which are bigger close to the critical temperature Tc ≈ 2.26. Notealso an increase of the autocorrelation time in the vicinity of Tc.

Numerical simulations for the Metropolis algorithm for the 2d Ising model give z ≈2.17. Note that the critical slowing down effect is somehow visible also from the runshown in Fig. 3: fluctuations around the average magnetization value are correlatedfor longer times at T = 0.98Tc compared with the case T = 0.833Tc.

2.8 Cluster algorithms: the Wolff algorithm for the Ising model

We present here an algorithm, which was introduced by U. Wolff in 1989 10 and whichbasically solves the problem of critical slowing down. Differently from the Metropoliscase seen above the Wolff algorithm is a cluster algorithm, i.e. a whole cluster of spinis flipped in each move instead of a single one. There are other cluster algorithms,which however will not be discussed here. For more details see the book by Newmanand Barkema (see footnote 8).

In the Wolff algorithm a cluster is constructed as follows: take a starting seed spinat random and look at its neighboring spins. Those with the same sign as the seedspin are added to the cluster with a probability Padd, while they are excluded fromthe cluster with probability 1 − Padd (opposite spins than the seed spin are ignored).Our main preoccupation is the fulfillment of the detailed balance condition. Let us

10U. Wolff, Phys. Rev. Lett. 62, 361 (1989).

17

+

+

+

− − − −

− − −

− −

−

− −

− − − −

− −

−

−

−

−

− −

− − −−

− − − −

−−−−

− − −

−−− −

−

+

+ + +

++ +

+

+

+ +

+

+

µ ν

Figure 5: Cluster flipping in the Wolff algorithm. A cluster is composed by a connectedset of identical spin; its boundary is indicated as a dashed line in the example. Thespins encircled in both configurations are not included in the cluster.

consider two states µ and ν which differ from the flip of a single cluster. Apart fromcommon prefactors we have for the selection probabilities g(µ→ ν) ∝ (1−Padd)m andg(ν → µ) ∝ (1 − Padd)n, where m and n count the number of bonds between a spinin the cluster and an equal spin outside the cluster (in the example of Fig. 5 one hasm = 1 and n = 11).

The detailed balance condition reads:

P (ν → µ)

P (µ→ ν)=g(ν → µ) A(ν → µ)

g(µ→ ν) A(µ→ ν)= (1− Padd)n−m

A(ν → µ)

A(µ→ ν)= e−β(Eµ−Eν) (48)

For the Ising model the energy difference is Eµ−Eν = (n−m)2J hence one can rewritethe detailed balance condition as

A(ν → µ)

A(µ→ ν)=[(1− Padd)e2βJ

]m−n(49)

Now if we choose Padd as follows:

Padd = 1− e−2βJ (50)

then the right hand side of Eq. (49) becomes identically equal to 1. In this case we canchoose the acceptance ratios always equal to one for any transition A(µ→ ν) = 1, i.e.the cluster flipping is always accepted. Note that Eq. (50) implies that spins are addedto the cluster with a probability that is temperature dependent. At high temperaturesβ → 0 one has Padd → 0 i.e. clusters are very small, comprising typically only theseed spin. In this limit the Wolff algorithm becomes a single spin flipping algorithm.In the limit of very low temperatures β → ∞ one has Padd → 1. Clusters become aslarge as the whole lattice. At each time step the system flips almost all spins swappingbetween states with positive and negative magnetizations. Both in the low and hightemperature phases the Wolff algorithm is not very efficient and the simple Metropolisalgorithm works better.

A final remark concerning the Wolff algorithm is the unit of time. One could takea cluster flip τflip as a unit of time, but since the size of the Wolff cluster is different

18

-4 -2 0 2 4β∆E

0

0.2

0.4

0.6

0.8

1

Acc

epta

nce

ratio

MetropolisHeat Bath

Figure 6: Acceptance ratios as functions of β∆E, where β is the inverse temperatureand ∆E the energy difference, for the Metropolis (solid line) and Heat Bath algorithms(dashed line).

at low and high temperatures this is not a good time unit. In a d dimensional latticemodel of size L, which contains then Ld spins a more appropriate time unit is

τ = τflip〈n〉Ld

(51)

where 〈n〉 is the average number of spins in the cluster. This time unit is directlyrelated to that of the Metropolis algorithm. For instance, at low temperatures theclusters have the whole lattice size, i.e. 〈n〉 → Ld. Hence τ ≈ τflip, as a single clusterflip corresponds to a sweep of the whole lattice. When the time unit of Eq. (51) isused one finds that the correlation time at the critical point scales as a power-law asa function of the lattice size, as in Eq. (47). For the two-dimensional Ising model onefinds z ≈ 0.25. This shows that the Wolff algorithm solves the problem of the criticalslowing down.

2.9 Other single flip algorithms: the heat bath algorithm

The Metropolis algorithm is not the only single spin flip algorithm possible. We canstill impose the condition of detailed balance (Eq. (24)) in many other possible ways.One possibility is given by the heat bath algorithm which for the Ising model has thefollowing form:

A(µ→ ν) =e−βEν

e−βEν + e−βEµ=

e−β∆E

e−β∆E + 1(52)

where we have defined ∆E = Eν − Eµ. This acceptance ratio matches that of theMetropolis algorithm (Eq. (29)) in the two limiting cases β∆E → −∞, where A → 1and β∆E large and positive which gives A ≈ e−β∆E. The difference between heat bathand Metropolis is clear at ∆E = 0. In the Metropolis case A = 1, while in the heat

19

Metropolis - Potts q=10 , Temp=0.5, MCsweeps=1000

2

4

6

8

10

12

14

16

18

202 4 6 8 10 12 14 16 18 20

-1.8

-1.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

0 200 400 600 800 1000

Heat Bath - Potts q=10 , Temp=0.5, MCsweeps=1000

2

4

6

8

10

12

14

16

18

202 4 6 8 10 12 14 16 18 20

-2

-1.8

-1.6

-1.4

-1.2

-1

-0.8

-0.6

0 200 400 600 800 1000

Figure 7: (Top) Configurations of a 20×20 10-state Potts model evolving according tothe Metropolis algorithm (left column) and Heat Bath algorithm (right column), after103 Monte Carlo sweeps. The Heat Bath algorithm is more efficient than Metropolis forlarge values of q (faster equilibration), as seen from the energy vs. time plots (bottom).Here T = 0.5 corresponds to a very low temperature.

bath A = 1/2. In other words a flip that does no vary energy is always accepted in theMetropolis move, while it is accepted only half of the time in the Heat Bath algorithm.

The Heath Bath algorithm is not very efficient and should not be used in MonteCarlo simulation of the Ising model, however it turns out to be a good algorithm forother models. An example is the q-state Potts model. This model (see Appendix A)is a generalization of the Ising model in which the spins can take q different values, asopposed to the Ising case in which si = ±1. We do not enter in the details on howto define the Heat Bath algorithm for the q-state Potts model (see problems session).This is a generalization of Eq. (52) to a q state case. Figure 7 shows some snapshots ofconfigurations of the q = 10 case evolving according to the Metropolis algorithm andto the Heat Bath algorithm. The simulated temperature is very low, so the equilibriumconfiguration almost all spins are aligned to one of the q-values. Figure 7 shows that theHeat Bath algorithm evolves more rapidly towards the configuration with all alignedspins.

20

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Lattice = 100 x 100 T/Tc = 0.5

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Lattice = 100 x 100 T/Tc = 0.8

Figure 8: Typical interface configurations in the Ising model with non-local Kawasakidynamics at two different temperatures: left T = 0.5 Tc, right T = 0.8 Tc.

2.10 Conserved order parameter Ising model

There are some situations in which one would like to study the properties of interfacesbetween two phases. The Ising model can be useful for that purpose as well. Below thecritical temperature T < Tc it has two ferromagnetic phases with average magnetization±m(T ). In the limit T → 0 one has m(T ) → 1. In the example shown in Fig. 3 thesystem is prepared in a random configuration with average magnetization zero and thenevolves towards a positive magnetization. The Metropolis algorithm could have acuallychosen the negative magnetization state, with 50% chance, but the system ends up tobe dominated by a single phase. If we divide the lattice in two identical symmetricalparts and place spins up on one side and spins down in the other, the single spin flipalgorithm will finally select on of the two phases.

To have a stable interface we need something different than the spin flip algorithm.For this purpose it is more adapted the Kawasaki algorithm. The principles are verysimilar as the spin flip, with the only difference that the Kawasaki algorithm selectpairs of neighboring spins and swaps them. The acceptance ratio is taken to follow thestandard Metropolis rule of Eq. (29). As move consists in interchanging two spins witheach other, then the total magnetization

M =∑i

si (53)

is conserved. Under the Kawasaki dynamics spins move diffusively through the lattice.Hence, it may take a long time before equilibration is reached. A more efficient versionof the algorithm is the non-local version in which spins far away are swapped. In thisimplementation one creates a list of locations up spins and a list of locations of downspins. One then selects a spin from the up list and a spin of the down list, with a

21

probability independent of their physical position in the lattice. The acceptance ratiois still given by Eq. (29). The non-local moves imply a faster equilibration. Moreoverby selecting at each step a plus and minus spin one avoids pairs of equal spins whichonly leads to a waste of computer time.

2.11 Monte Carlo in continuous time

In some systems we may have defined a Markov chain with the properties that thesystem remains locked into certain states for many time steps:

µ→ γ → γ → . . .→ γ → ν . . . (54)

which gives a very slow dynamics to the system. We could circumvent this problem bycomputing Tγ, the average time spent into a certain state γ and impose to the systemto evolve into a different state. This is what the continuous time Monte Carlo methoddoes. How do we calculate Tγ? Let p = P (γ → γ) and q = 1 − p =

∑ν 6=γ P (γ → ν)

(we have used the normalization of Eq. (16)). One then has:

Tγ = ∆t+∞∑n=0

npn q = ∆tp

q≈ ∆t∑

ν 6=γ P (γ → ν)(55)

where ∆t is the discrete time unit of the Markov chain and where we have used q �p ≈ 1. The second step is to select the new state, which is chosen according to theprobability

g(γ → µ) =P (γ → µ)∑ν 6=γ P (γ → ν)

(56)

The disadvantage of this method is that for each given state γ one needs to compute thetransition probabilities P (γ → ν) for all possible ν, in order to compute the quantitiesin Eqs. (55) and (56). In some cases, with an appropriate choice of the transitionprobabilities, the continuous time method can be implemented in an efficient way. Anexample of an implementation for the conserved order parameter Ising model can befound in the book of Newman and Barkema (see footnote 8).

2.12 Continuous systems: Lennard-Jones fluids

All we have discussed for the Ising model can be applied to other types of systems. Wediscuss briefly here Monte Carlo simulations of classical fluids, in particular Lennard-Jones fluids. These systems of N particles occuping a volume V and at temperatureT , subject to a pairwise interaction potential of the type:

VLJ(r) = 4ε

[(σr

)12

−(σr

)6]

(57)

where ε is the depth of the minimum of the potential and σ a characteristic distance atwhich the potential is zero. This potential contains two terms: a short range repulsiveforce diverging at short distances as 1/r12 and a long-range attraction decaying as 1/r6.

22

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

i

Figure 9: In the Monte Carlo algorithm for a continuous fluid system a particle isselected at random and a move to a new position is attempted. The new configurationis accepted according to the Metropolis rule.

The latter describes dispersive van der Waals forces due to fluctuating dipoles. Theshort range term is due to Pauli exclusion principle. The term 1/r12 was just chosenbecause of ease of the computation (it is the square of the attraction term).

The partition function of the system is expressed as an integral over positions ~riand momenta ~pi of all particles i = 1, 2, . . . N as:

Z =

∫d~r1 . . . d~rNd~p1 . . . d~pN e

−β(∑

i

p2i2m

+V (~r1...~rN )

)(58)

where the quantity between parenthesis in exponential is the total energy of the system,the sum of kinetic and potential energies. The average value of an observable A, whichis a function of position and momenta (A(~r1 . . . ~rN , ~p1 . . . ~pN)) takes the form:

〈A〉 =1

Z

∫d~r1 . . . d~rNd~p1 . . . d~pN A(~r1 . . . ~rN , ~p1 . . . ~pN) e

−β(∑

i

p2i2m

+V (~r1...~rN )

)(59)

Note that this equation is the continuous conterpart of Eq. (10). We consider observ-ables which are function of either position or momenta. The functions of momenta areeasy to deal with since they can be integrated analitically as these are gaussian inte-grals. The computations of position-dependent quantities cannot be done analyticallyand this is where numerical simulations as the Monte Carlo method are of interest.

In the case of a Lennard-Jones fluid the potential is a sum of pair potentials

V (~r1 . . . ~rN) =∑i<j

VLJ(rij) (60)

where rij = |~ri − ~rj|. This however does not make computations easier. For instance,to compute the average distance between two given particles 〈|~r1 − ~r2|〉 one has tointegrate over terms containing exponential of the type exp(−(VLJ(r12) + VLJ(r13) +VLJ(r14) + . . .)/kBT ), which cannot be computed exactly.

23

A Monte Carlo algorithm for this system works as follows. A particle is selected atrandom and a move is attempted ~ri → ~r ′i (see Fig. 9). This is obtained by selectinga shift of the coordinates δx, δy and δz from a uniform distribution on [−∆,∆]. Thevalue of ∆ is chosen to be of the order of the interparticles distance. To compute theacceptance ratio one needs the energy difference between new and old configuration.If the particle i is moved then the energy difference is

∆E =∑j 6=i

[VLJ(r′ij)− VLJ(rij)

](61)

where r′ij = |~r ′i − ~rj|. The acceptance ratio is as in the standard Metropolis case:A(~ri → ~r ′i) = 1 if ∆E ≤ 0 and A(~ri → ~r ′i) = exp(−∆E/kBT ) if ∆E > 0.

One of the difficulties of this procedure is that the interaction is long range. Thetotal potential energy is obtained by summing over all pairs of particles, which meansa sum of N(N − 1)/2 terms. This is very large in a simulation with a large number ofparticles. This is why the range of the interaction is often truncated. One simulatesfor instance the modified potential

Vtrunc(r) =

{VLJ(r) if r < rc

0 otherwise(62)

where rc is the truncation radius (typically rc ≈ 2− 3σ).So given a particle, only those at a distance smaller than rc from it need to be

considered. This can be practically implemented by forming a list of all particles whichare within a radius rc from the given particle. There is one list for every particle, whichneeds to be updated at any accepted move. Apart from these technical difference istherefore the algorithm very similar to that introduced for the Ising model.

3 Out of equilibrium systems

In Monte Carlo simulations for equilibrium processess the two basic requirements thatan algorithm has to fulfill are (1) Ergodicity and (2) Detailed Balance. As seen in theIsing model there are many algorithms for which these conditions are both satisfied; thealgorithms can be of single spin flip (Metropolis, heath bath . . . ) or cluster type (Wolffalgorithm . . . ). In non-equilibrium simulations there is unfortunately not such a largechoice of algorithms. First of all we need to build up a Markov chain characterizedby some transition probabilities for which P (µ → ν), where ∆t has the significanceof a real time and not of an algorithmic time any longer. For instance in the Isingmodel the cluster algorithms are of no use when studying dynamical properties: it isnot a realistic dynamics one which swaps large clusters at low temperatures and verysmall ones at high temperatures. The non-local Kawasaki dynamics is equally non-physical as it involves the swapping of opposite spins with rates which do not depend ontheir physical distance. In conclusion: when studying dynamical properties one has torestict to local algorithms. Differently from equilibrium statistical mechanics, the non-equilibrium systems do not have such a developed formalism which relies on partition

24

functions, free energies etc . . . , hence each system has to be studied separately. In whatfollows we present an example of Monte Carlo simulations applied to non-equilibriumsystems.

3.1 Coupled chemical reactions

In many processes in physics and chemistry one deals with systems which can bedescribed as coupled chemical reactions.

In a most general framework the chemical reactions can be written as

m1A1 +m2A2 +m3A3 + . . .+mrArki−→n1A1 + n2A2 + n3A3 + . . .+ npAp (63)

with mi, ni stochiometric coefficients and Ai different types of particles. The reactionrate is defined as follows: kidt is the probability that a given r-uple of particles willreact according to reaction Eq. (63) in a time interval [t, t + dt]. Note that a givenreaction can be reversible, with a forward rate and a possibly different backward rate. Ingeneral one needs not to attach to the “particles” the significance of chemical moleculesor atoms. The next example, for instance, is a basic model for population dynamics,where A1 and A2 can be view as preys and predators.

Example 8The Lotka-Volterra model, also known as predator-prey model, is definedas follows. Consider two species A1 (the prey) and A2 (the predator)undergoing the following reactions:

A1k1−→ 2A1 (64)

A1 + A2k2−→ 2A2 (65)

A2k3−→ ∅ (66)

where ki, with i = 1, 2, 3 are the rates for each reaction.

Let us suppose we have in our system N different types of particles A1, A2 . . .AN .We assume the system to be at all times homogenous, so that spatial effects can beneglected. This happens if diffusion is sufficiently fast to efficiently mix the particles.Under this assumption a state of the system at time t is given by the N non-negative in-tegers (X1, X2, . . . XN). Let us also suppose that there are in total M possible chemicalreactions occurring in the system, each characterized by a rate ki with i = 1, 2 . . .M .Let P (X1, X2, . . . XN , t) the probability of finding the system on a state (X1, X2, . . . XN)at time t.

Our aim is to set up a Master equation for P . Let us consider a reaction as Eq. (64),if at time t there are X1 particles of type A1 the total probability that the reaction 1occurs in [t, t+ dt] is equal to X1k1dt. For the reaction of Eq. (65) one has X1X2k2dt.The actual rate for a reaction is

ai = ki{number of reagents combinations for reaction i} (67)

25

Hence the Master equation takes the form

∂

∂tP (X1, X2, . . . XN , t) = −

M∑i=1

ai P (X1, X2, . . . XN , t) +M∑i=1

Bi (68)

where as discussed in Appendix B the two terms in the right hand side of the equationare the loss and gain terms, respectively. In order to simplify the equation somewhat wehave introduce a shorthand Bi for the gain terms. If reaction l is of the type Aj → Aione has:

Bl = klXjP (X1, X2, . . . , Xj + 1, . . . , Xi − 1, . . . XN , t) (69)

3.1.1 Rate equations

The rate equations describe the time evolution of the average number of particles.starting from the Master equation one can derive an equation for the averages:

〈Xl〉 =∑{X}

XlP (X1, X2, . . . XN , t) (70)

with l = 1, 2 . . . N . The problem is that in general one cannot deduce equations inclosed form. Due to the presence of products of the particle numbers Xk in the ratesai, see Eq. (67), one obtains equations of the type

d

dt〈Xl〉 = gl (〈X1〉, 〈X2〉, . . . , 〈XN〉 . . . , 〈XiXj〉, . . . , 〈XkXpXq〉, . . .) (71)

Neglecting correlation, i.e. approximating 〈XiXj . . . Xk〉 ≈ 〈Xi〉〈Xj〉 . . . 〈Xk〉 one ob-tains differential equations in closed form. These equations are not exact, and describeonly the evolution of the average number of particles, in a deterministic way. Theydo not describe fluctuations around these averages. If the number of all particles islarge, i.e. 〈Xi〉 � 1, the rate equations are expected to provide an accurate descriptionof the evolution of the system. If the number of particles is low, the fluctuation maydominate and the rate equations description is no longer fulfilled.

Example 9The rate equations associated to the Lotka-Volterra system are:

d

dtX1 = k1X1 − k2X1X2 (72)

d

dtX2 = k2X1X2 − k3X2 (73)

where to simplify the notation we have omitted the parenthesis 〈.〉.There are two stationary solutions (i.e. time independent) X1 = X2 = 0and X1 = k3/k2, X2 = k1/k2. The system has oscillatoric solutions.To see this we expand the solution around the stationary point X1 =ε + k3/k2 and X2 = δ + k1/k2, using an expansion to linear orders in εand δ. We find ε(t) = ε0 cos(ωt+ φ), δ(t) = ε0

√k1/k3 sin(ωt+ φ).

26

3.1.2 Stochastic simulations: The Gillespie algorithm

The Gillespie algorithm 11 provides a method to simulates exactly the Master equation.Let us consider a configuration given as (X1, X2 . . . XN) at a time t. We ask our-

selves what is the probability that the system remains in this state until a time t + τand that the next reaction occurs in [t+ τ, t+ τ + dτ ] and that the reaction is of typej. We denote this probability as P (τ, j). We can write:

P (τ, j) = P0(τ) ajdτ (74)

where ajdτ is the probability that the reaction j occurs in in the interval [t+τ, t+τ+dτ ],while P0(τ) is the probability that no reactions occur in [t, t+ τ ].

It is simple to set up a differential equation for P0(τ). We have

P0(τ + dτ) = P0(τ)

[1−

M∑i=1

aidτ

](75)

which can be easily integrated and with the initial condition P0(0) = 1 yields:

P0(τ) = e−a0τ (76)

where a0 =∑M

i=1 ai. This quantity is the total rate that some of the M reactionsoccurr.

Gillespie Algorithm

1) Given that at time t the system is in the state (X1, X2 . . . XN).calculate the rates ai for all M chemical reactions

2) Generate a random τ from the distribution e−a0τ and update thetime to t = t+ τ

3) Choose one of the M reactions using as probabilities ai/a0.

4) If the reaction chosen is j update the particle numbers(X1, X2 . . . XN) accordingly and go to step 1)

11Although it is 30 years old it is still worth reading the original paper: D. T. Gillespie, ExactStochastic Simulation of Coupled Chemical Reactions, J. Phys. Chem. 81, 2340 (1977).

27

50 100 150 200 250 300 350

5010

015

020

025

030

035

040

0

Y1

Y2

0 5 10 15 20 25

010

020

030

040

0

time

Y1

& Y

2

Figure 10: Simulation of the Lotka-Volterra model with the Gillespie algorithm, withk1 =, k2 = and k3 =. Left: trajectory in the plane Y1, Y2. Right: plot of Y1(t) andY2(t).

4 Problems

4.1 Non-uniform random numbers

Monte Carlo simulations are based on random sampling of the quantities to compute.This sampling relies upon a random number generator (RNG). Most high level lan-guages (java, C++, Matlab, ...) contain RNGs that generate (quasi-)random numbersuniformly between 0 and 1. For instance in Matlab (or Octave) this is done by thefunction rand(). In Matlab one also has randn(), which generates random numbersfrom a Gaussian distribution12. In many Monte Carlo problems it is important togenerate non-uniform random numbers. This exercise presents two different ways todo so.

Let us suppose that we want to generate random numbers with a given distributionf(x). Here f(x) ≥ 0 and it is normalized

∫ +∞−∞ f(x)dx = 1. Examples are f(x) = e−x

in the interval [0,+∞) (and zero otherwise), or f(x) = 1− x2 in [−1, 1]. We considerthe cumulative distribution function:

F (x) =

∫ x

−∞dy f(y) (77)

for which 0 < F (x) ≤ 1. If we now draw random numbers xi from the uniformdistribution in [0, 1] then the random numbers yi = F−1(xi) will be distributed withprobability density f(x). To show this is easy and left to you as exercise.

12Try hist(rand(1,10000)) and hist(randn(1,10000)) to plot the histograms of the randomnumbers generated by the two functions

28

This first approach requires that the function f(x) can be integrated analyticallyand that F (x) can be inverted. This is not the case for instance for13 f(x) = e−x

2/√π.

Another way to generate random number with a given distribution f(x) on a finiteinterval (meaning that f(x) 6= 0 only in some interval [a, b]) is the hit-and-miss method.Let f(x) be defined in [a, b] and let M such that M > f(x) for every x in [a, b]. Wenow generate two random numbers t and s uniformly distributed in [a, b] and [0,M ],respectively. The case s > f(t) corresponds to a miss and the generated numbers areneglected. If instead s ≤ f(t), we keep and store the value of t, which is the outputof our RNG algorithm. Note that s is only a checking variable and its value is notreturned.

Note: the hit-and-miss method has a drawback. All the misses are generated andnot actually used. This is a waste of computer resources.

Questions

• Using uniform random numbers in the interval [0, 1] and the inverse cumulativedistribution function as defined in Eq. (77), generate random numbers (plot thehistograms) which are distributed

– uniformly in [−2, 1]

– exponentially in R+: f(x) = e−x for x ≥ 0 and f(x) = 0 for x < 0

– linearly in the interval [−1, 1]: f(x) = (x+ 1)/2 and zero elsewhere.

• Using the hit-and-miss method generate random numbers (plot the histograms)which are distributed

– linearly in the interval [−1, 1]: f(x) = (x+ 1)/2 and zero elsewhere.

– distributed as f(x) = −x(1− x)e−x2

log(1− x) in [0, 1]

• Using the hit-and-miss method generate random points in a two-dimensionalplane uniformly distributed inside a circle of radius R = 1.

• Generate the same uniform distribution (without the hit-and-miss method) usingan appropriate distribution function f(r, θ), where r and θ are polar coordinates.

4.2 Gaussian RNG

A way to generate random numbers from a Gaussian distribution is by the so-calledBox-Muller transformation. Suppose that we have two numbers x and y drawn from astandard Gaussian distribution. Their combined density function then becomes

f(x, y)dxdy =1

2πexp

(−x

2 + y2

2

)dxdy (78)

13A generator of Gaussian random numbers can be obtained using an appropriate transformation(the Box-Muller transformation).

29

Transforming this into polar coordinates gives

f(r, θ)drdθ =1

2πexp

(−r

2

2

)rdrdθ (79)

So generating random numbers r and θ according to above distribution and transform-ing them back to cartesian coordinates would yield two random numbers x and y whichare standard Gaussian distributed. The θ angle is drawn from a uniform distributionin [0, 2π]. The r-dependent part is solvable as before, since here we can calculate andinvert the cumulative density function:

F (r) =

∫ r

0

exp

(−z

2

2

)zdz (80)

So generating a uniform θ, and a r by inverting F (r) allows us to compute two inde-pendent Gaussian random numbers x and y by making the transformation from polarto cartesian coordinates.

Questions

• Program above algorithm and show that this indeed generates couples of normallydistributed random variables.

• How do we need to adapt the algorithm when we want the random numbers tobe distributed according to a Gaussian distribution with average µ and varianceσ2?

4.3 Monte Carlo integration (I)

In this exercise we will estimate the value of the following integral with Monte Carlomethods:

I =

∫ ∞−∞

exp

(−r

2

2

)(x+ y + z)2dxdydz (81)

where r2 = x2 + y2 + z2.To perform the calculation we take the factor e−r

2/2 as a density function on thedomain of the integral. When doing so, one has to be careful with the normalizationfactor. This integral can also be calculated analytically, and also using another type ofMonte Carlo sampling based on the Metropolis algorithm (see next exercise). We canthus compare these different methods.

Questions

• Compute I analytically.

• Use a Gaussian RNG to generate x, y and z and estimate the integral with thismethod. Compare this result with the analytical result.

30

• Calculate the variance σ2 of the sampled values for different numbers of sampledpoints N , and show that σ2 ∼ 1/N .

• Why can’t we turn the two factors around: Use (x+y+z)2 as density, and e−r2/2

as integrand?

4.4 Monte Carlo integration (II)

Here we repeat the calculation of the same integral using a different Monte Carlomethod: the Metropolis algorithm. We need for this purpose to define a Markov chaincharacterized by some transition probabilities P (~r → ~r ′). We would like that the“dynamics” so defined converges to the distribution w(~r) = (2π)−3/2 exp(−r2/2).

We know that if we require detailed balance, i.e. that the rates for any pairs ofpoints ~r and ~r ′ fulfill

q(~r)P (~r → ~r ′) = q(~r ′)P (~r ′ → ~r) (82)

then the dynamics converges to the distribution q(~r). Therefore we choose the transi-tion probabilities such

P (~r → ~r ′)

P (~r ′ → ~r)=w(~r ′)

w(~r)= exp

(−r′2 − r2

2

)(83)

The transition probabilities are usually split as P (~r → ~r ′) = g(~r → ~r ′)A(~r → ~r ′)where g is the selection probability and A the acceptance ratio.

The g is generated as follows: starting from an initial point ~r we select a new point~r ′ = ~r + ~δ where δ = (δx, δy, δz) is a random shift and the three components areindependently chosen from Wh(δ) a uniform distribution centered in 0 and of width h

Wh(δ) =

{1/h if |δ| < h/20 otherwise

(84)

The acceptance ratio for the Metropolis algorithm is given by:

A(δ) =

{1 if r′2 − r2 < 0

exp(− r′2−r22

) otherwise(85)

Shortly: we jump to a new site and accept this jump if the new site is closer to theorigin. If we are further away from the origin we accept the jump with a probabilityexp(− r′2−r2

2) < 1.

31

Questions

• Using the Metropolis algorithm plot the trajectory followed by the x, y-coordinatesof the points using N = 104 steps starting from an initial position at the origin~r = (0, 0, 0) and at the point ~r = (10, 10, 10). Repeat the calculation for h = 1and h = 0.1. By plotting the histogram of the positions for a run with N = 106

points verify indeed that the dynamics generates an equilibrium distribution ofpoints proportional to exp(−r2/2).

• Using the Metropolis algorithm estimate numerically the integral and compareits value to that of the previous exercise.

• Depending on the starting point ~r a given number of Metropolis steps Neq arenecessary for “equilibration”. Estimate Neq for an initial point at ~r = (10, 10, 10)for h = 1 and h = 0.1.

• Which algorithm do you expect is better for this calculation: the Gaussian ran-dom number generator of the first exercice session of the Metropolis algorithmdiscussed here?

4.5 Stochastic matrices

A N×N matrix P is called stochastic if a) all its elements are non-negative (∀i, j Pij ≥0) and b) the sum of all columns are equal to 1, i.e. ∀i:

N∑i=1

Pij = 1 (86)

Given a system with N states, the stochastic matrix generates the dynamics of thesystem. The element Pij is the transition rate that the system does in a time interval∆t from the state j to the state i. One also uses the notation Pij = p(j → i).

Therefore if w(t) is a vector, whose elements wi(t) are probabilities of finding thesystem in a state i at time t then at time t+ ∆t, the probabilities are the elements ofthe vector obtained by multiplication with the matrix P :

w(t+ ∆t) = Pw(t) (87)

Note that a state of the system is described by a vector of non-negative entrieswi ≥ 0. We will call a non-negative vector a vector whose elements are all non-negative.

32

Questions

• Show that given a non-negative initial vector w the stochastic matrix generatesnon-negative vectors at all later times.

• Show that the product of two stochastic matrices is a stochastic matrix

• We introduce now the norm ||w||1 =∑

j |wj|. Consider an arbitrary vector y(with entries which can also be negative).

Show that the following inequality holds

‖Py‖1 ≤ ‖y‖1 (88)

and deduce that if λ is eigenvalue of P then |λ| ≤ 1

• Show that a stochastic matrix has at least an eigenvalue λ = 1. (Hint: think ofa suitable left eigenvector of the stochastic matrix.)

4.6 Detailed balance

Consider a system with three states with energies E1 < E2 < E3. The dynamics issuch that the system can make these transitions 1→ 2, 2→ 3 and 3→ 1, or stay.

1) Assigning arbitrary rates to these transitions write the 3 × 3 stochastic matrixwhich describes the evolution of the process.

2) Does this dynamics satisfy detailed balance? Is it ergodic?

3) Show that the rates can be chosen such that the equilibrium distribution is theBoltzmann distribution, i.e. that the probability of finding the system in thestate i is

pi =e−βEi

Z(89)

where Z =∑3

i=1 e−βEi is the partition function.

Note: this problem is an example of the fact that “detailed balance” is not a neces-sary condition to get the Boltzmann distribution as equilibrium of the dynamics

4.7 Ising Model: Uniform sampling

In the two dimensional Ising model spins si = ±1 are located at the vertices of a squarelattice. The energy of a configuration is given by

E({sk}) = −J∑〈ij〉

sisj (90)

33

where the sum is extended to the four nearest neighboring spins. Each spin has fournearest neighbors, two in the vertical and two in the horizontal directions. The positiveconstant J > 0 represents the strenght of the interaction. We set it here equal to J = 1.

We consider now a Monte Carlo algorithm which samples uniformly all configura-tions of the model, just to show that this is very inefficient sampling.

Generating a lattice configuration and boundary conditions

The first thing we have to do is to generate a square lattice, which is not very difficultin most programming languages. We could use e.g. a matrix in Matlab, or a doublearray in C. One of the things we now must decide is what boundary conditions we willtake.

The most used boundary condition is the periodic boundary condition. This is easyto program with e.g. the modulo operation (the symbol % in C, and the function mod

in Matlab). We indicate in what follows with s(i, j) the value of the spin at latticeposition (i, j). With periodic boundary conditions, the neighboring spins are givenby14:

• The spin above and below are s(mod(i± 1 +N,N), j)

• The spin right and left are s(i,mod(j ± 1 +N,N))

There is however a second type of boundary conditions that is numerically muchfaster: helical boundary conditions. We label now the N2 spins as a vector s(i) and fillup the lattice from bottom left to top right as in the example in Fig. 12(b). The N×Nlattice is repeated in all directions, but shifted of one unit vertically. The consequencesare:

• The spin above and below are s(mod(i±N +N2, N2))

• The spin left and right are s(mod(i± 1 +N2, N2))

These are the boundary conditions that we will use.

Uniform sampling

We now go back to the problem of uniform sampling. The algorithm works as follows

Let ω be an empty vector

1) Generate a random configuration of the N2 spins.

2) Calculate the energy E of the generated configuration from Eq. (90) and updatethe vector by adding an extra element equal to E to it (in Matlab ω = [ω E]).

3) Go back to 1.

34

(3,1)(2,1) (4,1)

(4,2)

(4,3)

(4,4)

(3,1)

(3,1)

(3,1)

(1,1)

(2,2)

(2,3)

(2,4)

(1,2)

(1,3)

(1,4)

(3,1)(2,1) (4,1)

(4,2)

(4,3)

(4,4)

(3,1)

(3,1)

(3,1)

(1,1)

(2,2)

(2,3)

(2,4)

(1,2)

(1,3)

(1,4)

(3,1)(2,1) (4,1)

(4,2)

(4,3)

(4,4)

(3,1)

(3,1)

(3,1)

(1,1)

(2,2)

(2,3)

(2,4)

(1,2)

(1,3)

(1,4)

(3,1)(2,1) (4,1)

(4,2)

(4,3)

(4,4)

(3,1)

(3,1)

(3,1)

(1,1)

(2,2)

(2,3)

(2,4)

(1,2)

(1,3)

(1,4)

(3,1)(2,1) (4,1)

(4,2)

(4,3)

(4,4)

(3,1)

(3,1)

(3,1)

(1,1)

(2,2)

(2,3)

(2,4)

(1,2)

(1,3)

(1,4)

(a)

Figure 11: Periodic boundary conditions.

If we run this algorithm k times we generate a vector containing all the energiesof the sampled configurations ω = (E1, E2, . . . Ek). Instead of writing the total energyin the vector we could also write the energy per bond e = E/2N2. This quantity isprobably more informative15 since it varies in the interval [-1,1], where −1 is the energyof one of the two ground states where spins are either all s = +1 or all s = −1. Canyou draw the configuration with energy e = 1? Note: instead of storing the vectorwith all the values of the energy you can also generate an histogram by dividing [-1,1]in M equally spaced intervals and by binning the values of e in those intervals.

Boltzmann sampling with hit-and-miss method

We use now the hit-and-miss method to generate energies distributed according toexp(−βE), where β = 1/kBT is the inverse temperature. Select a value of β first

14Be careful with the mod operation on negative numbers.15e is an intensive variable, its range does not change if we change the lattice size.

35

(12)

(16)

(2)(1)

(5)

(9)

(13) (14)

(10)

(6) (7)

(11)

(15)

(8)

(4)(3)

(12)

(16)

(2)(1)

(5)

(9)

(13) (14)

(10)

(6) (7)

(11)

(15)

(8)

(4)(3)

(12)

(16)

(2)(1)

(5)

(9)

(13) (14)

(10)

(6) (7)

(11)

(15)

(8)

(4)(3)(12)

(16)

(2)(1)

(5)

(9)

(13) (14)

(10)

(6) (7)

(11)

(15)

(8)

(4)(3)

(12)

(16)

(2)(1)

(5)

(9)

(13) (14)

(10)

(6) (7)

(11)

(15)

(8)

(4)(3)

(b)

Figure 12: Helical boundary conditions.

and generate an empty vector ω. We indicate with E0 the ground state energy (E0 =−2N2).

1) Generate a random configuration of the N2 spins.

2) Calculate the energy E of the generated configuration from Eq. (90).

3) Generate a uniform random number r in [0, 1].

4) If r ≤ exp(−β(E − E0)) update the vector ω as above, by adding the energy E(this is a hit!).

5) Go back to 1.

Note that the hit-and-miss method at an infinite temperature (β = 0) reduces tothe uniform sampling discussed above.

Questions

• For a given N (e.g. N = 16) generate the histogram of energies per bond(e = E/2N2) using the uniform sampling method running the algorithm a largenumber of times (say 104 − 105).

36

• Use the hit-and-miss method to generate configurations with temperatures T =105, T = 103, T = 102 and T = 10. (Note: we will always take kB = 1 throughoutthis course). What is for each of the four temperatures the fraction of the numberof hits over the total number of configurations generated? Conclude that the hit-and-miss method performs very poorly.

• Another possibility is to use the so-called reweighting technique. From the his-togram of the total energy generated by the uniform sampling P (E), we cangenerate that for a given temperature T by PT (E) = exp(−E/T )P (E). Plotthe histograms for T = 105, T = 103, T = 10 and T = 2. Does this looks likea reasonable method to you? We will later compare these with those obtainedfrom the Metropolis algorithm.

4.8 Metropolis algorithm for the Ising model

We discuss here the Metropolis algorithm for the Ising model. In this algorithm spinconfigurations are sampled by a Markov chain, which is characterized by a transitionprobability P (µ → ν) where µ and ν are two spin configurations. In the algorithmP (µ → ν) 6= 0 only if the two configurations differ by a single spin. We initialize thesystem by selecting a random configuration µ of a N ×N lattice with helical boundaryconditions. The practical implementation of the algorithm is as follows:

1) Select one of the N2 spins at random, say spin s(i).

2) Calculate ∆E = Efin−Ein, where Ein is the energy of the initial configuration andEfin the energy of the configuration obtained by flipping the spin s(i)→ −s(i).

3) Flip the spin s(i) → −s(i) if ∆E ≤ 0 or if r ≤ exp(−∆E/kBT ), where r is auniform random number in [0,1].

4) Goto 1).

This cycle is repeated a large number of times (e.g. ∼ 106−108). Usually in MonteCarlo simulations we define the unit of time in sweeps. One sweep corresponds toone attempted step per spin, thus in a lattice with N2 spins one sweep corresponds toN2 Monte Carlo steps. Typical observables one computes in the Ising model are theenergy, or the magnetization per spin defined as

m =1

N2

∑i

s(i) (91)

or spin-spin correlation functions. These quantities reach their equilibrium value aftersome equilibration time τeq. Another quantity we compute is the (temporal) autocor-relation function which is defined for a system in which time is a discrete variable

χ(t) =1

tf

τeq+tf∑s=τeq

[m(s)− 〈m〉][m(s+ t)− 〈m〉] (92)

37

where m(t) is the magnetization at timestep t, and 〈m〉 is the average equilibriummagnetization. This equation is only valid when the Monte Carlo run has reached anequilibrium configuration (t > τeq). The sum in Eq. (92) runs from τeq, a timestepwhere the simulation definitely has reached equilibrium, until tf timesteps later, withtf a reasonably large number of timesteps (e.g. all timesteps between τeq and the endof the simulation).

This autocorrelation function decreases exponentially like

χ(t) ∼ e−t/τ (93)

where τ defines the autocorrelation time of the simulation. This is the timescale of thesimulation, and it indicates how much time it takes to arrive at a new uncorrelatedconfiguration of the model.

Questions

• Use the Metropolis algorithm to compute the magnetization per spin defined as

m =1

N2

∣∣∣∣∣∑i

s(i)

∣∣∣∣∣ (94)

for a temperature T = 2 and N = 50 as a function of the Monte Carlo time step(you should obtain a plot similar to that shown in Fig. 2 of the lecture notes).Compare the average magnetization with the exact value:

m(T ) =[1− sinh−4(2J/T )

]1/8(95)

• From your plot estimate the equilibration time τeq..

• Compute the energy histogram by running the Metropolis algorithm at T = 2.To generate the histogram start sampling from times > τeq.. Compare your his-togram with that obtained from the reweighting method in the previous exercise.

• For N = 50, calculate the autocorrelation function as a function of time, ifthe temperature is taken to be T = 2.2. Do this by iterating for a long time,and keep track of the magnetizations at each timestep (sweep). Normalize thisautocorrelation function by dividing by χ(0).

• Determine the correlation time τ from equation (93) from these data.

• Facultative - Fix the temperature to the critical value16 T = Tc = 2/ log(1 +√2) = 2.269 . . ., and calculate the correlation time τ for lattices of sizes N ×N ,

with N = 5, 10, 25. Plot these correlation times as a function of the lattice size.Show that these correlation times grow with N , and that they are much larger

16This is the exact value of the phase transition temperature

38

than in the case of T = 2.0 (this is called critical slowing down). Fit these T = Tccorrelation times using the formula

τ ∼ N z (96)

and estimate the dynamical exponent z.

Some interesting tricks

A Monte Carlo code consists of a central core nucleus which is repeated a large numberof times. Therefore we should try to speed up this core to have a very efficient program(with a fast program we can sample a large number of different configurations and thusget an accurate result).

• A common mistake is to compute the exponentials at each Metropolis step. Theexp(x) is an operation which requires a certain amount of polynomial approxi-mation on a computer and it is very slow. Since in the Ising model there are onlyfew discrete values for ∆E > 0 (how many?) it is much faster to calculate theseonce and store these numbers in a small vector.

• To calculate ∆E we only need to know the 4 spins up, down, right and left ofthe given spin, because all the other spins are unchanged.

• Often one performs simulations at different temperatures, say T1 < T2 < T3 . . ..To avoid long runs to reach equilibration one can take as begin configuration forthe simulation at T2 the last spin configuration of the run at T1, and so on.

4.9 The heat bath algorithm for the Potts model

The Potts model is a generalization of the Ising model, where a spin can have q values:si = 0, 1, 2 . . . q − 1. The energy of a configuration is given by:

E = −J∑〈ij〉

δsisj (97)

where the sum is extended to the four nearest neighboring spins J > 0 is the couplingstrength and δsisj is a discrete δ-function17. The Ising model corresponds to the caseq = 2 (with a simple variable transformation you can show this). For a general valueof q > 2 the model has a phase transition from a “ferromagnetic” phase where themajority of the spins assume one of the q possible values, to a configuration at hightemperatures, where the spins are equally likely one of the q possible directions.

In this exercise we compare the performance of the standard Metropolis algorithm18

with that of the heat bath algorithm. The heat bath algorithm is a single spin flipalgorithm. The selection probability of two states µ and ν differing for more than a

17The definition is δmn = 1 if m = n and δmn = 0 otherwise18The Metropolis algorithm for the q state Potts model is a generalization of that of the Ising model.

39

single spin g(µ→ ν) = 0. If now sk is the flipping spin we indicate with Em the energyof the state in which sk = m, which is calculated according to the Hamiltonian (97).Let us consider two configurations µ and ν differing by the spin sk and let sk = n in νthen the heat bath algorithm is defined by

g(µ→ ν) =1

qM

e−βEn∑q−1m=0 e

−βEm(98)

where M is the total number of spins in the lattice. The heat bath algorithm will thusselect a new value for the spin depending on the energy of the new configuration. Inthis algorithm the acceptance ratio is A(µ→ ν) = 1 for all states.

Questions

• Show that the transition probabilities defining the heat bath algorithm obeydetailed balance and ergodicity.

• For q = 10 and starting from a random initial configuration, determine theequilibration time for the heat bath algorithm and the Metropolis algorithm forT = 0.5 by plotting the internal energy as a function of the number of iterations.Take the size of the lattice 20×20, and use periodic or helical boundary conditions.You can experiment with other lattice sizes, temperatures and values for q if timeallows.

• Which of the algorithms converges the fastest, and how much faster?

We again apply a trick like in the Ising simulations, so that we only need to calculatea few exponentials beforehand, and store them in a small array, thus improving thespeed of the algorithm.

4.10 Kawasaki algorithm: The local version

In this exercise we use the Kawasaki algorithm to study the conserved order parameterIsing model. Differently from the single spin flip Metropolis algorithm, the Kawasakialgorithm conserves the total magnetization of the system:

M =∑i

si (99)

The idea is very simple: pick up two neighboring spins at random. If these are equal donothing. If they are different try to swap them according to the standard Metropolisrule. We consider an Ising lattice of size Lx = 200 and Ly = 50. The boundaryconditions are fixed in the y-direction19 and helical in the x-direction as illustrated inFig. 13.

19Fix initially the values of the spins in the top and bottom rows and never select these spins forflipping

40

4901

4801

101

1 2 3 99 100

200

5000y

100

200

x

201

(a)

(b)

Figure 13: Left: lattice configuration used in the problem. We use helical boundaryconditions in the x direction and open boundary conditions in the y direction. Right:start configurations for the simulation (a) half of the lattice filled with spins “plus”and half with spin “minus” separated by a straight horizontal interface and (b) centralsquare of spins “plus” surrounded by spin “minus”.

Questions

• Consider the start configuration of Fig. 13(a) and a temperature equal to 0.9 Tc,where Tc is the critical temperature for the two dimensional Ising model20. Run asimulation with the Kawasaki algorithm for N Monte Carlo steps and determineNacc the number of times a swap is actually performed.

• Determine the ratio Nacc/N at T = 0.5Tc, 1.2Tc and 2.5Tc. Does the ratio in-creases or decreases with temperature? Does this behavior matches your expec-tations?

• Using the initial configuration of Fig. 13(a), plot the energy per spin as a functionof the number of sweeps at T = 0.9Tc. From the run determine approximatelythe equilibration time.

4.11 Kawasaki algorithm: The non-local version

A more efficient simulation algorithm contains non-local moves, in which non-neighboringspins can be swapped. The most efficient way to do this is to create two vectorscontaining the list of plus and minus spins, for instance ~u = (1, 2, 3, 5, 7, 10 . . .) and~d = (4, 6, 8, 9 . . .). This means that the spins 1, 2, 3, 5, 7 . . . , labeled as in Fig. 13(left)are plus. We also keep the configuration of spins stored in a vector of length Lx × Ly,as done for the Ising model i.e. ~v = (1, 1, 1,−1, 1,−1, 1,−1,−1, 1 . . .). We select arandom spin from the “plus” list and one from the “minus” list, say ui and dj, andcalculate ∆E, the energy difference between the initial configuration and the one wherethe spins are swapped. This swap is then accepted according to the Metropolis rule.If the swap is accepted we update the vectors to u′i = dj and d′j = ui, as well asv′(ui) = −v(ui) and v′(dj) = −v(dj).

20The critical temperature is Tc = 2/ log(1 +√

2) ≈ 2.269

41

Questions

• Using the non-local algorithm estimate once again the acceptance probabilitiesNacc/N for the three temperatures given above starting from the initialconfigu-ration of Fig. 13(a). Is this lower or higher compared to the local algorithm?

• What is the equilibration time for T = 0.9Tc? Compare it with that obtained inthe previous exercise.

We use now as initial condition the configuration in Fig. 13(b).

• Run the Kawasaki algorithm with this initial condition with a square of size50 × 50, for T = 0.25Tc and T = 0.5Tc. Compare the evolution of the shape ofthe square at these two temperatures, by plotting the configuration after a fewselected numbers of timesteps.

• Consider now a temperature T = 0.8 Tc and initial squares of sizes 10×10, 20×20and 30 × 30. Show that as the simulation is started the smallest square rapidlydissolves, while the largest one persists after a large number of time steps.

4.12 Hard Disks

In this problem we consider a Monte Carlo algorithm for hard spheres in two dimensions(hard disks). The interaction between particles is described by the pair potential

VHS(r) =

{+∞ r ≤ 2σ

0 otherwise(100)

where r is the distance between two particles and σ is the disk radius. A Metropolisalgorithm for this system consists of three steps:

1) Pick up a particle at random, say particle i.

2) Perform a move ~ri → ~r ′i = ~ri + ~∆, where ~∆ is a random shift.

3) Check that the new position of the particle does not overlap with that of anyother particle, i.e. that |~rj − ~r ′i| ≥ 2σ for every j 6= i. If so accept the move,otherwise reject it.

4) Goto 1).

This is precisely the Metropolis algorithm we have seen for the Ising model. If amove brings two particles to overlap ∆E = +∞, thus the move is rejected, otherwise∆E = 0 and the move is always accepted. As a practical implementation we choosethe random shift as

∆α = (2u− 1)q (101)

42

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

Figure 14: Cell list method: the overlap of the new position is checked against allparticles of some neighboring cells.

with α = {x, y, z}, u a uniform random number in [0, 1] and q a characteristic length.With the choice (101) the algorithm is ergodic.

We consider a L× L square with periodic boundary conditions. To implement theboundary conditions we perform each move as follows:

~r ′i = ~ri + ~∆− L int

(~ri + ~∆

L

)(102)

where int(x) denotes the integer part21 of x (use the same type of approach to computedistances between particles).

The N particles can be placed initially on a regular grid of points or at random,but such that the interparticle distance is always ≥ 2σ.

We choose next the width of a move q (Eq. (101)). If q is very small almost everymove is accepted but the update is very slow. We choose q as the typical interparticledistance, thus for N particles in the L× L square q ≈ L√

N.

The slowest part of the algorithm is the check that the updated particle position isnot overlapping with any of the other N − 1 particles. One can alleviate this problemby constructing a “cell list”. We divide our L2 total area in squares of size rc× rc. Wegenerate a list which associate to each particle the given cell. If the position updatebrings the particle in the l-th cell, we check that it is not overlapping with any of theparticles in the cell l plus eventually some of its neighboring cells (see Fig. 14). Inpractice, rc can be chosen to be somewhat bigger than 2σ, the diameter of the harddisks.

21In Matlab/Octave use the function floor()

43

Here our interest in the computation of the pair correlation function defined as

g(r) =L2

2πrN

⟨∑i

∑j 6=i

δ(r − rij)

⟩(103)

Here rij is the distance between two pairs of particles. g(r) is the probability of findinga particle at a distance r, given that there is a particle at the origin. Note that fora hard disks fluid g(r) = 0 for r < 2σ. For a low density system (as a gas), g(r) isapproximately constant for r > 2σ. For a fluid at higher densities g(r) has a non-trivialshape, which provides information on the local arrangement of the molecules.

One can compute it by building up an histogram of distances for any two pairs ofparticles as

g(r) ≈ 〈N(r,∆r)〉ρ2πr∆r

(104)

where ∆r is some discrete distance, 〈N(r,∆r)〉 is the average number of pairs withdistance in the interval [r, r+∆r] and ρ = N/L2 the particle density. It is also possibleto use the cell list method to compute this quantity. Given a particle in the l-th cellwe only have to check distances with particles in the same in neighboring cells, if weare interested in g(r) up to a certain distance rmax.

Questions

• It is customary to define the packing fraction of the fluid as η = NπR2/L2, whichis the total area occupied by the disks divided by the area of the system. Whatis the maximal value of η one can have hard disks? (analytical computation)

• Consider a system with L = 100 and R = 2. Compute the average acceptanceratio of your Monte Carlo moves for some values of η, and some given q.

• Compute g(r) for some values of η. Show that g(r) becomes oscillatory at highη.

• Facultative - Setup a program which makes use of the cell list method andcompare it with the standard algorithm.

4.13 Monte Carlo in continuous time for a persistent randomwalk

We consider a one dimensional random walk. In this model a particle can occupythe sites of a one-dimensional infinite lattice x = {. . .− 2,−1, 0, 1, 2 . . .}. Initially theparticle is placed at position x = 0. At each time step (say ∆t = 1) the particle eitherstays in the same position or it jumps to one of the two neighboring sites. We chooseas probabilities P (x → x) = 1 − 2α and P (x → x ± 1) = α with α a parameter. If α

44

becomes small the particle will spend most of its time in a site and will rarely jump toneighboring sites. Let us consider the case α = 0.01.

In the case of the continuous time algorithm we estimate first analitically ∆t′,the average time the walker spend on the same lattice point. We choose with equalprobability a jump to the right or to the left. At every jump we update the time ast→ t+ ∆t′.

Questions

• Given the initial position of the particle at x = 0 compute the time it takesto reach x = 20 for the first time. Average over many different simulations.Compare the result obtained from the direct (brute force) simulation, with thatobtained from the continuous time algorithm explained above.

• What is the difference in run time of the two different codes to reach timest = 106∆t?

4.14 Birth-annihilation process

Consider a system made of a single particle type, and evolving in time following thetwo reactions:

Aλ−→ ∅ (105)

Aµ−→ AA (106)

where λ and µ are the rates. These reaction define the birth-annihilation process.

• Write down and solve the deterministic differential equation which describes theevolution of the average number of particles.

• Simulate the evolution of the system using the Gillespie algorithm, starting fromN = 10, 000 and N = 100 particles using λ = 1, µ = 0.2. Compare the simula-tions results with the solutions of the deterministic differential equation.

• Starting initially from N = 100 particles and using λ = 1 and µ = 0.2 trace ahistogram of extinction times (the extinction time is the time it takes to remove allparticles from the system), by repeating sufficiently many times your simulations.

4.15 Lotka-Volterra Model

The Lotka-Volterra model describes the evolution of a predator-prey population. Theprey is Y1 and the predator Y2. The reactions are:

Y1k1−→ 2 Y1 (107)

Y1 + Y2k2−→ 2 Y2 (108)

Y2k3−→ ∅ (109)

45

where ki with i = 1, 2 and 3 are the rates. Let us choose the rates k1 = 1, k2 = 0.005and k3 = 0.6 and start from an initial state with 100 preys and 20 predators.

• Using the Gillespie algorithm produce plots of the trajectory in the plane X1, X2

and of X1(t) together with X2(t).

• Estimate the average number of oscillations performed in the system, beforeending up in one of the two absorbing states (i.e. X1 = X2 = 0 or X1 → ∞,X2 = 0)

• How often is the system ending in X1 = X2 = 0 compared to X1 →∞, X2 = 0?

• Determine the distribution of extinction times (here we define as extiction timeas the time it takes to reach a state in which either X1 = 0 or X2 = 0).

4.16 Brusselator

We consider here a model of autocatalytic reaction known as the Brusselator. This ischaracterized by the following reactions22

∅ k1−→ Y1 (110)

Y1k2−→ Y2 (111)

2Y1 + Y2k3−→ 3 Y1 (112)

Y1k4−→ ∅ (113)

where ki with i = 1, 2, 3 and 4 are the rates. We fix them as in the original paper byGillespie to the values k1 = 5 · 103, k2 = 50, k3 = 5 · 10−5 and k4 = 5.

• Using the Gillespie algorithm produce plots of the trajectory in the X1-X2 planeand of X1(t) and X2(t) using as initial condition X1(0) = 1000 and X2(0) = 2000.

• Repeat the simulations with the three different initial conditions (a) X1(0) =1000, X2(0) = 4000 (b) X1(0) = 5000, X2(0) = 5000 and (c) X1(0) = 500,X2(0) = 500. Show that the system approaches a “limit cycle” type of oscillation.Compare your results with those reported in Fig. 15 of the Gillespie’s paper.

22This is a slightly different notation as that used in the original paper by Gillespie, see http:

//www.caam.rice.edu/~cox/gillespie.pdf, Eqs. 44(a-d)

46

http://www.caam.rice.edu/~cox/gillespie.pdf

http://www.caam.rice.edu/~cox/gillespie.pdf

A The Ising and Potts models

The Ising model is a cornerstone of classical statistical mechanics. It is sufficientlysimple that can be handled analytically, through exact computations (in one and twodimensions), or with approximated methods. In particular in two and higher dimen-sions the model has a phase transition. The Ising model is defined on a lattice in ddimensions, whose sites are occupied by a spin taking two values s = ±1. The energyof a configuration is given by

E = −J∑〈ij〉

sisj (114)

where J > 0 is the coupling constant and where the previous sum is extended to nearestneighbors. This model has a transition from a low temperature ferromagnetic phase toa high temperature phase which is paramagnetic. The transition temperature for thetwo dimensional square lattice is known exactly and it is equal to

Tc =2J

ln(1 +√

2)(115)

In the ferromagnetic phase the magnetization m = 〈si〉 6= 0 (the average value of aspin). It can be computed exactly for an infinite two dimensional square lattice andhas the form

m(T ) =[1− sinh−4(2J/T )

]1/8(116)

The magnetization vanishes by approaching Tc from below with some power-law sin-gularity as

m ∼ (T − Tc)β (117)

Other quantities also show some power-law behavior. For instance the spin-spin cor-relation function for two spins separated by a distance r in the lattice is given by

〈sisj〉 ∼ e−r/ξ (118)

where ξ defines the correlation length, a quantity which diverges when approaching Tcas follows:

ξ ∼ |T − Tc|−ν (119)

The Ising model has been solved exactly in two dimensions and the critical exponentsare given in terms of this exact solution. In three dimensions the value of the exponentsis known approximately. Table 1 summarizes their values. The exponents are universal,they do not depend on the type of lattice (e.g. squared, triangular . . . ) but only onthe dimensionality of the system.

The Potts models are generalizations of the Ising model. In the q-state model eachspin can take q values si = 0, 1, 2 . . . q − 1, with an energy given by

E = −J∑〈ij〉

δsisj (120)

47

Dimension α β γ ν2 0 1/8 7/4 13 0.12 0.31 1.25 0.64

Table 1: Summary table of the critical exponents of the Ising model in two and threedimensions. The exponents are associated to the divergences of the specific heat (α),magnetization (β), susceptibility (γ) and correlation length (ν).

and J > 0. Here the δsisj is the discrete (Kronecker) delta function which is definedas δmn = 1 if m = n and δmn = 0 otherwise. It is easy to show that the q = 2 modelcorresponds to the Ising model. We can show this by using the identity:

δmn =1

2(2m− 1)(2n− 1) +

1

2(121)

which is valid for m,n = 0, 1. The transformation s = 2s−1, takes us from Potts spinss = 0, 1, to Ising spins s = 2s − 1 = ±1. So we find EPotts({si}) = 1

2EIsing({si}) −

J2

∑〈ij〉. The Potts model for any dimensions higher than two has a phase transition

from a high temperature phase where the spins point to a random direction, to a“ferromagnetic” phase in which they predominantly align along one of the q values ofthe spins.

B The Master equation

Eq. (18) describes the stochastic evolution of the system evolving at discrete times ∆t.By taking the limit ∆t→ 0 we can derive the so-called Master equation in its standardform. This is a differential equation that can be deduced as follows:

p(t+ ∆t)− p(t) = (P∆t − 1) p(t) (122)

where 1 is the identity matrix. Dividing by ∆t and taking the limit ∆t→ 0 we obtain

d

dtp(t) = −Hp(t) (123)

where we have introduced the matrix H as

H = lim∆t→0

1− P∆t

∆t(124)

As we have seen the sum of elements in the column of P is equal to one. Hence thecolumns of H add up to zero. Moreover the non-diagonal elements of H are:

Hµν = −h(ν → µ) (125)

where h(ν → µ) are transition rates from ν to µ, i.e. transition probabilities per unittime. We write now explicitely Eq. (123) in terms of matrix elements, separating thediagonal and non-diagonal terms:

d

dtpµ(t) = −

∑ν 6=µ

Hµνpν(t)−Hµµpµ(t) (126)

48

Using the fact that the sum of the elements on each column equals zero and usingEq. (125) we can rewrite it as

d

dtpµ(t) =

∑ν 6=µ

h(ν → µ)pν(t)−∑ν 6=µ

h(µ→ ν)pµ(t) (127)

which is the basic form of the Master equation. The two terms on the left hand sideof this equation have an obvious interpretation as “gain” and “loss” terms.

49

Documents

Monte Carlo 2014Course