Nonequilibrium candidate Monte Carlo is an efficient … · Nonequilibrium candidate Monte Carlo is an efficient tool for equilibrium simulation ... favored tools for sampling from

Corrections

CHEMISTRY, BIOPHYSICS AND COMPUTATIONAL BIOLOGYCorrection for “Nonequilibrium candidate Monte Carlo is anefficient tool for equilibrium simulation,” by Jerome P. Nilmeier,Gavin E. Crooks, David D. L. Minh, and John D. Chodera,which appeared in issue 45, November 8, 2011, of Proc Natl AcadSci USA (108:E1009–E1018; first published October 24, 2011;10.1073/pnas.1106094108).The authors note that Reference 62 appeared incorrectly. The

corrected reference appears below.Additionally, the authors note that Equations 39, 41, and 42

appeared incorrectly. The corrected equations appear below.

xt ¼ x*t þΔtγm

Ftðx*t Þ þffiffiffi2

p �Δtγm

�1=2ξt; [39]

xt ¼ x*t þΔtγm

Ftðx*t Þ þffiffiffi2

p �Δtγm

�1=2ξt

x*t ¼ xt þ Δtγm

FtðxtÞ þffiffiffi2

p �Δtγm

�1=2~ξ t;

[41]

~ξ t ¼−1ffiffiffi2

p�Δtγm

�1=2½FtðxtÞ þ Ftðx*t Þ�− ξt: [42]

62. Athènes M, Marinica M-C (2010) Free energy reconstruction from steered dynamics

without post-processing. J Comput Phys 229:7129–7146.

www.pnas.org/cgi/doi/10.1073/pnas.1207617109

BIOPHYSICS AND COMPUTATIONAL BIOLOGYCorrection for “Successful prediction of the intra- and extra-cellular loops of four G-protein-coupled receptors,” by Dahlia A.Goldfeld, Kai Zhu, Thijs Beuming, and Richard A. Friesner,which appeared in issue 20, May 17, 2011, of Proc Natl Acad SciUSA (108:8275–8280; first published May 2, 2011; 10.1073/pnas.1016951108).The authors note that their conflict of interest statement was

omitted during publication. The authors declare that R.A.F. isa founder of Schrodinger, Inc.


www.pnas.org PNAS | June 12, 2012 | vol. 109 | no. 24 | 9665

CORR

ECTIONS



Nonequilibrium candidate Monte Carlo is anefficient tool for equilibrium simulationJerome P. Nilmeiera, Gavin E. Crooksb, David D. L. Minhc, and John D. Choderad,1

aBiosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA 94550; bPhysicalBiosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720; cBiosciences Division, Argonne National Laboratory, Argonne,IL 60439; and dCalifornia Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA 94720

Edited by Michele Parrinello, ETH Zurich, Lugano, Switzerland, and approved August 30, 2011 (received for review April 21, 2011)

Metropolis Monte Carlo simulation is a powerful tool for studyingthe equilibrium properties of matter. In complex condensed-phasesystems, however, it is difficult to design Monte Carlo moves withhigh acceptance probabilities that also rapidly sample uncorrelatedconfigurations. Here, we introduce a new class of moves based onnonequilibrium dynamics: Candidate configurations are generatedthrough a finite-time process in which a system is actively drivenout of equilibrium, and accepted with criteria that preserve theequilibrium distribution. The acceptance rule is similar to the Me-tropolis acceptance probability, but related to the nonequilibriumwork rather than the instantaneous energy difference. Ourmethodis applicable to sampling from both a single thermodynamic stateor a mixture of thermodynamic states, and allows both coordinatesand thermodynamic parameters to be driven in nonequilibriumproposals. Whereas generating finite-time switching trajectoriesincurs an additional cost, driving some degrees of freedom whileallowing others to evolve naturally can lead to large enhancementsin acceptance probabilities, greatly reducing structural correlationtimes. Using nonequilibrium driven processes vastly expands therepertoire of useful Monte Carlo proposals in simulations of densesolvated systems.

expanded ensembles ∣ Markov chain Monte Carlo ∣ Metropolis–Hastings ∣molecular dynamics

In this paper, we describe a new technique for constructing effi-cient Markov chain Monte Carlo (MCMC) (1) moves that both

have high acceptance rates and also allow rapid transit throughconfiguration space, greatly enhancing convergence rates insimulations of dense solvated systems. The Metropolis MonteCarlo (MC) (2, 3) sampling procedure is generalized by usingnonequilibrium processes to generate candidates for equilibriumsimulations. Within this framework, moves that are efficient foran isolated part of a system but lead to near-universal rejectionin standard Monte Carlo simulations of dense mixtures can beconverted to nonequilibrium processes that generate candidateswith higher acceptance probabilities. In this new procedure, theacceptance criteria is related to the nonequilibrium work, ratherthan the potential energy difference used in traditional MonteCarlo moves.

Since their introduction in the mid-twentieth century, MC(2, 3) and molecular dynamics (MD) (4) simulations have becomefavored tools for sampling from complex multidimensional distri-butions, such as configurations of microscopic physical systemsin thermodynamic ensembles. However, these methods can pro-duce highly correlated samples, leading to slow convergence ofestimated expectations. Whereas MD requires the use of smalltimesteps for numerical stability and to approximate samplingfrom the desired distribution, MC simulations can, in principle,make use of nonlocal moves that accelerate mixing of the Markovchain. Indeed, vast improvements in efficiency have been ob-tained by applying cleverly constructed move sets that exploitphysical intuition about the system under study, such as clustermoves in Potts and Ising model simulations (5, 6).

Designing efficient moves requires striking a balance betweenrapid traversal of phase space and ensuring reasonable accep-tance probabilities. For complex heterogeneous systems suchas solvated biomolecules, achieving this balance remains challen-ging. Typically, efficient moves exploit physical insight into kine-tically slow processes and energetically favorable configurations.Often, the experimenter may possess physical insight about onecomponent in the system (e.g., a biomolecule) that permits thedesign of moves that would be efficient in the absence of othercomponents (e.g., solvent), but encounter energetically unfavor-able interactions in their presence, reducing acceptance rates tolevels where standard MC provides no benefit. As an illustrativeexample, consider a bistable dimer—a pair of particles interact-ing with a potential with minima in compact or extended config-urations, separated by a high barrier (see Fig. 1). For simulationsof this system in a vacuum, a simple and effective standard MCmove is to instantaneously increase the interparticle distancefrom a compact to extended configuration (or conversely, to de-crease the distance from an extended to compact configuration).When the dimer is immersed in a dense solvent, however, thismove is met with near-universal rejection because solvent mole-cules overlap with proposed configurations.

One approach that can allow unperturbed degrees of freedomto relax, and hence maintain a reasonable acceptance rate, is touse a nonequilibrium process to generate candidate configura-tions. Using the appropriate acceptance criterion for the finalconfiguration will preserve the equilibrium distribution. In thecase of the bistable dimer immersed in dense solvent, the exten-sion (or contraction) may be carried out over a finite number ofincrements interleaved with standard Metropolis Monte Carlo ormolecular dynamics steps that allow the solvent to reorganize toavoid overlap with the dimer particles.

The basic idea of using nonequilibrium driven processes asMonte Carlo moves has precedents in both the statistical (7, 8)and chemical (9–12) literature. Among the latter, Athènes devel-oped “work-bias Monte Carlo” to enhance acceptance rates ingrand canonical Monte Carlo simulations (9), Stern presenteda scheme to sample an equilibrium mixture of protonation statesat constant pH in explicit solvent (11) [although an inexact var-iant was proposed earlier (10)], and Nilmeier et al. (12) proposedthe driving of a subset of degrees of freedom to enhance accep-tance rates (using an approximate acceptance criterion). Non-

Author contributions: J.P.N., G.E.C., D.D.L.M., and J.D.C. designed research; D.D.L.M. andJ.D.C. performed research; J.P.N., G.E.C., D.D.L.M., and J.D.C. contributed new analytictools; J.D.C. analyzed data; and J.P.N., G.E.C., D.D.L.M., and J.D.C. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: Simulation and analysis scripts, as well as simulation data generatedfor this paper, are available online at http://simtk.org/home/ncmc/; simtk.org is an onlinerepository for exchanging scientific data produced by SimBios, a National Institutes ofHealth Center for Biomedical Computing.1To whom correspondence should be addressed. E-mail: [email protected].

See Author Summary on page 18199.

www.pnas.org/cgi/doi/10.1073/pnas.1106094108 PNAS ∣ November 8, 2011 ∣ vol. 108 ∣ no. 45 ∣ E1009–E1018

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

CHEM

ISTR

YPN

ASPL

US

http://simtk.org/home/ncmc/

http://simtk.org/home/ncmc/

http://www.pnas.org/content/108/45/E1009/1

equilibrium processes have also been used to generate configura-tions for parallel tempering simulations (13–15).

Here, we unify these ideas and significantly extend the appli-cation of nonequilibrium moves in physical simulations. We pre-sent a theoretical framework, nonequilibrium candidate MonteCarlo (NCMC), that is applicable to both single thermodynamicstates (e.g., NVT, NpT, μVT ensembles) as well as mixtures ofthermodynamic states [e.g., expanded ensemble (16, 17) simula-tions]. Nonequilibrium proposals may drive a subset of degreesof freedom, the thermodynamic parameters characterizing theequilibrium distribution, or both, significantly expanding the re-pertoire of Monte Carlo moves that lead to high acceptance andefficient mixing in dense condensed-phase systems.

Equilibrium and Expanded Thermodynamic EnsemblesFor physical systems in equilibrium, the probability of observing amicrostate is given by the Boltzmann distribution,

πλðxÞ ¼ Z−1λ e−uλðxÞ; Zλ ¼

ZΓdx e−uλðxÞ; [1]

where x ∈ Γ denotes a microstate of the system (which may in-clude coordinates, momenta, and other dynamical variables, suchas simulation box dimensions), λ denotes a set of thermodynamicparameters whose values define a thermodynamic state, and Zλ isa normalizing constant known as the partition function.

The reduced potential uλðxÞ depends on the thermodynamicensemble under consideration (18). For instance, in an isother-mal-isobaric (NpT) ensemble, the reduced potential will assumethe form,

uλðxÞ ¼ β½HðxÞ þ pV ðxÞ�; [2]

which depends on the Hamiltonian HðxÞ (which may include anexternal biasing potential, and is presumed to be invariant undermomentum inversion) and the system volume V ðxÞ. In this en-semble, the vector of controllable thermodynamic parametersλ≡ fβ; H; pg includes the inverse temperature β, the Hamilto-nianHðxÞ, and external pressure p. Other thermodynamic param-

eters and their conjugate variables can be included or excluded togenerate alternative physical (or unphysical) ensembles.

To allow sampling from multiple thermodynamic states withina single simulation, we also define an expanded ensemble (16,17), which specifies a joint distribution for ðx; λÞ in a weightedmixture of thermodynamic states,

πðx; λÞ ¼ ZλπλðxÞωλ

∑ν∈G

RΓ dy ZνπνðyÞων

; [3]

where ωλ > 0 specifies an externally imposed weight for state λ.Here, λ ∈ G may assume values in a discrete or continuous spaceG. If the set G consists of a single value of λ, a single thermody-namic state is sampled, and πðx; λÞ ¼ πλðxÞ. These thermody-namic states may correspond to a variety of different states ofinterest, such as temperatures in a simulated tempering simula-tion (19), alchemical states in a simulated scaling simulation (20),or protonation states in a constant-pH simulation (21).

Nonequilibrium Candidate Monte CarloWe first describe the general form of NCMC. At the start of aniteration, the current sample in the Markov chain, ðxðnÞ; λðnÞÞ,which is assumed to be drawn from πðx; λÞ, is used to initializea trajectory, ðx0; λ0Þ ¼ ðxðnÞ; λðnÞÞ. A candidate configurationðxT; λTÞ is then proposed through a nonequilibrium process inwhich a set of degrees of freedom and/or thermodynamic param-eters may be driven according to some protocol (22) selected witha probability dependent only on ðx0; λ0Þ. Even if we only wish tosample from a single thermodynamic state λ, we may use a pro-tocol that transiently drives the thermodynamic parameters awayfrom λ and back again (as in ref. 14). Finally, an acceptance prob-ability is computed and used to decide whether the next sample inthe Markov chain, ðxðnþ1Þ; λðnþ1ÞÞ, is the candidate, ðxT; λTÞ, or themomentum reversal of the initial sample, ð~xðnÞ; λðnÞÞ.

An NCMC move begins by selecting a protocol Λ from a set ofpossible protocols with probability PðΛjx0; λ0Þ, such that there ex-ists a reverse protocol labeled as ~Λ (to be defined momentarily)with Pð ~Λj~xT; λTÞ > 0. A protocol Λ specifies both a series of Tperturbation kernels αtðx; yÞ and propagation kernels Ktðx; yÞ,arranged in an alternating pattern such that Λ≡ fα1; K1; α2; K2;…; αT; KTg. Both αtðx; yÞ and Ktðx; yÞ are conditional probabil-ities of y ∈ Γ given any x ∈ Γ, and must satisfy the requirementthat if pðx; yÞ > 0, then pðy; xÞ > 0, for p substituted by αt and Kt.

Each perturbation kernel αt drives some or all of the degreesof freedom x in a stochastic or deterministic way (e.g., by drivinga torsion angle, a distance between two atoms, or the volume ofthe simulation cell). Similarly, each propagation kernel Kt propa-gates some or all of the coordinates of the system at fixed λtaccording to some form of MCMC or MD [e.g., MetropolisMonte Carlo (2, 3), velocity Verlet (23) deterministic dynamics,or overdamped Langevin stochastic dynamics (24, 25)] that mayalso depend on the time index t. Interleaving perturbation andpropagation allows for energetically unfavorable interactionsintroduced by perturbation to be relaxed during propagation,potentially increasing acceptance rates relative to the instanta-neous perturbations of standard Metropolis Monte Carlo.

The procedure by which a trajectory X ≡ ðx0; x1;…; xTÞ is gen-erated from an initial microstate x0 according to a protocol Λ canbe illustrated by the scheme,

x0 ��!α1 x�1 ��!K1 x1 ��!⋯ ��! xT−1 ��!αT x�T ��!KT xT: [4]

Application of the perturbation αt to xt−1 generates a perturbedconfiguration x�t , which is then propagated by the kernel Kt toobtain xt.

The reverse protocol ~Λ≡ fKT; αT;…; K0; α0g reverses the or-der in which the perturbation and propagation steps are applied,

Fig. 1. Bistable dimer potential and instantaneous MC moves in WCA fluid.An extension move increases the dimer extension by Δr ¼ þr0, whereas acompaction move decreases the dimer extension by Δr ¼ −r0. Both movetypes meet with near-universal rejection when implemented as instanta-neous MC moves in a dense WCA fluid. Note that the lower panel is onlya cartoon—the actual described simulation is of a 3D system.

E1010 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1106094108 Nilmeier et al.

generating the time-reversed trajectory ~X ≡ f~xT;…; ~x0g, where ~xdenotes x with inverted momenta,

~xT ��!KT~x�T ��!αT ~xT−1 ��!⋯ ��! ~x1 ��!K1

~x�1 ��!α1 ~x0: [5]

The next step in NCMC is to accept or reject ðxT; λTÞ as thenext sample in the chain. To ensure that the stationary distribu-tion πðx; λÞ is preserved, we enforce a strict pathwise form of de-tailed balance*,

AðX jΛÞΠðX jx0; ΛÞPðΛjx0; λ0Þπðx0; λ0Þ¼ Að ~X j ~ΛÞΠð ~X j~xT; ~ΛÞPð ~Λj~xT; λTÞπð~xT; λTÞ: [6]

The quantity AðX jΛÞ is the NCMC acceptance probability,whereasΠðX jx0; ΛÞ andΠð ~X j~xT; ~ΛÞ denote the probability of gen-erating trajectory X from initial configuration x0 using protocolΛ,or ~X from final configuration ~xT with protocol ~Λ, respectively,

ΠðX jx0; ΛÞ ¼Y

1≤t≤T

αtðxt−1; x�t ÞKtðx�t ; xtÞ [7]

Πð ~X j~xT; ~ΛÞ ¼Y

T≥t≥1αtð~x�t ; ~xt−1ÞKtð~xt; ~x�t Þ: [8]

Summation of Eq. 6 over all trajectories starting with x0 andending with xT recovers the standard detailed balance condition(see Appendix for proof).

We define the ratio of proposal kernels as

αð ~X j ~ΛÞαðX jΛÞ≡

YTt¼1

αtð~x�t ; ~xt−1Þαtðxt−1; x�t Þ

; [9]

and the ratio of propagation kernels as the exponentiated differ-ence in forward and backward conditional path actions as

e−ΔSðX jΛÞ ≡YTt¼1

Ktð~xt; ~x�t ÞKtðx�t ; xtÞ

: [10]

Using the above expressions and the momentum invarianceproperty πðx; λÞ ¼ πð~x; λÞ, we may write the ratio of acceptanceprobabilities as

AðX jΛÞAð ~X j ~ΛÞ ¼

πð~xT; λTÞπðx0; λ0Þ

Pð ~Λj~xT; λTÞPðΛjx0; λ0Þ

ΠðX j~xT; ~ΛÞΠðX jx0; ΛÞ

¼ πðxT; λTÞπðx0; λ0Þ


YTt¼1

αtð~x�t; ~xt−1Þαtðxt−1; x�t Þ

Ktð~xt; ~x�t ÞKtðx�t; xtÞ

≡ ωT

ω0


αð ~X j ~ΛÞαðX jΛÞ e

−ΔSðX jΛÞ−ΔuðX jΛÞ; [11]

where ΔuðX jΛÞ≡ uTðxTÞ − u0ðx0Þ is the energy difference. Eq. 11is the main result of this paper, and is highly general with regardto both the choice of protocol for perturbation and propagation.In subsequent sections, we discuss specific choices for these pro-tocols that lead to particularly simple acceptance criteria.

Many choices of acceptance probabilities AðX jΛÞ that satisfyEq. 11 are possible, including the well-known Metropolis–Hastings criterion (2, 3)

AðX jΛÞ ¼ min�1;ωT

ω0



−ΔSðX jΛÞ−ΔuðX jΛÞ�:

[12]

After generating ðxT; λTÞ and evaluating AðX jΛÞ, we generate auniform random variate U. If AðX jΛÞ > U, then the candidatebecomes the next value in the chain, ðxðnþ1Þ; λðnþ1ÞÞ ¼ ðxT; λTÞ.Otherwise, it is rejected, we perform a momentum flip, andthe next value becomes ðxðnþ1Þ; λðnþ1ÞÞ ¼ ð~x0; λ0Þ. Alternately, wemay perform a momentum flip upon acceptance, ðxðnþ1Þ; λðnþ1ÞÞ ¼ð~xT; λTÞ and preserve the momentum upon rejection, ðxðnþ1Þ;λðnþ1ÞÞ ¼ ðx0; λ0Þ. We cannot, however, ignore the momentumflip completely; as explained in Appendix, this flip is necessaryto preserve the equilibrium distribution.

We note that NCMC need not be used exclusively to samplefrom πðx; λÞ, but can be mixed with other MCMC moves or withMD (1). For example, one may reinitialize velocities from theMaxwell-Boltzmann distribution after each NCMC step; this isa Gibbs sampling MCMC move using the marginal distributionfor velocities.

Perturbation KernelsA large variety of choices are available for the perturbationkernels αtðx; yÞ. Through judicious selection of these kernels, apractitioner can design nonequilibrium proposals that carry somecomponent of the system from one high-probability region toanother with high acceptance rates. We briefly describe a fewpossibilities.

Stochastically Driven Degrees of Freedom. Suppose we wish to drivea torsion angle ϕ (an angle subtended by four bonded atoms)stochastically by rotating it to a new torsion angle ϕ0 (holding an-gles and bonds fixed) according to some probability, such as thevon Mises circular distribution centered on ϕ,

ηðϕ0jϕÞ ¼ ½2πI0ðκÞ�−1eκ cosðϕ0−ϕÞ; [13]

with I0ðκÞ denoting the modified Bessel function of order zeroand κ > 0 taking the role of a dimensionless force constant.Because the stochastic perturbation is made in a non-Cartesiancoordinate, a Jacobian JðϕÞ must be included to compute αðx; yÞin Cartesian coordinates, resulting in the ratio,

αtð~y; ~xÞαtðx; yÞ

¼ ηðϕjϕ0ÞJðϕÞηðϕ0jϕÞJðϕ0Þ ¼ 1; [14]

where Jðϕ0Þ ¼ JðϕÞ ¼ 1 because the transformation (a rotationabout a bond vector) preserves the Cartesian phase space volume.

Deterministically Driven Degrees of Freedom. Instead of perturbingthe torsion angle stochastically, we can deterministically drive itin small, fixed increments Δϕ. In this case, we effectively definean invertible map M that takes x → y, such that y ¼ Mx differsfrom x only in the rotation of the specified torsion ϕ by Δϕ.To implement this, we may choose a perturbation Δϕ from adistribution where �Δϕ have equal probability, and drive ϕðxÞfrom its current value ϕ0 to a final value ϕT ¼ ϕ0 þ Δϕ over Tsteps in equal increments, such that ϕðxtÞ is constrained toϕt ≡ ð1 − t∕TÞϕ0 þ ðt∕TÞϕT . In this case, αtðx; yÞ ¼ δðy −MxÞJðxÞ, where the Jacobian JðxÞ represents the factor by whichCartesian phase space is compressed on the application of themap M, which is again unity for rotation about a torsion angleby Δϕ, and, due to the invertibility of the map, the ratioαtð~y; ~xÞ∕αtðx; yÞ ¼ 1.

Simulation Box Scaling. Another possible deterministic perturba-tion kernel is simulation box scaling. A barostat can be implemen-ted by proposing propagation kernels that scale the molecular

*The described pathwise detailed balance condition is closely related to “super-detailedbalance” (e.g., ref. 26), but also accounts for momentum reversal to extend its definitionto include molecular dynamics integrators.

Nilmeier et al. PNAS ∣ November 8, 2011 ∣ vol. 108 ∣ no. 45 ∣ E1011

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

CHEM

ISTR

YPN

ASPL

US

centers and box geometry by a factor s ¼ ½ðV ðxÞ þ ΔV Þ∕V ðxÞ�1∕3with ΔV chosen uniformly from ½V − ΔV 0; V þ ΔV 0� applied asa factor of s1∕T over the course of T steps. In this case, the per-turbation kernel αtðx; yÞ is a delta function that compresses orexpands the molecular centers and box geometry. Because theJacobian is the ratio of infinitesimal volumes upon scaling, theratio of perturbation kernels is αð ~X j ~ΛÞ∕αðX jΛÞ ¼ s3N , where Ndenotes the number of molecular centers.

Thermodynamic Perturbation. In many driven nonequilibriumprocesses, there is no direct perturbation to the coordinates, suchthat αtðx; yÞ ¼ δðx − yÞ and the ratio αð ~X j ~ΛÞ∕αðX jΛÞ ¼ 1. Instead,only the thermodynamic parameters λ are varied in time, carryingthe system out of equilibrium through action of the Kt propaga-tion kernels. We recover Neal’s method (7) if the reduced poten-tial ut is a simple linear interpolation such that utðxÞ ¼ð1 − t∕TÞu0ðxÞ þ ðt∕TÞuTðxÞ, the probability of choosing protocolΛ is symmetric with ~Λ, and MC (2, 3) is used for the propagationkernel Kt.

Propagation KernelsThe choice of propagation kernels available is also very broad. Ifstrong driving is performed in selection of α, one may elect tochoose a time-independent propagation kernel Ktðx; yÞ≡ Kðx; yÞthat samples from a stationary distribution πðxÞ of interest.Alternatively, a strongly time-dependent Kt could be selectedto transiently drive the system out of equilibrium, or from theequilibrium distribution at one thermodynamic state to another.Some possible choices are described below.

Reversible Markov Chain Monte Carlo.One may propagate some orall of a system’s degrees of freedom (e.g., those not affectedby the perturbation kernel αt) by a method that satisfies detailedbalance in πt,

πtðxÞKtðx; yÞ ¼ πtð~yÞKtð~y; ~xÞ; [15]

where πt xð Þ≡ Z−1t e−utðxÞ for a specified utðxÞ. Many MCMCmeth-

ods (1), including Metropolis (2, 3) and various hybrid MonteCarlo (HMC) algorithms that combine discrete-timestep integra-tors with Monte Carlo acceptance/rejection steps (27, 28), obeydetailed balance and are commonly used for the simulation ofphysical systems.

By analogy with Crooks (29), we define a work w and heat q forthe nonequilibrium driven process,

wðX jΛÞ ¼ ∑T

t¼1

½utðx�t Þ − ut−1ðxt−1Þ� [16]

qðX jΛÞ ¼ ∑T

t¼1

½utðxtÞ − utðx�t Þ�; [17]

such that wðX jΛÞ þ qðX jΛÞ ¼ ΔuðX jΛÞ, a restatement of the firstlaw of thermodynamics.

The conditional path action difference can then be written interms of the heat of the process, qðX jΛÞ,

ΔSðX jΛÞ ¼ − lnYTt¼1

πtðx�t ÞπtðxtÞ

¼ −qðX jΛÞ; [18]

leading to an acceptance probability similar to standard MC, ex-cept that the work, wðX jΛÞ, replaces the instantaneous potentialenergy difference,


ωT

ω0



−wðX jΛÞ: [19]

Deterministic Dynamics.When an isolated system is propagated bya symplectic integrator—a reversible, deterministic integratorthat preserves phase space volume—the propagation kernels fol-low Ktðx; yÞ ¼ Ktð~y; ~xÞ. Hence, ΔSðX jΛÞ ¼ 0 and the acceptanceratio is,


ωT

ω0



−ΔuðX jΛÞ; [20]

where ΔuðX jΛÞ≡ uTðxTÞ − u0ðx0Þ is the energy difference. Theequivalence of the work and energy difference for volume-preser-ving integrators was previously recognized in the context offluctuation theorem calculations (30, 31).

Symplectic integrators include velocity Verlet (23). These in-tegrators are also symplectic when utilizing constraints throughthe application of algorithms such as RATTLE (32), providedthat the constraints are iterated to convergence each time-step (33).

Stochastic Dynamics. Stochastic integrators such as velocity Verletdiscretizations of Langevin dynamics (34, 35) sample a modifieddistribution that differs from the desired equilibrium distributionπt in a timestep-dependent manner (36). Whereas this modifieddistribution may be difficult or impossible to compute to recoverequilibrium properties by reweighting, computation of the rela-tive action ΔSðX jΛÞ is relatively straightforward, and the NCMCacceptance criteria ensures that the NCMC-sampled configura-tions are distributed according to the desired equilibrium ensem-ble†. As examples, we compute ΔSðX jΛÞ for the overdampedLangevin (Brownian) dynamics integrator of Ermak and Yeh(24, 25) and the Brünger–Brooks–Karplus (BBK) Langevin inte-grator (38–40) in Appendix.

Illustrative Application: Bistable Dimer in a WCA FluidTo demonstrate NCMC, we ran simulations of a bistable dimer(adapted from section 1.3.2.4 of ref. 36) in vacuum as well asa dense fluid. The dimer consists of a pair of “bonded” particlesinteracting via a double-well potential, with minima at r ¼ r0(compact) and r ¼ 2r0 (extended), and a 5kBT barrier (see Fig. 1).In the solvated simulations, the dimer was immersed in a densebath (reduced density ρσ3 ¼ 0.96) of particles that interact withthe bonded particles and each other via the Weeks–Chandler–Andersen (WCA) soft repulsive potential (41). Each simulation“iteration” consisted of velocity reassignment from the Maxwell–Boltzmann distribution, 500 steps of generalized hybrid MonteCarlo (GHMC) dynamics (1, 28, 36, 37) (essentially, a Metropo-lis-corrected form of Langevin dynamics, henceforth referred tohere as MD), optionally followed by either an instantaneous MCmove or an NCMC move.

The rate at which effectively uncorrelated samples are gener-ated can be quantified in terms of the correlation time τ for thedimer extension rðtÞ (shown in Fig. 2). This time represents theasymptotic decay time constant for the correlation functionCðtÞ ¼ hrð0ÞrðtÞi, which will behave like

CðtÞ ¼ C∞ þ ðC0 − C∞Þe−t∕τ [21]

for large t, where C0 ¼ hr2i and C∞ ¼ hri2. The correlation timeis related to the statistical inefficiency, g ¼ 1þ 2τ, a factor thatdescribes the number of iterations necessary to generate an effec-tively uncorrelated sample (42).

For the MD simulation in vacuum (Fig. 2, top trace), weobserve slow hopping between compact and extended configura-

†Alternatives to using NCMC to correct stochastic integration include introducing aMetropolization correction after each step, as in the generalized hybrid Monte Carlo(GHMC) integrator we use in the example (1, 28, 36, 37).


tions with a correlation time τ ¼ 59.2 iterations, resulting in slowconvergence of the histogram. Introducing instantaneous MCdimer extension/contraction moves that modify the dimer exten-sion byΔr ¼ �r0 reduces the correlation time to τ ≈ 0.0, such thatan uncorrelated configuration is generated after each iteration of500 MD steps and one instantaneous MC step (Fig. 2, secondtrace from top).

When the dimer is immersed in a dense fluid ofWCA particles,however, iterations consisting of 500 MD steps alone result inextremely slow barrier crossings, requiring g ≈ 600 iterations toproduce an uncorrelated sample (Fig. 2, middle trace). Unlike invacuum, the introduction of instantaneous MC moves does notsignificantly reduce the correlation time τ (Fig. 2, second tracefrom bottom). However, performing these same dimer expansionand contraction moves over 2,048-step NCMC moves (Fig. 2,

bottom trace) allows the system to rapidly mix between both com-pact and extended states with a correlation time of τ ¼ 4.0 itera-tions. Whereas each iteration requires a fivefold increase incomputational effort (500 MD steps + 2,048 NCMC switchingsteps ¼ 2;548 force evaluations, versus 500 force evaluations forMD alone), a 67-fold reduction in correlation time is achieved,resulting in a remarkable order-of-magnitude gain in overallefficiency.

The length of the NCMC switching process is a free parameterthat may be tuned to further improve efficiency. Toward this end,we estimated the acceptance probability of the extension/contrac-tion moves in dense solvent as a function of switching length(Fig. 3). Whereas instantaneous MC proposals of �r0 are onlyaccepted with probability ≈10−27 (the error is this quantity islikely underestimated due to its extremity), dividing the move intosmaller steps boosts the acceptance rate to a level useful in con-densed-phase simulation. If the move is divided into a small num-ber of steps (1–8), there is little to no increase in acceptance rate,but for an intermediate number of steps (16–1,024), there is asuperlinear boost in the acceptance probability relative to thelength of the switching process. The acceptance probability finallyreaches useful levels around 2,000 steps, achieving an acceptancerate of 12% using nonequilibrium proposal trajectories of 2,048steps, or 38% for 8,192 steps.

In general, there is no direct relationship between acceptanceprobability and efficiency. Under certain assumptions relevantto the bistable dimer, however, it is possible to link the NCMCacceptance probability to τeff , an indirect estimate of the correla-tion time,

τeff ¼ τMD

�τNCMC

τMD þ τNCMC

�; [22]

where τMD and τNCMC are correlation times for iterations con-sisting solely of MD or NCMC moves, respectively. The lattercorrelation time may be estimated from the average acceptanceprobability γ by τNCMC ≈ −1∕ lnð1 − 2γÞ (see Appendix for deri-vation).

As shown in Fig. 4, the effective correlation time τeff is onlydiminished when the NCMC acceptance probability is largeenough such that τNCMC ≈ τMD, which occurs when γ ≥ 0.13%(about 256 switching steps or more). For shorter switching times,even though the acceptance probability is high relative to instan-taneous MC, it is still too small to significantly reduce τeff .

When comparing efficiency, we are most interested in the rateof generating uncorrelated configurations for a given amount ofcomputational effort. Relative to MD alone, this rate is describedby the efficiency gain,

Fig. 2. Trajectories of WCA dimer system in vacuum and solvent. (Left) Thedimer extension r as a function of simulation iteration. The dotted horizontalline denotes division between compact and extended configurations. Thequantity τ printed above each plot indicates the estimated integrated auto-correlation time for the dimer extension r. (Right) Histogram accumulatedover trajectory (black), with true equilibrium distribution (red). Plot titlesdenote whether simulation was run in vacuum (vacuum) or dense WCA fluid(solvent), and whether the simulation utilized only 500 GHMC steps per itera-tion (MD) or included instantaneous MC (MC) or 2,048-step NCMC moves(NCMC) following each iteration.

1 2 4 8 16 32 64 128 256 512 1,024 2,048 4,096 8,192

10−30

10−20

10−10

100

NCMC switching steps

acce

ptan

ce p

roba

bilit

y

0

0.1

0.2

0.3

0.4

0.5

Fig. 3. Acceptance probabilities of NCMC proposals. (Top) Acceptance probability of NCMC proposals as a function of length of nonequilibrium proposaltrajectory (black dots), compared with instantaneous MC proposal (red line). (Inset) Enlarged region with acceptance probability shown on linear scale.Estimated 95% confidence intervals are shown as vertical lines.


BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

CHEM

ISTR

YPN

ASPL

US

E≡ gMDTMD

gNCMCðTMD þ TNCMCÞ: [23]

Here, TMD ¼ 500 steps per iteration, and TNCMC is again variedfrom 1–8,192 steps. The results are shown in the bottom panel ofFig. 4. Surprisingly, there is actually a slight loss in efficiencyat short switching times—dropping to a minimum of 86.9%the efficiency of MD alone at 128 steps—followed by a rapid gainin efficiency, plateauing at an efficiency gain of approximately13× the efficiency of MD alone for 2,048–4,096-step NCMC pro-posals. (A similar plateau behavior is observed in the modifiedparallel tempering protocol of ref. 15.) After this point, longerswitching times do not achieve as high of an efficiency gain; eventhough the acceptance rate continues to increase as the numberof NCMC switching steps is doubled again to 8,192 steps, thereduction in correlation time is not sufficient to offset the addi-tional cost of these moves.

EpilogueWe have described a procedure—nonequilibrium candidateMonte Carlo (NCMC)—by which nonequilibrium proposals canbe used within MCMC simulations to enhance acceptance rates.In our illustration, we have demonstrated how its use can leadto large improvements in statistical efficiency—the rate at whichuncorrelated samples are generated for a fixed amount of com-putational effort. In other applications, whether similarly largeefficiency gains are achieved will depend on the precise nature

of the system under study and the nonequilibrium proposalsintroduced. The most straightforward approach—borrowingMetropolis Monte Carlo proposals that are reasonable for onecomponent of the system in isolation, and converting them tononequilibrium proposals—is likely to be a fruitful avenue forgenerating efficient schemes, as was demonstrated here for a sim-ple system.

More generally, the problem of selecting efficient nonequili-brium proposals is similar to the problem of choosing good reac-tion coordinates, in that it is desirable to drive the system along(possibly complex) slow collective coordinates where orthogonaldegrees of freedom relax quickly. The search for such collectivecoordinates is a topic of active research (43–49). Given an initialnonequilibrium protocol, the issue of optimizing such a protocolto minimize dissipation (and maximize acceptance) is also a topicof active study, led by forays into the world of single-moleculemeasurement (50–52). Recent work has also suggested that esti-mating the thermodynamic metric tensor along the nonequili-brium parameter switching path (53–56), could prove useful inadaptively optimizing the switching protocol (57).

We note that switching trajectories contain potentially usefulinformation. Indeed, several methods (56, 58, 59) have recentlybeen developed to estimate equilibrium properties from nonequi-librium samples through the application of statistical estimatortheory to nonequilibrium fluctuation theorems (30, 60, 61); theseare particularly relevant to switching between multiple thermo-dynamic states. Though the development of efficient estimatorsthat utilize both nonequilibrium switching trials and sampledequilibrium data generated in NCMC simulations remains anopen challenge, it is at least straightforward to incorporate infor-mation from rejected NCMC proposals in the estimation of equi-librium averages (26, 62).

Materials and MethodsThe dimer system considered here consists of two particles that interact via adouble-well bonded potential in the interatomic distance r,

UbondðrÞ ¼ h�1 −

ðr − r0 − sÞ2s2

�2

; [24]

1 2 4 8 16 32 64 128 256 512 1,024 2,048 4,096 8,1920

100

200

300

NCMC switching stepsco

rrel

atio

n tim

e τ ef

f

1 2 4 8 16 32 64 128 256 512 1,024 2,048 4,096 8,1920

5

10

15

NCMC switching steps

rela

tive

effic

ienc

y

0

1

2

Fig. 4. Statistical efficiency gain of NCMC proposals relative to instantaneousMC proposals. (Upper) Effective correlation time τeff, in iterations, forMD+NCMC(black dots) compared toMD alone (red line). (Lower) Relative statistical efficiency of MD+NCMC, in terms of number of uncorrelated configurations generatedfor a fixed amount of computational effort, for MD+NCMC (black dots) relative to MD alone (red line).

Fig. 5. Umbrella sampling simulation of the dimer in WCA solvent. (Left)The dimer extension r as a function of simulation iteration. (Right) Thehistogram accumulated over the trajectory, with the observed histogramin black and the reweighted histogram (corrected for the applied umbrellabias potential) in red.


with h ¼ 5kBT , r0 ¼ rWCA, and s ¼ rWCA∕2, where rWCA ≡ 21∕6σ. Simulationsdenoted as “vacuum” contain only these two particles, whereas simulationsdenoted as “solvated” also interact with a dense bath of particles via theWCA nonbonded potential (41),

UWCAðrÞ ¼

8><>:

4ϵ

��σr

�12

−�

σr

�6�þ ϵ; r < rWCA

0 r ≥ rWCA

; [25]

with mass m ¼ 39.9 amu, σ ¼ 3.4 Å, and ϵ ¼ 120 kBT . The nonbonded WCAinteraction is excluded between the two bonded particles. The solvatedsystem contains a total of 216 WCA particles at a reduced density ofρσ3 ¼ 0.96. For all simulations, the reduced temperature is kBT∕ϵ ¼ 0.824.A custom Python code making use of the graphics processing unit (GPU)-accelerated OpenMM package (63–65) and the PyOpenMM Python wrapper(66) was used to conduct the simulations. All scripts are available for down-load from http://simtk.org/home/ncmc.

To ensure that observed differences were not due to changes in the sta-tionary distribution of the integrator, we used GHMC (1, 28, 36, 37) for all oursimulations. GHMC is based on a velocity Verlet discretization (23) of Lange-vin dynamics—the two are equivalent in the limit of small timesteps—butincludes an acceptance/rejection step to correct for errors introduced byfinite timesteps so that the stationary distribution is the exact equilibriumdistribution. We used a timestep of 0.002τ, where τ ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiσ2m∕ϵ

p, and the colli-

sion rate was set to τ−1. With this timestep, the acceptance probabilityis 99.929� 0.001%; the resulting dynamics closely approximates Langevindynamics.

In simulations employing instantaneous Monte Carlo moves, a perturba-tion Δr to the interatomic distance r of the dimer was chosen according to

Δr ¼8<:

þr0 if r < 1.5r0−r0 if 1.5r0 ≤ r ≤ 3r00 otherwise

: [26]

The dimer was contracted or expanded about the bondmidpoint to generatea new configuration xnew with dimer extension rnew from the old configura-tion xold with dimer extension rold, and the move accepted or rejected withthe Metropolis–Hastings criterion (3),

AðxnewjxoldÞ ¼ minf1; e−β½UðxnewÞ−UðxoldÞ�Jrðxold; xnewÞg; [27]

where the Jacobian ratio term Jrðxold; xnewÞ ¼ ðrnew∕roldÞ2 accounts for theexpansion and contraction of phase space due to the Monte Carlo proposals.

For simulations employing T -step NCMC moves, proposals were made byselecting a new velocity vector from the Maxwell–Boltzmann distribution,integrating T steps of velocity Verlet dynamics (23) for all bath atoms asthe dimer extension was driven from rold to rnew in equal steps of sizeΔr∕T , and accepting or rejecting based on the modified Metropolis criteriafor symplectic integrators (Eqs. 12–20),

AðXÞ ¼ minf1; e−β½HðxT Þ−Hðx0Þ�Jrðx0; xTÞg: [28]

The Jacobian ratio is also ðrnew∕roldÞ2. This scheme corresponds to a choice forthe perturbation kernel of

αtðx; yÞ ¼�rðyÞrðxÞ

�2

δð½rðyÞ − rðxÞ� − ½Δr∕T�Þ; [29]

where rðxÞ denotes the dimer separation of configuration x. The propagationkernel Ktðx; yÞ corresponds to velocity Verlet dynamics where the dimeratoms are held fixed in space. The final configuration after the MC or NCMCrejection procedure was stored and plotted to generate Fig. 2.

The mean acceptance probabilities for each switching time τ can be esti-mated via the sample mean

hAiτ ≈1

N∑N

n¼1

AðXnÞ: [30]

For numerical stability, logarithms of AðXnÞwere stored, as an ≡ lnAðXnÞ. Wethen estimated lnhAiτ (shown in Fig. 4) as

lnhAiτ ≈ − lnN þ ln bþ∑N

n¼1

ean−b; [31]

where b≡maxn an.Integrated autocorrelation times were estimated using the rapid scheme

described in section 5.2 of ref. 42.The acceptance probabilities plotted in Fig. 4 were estimated from a

trajectory consisting of 10,000 iterations of 2,048-step NCMC, with 500 stepsof GHMC dynamics in between each NCMC trial, to ensure equilibrium sam-pling. Prior to each 2,048-step NCMC iteration, a velocity from the Maxwell–Boltzmann distribution was selected, and NCMC trial moves with varyingswitching times were conducted solely to accumulate statistics. The statisticalerror in the estimate of lnhAiτ and the computed relative efficiency over in-stantaneous Monte Carlo was estimated by 1,000 bootstrap trials, in whichthe dataset of 10,000 work samples was resampled with replacement in eachbootstrap trial and 95% confidence intervals computed from the distributionover bootstrap replicates.

The reference distribution for the interparticle distribution PðrÞ plottedin red on the right side of Fig. 2 was computed analytically for the vacuumsystem,

PvacðrÞ ∝ 4πr2e−βUbondðrÞ: [32]

For the solvated system, this distribution was estimated from an umbrellasampling simulation (Fig. 5) employing a modified bonded potential in-tended to remove the barrier in between compact and extended states,

UumbrellaðrÞ ¼ kBT ln r2 þ θðrmin − rÞðK∕2Þ½r − rmin�2þ θðr − rmaxÞðK∕2Þ½r − rmax�2; [33]

where rmin ¼ r0, rmax ¼ 2.05r0, and K ¼ kBT∕η2, with η ¼ 0.3 Å, and θðrÞ is theHeaviside function that assumes a value of unity for r ≥ 0 and zero otherwise.The true solvated interparticle distribution pðrÞ was estimated by reweight-ing the data produced from this simulation, using the relationship

PsolðrÞ ∝ ∑N

n¼1δðr − rnÞe−β½UbondðrnÞ−UumbrellaðrnÞ�

∑N

n¼1e−β½UbondðrnÞ−UumbrellaðrnÞ� ; [34]

where rn denotes the bond separation for sample n, and a finite-width his-togram bin was used instead of the delta function δðrÞ.

AppendixProof that NCMC Preserves the Equilibrium Distribution. Followingthe proof for GHMC in ref. 36, here we show that NCMC pre-serves the equilibrium distribution. The expected acceptance ratefor NCMC moves initiated from ðx; λÞ is

αðx; λÞ≡Z

dΛZ

dXρðX; Λjx; λÞAðX jΛÞ: [35]

Suppose that we have a variate ðxðnÞ; λðnÞÞ drawn from the equi-librium distribution πðx; λÞ. The probability density of the nextvalue in the chain, pðxðnþ1Þ; λðnþ1ÞÞ, has contributions from twoscenarios: when the candidate variate is rejected and when itis accepted. The contribution from rejecting the candidate andflipping the momentum such that ðxðnþ1Þ; λðnþ1ÞÞ ¼ ð~xðnÞ; λðnÞÞ is

Zdx∑

λ

πðx; λÞ½1 − αðx; λÞ�δð~x − xðnþ1ÞÞδðλ − λðnþ1ÞÞ ¼ πð~xðnþ1Þ; λðnþ1ÞÞ½1 − αð~xðnþ1Þ; λðnþ1ÞÞ�: [36]


BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

CHEM

ISTR

YPN

ASPL

US

http://simtk.org/home/ncmc

http://simtk.org/home/ncmc

The latter contribution from accepting the candidate such that ðxðnþ1Þ; λðnþ1ÞÞ ¼ ðxT; λTÞ is,

Zdx∑

λ

πðx; λÞZ

dXZ

dΛρðX; Λjx; λÞAðX jΛÞδðxT − xðnþ1ÞÞδðλT − λðnþ1ÞÞ

¼Z

dx0∑λ0

ZdX

ZdΛ½πðx0; λ0ÞρðX; Λjx0; λ0ÞAðX jΛÞ�δðxT − xðnþ1ÞÞδðλT − λðnþ1ÞÞ

¼Z

dxT∑λT

Zd ~X

Zd ~Λ½πð~xT; λTÞρð ~X; ~Λj~xT; λTÞAð ~X j ~ΛÞ�δðxT − xðnþ1ÞÞδðλT − λðnþ1ÞÞ

¼ πð~xðnþ1Þ; λðnþ1ÞÞαð~xðnþ1Þ; λðnþ1ÞÞ; [37]

where ρðX; Λjx0; λ0Þ≡ ΠðX jx0; ΛÞPðΛjx0; λ0Þ is the probability ofgenerating the trajectory-protocol pair ðX; ΛÞ from ðx0; λ0Þ, andthe pathwise detailed balance condition (Eq. 6) is used to pro-duce the quantity in brackets.

Taking the sum of Eqs. 36 and 37, we find that the equilibriumdistribution is preserved

pðxðnþ1Þ; λðnþ1ÞÞ ¼ πðxðnþ1Þ; λðnþ1ÞÞ: [38]

By analogous reasoning, maintaining the momentum uponrejection, ðxðnþ1Þ; λðnþ1ÞÞ ¼ ðxðnÞ; λðnÞÞ, and flipping it upon accep-tance, ðxðnþ1Þ; λðnþ1ÞÞ ¼ ð~xT; λTÞ will also preserve the equilibriumdistribution.

Acceptance Criteria for Overdamped Langevin (Brownian) Integratorof Ermak and Yeh. A common integrator for Brownian dynamics(the overdamped regime of Langevin dynamics), in which onlycoordinates x are explicitly integrated, is given by Ermak andYeh (24, 25). In our notation, where the perturbed coordinatesx�t are propagated by one step of the stochastic integrator toobtain xt, application of the propagation kernel Kðx�t ; xtÞ canbe written,

xt ¼ x�t −Δtγm

Ftðx�t Þ þffiffiffi2

p �Δtγm

�1∕2

ξt; [39]

where m is the particle mass, FtðxÞ≡ −ð∂∕∂xÞHtðxÞ is the (poten-tially time-dependent) systematic force, and γ is an effective col-lision frequency or friction coefficient with units of inverse time.The noise history ξt for each degree of freedom is a normal ran-dom variate with zero mean and variance β−1, drawn from thedistribution

ϕðξtÞ ¼1ffiffiffiffiffiffiffiffiffiffiffiffi2πβ−1

p exp�−β

2ξ2t

�: [40]

In NCMC, every application of the propagation kernel Kt pro-duces a transition x�t → xt determined by the noise history variableξt, there is a corresponding ~ξt that generates the opposite step,xt → x�t . By noting

xt ¼ x�t −Δtγm

Ftðx�t Þ þffiffiffi2

p �Δtγm

�1∕2

ξt

x�t ¼ xt −Δtγm

FtðxtÞ þffiffiffi2

p �Δtγm

�1∕2

~ξt; [41]

we can compute the relationship

~ξt ¼1ffiffiffi2

p�Δtγm

�1∕2

½FtðxtÞ þ Ftðx�t Þ� − ξt: [42]

Then, the ratio of transition kernels appearing in Eq. 10 can bewritten in terms of noise history ξt and the computed reversenoise history ξ�t

ΔSðXÞ ¼ − lnYTt¼1

Ktðxt; x�t ÞKtðx�t ; xtÞ

¼ − lnYTt¼1

ϕð~ξtÞj ∂x�t

∂~ξtj

ϕðξtÞj ∂xt∂ξtj ¼ − ln

YTt¼1

ϕð~ξtÞϕðξtÞ

¼ − lnYTt¼1

exp�−β

2ð~ξ2t − ξ2t Þ

�¼ β

2∑T

t¼1

ð~ξ2t − ξ2t Þ; [43]

where the tildes are dropped because the microstate x contains nomomenta. The quantity j∂xt∕∂ξtj represents the Jacobian for thechange of variables from the ξt to xt, and the Jacobians in thenumerator and denominator cancel. The quantity in Eq. 43can easily be computed during integration.

Acceptance Criteria for Langevin Integrator of Brooks, Brünger, andKarplus (BBK). The Brünger–Brooks–Karplus (BBK) stochastic in-tegrator (38, 39) is a popular integrator for simulating Langevindynamics. In our notation, where the perturbed coordinates x�tare propagated by one step of the stochastic integrator to obtainxt, application of the propagation kernel Kðx�t ; xtÞ can be written

v0t ¼ v�t þΔt2m

�Ftðr�t Þ − γmv�t þ

ffiffiffiffiffiffiffiffiffi2γmΔt

rξt

�

rt ¼ r�t þ Δtv0t

vt ¼1

1þ γΔt2

�v0t þ

Δt2m

�FtðrtÞ þ

ffiffiffiffiffiffiffiffiffi2γmΔt

rξ0t

��; [44]

where we have used a velocity Verlet discretization of the BBKintegrator (36, 40). Here rt and vt denote the respective Cartesianposition and velocity components of the microstate xt, γ the ef-fective collision frequency with units of inverse time, and m theparticle mass. v0t is an auxiliary variable used only in simplifyingthe mathematical representation of the integration scheme. Notethat we require two random variates, ξt and ξ0t, per degree of free-dom per timestep in order for this scheme to be able to generateboth the forward trajectory X and its time-reverse ~X (e.g., section2.2.3.2 of ref. 36).

The noise history terms ξt and ξ0t are normal random variateswith zero mean and variance β−1. Their joint distribution cantherefore be written

ψðξt; ξ0tÞ ¼1

2πβ−1exp

�−β

2ðξ2t þ ξ02t Þ

�: [45]

For every step x�t → xt, the positions and velocities undergo a tran-sition ðr�t ; v�t Þ → ðrt; vtÞ determined by the noise variables ðξt; ξ0tÞ.A corresponding choice of noise variables ð~ξt; ~ξ0tÞ will generate the


reverse step, ~xt → ~x�t ; carrying ðrt; −vtÞ → ðr�t ; −v�t ÞWith a little al-gebra, it is seen that

~ξt ¼ ξ0t −ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2γmΔtvt

p~ξ0t ¼ ξt −

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2γmΔtv�t

p: [46]

To write the ratio of transition kernels appearing in Eq. 10 interms of noise variables ðξt; ξ0tÞ and the computed reverse noisevariables ð~ξt; ~ξ0tÞ, we must first compute the Jacobian Jðξt; ξ0tÞ be-cause the random variates are not in Cartesian space

Jðξt;ξ0tÞ≡ det∂rt∂ξt

∂vt∂ξt

∂rt∂ξ0t

∂vt∂ξ0t

" #; [47]

which can be shown to be independent of ξt and ξ0t. The condi-tional path action difference can now be computed

ΔSðXÞ ¼ − lnYTt¼1

Ktð~xt; ~x�t ÞKtðx�t ; xtÞ

¼ − lnYTt¼1

ψð~ξt; ~ξ0tÞJð~ξt; ~ξ0tÞψðξt; ξ0tÞJðξt; ξ0tÞ

¼ β

2∑T

t¼1

½ð~ξ2t þ ~ξ02t Þ − ðξ2t þ ξ02t Þ�; [48]

where the ratio of Jacobians Jð~ξt; ~ξ0tÞ∕Jðξt; ξ0tÞ cancels because theJacobians are independent of the noise variates.

Derivation of Effective Correlation Time for Mixed MD/NCMC Sam-pling. For simplicity, we make the assumption that the systemof interest has two long-lived conformational states of equalpopulation with dimer extensions rc and re. This assumption holdsto good approximation for the WCA dimer example consideredhere, and may apply to other systems of interest as well. We as-sume that the correlation time for a fixed number of MD simula-tion steps per iteration is given by τMD, and describe theprobability of finding the system ends up in a given conforma-tional state after one iteration by a 2 × 2 column-stochastic tran-sition matrix TMD

TMD ¼ 1 − α αα 1 − α

� �: [49]

For a 2 × 2 system whose time evolution is governed by thecolumn stochastic transition matrix T, we can write the autocor-relation function for the dimer extension r as

CðnΔtÞ ¼ hrð0ÞrðnΔtÞi ¼ ½ rc re �Tn 1∕2 0

0 1∕2

� �rcre

� �

¼ ½ rc re �U 1 0

0 μn

� �U−1 1∕2 0

0 1∕2

� �rcre

� �

¼ ðC0 − C∞Þμn þ C∞; [50]

where the transition matrix T has unitary eigenvalue decomposi-tion UMU−1, and μ is the nonunit eigenvalue of T. The constantsare C0 ¼ ð1∕2Þðr2c þ r2eÞ and C∞ ¼ ð1∕4Þðrc þ reÞ2.

Relating this to the autocorrelation time τ estimated from atimeseries, intended to reflect the fit to

CðtÞ ¼ ðC0 − C∞Þe−t∕τ þ C∞; [51]

we can see that τ ¼ −1∕ ln μ. We then determine that the correla-tion time τMD ¼ −1∕ ln μMD, with μMD ¼ 1 − 2α.

Similarly, we can write the probability that the NCMC stepwill carry the system from one conformational state to another interms of the acceptance probability γ, which we assume to be sym-metric,

TNCMC ¼ 1 − γ γγ 1 − γ

� �; [52]

where we have correlation time τNCMC ¼ −1∕ ln μNCMC andμNCMC ¼ 1 − 2γ.

The effective transition matrix Teff for iterations consisting ofMD simulation steps followed by an NCMC trial is given by

Teff ¼ TMDTNCMC ¼ 1 − α αα 1 − α

� �1 − γ γγ 1 − γ

� �

¼ 1 − ðαþ γÞ ðαþ γÞ − 2αγðαþ γÞ − 2αγ 1 − ðαþ γÞ

� �; [53]

where the effective correlation time τeff ¼ −1∕ ln μeff , withμeff ¼ 1 − 2½ðαþ γÞ − 2αγ�. Substituting in α ¼ ð1 − e−1∕τMDÞ∕2and γ ¼ ð1 − e−1∕τNCMCÞ∕2, we obtain

τeff ¼ −1

ln½1 − ð1 − e−1∕τMDÞ − ð1 − e−1∕τNCMCÞ þ ð1 − e−1∕τMD Þð1 − e−1∕τNCMCÞ�¼ −

1

ln½e−1∕τMDe−1∕τNCMC � ¼1

τ−1MD þ τ−1NCMC¼ τMDτNCMC

τMD þ τNCMC: [54]

As a check of the accuracy of Eq. 54, we note that for MD with2,048-step NCMC switching, we compute τeff ≈ 4.0 iterations,using only τMD ¼ 299.8 iterations and the NCMC acceptanceprobability γ ¼ 12.1%. The actual correlation time measuredfrom a 10,000 iteration simulation is computed to be τeff ¼ 4.0.

ACKNOWLEDGMENTS. We thank Gabriel Stoltz (CERMICS, École des PontsParisTech); Jed W. Pitera (IBM Almaden Research Center); Manuel Athenès(CEA Saclay); Firas Hamze (D-Wave Systems); Yael Elmatad, Anna Schneider,Paul Nerenberg, Todd Gingrich, David Chandler, David Sivak, Phillip Geissler,Michael Grünwald, and Ulf Rörbæck Pederson (University of California,Berkeley); Vijay S. Pande (Stanford University); and SuriyanarayananVaikuntanathan, Andrew J. Ballard, and Christopher Jarzynski (Universityof Maryland), and Huafeng Xu (D. E. Shaw Research) for enlightening discus-sions on this topic and constructive feedback on this manuscript, as well as

the two anonymous referees for their helpful suggestions for improvingclarity. J.P.N. was supported by the US Department of Energy (DOE) by Lawr-ence Livermore National Laboratory under Contract DE-AC52-07NA27344.G.E.C. was funded by the Helios Solar Energy Research Center, which is sup-ported by the Director, Office of Science, Office of Basic Energy Sciences ofthe DOE under Contract DE-AC02-05CH11231. D.D.L.M. was funded by aDirector’s Postdoctoral Fellowship from Argonne National Laboratory.J.D.C. was supported through a distinguished postdoctoral fellowship fromthe California Institute for Quantitative Biosciences (QB3) at the Universityof California, Berkeley. Additionally, the authors are grateful to OpenMMdevelopers Peter Eastman, Mark Friedrichs, Randy Radmer, and ChristopherBruns for their generous help with the OpenMMGPU-accelerated computingplatform and associated PyOpenMM Python wrappers. This research wasperformed under the auspices of the DOE by Lawrence Livermore NationalLaboratory under Contract DE-AC52-07NA27344 and by University ofChicago Argonne, LLC, Operator of Argonne National Laboratory (Argonne).Argonne, a USDE Office of Science laboratory, is operated under ContractDE-AC02-06CH11357.


BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

CHEM

ISTR

YPN

ASPL

US

1. Liu JS (2002) Monte Carlo Strategies in Scientific Computing (Springer, New York),2nd Ed.

2. Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation ofstate calculations by fast computing machines. J Chem Phys 21:1087–1092.

3. Hastings WK (1970) Monte Carlo sampling methods using Markov chains and theirapplications. Biometrika 57:97–109.

4. Rahman A (1964) Correlations in the motion of atoms in liquid argon. Phys Rev 136:A405–A411.

5. Swendsen RH, Wang J-S (1987) Nonuniversal critical dynamics in Monte Carlo simula-tions. Phys Rev Lett 58:86–88.

6. Wolff U (1989) Collective Monte Carlo updating for spin systems. Phys Rev Lett62:361–364.

7. Neal RM (2004) Taking bigger Metropolis steps by dragging fast variables. (Depart-ment of Statistics, University of Toronto, Toronto), Technical Report 0411.

8. Andrieu C, Doucet A, Holenstein R (2010) Particle Markov chain Monte Carlo methods.J R Stat Soc Series B Stat Methodol 72:269–342.

9. Athènes M (2002) Computation of a chemical potential using a residence weightalgorithm. Phys Rev E 66:046705.

10. Bürgi R, Kollman PA, van Gunsteren WF (2002) Simulating proteins at constant pH:An approach combining molecular dynamics and Monte Carlo simulation. Proteins47:469–480.

11. Stern HA (2007) Molecular simulation with variable protonation states at constant pH.J Chem Phys 126:164112.

12. Nilmeier J, Jacobson MP (2009) Monte Carlo sampling with hierarchical move sets:POSH Monte Carlo. J Chem Theor Comput 1968–1984.

13. Opps SB, Schofield J (2001) Extended state-space Monte Carlo methods. Phys Rev E63:56701.

14. Brown S, Head-Gordon T (2003) Cool walking: A new Markov chain Monte Carlosampling method. J Comput Chem 24:68–76.

15. Ballard AJ, Jarzynski C (2009) Replica exchange with nonequilibrium switches. ProcNatl Acad Sci USA 106:12224–12229.

16. Lyubartsev AP, Martsinovski AA, Shevkunov SV, Vorontsov-Velyaminov PN (1992)New approach to Monte Carlo calculation of the free energy: Method of expandedensembles. J Chem Phys 96:1776–1783.

17. Park S (2008) Comparison of the serial and parallel algorithms of generalized ensem-ble simulations: An analytical approach. Phys Rev E 77:016709.

18. Shirts MR, Chodera JD (2008) Statistically optimal analysis of samples from multipleequilibrium states. J Chem Phys 129:124105.

19. Marinari E, Parisi G (1992) Simulated tempering: A newMonte Carlo scheme. EurophysLett 19:451–458.

20. Li H, Fajer M, Yang W (2007) Simulated scaling method for localized enhancedsampling and simultaneous “alchemical” free energy simulations: A general methodfor molecular mechanical, quantum mechanical, and quantum mechanical/molecularmechanical simulations. J Chem Phys 126:024106.

21. Mongan J, Case DA, McCammon JA (2004) Constant pH molecular dynamics ingeneralized Born implicit solvent. J Comput Chem 25:2038–2048.

22. Jarzynski C (2000) Hamiltonian derivation of a detailed fluctuation theorem. J StatPhys 98:77–102.

23. SwopeWC, Andersen HC, Berens PH, Wilson KR (1982) A computer simulation methodfor the calculation of equilibrium constants for the formation of physical clusters ofmolecules: Application to small water clusters. J Chem Phys 76:637–649.

24. Ermak DL, Yeh Y (1974) Equilibrium electrostatic effects on the behavior of polyions insolution: Polyion-mobile ion interaction. Chem Phys Lett 24:243–248.

25. Ermak DL (1975) A computer simulation of charged particles in solution. I. Techniqueand equilibrium properties. J Chem Phys 62:4189–4196.

26. Frenkel D (2004) Speed-up of Monte Carlo simulations by sampling of rejected states.Proc Natl Acad Sci USA 101:17571–17575.

27. Duane S, Kennedy AD, Pendleton BJ, Roweth D (1987) Hybrid Monte Carlo. Phys Lett B195:216–222.

28. Lelièvre T, Stoltz G, Rousset M (2010) Langevin dynamics with constraints and compu-tation of free energy differences. (Imperial College Press, London).

29. Crooks GE (1998) Nonequilibrium measurements of free energy differences for micro-scopically reversible markovian systems. J Stat Phys 90:1481–1487.

30. Jarzynski C (1997) Nonequilibrium equality for free energy differences. Phys Rev Lett78:2690–2693.

31. Lechner W, Oberhofer H, Dellago C, Geissler PL (2006) Equilibrium free energies fromfast-switching trajectories with large time steps. J Chem Phys 124:044113.

32. Andersen HC (1983) Rattle: A “velocity” version of the Shake algorithm for moleculardynamics calculations. J Comput Phys 52:24–34.

33. Leimkuhler B, Skeel RD (1994) Symplectic numerical integrators in constrainedHamiltonian systems. J Comput Phys 112:117–125.

34. Paterlini MG, Ferguson DM (1998) Constant temperature simulations using theLangevin equation with velocity Verlet integration. Chem Phys 236:243–252.

35. Izaguirre JA, Sweet CR, Pande VS (2010) Multiscale dynamics of macromolecules usingnormal mode Langevin. Pac Symp Biocomput 15:240–251.

36. Lelièvre T, Stoltz G, Rousset M (2010) Free Energy Computations: A MathematicalPerspective (Imperial College Press, London), 1st Ed.

37. Gustafson P (1998) A guided walk Metropolis algorithm. Stat Comput 8:357–364.38. Brünger A, Brooks CL, III, Karplus M (1984) Stochastic boundary conditions for mole-

cular dynamics simulations of ST2 water. Chem Phys Lett 105:495–500.39. Pastor RW, Brooks Br, Szabo A (1988) An analysis of the accuracy of Langevin and

molecular dynamics algorithms. Mol Phys 65:1409–1419.40. Schlick T (2002) Molecular Modeling and Simulation: An Interdisciplinary Guide

(Springer, New York), 1st Ed.41. Weeks JD, Chandler D, Andersen HC (1971) Role of repulsive forces in determining the

equilibrium structure of simple liquids. J Chem Phys 54:5237–5247.42. Chodera JD, Swope WC, Pitera JW, Seok C, Dill KA (2007) Use of the weighted

histogram analysis method for the analysis of simulated and parallel tempering simu-lations. J Chem Theor Comput 3:26–41.

43. Bolhuis PG, Chandler D, Dellago C, Geissler PL (2002) Transition path sampling: Throw-ing ropes over rough mountain passes, in the dark. Annu Rev Phys Chem 53:291–318.

44. Best RB, Hummer G (2005) Reaction coordinates and rates from transition paths. ProcNatl Acad Sci USA 102:6732–6737.

45. Ma A, Dinner AR (2005) Automatic method for identifying reaction coordinates incomplex systems. J Phys Chem B 109:6769–6779.

46. E W, Ren W, Vanden-Eijnden E (2005) Finite temperature string method for the studyof rare events. J Phys Chem B 109:6688–6693.

47. Berezhkovskii A, Szabo A (2005) One-dimensional reaction coordinates for diffusiveactivated rate processes in many dimensions. J Chem Phys 122:014503.

48. Peters B, Trout BL (2006) Obtaining reaction coordinates by likelihood maximization.J Chem Phys 125:054108.

49. Ensing B, de Vivo M, Liu Z, Moore P, Klein ML (2006) Metadynamics as a tool forexploring free energy landscapes of chemical reactions. Acc Chem Res 39:73–81.

50. Schmiedl T, Seifert U (2007) Optimal finite-time processes in stochastic thermody-namics. Phys Rev Lett 98:108301.

51. Then H, Engel A (2008) Computing the optimal protocol for finite-time processes instochastic thermodynamics. Phys Rev E 77:041105.

52. Gomez-Marin A, Schmiedl T, Seifert U (2008) Optimal protocols for minimal workprocesses in underdamped statistical thermodynamics. J Chem Phys 129:024114.

53. Salamon P, Berry RS (1983) Thermodynamic length and dissipated availability. Phys RevLett 51:1127–1130.

54. Crooks GE (2007) Measuring thermodynamic length. Phys Rev Lett 99:100602.55. Feng EH, Crooks GE (2009) Far-from-equilibrium measurements of thermodynamic

length. Phys Rev E 79:012104.56. Minh DDL, Chodera JD (2011) Multiple-timeslice nonequilibrium estimators: Estimat-

ing equilibrium ensemble averages from nonequilibrium experiments using multipletimeslices, with application to thermodynamic length. J Chem Phys 134:024111.

57. Shenfeld DK, Xu H, Eastwood MP, Dror RO, Shaw DE (2009) Minimizing thermody-namic length to select intermediate states for free-energy calculations and replica-exchange simulations. Phys Rev E 80:046705.

58. Minh DDL, Adib AB (2008) Optimized free energies from bidirectional single-moleculeforce spectroscopy. Phys Rev Lett 100:180602.

59. Minh DDL, Chodera JD (2009) Optimal estimators and asymptotic variances for none-quilibrium path-ensemble averages. J Chem Phys 131:134110.

60. Crooks GE (1999) Entropy production fluctuation theorem and the nonequilibriumwork relation for free energy differences. Phys Rev E 60:2721–2726.

61. Crooks GE (2000) Path-ensemble averages in systems driven far from equilibrium. PhysRev E 61:2361–2366.

62. Athènes M, Marinica M-C (2010) Free energy reconstruction from steered dynamicswithout post-processing. J Chem Phys 229:7129–7146.

63. Friedrichs MS, et al. (2009) Accelerating molecular dynamic simulation on graphicsprocessing units. J Comput Chem 30:864–872.

64. Eastman P, Pande VS (2010) OpenMM: A hardware-independent framework formolecular simulations. Comput Sci Eng 12:34–39.

65. Eastman P, Pande VS (2010) Efficient nonbonded interactions for molecular dynamicsof a graphics processing unit. J Comput Chem 31:1268–1272.

66. Bruns CM, Radmer RA, Chodera JD, Pande VS (2010) PyOpenMM. http://simtk.org/home/pyopenmm.


http://simtk.org/home/pyopenmm



Documents

Nonequilibrium candidate Monte Carlo is an efficient … · Nonequilibrium candidate Monte Carlo is an efficient tool for equilibrium simulation ... favored tools for sampling from