54
Basics of molecular dynamics simulations Reduction of the quantum problem to a classical one Parameterization of force fields Boundaries, energy minimization, integration algorithms Analysis of trajectory data Lecture notes are at : www.physics.usyd.edu.au/~serdar/ws 1

Basics of molecular dynamics simulations

Embed Size (px)

DESCRIPTION

Reduction of the quantum problem to a classical one Parameterization of force fields Boundaries, energy minimization, integration algorithms Analysis of trajectory data Lecture notes are at : www.physics.usyd.edu.au/~serdar/ws. Basics of molecular dynamics simulations. - PowerPoint PPT Presentation

Citation preview

Page 1: Basics of molecular dynamics simulations

Basics of molecular dynamics simulations

Reduction of the quantum problem to a

classical one

Parameterization of force fields

Boundaries, energy minimization, integration

algorithms

Analysis of trajectory data

Lecture notes are at :

www.physics.usyd.edu.au/~serdar/ws1

Page 2: Basics of molecular dynamics simulations

Quantum, classical and stochastic description of molecular systems

Quantum mechanics (Schroedinger equation)Quantum mechanics (Schroedinger equation)

The most fundamental approach but feasible only for few atoms (~10).The most fundamental approach but feasible only for few atoms (~10).

Approximate methods (e.g. density functional theory) allows treatment Approximate methods (e.g. density functional theory) allows treatment

of larger systems (~1000) and dynamic simulations for picoseconds.of larger systems (~1000) and dynamic simulations for picoseconds.

Classical mechanics (Newton’s equation of motion)Classical mechanics (Newton’s equation of motion)

Most atoms are heavy enough to justify a classical treatment (except Most atoms are heavy enough to justify a classical treatment (except

H). The main problem is finding accurate potential functions (force H). The main problem is finding accurate potential functions (force

fields). fields).

MD simulation of 100,000 atoms for many nanoseconds is now feasible.MD simulation of 100,000 atoms for many nanoseconds is now feasible.

Stochastic mechanics (Langevin equation)Stochastic mechanics (Langevin equation)

Most biological processes occur in the range of microsec to millisec. Most biological processes occur in the range of microsec to millisec.

Thus to describe such processes, a simpler (coarse-grained) Thus to describe such processes, a simpler (coarse-grained)

representation of atomic system is essential (e.g. Brownian dynamics).representation of atomic system is essential (e.g. Brownian dynamics).2

Page 3: Basics of molecular dynamics simulations

,

2

22

2

22

2

2

2

,,

i i

ine

e

ji ji

ji

ii

in

iineen

ezU

em

H

ezz

MH

EUHH

rR

rr

RR

rRrR

Many-body Schroedinger eq. for a molecular system

Here m and Mi are the mass of the electrons and nuclei,

r and Ri denote the electronic and nuclear coordinates,

and and i denote the respective gradients.

nuclear hamiltonian

electronic hamiltonian

elect-nucl. interact.

(1)

3

Page 4: Basics of molecular dynamics simulations

rRRrR ,, ieini

rRRrRR ,, ieinieinneen EUHH

Separation of the electronic wave function

Nuclei are much heavier and hence move much slower than

electrons. This allows decoupling of their motion from those of

electrons. Introduce the product wave function:

Substituting this ansatz in the Schroedinger eq. gives

For fixed nuclei, the electronic part gives

rRRrR ,, ieieienee EUH (2)

4

Page 5: Basics of molecular dynamics simulations

rRRrRRR ,, ieinieinien EEH

Substitute the electronic part back in the Schroedinger eq.

Eqs. (2, 3) need to be solved simultaneously, which is a

formidable

problem for most systems. For two nuclei, there is only one

coordinate

for R (the distance), so it is feasible. But for three-nuclei, there

are

4 coordinates (in general for N nuclei, 3N-5 coordinates are

required),

which makes numerical solution very difficult.

Born-Oppenheimer (adiabatic) approximation consists of neglecting

the cross terms arising from

(which are of order m/M), so that the nuclear part becomes

rR ,ienH

ininien EEH RRR (3)

5

Page 6: Basics of molecular dynamics simulations

ieji ji

jii

iii

i

Eezz

U

NiUdt

dM

RRR

R

RR

2

2

2

,,1

Classical approximation for nuclear motion

Nuclei are heavy so their motion can be described classically, that

is,

instead of solving the Schroedinger Eq. (3), we solve the

corresponding

Newton’s eq. of motion

At zero temperature, the potential can be minimized with respect to

the

nuclear coordinates to find the equilibrium conformation of

molecules.

At finite temperature, Eqs. (2) and (4) form the basis of ab initio

MD

(ignores quantum effects in nuclear motion and electronic exc. at

finite T)

(4)

6

Page 7: Basics of molecular dynamics simulations

Methods of solution for the electronic equation

Two basic methods of solution:

1. Hartree-Fock (HF) based methods: HF is a mean field theory.

One finds the average, self-consistent potential in which electrons move.

Electron correlations are taken into account using various methods.

2. Density functional theory: Solves for the density of electrons.

Better scaling than HF (which is limited to ~10 atoms); 100’s of atoms.

Car-Parrinello MD (DFT+MD) has become popular in recent years.

(2)

rRR

rRrRrr

,

,2 ,

222

2

ieie

iei i

i

i

E

ezem

Electronic part of the Schroedinger eq. (2) has the form

7

Page 8: Basics of molecular dynamics simulations

een

)()( rrr

)]([''

)'()(

2)]([

)(

)(

332

32

,

2

rrr

rrr

rrR

rrrR

nErrddnne

nTH

rdnez

ezU

UHE

xce

i i

i

ei i

iene

eneeee

Density functional theory (DFT)

In DFT one uses the density function of electrons, n(r) instead

of their wavefunction

Expectation value of the electronic energy in the ground state becomes

8

Page 9: Basics of molecular dynamics simulations

Nidt

dM

N

ijiji

ii ,,1,

2

2

FF

r

Classical mechanics

Molecular dynamics (MD) is the most popular method for

simulation studies of biomolecules. It is based on Newton’s

equation of motion.

For N interacting atoms, one needs to solve N coupled Diff. Eq’s:

Force fields are determined from experiments and ab initio

methods.

Analytically this is an intractable problem for N>2,

but we can solve it easily on a computer using numerical

methods.

Current computers can handle N ~ 105-6 particles, which is large

enough for description of most biomolecules.

Integration time, however is still a bottleneck (106 steps @ 1 fs =

1 ns)

9

Page 10: Basics of molecular dynamics simulations

iiiiii

i mdt

dm RvF

r

2

2

Stochastic mechanics

In order to deal with the time bottle-neck in MD, one has to

simplify the simulation system (coarse graining). This is

achieved by describing parts of the system as continuum with

dielectric constants.

Examples:

• transport of ions in electrolyte solutions (water →

continuum)

• protein folding (water → continuum)

• function of transmembrane proteins (lipid, water →

continuum)

To include the effect of the atoms in the continuum, modify

Newton’s eq. of motion by adding frictional and random forces:

Langevin

equation10

Page 11: Basics of molecular dynamics simulations

Molecular dynamics (MD) simulations

MD programs consist of over 100,000 lines of coding. To

understand

what is going on in an MD simulation, we need to consider

several

topics:

1. Force fields (potential functions among atoms)

2. Boundaries and computation of long-range interactions

3. Molecular mechanics (T=0, energy minimization)

4. MD simulations (integration algorithms, ensembles)

5. Analysis of trajectory data to characterize structure and

function of proteins11

Page 12: Basics of molecular dynamics simulations

Empirical force fields (potential functions)

Problem: Reduce the ab initio potential energy surface among the atoms

ji kji

kjijii UUU ),,(),( 32 RRRRRR

To a classical many-body interaction in the form

ieji ji

jii E

ezzU R

RRR

2

Such a program has been tried for water but so far it has failed to

produce a classical potential that works. In strongly interacting systems,

it is difficult to describe the collective effects by summing up the

many-body terms.

Practical solution: Truncate the series at the two-body level and assume

(hope!) that the effects of higher order terms can be absorbed in U2 12

Page 13: Basics of molecular dynamics simulations

Non-bonded interactions

Interaction of two atoms at a distance R = |Ri - Rj| can be

decomposed

into 4 pieces

1. Coulomb potential

2. Induced polarization

3. Attractive dispersion (van der Waals) (1/R6)

4. Short range repulsion (eR/a)

The last two terms are combined into a 6-12 Lennard-Jones

potential

Combination rules for different atoms:

612

4RR

ULJ

2/)( jiijjiij 13

R

qqU ji

04Coul

Page 14: Basics of molecular dynamics simulations

12-6 Lennard-Jones potential (U is in kT, r in Å)

3 3.5 4 4.5 5-0.5

0

0.5

1

1.5

2

2.5

r

U(r)

AkT 3,41

14

Page 15: Basics of molecular dynamics simulations

Because the polarization interaction is many-body and requires

iterations,

it has been neglected in current generations of force fields.

The non-bonded interactions are thus represented by the Coulomb

and

12-6 LJ potentials.Model RO-H (Å) HOH qH (e) (kT) (Å) (D) eps (T=298)

SPC 1.0 109.5 0.410 0.262 3.166 2.27 65±5

TIP3P 0.957 104.5 0.417 0.257 3.151 2.35 97±7

Exp. 1.86 80

SPC: simple point chargeTIP3P: transferable intermolecular potential with 3 points

Popular water models (rigid)

15

Page 16: Basics of molecular dynamics simulations

Covalent bonds

In molecules, atoms are bonded together with covalent bonds,

which arise from sharing of electrons in partially occupied

orbitals. If the bonds are very strong, the molecule can be

treated as rigid as in water molecule. In most large molecules,

however, the structure is quite flexible and this must be taken

into account for a proper description of the molecules. This is

literally done by replacing the bonds by springs. The nearest

neighbour interactions involving 2, 3 and 4 atoms are described

by harmonic and periodic potentials

02

02

0 cos122

ijklijkl

ijklijk

ijkijk

ijij

rij

bond nVk

rrk

U

bond stretching bending torsion (not very good)16

Page 17: Basics of molecular dynamics simulations

Interaction of two H atoms in the ground (1s) state can be described

using Linear Combinations of Atomic Orbitals (LCAO)

)1()1( 21 scsc BA

)1()1(2

1ss BA

From symmetry, two solutions with lowest and highest energies are:

+ Symmetric

- Anti-symmetric

17

Page 18: Basics of molecular dynamics simulations

The R dependence of the potential energy is approximately

given by the Morse potential

2)( 01 RRe eDU

Morse

20 )(

21

RRkUbond

where

De: dissociation energy

R0: equilibrium bond distance

Controls the width of the potential

Classical representation:

eDk 2/

k

18

Page 19: Basics of molecular dynamics simulations

An in-depth look at the force fields

Three force fields, constructed in the eighties, have come to

dominate

the MD simulations

1. CHARMM (Karplus @ Harvard)

Optimized for proteins, works well also for lipids and nucleic

acids

2. AMBER (Kollman & Case @ UCSF)

Optimized for nucleic acids, otherwise quite similar to CHARMM

3. GROMOS (Berendsen & van Gunsteren @ Groningen)

Optimized for lipids and does not work very well for proteins

(due to smaller partial charges in the carbonyl and amide

groups)

The first two use the TIP3P water model and the last one, SPC

model.

They all ignore the polarization interaction (polarizable versions

are under construction)

19

Page 20: Basics of molecular dynamics simulations

Charm parameters for alanine

Partial charge (e)

ATOM N -0.47 |

ATOM HN 0.31 HN—N

ATOM CA 0.07 | HB1

ATOM HA 0.09 | /

GROUP HA—CA—CB—HB2

ATOM CB -0.27 | \

ATOM HB1 0.09 | HB3

ATOM HB2 0.09 O==C

ATOM HB3 0.09 |

ATOM C 0.51

ATOM O -0.51

Total charge: 0.0020

Page 21: Basics of molecular dynamics simulations

Bond lengths :

kr (kcal/mol/Å2) r0 (Å)

N CA 320. 1.430 (1)

CA C 250. 1.490 (2)

C N 370. 1.345 (2) (peptide bond)

O C 620. 1.230 (2) (double bond)

N H 440. 0.997 (2)

HA CA 330. 1.080 (2)

CB CA 222. 1.538 (3)

HB CB 322. 1.111 (3)

1. NMA (N methyl acetamide) vibrational spectra

2. Alanine dipeptide ab initio calculations

3. From alkanes

20 )( rrkU r bond

21

Page 22: Basics of molecular dynamics simulations

Bond angles :

k (kcal/mol/rad2) 0 (deg)

C N CA 50. 120. (1)

C N H 34. 123. (1)

H N CA 35. 117. (1)

N CA C 50. 107. (2)

N CA CB 70. 113.5 (2)

N CA HA 48. 108. (2)

HA CA CB 35. 111. (2)

HA CA C 50. 109.5 (2)

CB CA C 52. 108. (2)

N C CA 80. 116.5 (1)

O C CA 80. 121. (2)

O C N 80. 122.5 (1)

20 )( kUangle

Total 360 deg.

Total 360 deg.

22

Page 23: Basics of molecular dynamics simulations

Basic dihedral configurations trans cis

Definition of the dihedral

angle for 4 atoms A-B-C-D

)cos(12 0 nV

U ndihedDihedrals:

23

Page 24: Basics of molecular dynamics simulations

Dihedral parameters:

Vn (kcal/mol) n 0 (deg) Name

C N CA C 0.2 1 180.

N CA C N 0.6 1 0.

CA C N CA 1.6 1 0.

H N CA CB 0.0 1 0.

H N CA HA 0.0 1 0.

C N CA CB 1.8 1 0.

CA C N H 2.5 2 180.

O C N H 2.5 2 180.

O C N CA 2.5 2 180.

)cos(12 0 nV

U ndihed

24

Page 25: Basics of molecular dynamics simulations

BoundariesIn macroscopic systems, the effect of boundaries on the

dynamics of biomolecules is minimal. In MD simulations,

however, the system size is much smaller and one has to

worry about the boundary effects.

• Using nothing is not realistic for bulk simulations.

• Minimum requirement: water beyond the simulation box must

be treated using a continuum representation

• Most common solution: periodic boundary conditions.

The simulation box is replicated in all directions just like in a

crystal. The cube and rectangular prism are the obvious

choices for a box shape.

25

Page 26: Basics of molecular dynamics simulations

Periodic boundary conditions in two dimensions: 8 nearest neighbours

Particles in the box freely move to the next box, which means

when one moves out another appears from the opposite side of

the same box. In 3-D, there are 26 nearest neighbours.26

Page 27: Basics of molecular dynamics simulations

Treatment of long-range interactions

Problem: the number of non-bonded interactions grows as N2

where N is the number of particles. This is the main

computational bottle neck that limits the system size.

Because the Coulomb potential between two unit charges, has

long range [U=560 kT/r (Å)], use of any cutoff is problematic and

should be avoided.

All the modern MD codes use the Ewald sum method to treat the

long-range interactions without cutoffs. Here one replaces the

point charges with Gaussian distributions, which leads to a much

faster convergence of the sum in the reciprocal (Fourier) space.

The remaining part (point particle – Gaussian) falls exponentially in

real space, hence can be computed easily.27

Page 28: Basics of molecular dynamics simulations

Molecular mechanics

Molecular mechanics deals with the static features of

biomolecular systems at T=0 K, that is, the energy is given

solely by the potential energy of the system.

Two important applications of molecular mechanics are:

1. Energy minimization (geometry optimization):

Find the coordinates for the configuration that minimizes the

potential energy.

2. Normal mode analysis:

Find the vibrational excitations of the system built on the

absolute minimum using the harmonic approximation.

28

Page 29: Basics of molecular dynamics simulations

Molecular dynamics

In MD simulations, one follows the trajectories of N particles

according to Newton’s equation of motion:

where U(r1,…, rN) is the potential energy function

consisting of bonded and non-bonded interactions.

We need to consider:

• Integration algorithms, e.g., Verlet, Leap-frog

• Initial conditions and choice of the time step

• MD simulations at constant temperature and pressure

• Constrained dynamics for rigid molecules (SHAKE)

),,(,2

2

Niiiiii Udt

dm rrFFr

29

Page 30: Basics of molecular dynamics simulations

2

2

)()()()( t

m

tttttt

i

iiii

Fvrr

2

2)(

)()()(

)()()(

tmt

ttttt

tmt

ttt

i

iiii

i

iii

Fvrr

Fvv

Integration algorithms

Given the position and velocities of N particles at time t,

integration of Newton’s equation of motion yields at t+t

In the popular Verlet algorithm, one eliminates velocities using

the positions at t-t,

Adding eq. (*) yields: 2)()()(2)( t

m

tttttt

i

iiii

Frrr

(*)

30

Page 31: Basics of molecular dynamics simulations

tttttt

tm

ttttttt

iii

i

iiii

)2/()()(

)()2/()()( 2

vrr

Fvrr

tttttt

tm

ttttt

iii

i

iii

)2/()()(

)()2/()2/(

vrr

Fvv

In the Leap-frog algorithm, the positions and velocities are

calculated at different times separated by t/2

To show its equivalence to the Verlet algorithm, consider

Subtracting the two equations yields the Verlet result.

If required, velocity at t is obtained from:

)]2/()2/([21

)( ttttt iii vvv31

Page 32: Basics of molecular dynamics simulations

To iterate these equations, we need to specify the initial

conditions.

• The initial configuration of biomolecules can be taken from the

Protein Data Bank (PDB) (if available).

• In addition, membrane proteins need to be embedded in a

lipid bilayer.

• All the MD codes have facilities to hydrate a biomolecule, i.e.,

fill the void in the simulation box with water molecules at the

correct density.

• Ions can be added at random positions.

After energy minimization, these coordinates provide the

positions at t=0.

Initial velocities are sampled from a Maxwell-Boltzmann

distribution: kTvm

kT

mvP ixi

iix 2exp

2)( 2

2/1

32

Page 33: Basics of molecular dynamics simulations

MD simulations at constant temperature and

pressure

MD simulations can be performed in the NVE ensemble, where

all 3 quantities are constant. Due to truncation errors, keeping

the energy constant in long simulations can be problematic. To

avoid this problem, the alternative NVT and NPT ensembles are

employed. The temperature of the system can be obtained

from the average K. E.Thus an obvious way to keep the temperature constant at T0 is

to scale the velocities as:

)(),()( 0 tTTtvtv ii

NkTK23

Because K. E. has considerable fluctuations, this is a rather crude method. 33

Page 34: Basics of molecular dynamics simulations

A better method which achieves the same result more smoothly

is the Berendsen method, where the atoms are weakly coupled

an external heat bath with the desired temperature T0

If T(t) > T0 , the coupling term is negative, which invokes a

viscous force slowing the velocity, and vice-versa for T(t) < T0

Similarly in the NPT ensemble, the pressure can be kept

constant by simply scaling the volume. Again a better method

(Langevin piston), is to weakly couple the pressure difference

to atoms using a force as in above, which will maintain the

pressure at the desired value (~1 atm).

dt

d

tT

Tm

dt

dm i

iiiiir

Fr

1)(

02

2

34

Page 35: Basics of molecular dynamics simulations

Analysis of trajectory data

1. Fundamental quantities

Total energy, temperature, pressure, volume (or density)

2. Structural quantities

Root Mean Square Deviation (RMSD)

Distribution functions (e.g., pair and radial)

Conformational analysis (e.g., Ramachandran plots,

rotamers)

3. Dynamical quantities

Time correlation functions & transport coefficients

Free energy calculations using perturbation, umbrella

sampling and steered MD methods 35

Page 36: Basics of molecular dynamics simulations

Fundamental quantities

• Total energy: strictly conserved but due to numerical errors it

may drift.

• Temperature: would be constant in a macroscopic system

(fluctuations are proportional to 1/N, hence negligible). But

in a small system, they will fluctuate around a base line.

Temperature coupling ensures that the base line does not drift

from the set temperature.

• Pressure: similar to temperature, though fluctuations are

much larger compared to the set value of 1 atm.

• Volume (or density): fixed in the NVT ensemble but varies in

NPT. Therefore, it needs to be monitored during the initial

equilibration to make sure that the system has converged to

the correct volume. Relatively quick for globular proteins but

may take a long time (~1 ns) for systems involving lipids

(membrane proteins).36

Page 37: Basics of molecular dynamics simulations

Root Mean Square Deviation

For an N-atom system, variance and RMSD at time t are defined

as

Where ri(0) are the reference coordinates. Usually they are taken

from the first frame in an MD simulation, though they can also be

taken from the PDB or a target structure. Because side chains in

proteins are different, it is common to use a restricted set of atoms

such as backbone or C. RMSD is very useful in monitoring the

approach to equilibrium (typically, signalled by the appearance of

a broad shoulder).

It is a good practice to keep monitoring RMSD during production

runs

to ensure that the system stays near equilibrium.

RMSD,)0()(1

)(1

2N

iii tN

t rr

37

Page 38: Basics of molecular dynamics simulations

Evolution of the RMSD for the backbone atoms of ubiquitin

38

Page 39: Basics of molecular dynamics simulations

rr

rNrg waterion

24

)()(

Distribution functions

Pair distribution function gij(r) gives the probability of finding a

pair of atoms (i, j) at a distance r. It is obtained by sampling the

distance rij in MD simulations and placing each distance in an

appropriate bin, [r, r+r]. Pair distribution functions are used in

characterizing correlations between pair of atoms, e.g.,

hydrogen bonds, cation-carbonyl oxygen. The peak gives the

average distance and the width, the strength of the interaction.

When the distribution is isotropic (e.g., ion-water in an

electrolyte), one samples all the atoms in the spherical shell, [r,

r+r]. Thus to obtain the radial distribution function (RDF), the

sampled number needs to be normalized by the volume ~4r2r

(as well as density)39

Page 40: Basics of molecular dynamics simulations

Conformational analysisIn proteins, the bond lengths and bond angles are more or less

fixed.

Thus we are mainly interested in conformations of the torsional

angles. As the shape of a protein is determined by the backbone

atoms, the torsional angles, (N−C) and (C−C), are of

particular interest. These are conveniently analyzed using the

Ramachandran plots. Conformational changes in a protein

during MD simulations can be most easily revealed by plotting

these torsional angles as a function of time.

Also of interest are the torsional angles of the side chains,

etc.

Each side chain has several energy minima (called rotamers),

which are separated by low-energy barriers (~ kT). Thus

transitions between different rotamers is readily achievable, and

they could play functional roles, especially for charged side

chains.

40

Page 41: Basics of molecular dynamics simulations

Time correlation functions & transport coefficients

At room temperature, all the atoms in the simulation system are

in a constant motion characterized by their average kinetic

energy: (3/2)kT. Free atoms or molecules diffuse according to

the equation Dtt2 6)( r

dttD

0

)()0(31

vv

While this relationship can be used to determine the diffusion

coefficient, more robust results can be obtained using correlation

functions, e.g., D is related to the velocity autocorrelation function

as

Similarly, conductance of charged particles is given by the Kubo formula

ii

iqdttVkT vJJJ

,)()0(3

1

0

(current)

41

Page 42: Basics of molecular dynamics simulations

2)0(

)()0()(

ttC

Bound atoms or groups of atoms fluctuate around a mean value.

Most of the high-frequency fluctuations (e.g. H atoms), do not

directly

contribute to the protein function. Nevertheless they serve as

“lubricant” that enables large scale motions in proteins (e.g.

domain motions) that do play a significant role in their function.

As mentioned earlier, large scale motions occur in a ~ ms to s

time scale, hence are beyond the present MD simulations.

(Normal mode analysis provides an alternative.) But torsional

fluctuations occur in the ns time domain, and can be studied in

MD simulations using time correlation functions, e.g.

These typically decay exponentially and such fluctuations can be

described using the Langevin equation. 42

Page 43: Basics of molecular dynamics simulations

Free energy calculations

Free energy is the most important quantity that characterizes a

dynamical process. Calculation of the absolute free energies is

difficult in MD simulations. However free energy differences can

be estimated more easily and several methods have been

developed for this purpose. The starting point for most

approaches is Zwanzig’s perturbation formula for the free energy

difference between two states A and B:

If the two states are not similar enough, there is a large hysteresis

effect and the forward and backward results are not equal.

)()(

/)](exp[ln)(

/)](exp[ln)(

ABGBAG

kTHHkTABG

kTHHkTBAG

BBA

AAB

43

Page 44: Basics of molecular dynamics simulations

Alchemical transformation and free energy

perturbation

To obtain accurate results with the perturbation formula, the

energy difference between the states should be < 2 kT, which is

not satisfied for most biomolecular processes. To deal with this

problem, one introduces a hybrid Hamiltonian

and performs the transformation from A to B gradually by

changing the parameter from 0 to 1 in small steps. That is, one

divides [0,1] into n subintervals with {i, i = 0, n}, and for each i

value, calculates the free energy difference from the ensemble

average

BA HHH )1()(

ikTHHkTG iiii /))]()((exp[ln)( 11

44

Page 45: Basics of molecular dynamics simulations

1

01)(

n

iiiGG

The total free energy change is then obtained by summing the

contributions from each subinterval

The number of subintervals is chosen such that the free energy

change at each step is < 2 kT, otherwise the method may lose

its validity.

1

0

)(

dH

G

Thermodynamic integration : Another way to obtain the free energy

difference is to integrate the derivative of the hybrid Hamiltonian:

This integral is evaluated most efficiently using a Gaussian quadrature.45

Page 46: Basics of molecular dynamics simulations

Absolute free energies from umbrella samplingHere one samples the densities along a reaction coordinate and

determines the potential of mean force (PMF) from the Boltzmann

eq.

)(

)(ln)()()()(

00

/)]()([0

0

z

zkTzWzWezz kTzWzW

Here z0 is a reference point, e.g. a point in bulk where W vanishes.

In general, a particle cannot be adequately sampled at high-

energy

points. To counter that, one introduces harmonic potentials, which

restrain the particle at desired points, and then unbias its effect.

For convenience, one introduces umbrella potential at regular

intervals along the reaction coordinate (e.g. ~0.5 Å). The PMF’s

obtained in each interval are unbiased and optimally combined

using the Weighted Histogram Analysis Method (WHAM).

46

Page 47: Basics of molecular dynamics simulations

Steered MD (SMD) simulations and Jarzynski’s

equationSteered MD is a more recent method where a harmonic force is

applied to an atom on a peptide and the reference point of this

force is pulled with a constant velocity. It has been used to

study unfolding of proteins and binding of ligands. The discovery

of Jarzynski’s equation in 1997 enabled determination of PMF

from SMD, which has boosted its applications.

)]([,. 0

//

tkW

ee

f

i

kTWkTF

vrrFdsF

Jarzynski’s equation:

Work done by the harmonic force

This method seems to work well in simple systems and when F is large

but beware of its applications in complex systems! 47

Page 48: Basics of molecular dynamics simulations

A critical look at simulation methods

Molecular dynamics is the “standard model” of biomolecules.

• A higher level (quantum) description is simply not feasible for

any biomolecular system.

• A lower level description sacrifices the atomic detail and

cannot be expected to succeed without guidance from MD.

Nevertheless, MD simulations alone cannot provide a complete

description of biomolecules, and hence we need to appeal to

both

higher and lower level theories to make progress.

• Quantum description (in a mixed QM/MM scheme) is required

for

1. Development and testing of force fields

2. Enzyme reactions that involve making or breaking of

bonds

3. Transport of light particles (e.g., protons and electrons)

48

Page 49: Basics of molecular dynamics simulations

• Stochastic description and coarse-graining are required for any

process that is beyond the current capabilities of MD simulations,

that is, more than 105-6 atoms and longer than microseconds.

1. Protein-protein interactions, protein aggregation (water in

continuum, protein may be rigid, coarse-grained, semi-flexible

or flexible)

2. Aggregation of membrane proteins (water and lipid in

continuum, protein as above)

3. Ion channels and transporters: modelling of ion permeation

(water and lipid in continuum, protein may be rigid or semi-

flexible)

4. Protein folding (water in continuum, protein coarse-grained)

5. Self assembly of lipid bilayers and micelles (water and lipid

molecules coarse-grained)

6. DNA condensation (water in continuum, DNA coarse-grained)

49

Page 50: Basics of molecular dynamics simulations

The guiding principle: Occam’s razor

Use the simplest method that yields the answer, provided:

• The method is justified (domain of validity)

• The parameters employed can be derived from a more

fundamental theory.

Most of the people who use molecular dynamics or other

simulation

methods treat them as black boxes and apply them to

biomolecules

without worrying too much whether they are valid or relevant.

You also need to watch out for experts who publish only the

good

results!

A cautionary tale: Continuum description of ion channels using

the

Poisson-Nernst-Planck (drift-diffusion) equations.

50

Page 51: Basics of molecular dynamics simulations

Here, : potential, r: charge density, J: flux, n: concentration of ions

PNP equations are solved simultaneously, yielding potential and flux

Assumption: Ions are represented by an average charge density, zvenv

which invokes a mean-field approximation.

This is alright in a bulk electrolyte, but in ion channels with radius

smaller than the Debye length (8 Å for 0.15 M), one should worry about

the validity of PNP equations.

Poisson-Nernst-Planck (PNP) Equations

kT

enznDJ

enzext0 Poisson equation

Nernst-Planck equation

51

Page 52: Basics of molecular dynamics simulations

+

+

+

+

+

+

+

= 2 = 80

Induced chargesat the water-protein interface

Image force on an ion

In application of PNP to Ca channels, diffusion coeff. is reduced by 10-4 !

The problem: Ions induce like-charges at a water-protein interface, which

repel them. When ions are represented as a continuous charge density,

this image force (or dielectric self-energy) is severely underestimated.

water protein

52

Page 53: Basics of molecular dynamics simulations

A simple test: PNP vs Brownian dynamics (BD)

Control study:

Set artificially ε = 80 in the protein. No induced charges on the

boundary, hence no discrepancy between the two methods

r = 4 Å

Na +

Cl -

53

Page 54: Basics of molecular dynamics simulations

Realistic case:

In the realistic case (εε = 2 = 2 in the protein), ions do not enter

the channel in BD due to the dielectric self-energy barrier.

Only in large pores (r > 10 Å), validity of PNP is restored.

r = 4 Å

PNP

BD

54