43
Computational Computational Modeling of Protein- Modeling of Protein- Ligand Interactions Ligand Interactions Steven R. Gwaltney Steven R. Gwaltney Department of Chemistry Department of Chemistry Mississippi State University Mississippi State University Mississippi State, MS 39762 Mississippi State, MS 39762

Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Embed Size (px)

Citation preview

Page 1: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Computational Computational Modeling of Protein-Modeling of Protein-Ligand InteractionsLigand Interactions

Steven R. GwaltneySteven R. GwaltneyDepartment of ChemistryDepartment of Chemistry

Mississippi State UniversityMississippi State UniversityMississippi State, MS 39762Mississippi State, MS 39762

Page 2: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Auguste Comte, 1830 Auguste Comte, 1830

““Every attempt to refer chemical Every attempt to refer chemical questions to mathematical doctrines must be questions to mathematical doctrines must be considered, now and always, profoundly considered, now and always, profoundly irrational, as being contrary to the nature of irrational, as being contrary to the nature of the phenomena. . . . but if the employment of the phenomena. . . . but if the employment of mathematical analysis should ever become so mathematical analysis should ever become so preponderant in chemistry (an aberration preponderant in chemistry (an aberration which is happily almost impossible) it would which is happily almost impossible) it would occasion vast and rapid retrogradation, by occasion vast and rapid retrogradation, by substituting vague conceptions for positive substituting vague conceptions for positive ideas, and an easy algebraic verbiage for a ideas, and an easy algebraic verbiage for a laborious investigation of facts.”laborious investigation of facts.”

Page 3: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

P. A. M. Dirac, 1929P. A. M. Dirac, 1929

““The underlying physical laws The underlying physical laws necessary for the mathematical theory necessary for the mathematical theory of a large part of physics and the of a large part of physics and the whole of chemistry are thus whole of chemistry are thus completely known, and the difficulty is completely known, and the difficulty is only that the exact application of these only that the exact application of these laws leads to equations much too laws leads to equations much too complicated to be soluble.”complicated to be soluble.”

Page 4: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Why the Change?Why the Change?

Quantum MechanicsQuantum Mechanics

Postulated by Schrödinger in 1926Postulated by Schrödinger in 1926 Time dependent version Time dependent version iħ ∂iħ ∂ΨΨ/∂t = /∂t =

HHΨΨ Time independent version Time independent version HHψψ=E=Eψψ Partial differential equationsPartial differential equations No exact solutions for real systemsNo exact solutions for real systems

Page 5: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

ApproximateApproximate

We can’t solve the Schrödinger We can’t solve the Schrödinger equation for molecules.equation for molecules.

The trick is to choose appropriate The trick is to choose appropriate approximations – tradeoff of time approximations – tradeoff of time versus accuracyversus accuracy

““The right answer for the right The right answer for the right reason”reason”

Page 6: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Theory’s Family TreeTheory’s Family Tree

TheoreticalTheoreticalChemistryChemistry

ElectronicElectronicStructureStructureTheoryTheory

DynamicsDynamics StatisticalStatisticalMechanicsMechanics

SemiempericalSemiempericalDensityDensity

FunctionalFunctionalTheoryTheory

Ab InitioAb Initio QuantumQuantumDynamicsDynamics

MolecularMolecularDynamicsDynamics

quantum chemistryquantum chemistry

Page 7: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

The Three Main BranchesThe Three Main Branches Electronic Structure TheoryElectronic Structure Theory

– Uses the time independent Schrödinger equation Uses the time independent Schrödinger equation to describe the molecule’s electron configurationto describe the molecule’s electron configuration Can calculate energies, geometries, vibrational Can calculate energies, geometries, vibrational

frequencies, dipole moments, NMR spectra, etc.frequencies, dipole moments, NMR spectra, etc.

DynamicsDynamics– Studies how the system changes over timeStudies how the system changes over time

Uses either quantum mechanics or Newtonian Uses either quantum mechanics or Newtonian mechanicsmechanics

Statistical MechanicsStatistical Mechanics– Studies the average behavior of complex Studies the average behavior of complex

ensemblesensembles Often used for liquids, polymer melts, similar systemsOften used for liquids, polymer melts, similar systems

Page 8: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Theory’s Family TreeTheory’s Family Tree

TheoreticalTheoreticalChemistryChemistry

ElectronicElectronicStructureStructureTheoryTheory

DynamicsDynamics StatisticalStatisticalMechanicsMechanics

SemiempericalSemiempericalDensityDensity

FunctionalFunctionalTheoryTheory

Ab InitioAb Initio QuantumQuantumDynamicsDynamics

MolecularMolecularDynamicsDynamics

Page 9: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

The Dynamics SiblingsThe Dynamics Siblings

Quantum Dynamics uses time dependent Quantum Dynamics uses time dependent Schrödinger equationSchrödinger equation– Can only handle up to four degrees of Can only handle up to four degrees of

freedomfreedom

Classical Dynamics moves atoms by Classical Dynamics moves atoms by F=maF=ma– Describe systems of several thousand atomsDescribe systems of several thousand atoms– Uses molecular mechanics force fieldsUses molecular mechanics force fields

Page 10: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Molecular MechanicsMolecular Mechanics

Describes bond lengths and bond Describes bond lengths and bond angles as springsangles as springs

Also includes terms for out of plane Also includes terms for out of plane bends, torsions, electrostatics, bends, torsions, electrostatics, hydrogen bonds, and van der Waals hydrogen bonds, and van der Waals interactionsinteractions

Very fastVery fast Parameters chosen to fit certain classes Parameters chosen to fit certain classes

of moleculesof molecules Can’t break bondsCan’t break bonds

Page 11: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

An ExampleAn Example

Page 12: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

SSNN2 Reaction2 Reaction

ReactantReactant

ProductProduct

TransitionTransitionStateState

Page 13: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Theory’s Family TreeTheory’s Family Tree

TheoreticalTheoreticalChemistryChemistry

ElectronicElectronicStructureStructureTheoryTheory

DynamicsDynamics StatisticalStatisticalMechanicsMechanics

SemiempericalSemiempericalDensityDensity

FunctionalFunctionalTheoryTheory

Ab InitioAb Initio QuantumQuantumDynamicsDynamics

MolecularMolecularDynamicsDynamics

Page 14: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Semiemperical MethodsSemiemperical Methods Molecular Hamiltonian consists of 4 terms:Molecular Hamiltonian consists of 4 terms:

– Kinetic energy of the electronsKinetic energy of the electrons– Nuclear-nuclear repulsionNuclear-nuclear repulsion– Electron-nuclear attractionElectron-nuclear attraction– Electron-electron repulsionElectron-electron repulsion

Semiemperical methods throw out most of Semiemperical methods throw out most of the two-electron integrals and the two-electron integrals and parameterize the rest of the terms.parameterize the rest of the terms.– Different parameters for different propertiesDifferent parameters for different properties

Speed advantage is diminishing.Speed advantage is diminishing. Importance of methods is decreasing.Importance of methods is decreasing.

the expensive termthe expensive term

Page 15: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Ab InitioAb Initio Methods Methods

No experimental data used to fit resultsNo experimental data used to fit results Simplest method is Hartree-FockSimplest method is Hartree-Fock

– Electrons move in the average electric Electrons move in the average electric field produced by the other electronsfield produced by the other electrons

– Origin of the molecular orbital pictureOrigin of the molecular orbital picture– Formally scales as system size to the Formally scales as system size to the

fourth, in practice much cheaperfourth, in practice much cheaper– Neglects the instantaneous correlation of Neglects the instantaneous correlation of

electron motionselectron motions

Page 16: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Correlated MethodsCorrelated Methods Add in missing correlation energyAdd in missing correlation energy Equations look like either a large system of Equations look like either a large system of

nonlinear equations (CC) or a large nonlinear equations (CC) or a large eigenvalue/eigenvector problem (CI)eigenvalue/eigenvector problem (CI)

Best methods are very accurate and very Best methods are very accurate and very costlycostly– Errors as low as 0.2 kcal/mol for atomization Errors as low as 0.2 kcal/mol for atomization

energies and 0.004 Å for bond lengthsenergies and 0.004 Å for bond lengths– Cost scales as system size to the seventh powerCost scales as system size to the seventh power– Limited to less than 20 atomsLimited to less than 20 atoms

We know how to converge to the exact We know how to converge to the exact solutionsolution

Page 17: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Density Functional TheoryDensity Functional Theory

Describe system via electron density Describe system via electron density (3 variables) instead of wave function (3 variables) instead of wave function (3n variables)(3n variables)

Existence proof for exact formExistence proof for exact form Practical methods use a few Practical methods use a few

parameters and fit to experimental parameters and fit to experimental datadata

Errors of around 3 kcal/mol for Errors of around 3 kcal/mol for atomization energiesatomization energies

Page 18: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

DFT ContinuedDFT Continued

Solved self consistentlySolved self consistently Formally scale as system size to the Formally scale as system size to the

fourth, but linear scaling versions fourth, but linear scaling versions have been developedhave been developed

Can handle up to a couple hundred Can handle up to a couple hundred atomsatoms

Rapidly becoming the workhorse Rapidly becoming the workhorse method of computational chemistrymethod of computational chemistry

Page 19: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

DFT, Part 3DFT, Part 3

Form of functionalForm of functional

E[E[ρρ] = T] = Tss[[ρρ] + E] + EJJ[[ρρ] + E] + Excxc[[ρρ]] No one knows how to get the exact No one knows how to get the exact

EExcxc[[ρρ]]..– Instead, approximations must be used.Instead, approximations must be used.

A veritable plethora of exchange-A veritable plethora of exchange-correlation functionals exist.correlation functionals exist.– Often difficult to tell which one works bestOften difficult to tell which one works best– No way to converge to the exact answerNo way to converge to the exact answer

Page 20: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

A Note On Basis SetsA Note On Basis Sets

The wave function (or density) is The wave function (or density) is expanded in terms of Gaussian-expanded in terms of Gaussian-shaped orbitals centered on each shaped orbitals centered on each atom.atom.

Sets of standard basis sets exist.Sets of standard basis sets exist.– These vary primarily by the number of These vary primarily by the number of

basis functions on each atom.basis functions on each atom. Bigger basis sets equal:Bigger basis sets equal:

– Better answersBetter answers– Longer calculationsLonger calculations

Page 21: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

SSNN2 Revisited2 Revisited

A quantum treatment can break the A quantum treatment can break the bond.bond.

Page 22: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Chemistry and ToxicologyChemistry and Toxicology

““Usually, a poison has a specific molecule with which it interacts and it is that interaction that causes the toxicity.”.”

Russell CarrRussell Carr

Page 23: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Organophosphate Organophosphate InsecticidesInsecticides

Very heavily used, especially in Very heavily used, especially in agricultural areasagricultural areas

Acts by reacting with the active site Acts by reacting with the active site of the enzyme acetylcholinesteraseof the enzyme acetylcholinesterase

Acute exposure to OP agents can Acute exposure to OP agents can lead to vomiting, muscle twitches, lead to vomiting, muscle twitches, convulsions, and even death.convulsions, and even death.

Closely related to nerve gasses, both Closely related to nerve gasses, both in structure and in mode of actionin structure and in mode of action

Page 24: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

ChlorpyrifosChlorpyrifos

Page 25: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

AcetylcholinesteraseAcetylcholinesterase

The neurotransmitter acetylcholine (ACh) The neurotransmitter acetylcholine (ACh) is the primary signal carrier in cholinergic is the primary signal carrier in cholinergic nerve/nerve and nerve/muscle junctions.nerve/nerve and nerve/muscle junctions.

Acetylcholinesterase (AChE) breaks down Acetylcholinesterase (AChE) breaks down ACh, causing the nerve signal to ACh, causing the nerve signal to terminate.terminate.

AChE exists AChE exists in vivoin vivo as a membrane bound as a membrane bound monomer, a dimer, and a tetramer.monomer, a dimer, and a tetramer.

Page 26: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Structure of AChE?Structure of AChE?

The chemical structure of the toxicant before The chemical structure of the toxicant before it enters your body is often well known.it enters your body is often well known.– However, However, in vivoin vivo is the parent or a metabolite the is the parent or a metabolite the

active species?active species? The structure of a protein is much harder to The structure of a protein is much harder to

determine.determine. No general method exists to go from the No general method exists to go from the

sequence to the tertiary structure of a sequence to the tertiary structure of a protein.protein.– Nobel Prize is waiting!Nobel Prize is waiting!

Page 27: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

The Protein Data BankThe Protein Data Bank

The two primary ways of experimentally The two primary ways of experimentally determining the structure of a protein determining the structure of a protein are X-ray crystallography and NMR are X-ray crystallography and NMR studies.studies.

Journals require authors to submit Journals require authors to submit solved structures to a central repository, solved structures to a central repository, the Protein Data Bank (PDB).the Protein Data Bank (PDB).

Structures from the PDB are available Structures from the PDB are available free of charge.free of charge.

Page 28: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Mouse AChEMouse AChE

Tetramer with 17,000 non-hydrogen atomsTetramer with 17,000 non-hydrogen atoms

Page 29: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Single MonomerSingle Monomer

547 amino acids, 4,300 non-hydrogen atoms547 amino acids, 4,300 non-hydrogen atoms

Page 30: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

What Do We Want to Know?What Do We Want to Know?

Once we have structures, we need to Once we have structures, we need to decide what information we want decide what information we want learn.learn.

This determines what methods we This determines what methods we should use for our calculations.should use for our calculations.

Page 31: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

A Little Physical ChemistryA Little Physical Chemistry

E+S E+S →→ ES ES → EP→ EP KKAA is the equilibrium constant for is the equilibrium constant for

enzyme/substrate associationenzyme/substrate association– KKAA = e = e--ΔΔGGbb/RT/RT

kkpp is the rate of product formation is the rate of product formation

– kkpp = Ae = Ae-E-Eaa/RT/RT→KA kp

Page 32: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Reaction DiagramReaction Diagram

E+SE+S

ESES

Transition stateTransition state

EPEP

∆∆GGbb

EEaa

Need three pointsNeed three points

Page 33: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

The ProblemThe Problem

1.1. Enzymes are too big to study with Enzymes are too big to study with quantum mechanics.quantum mechanics.

2.2. Molecular mechanics can’t break Molecular mechanics can’t break bonds.bonds.

3.3. How do we bridge the gap?How do we bridge the gap?

Page 34: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Combine the TwoCombine the Two

““For every problem there is a For every problem there is a solution which is simple, obvious, and solution which is simple, obvious, and wrong”wrong”

Albert EinsteinAlbert Einstein

Page 35: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

QM/MMQM/MM

ProblemsProblems How do you How do you

define the define the border?border?

How do you How do you couple the couple the two regions two regions together?together?

QM RegionQM Region

MM RegionMM Region

Page 36: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Make the Enzyme SmallerMake the Enzyme Smaller

Can we cut out a piece of the Can we cut out a piece of the enzyme?enzyme?– The piece must be small enough to The piece must be small enough to

calculate.calculate.– The piece must be able to describe the The piece must be able to describe the

chemistry.chemistry.

Page 37: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

AChE Active SiteAChE Active Site

6 amino acids, 42 non-hydrogen atoms6 amino acids, 42 non-hydrogen atoms

Ser 203Ser 203

His 447His 447

Glu 334Glu 334

oxyanion holeoxyanion hole

Page 38: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

The Role of the RestThe Role of the Rest

Active site is 6 out of 547 amino Active site is 6 out of 547 amino acids.acids.

The rest of a protein serves to hold The rest of a protein serves to hold the active site and the substrate in the active site and the substrate in an optimal configuration.an optimal configuration.

It also provides a polarized It also provides a polarized environment, allosteric interactions, environment, allosteric interactions, and gross conformational changes.and gross conformational changes.

Page 39: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

A Bigger PieceA Bigger Piece

26 amino acids, 214 non-hydrogen atoms26 amino acids, 214 non-hydrogen atoms

Page 40: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

So, what do we do?So, what do we do?

Use linear scaling DFT calculations to Use linear scaling DFT calculations to calculate a “chunk” of the enzymecalculate a “chunk” of the enzyme

Big basis set in the middle – small Big basis set in the middle – small basis set at the edgebasis set at the edge

Page 41: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Not Quite so SimpleNot Quite so Simple

1.1. The multiple minimum problemThe multiple minimum problem

2.2. How does the substrate fit it?How does the substrate fit it?

3.3. Where are the waters?Where are the waters?

Page 42: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

Back to Molecular DynamicsBack to Molecular Dynamics

Use MD simulations to provide initial Use MD simulations to provide initial geometries for DFT studiesgeometries for DFT studies– Easy to add water molecules to the Easy to add water molecules to the

simulationsimulation Can then put them into the DFT calculations Can then put them into the DFT calculations

in the right placesin the right places

– Allow the enzyme to relax in the Allow the enzyme to relax in the presence of the substratepresence of the substrate

– Can give us multiple starting structures if Can give us multiple starting structures if multiple important structures existmultiple important structures exist

Page 43: Computational Modeling of Protein-Ligand Interactions Steven R. Gwaltney Department of Chemistry Mississippi State University Mississippi State, MS 39762

One Final QuoteOne Final Quote

““In theory, there is no difference In theory, there is no difference between theory and practice; In between theory and practice; In practice, there is.”practice, there is.”

Chuck ReidChuck Reid