1
Polar/Aromatic Polar/Aromatic Evaluating Molecular Mechanics Force Fields with a Quantum Chemical Approach David Ryan Koes and John K. Vries Department of Computational and Systems Biology, University of Pittsburgh Introduction The chemical shifts of a molecular dynamics trajectory can be obtained from de novo quantum mechanical (QM) calculations performed on localized regions surrounding the atoms of interest. The ensemble average of these calculated chemical shifts can be compared to the chemical shifts obtained from NMR spectroscopy to get an estimate of the accuracy of the simulation [1]. We quantify the errors associated with the backbone atoms in MD simulations of seven model proteins. A library of regional conformers containing 261,587 members was constructed from molecular dynamics simulations of the model proteins. The chemical shifts associated with the backbone atoms in each of these conformers was obtained from QM calculations using density functional theory at the B3LYP level with a 6-311+G(2d,p) basis set. Chemical shifts were assigned to each backbone atom in each MD simulation frame using a template matching approach. The ensemble average of these chemical shifts was then compared to chemical shifts from NMR spectroscopy. All force fields exhibit errors, and 1H atoms display a particularly prominent systematic error. Dierent force fields exhibit consistent dierences in errors that map directly to residue specific dierences in conformational sampling. We find that recent force fields developed with the implicitly polarized charge model generate trajectories with the lowest average error, but no single force field is clearly best. http://github.com/dkoes/MD2NMR Conclusion Force fields are typically validated on larger systems using statistical analysis of secondary structure, NMR relaxation times, and J-couplings. Ab initio chemical shift prediction provides an additional metric for validation that provides detailed information about atomic environments. 14ipq and 15ipq perform best in our evaluation. Neither is clearly superior to the other, but both generally outperform the other considered force fields. Background Chemical Shifts. The chemical shift of an atom is the change in resonant frequency of the nuclei with respect to a standard. As the resonant frequency is determined by the local electronic environment, chemical shift values provide an experimentally measurable indicator of the local chemical environment of individual atoms of a protein. Molecular dynamics. Molecular dynamics simulations numerically integrate Newton’s equations of motion to derive a trajectory of atomic coordinates. Force fields. Molecular mechanics force fields determine the potential energy of a molecular system and usually consist of covalent (bond/angle/dihedral) and noncovalent (electrostatic/van der Waals) terms. If a force field accurately estimates the potential energy of a system, then molecular dynamics simulations computed using the force field will have realistic dynamics that correspond to experimental observables. In this work we evaluate force fields from the Amber family of force fields (in approximate chronological order): 94[2], 96[3], 03[4], 99SB[5], 99SBildn[6], 99SBnmr[7], 14SB[8], 14ipq[9], and 15ipq[10]. References [1] Koes, David R., and John K. Vries. "Error assessment in molecular dynamics trajectories using computed NMR chemical shifts." Computational and Theoretical Chemistry 1099 (2017): 152-166. [2] Cornell, Wendy D., et al. "A second generation force field for the simulation of proteins, nucleic acids, and organic molecules." Journal of the American Chemical Society 117.19 (1995): 5179-5197.[3] Kollman, Peter, et al. "The development/application of a ‘minimalist’organic/ biochemical molecular mechanic force field using a combination of ab initio calculations and experimental data." Computer simulation of biomolecular systems. Springer Netherlands, 1997. 83-96. [4] Duan, Yong, et al. "A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations." Journal of computational chemistry 24.16 (2003). [5] Hornak, Viktor, et al. "Comparison of multiple Amber force fields and development of improved protein backbone parameters." Proteins: Structure, Function, and Bioinformatics 65.3 (2006): 712-725. [6] Lindorff-Larsen, Kresten, et al. "Improved side-chain torsion potentials for the Amber ff99SB protein force field." Proteins: Structure, Function, and Bioinformatics 78.8 (2010): 1950-1958. [7] Li, Da-Wei, and Rafael Brüschweiler. "NMR-based protein potentials." Angewandte Chemie International Edition 49.38 (2010): 6778-6780. [8] Maier, James A., et al. "ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB." Journal of chemical theory and computation 11.8 (2015): 3696-3713. [9] Cerutti, David S., et al. "ff14ipq: a self-consistent force field for condensed-phase simulations of proteins." Journal of chemical theory and computation 10.10 (2014): 4515-4534. [10] Debiec, Karl T., et al. "Further along the Road Less Traveled: AMBER ff15ipq, an Original Protein Force Field Built on a Self-Consistent Physical Model." Journal of Chemical Theory and Computation 12.8 (2016): 3926-3947. Molecular Dynamics Amber 16 TIP3P water 1ps sampling 30 10ns trajectories 3 100ns trajectories Extract Regions of Interest (ROIs) 5Å Radius, Full Residues Select Novel Representative ROIs 0.2Å Matching Cutoff Calculate Chemical Shifts Gaussian 09 B3LYP 6-311+G(2d,p) Match Closest Template Minimum Average Absolute Difference Template Database Ensemble Average A greedy algorithm for the dominating set problem is used to select a minimal set of ROI conformations that “cover” the full simulation. ROI conformations are reduced to a set of distances between atoms of interest for efficient storage and matching. QM calculations closely match experiment in reference organic molecules. Errors due to matching are normally distributed, allowing larger match tolerances. Force Fields Compared GLY59 MP = n i |d i | n 1H Error as a Function of MP A B C 1enh 1hik 1igd 1qzm 1ubq 2a3d 3obl The computed chemical shift can serve as a sensitive indicator variable of the local chemical environment. 3OBL Correlation With Experiment 3OBL: ff14SB vs. ff14ipq 14ipq better 14SB better 14SB Water: 57% Acid: 20% 14ipq Water: 7% Acid: 82% GLU18 TYR52 14ipq Water: 22% Oxy: 13% 14SB Water: 97% Oxy: 0% 75% of residues display the same dominant interaction pattern across all force fields (Helix, Sheet, Coil, Water, Ring, or Acid). The remaining residues display different dominant interaction patterns in different force fields (Differ). These arise from significant changes in conformational sampling Acknowledgements This research was supported in part by the University of Pittsburgh Center for Simulation and Modeling through the supercomputing resources provided and by R01GM108340 from the National Institute of General Medical Sciences.

Evaluating Molecular Mechanics Force Fields with a ...bits.csb.pitt.edu/files/MD2NMR_SAM_2017.pdfdynamics trajectories using computed NMR chemical shifts." Computational and Theoretical

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Evaluating Molecular Mechanics Force Fields with a ...bits.csb.pitt.edu/files/MD2NMR_SAM_2017.pdfdynamics trajectories using computed NMR chemical shifts." Computational and Theoretical

Polar/Aromatic

Polar/Aromatic

Evaluating Molecular Mechanics Force Fields with a Quantum Chemical Approach

David Ryan Koes and John K. VriesDepartment of Computational and Systems Biology, University of Pittsburgh

IntroductionThe chemical shifts of a molecular dynamics trajectory can be obtained from de novo quantum mechanical (QM) calculations performed on localized regions surrounding the atoms of interest. The ensemble average of these calculated chemical shifts can be compared to the chemical shifts obtained from NMR spectroscopy to get an estimate of the accuracy of the simulation [1].

We quantify the errors associated with the backbone atoms in MD simulations of seven model proteins. A library of regional conformers containing 261,587 members was constructed from molecular dynamics simulations of the model proteins. The chemical shifts associated with the backbone atoms in each of these conformers was obtained from QM calculations using density functional theory at the B3LYP level with a 6-311+G(2d,p) basis set. Chemical shifts were assigned to each backbone atom in each MD simulation frame using a template matching approach. The ensemble average of these chemical shifts was then compared to chemical shifts from NMR spectroscopy.

All force fields exhibit errors, and 1H atoms display a particularly prominent systematic error. Different force fields exhibit consistent differences in errors that map directly to residue specific differences in conformational sampling. We find that recent force fields developed with the implicitly polarized charge model generate trajectories with the lowest average error, but no single force field is clearly best.

http://github.com/dkoes/MD2NMR

ConclusionForce fields are typically validated on larger systems using statistical analysis of secondary structure, NMR relaxation times, and J-couplings. Ab initio chemical shift prediction provides an additional metric for validation that provides detailed information about atomic environments.

ff14ipq and ff15ipq perform best in our evaluation. Neither is clearly superior to the other, but both generally outperform the other considered force fields.

BackgroundChemical Shifts. The chemical shift of an atom is the change in resonant frequency of the nuclei with respect to a standard. As the resonant frequency is determined by the local electronic environment, chemical shift values provide an experimentally measurable indicator of the local chemical environment of individual atoms of a protein.

Molecular dynamics. Molecular dynamics simulations numerically integrate Newton’s equations of motion to derive a trajectory of atomic coordinates.

Force fields. Molecular mechanics force fields determine the potential energy of a molecular system and usually consist of covalent (bond/angle/dihedral) and noncovalent (electrostatic/van der Waals) terms. If a force field accurately estimates the potential energy of a system, then molecular dynamics simulations computed using the force field will have realistic dynamics that correspond to experimental observables. In this work we evaluate force fields from the Amber family of force fields (in approximate chronological order): ff94[2], ff96[3], ff03[4], ff99SB[5], ff99SBildn[6], ff99SBnmr[7], ff14SB[8], ff14ipq[9], and ff15ipq[10].

References[1] Koes, David R., and John K. Vries. "Error assessment in molecular dynamics trajectories using computed NMR chemical shifts." Computational and Theoretical Chemistry 1099 (2017): 152-166.[2] Cornell, Wendy D., et al. "A second generation force field for the simulation of proteins, nucleic acids, and organic molecules." Journal of the American Chemical Society 117.19 (1995): 5179-5197.[3] Kollman, Peter, et al. "The development/application of a ‘minimalist’organic/biochemical molecular mechanic force field using a combination of ab initio calculations and experimental data." Computer simulation of biomolecular systems. Springer Netherlands, 1997. 83-96.[4] Duan, Yong, et al. "A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations." Journal of computational chemistry 24.16 (2003).[5] Hornak, Viktor, et al. "Comparison of multiple Amber force fields and development of improved protein backbone parameters." Proteins: Structure, Function, and Bioinformatics 65.3 (2006): 712-725.

[6] Lindorff-Larsen, Kresten, et al. "Improved side-chain torsion potentials for the Amber ff99SB protein force field." Proteins: Structure, Function, and Bioinformatics 78.8 (2010): 1950-1958.[7] Li, Da-Wei, and Rafael Brüschweiler. "NMR-based protein potentials." Angewandte Chemie International Edition 49.38 (2010): 6778-6780.[8] Maier, James A., et al. "ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB." Journal of chemical theory and computation 11.8 (2015): 3696-3713.[9] Cerutti, David S., et al. "ff14ipq: a self-consistent force field for condensed-phase simulations of proteins." Journal of chemical theory and computation 10.10 (2014): 4515-4534.[10] Debiec, Karl T., et al. "Further along the Road Less Traveled: AMBER ff15ipq, an Original Protein Force Field Built on a Self-Consistent Physical Model." Journal of Chemical Theory and Computation 12.8 (2016): 3926-3947.

Molecular DynamicsAmber 16 TIP3P water

1ps sampling30 10ns trajectories3 100ns trajectories

Extract Regions of Interest (ROIs)

5Å Radius, Full Residues

Select Novel Representative ROIs0.2Å Matching Cutoff

Calculate Chemical ShiftsGaussian 09

B3LYP 6-311+G(2d,p)

Match Closest TemplateMinimum Average Absolute

Difference

Template Database

Ensemble Average

A greedy algorithm for the dominating set problem is used to select a minimal set of ROI conformations that “cover” the full simulation.

ROI conformations are reduced to a set of distances between atoms of interest for efficient storage and matching.

QM calculations closely match experiment in reference organic molecules.

Errors due to matching are normally distributed, allowing larger match tolerances.

Force Fields Compared

GLY59

MP =

n!

i

|di|

n

1H Error as a Function of MP

A B C

1enh

1hik

1igd

1qzm

1ubq

2a3d

3obl

The computed chemical shift can serve as a sensitive indicator variable of the local chemical environment.

3OBL

Correlation With Experiment

3OBL: ff14SB vs. ff14ipqff14ipq better

ff14SB better

ff14SBWater: 57%Acid: 20%

ff14ipqWater: 7%Acid: 82%

GLU18 TYR52ff14ipqWater: 22%Oxy: 13%

ff14SBWater: 97%Oxy: 0%

75% of residues display the same dominant interaction pattern across all force fields (Helix, Sheet, Coil, Water, Ring, or Acid). The remaining residues display different

dominant interaction patterns in different force fields (Differ). These arise from significant changes in conformational sampling

AcknowledgementsThis research was supported in part by the University of Pittsburgh Center for Simulation and Modeling through the supercomputing resources provided and by R01GM108340 from the National Institute of General Medical Sciences.