1
Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1 , Amarda Shehu 2 , Lydia Kavraki 2,3 Provided an approach to generating physical conformations of a protein Modeled flexibility of the binding site Future work • Investigate other modes of motion • Incorporate multiple motion vectors Conclusions Acknowledgements 1 Dept. of Electrical and Computer Engineering, Rice University 2 Dept. of Computer Science, Rice University 3 Dept. of Bioengineering, Rice University Computer Research Association’s Committee on the Status of Women in Computing Research Distributed Mentor Project W. M. Keck Center Undergraduate Research Training Program Physical and Biological Computing Group, Rice University For questions, comments, and preprint requests: Yi-Chieh Wu [email protected] References A.A. Canutescu and R. L. Dunbrack. Cyclic coordinate descent: A robotics algorithm for protein loop closure. Protein Science, 12: 963- 972, 2003. A. Shehu. (2004). Sampling Biomolecular Conformations with Spatial and Energetic Constraints. MS Thesis, Rice University. Modeled flap movement of HIV-1 protease using first PCA Opened and closed flaps but kept protein stable Movement concentrated in flaps Open-flap conformations are less constrained – recovered conformations with higher RMSDs Discussion Spatial Constraints Inverse kinematics – CCD Features defined along backbone, so sidechains kept rigid Displacement only valid in a small neighborhood Energetic Constraints Full conjugate gradient minimization of CHARMM energy Energy cutoff of 600 kcal/mol “Rewind” to previous conformation if high-energy barrier encountered Spatial and Energetic Constraints Problem Definition Generate a set of conformations that capture the most important motions Follow along collective modes of motion starting from an initial structure Limited by local search – analysis fails far from the native Problem Statement Motivation Most current methods consider proteins as rigid structures Models incorporating protein flexibility provide better representations HIV-1 protease A virus protein that assists in HIV replication Target of drug design – single point of failure Native structure: fully minimized structure of 4HVP from the Protein Databank Principle component analysis (PCA) Identifies major modes of motion Direct physical interpretations HIV-1 protease: First eigenvector corresponds to opening and closing of the flaps surrounding the binding site Model System Figure 4. Backbone representation of HIV-1 protease bound to an inhibitor (orange). Features: residues with constrained positions Choose atoms with the largest displacements (Figure 5) Internal features moved along the PCA – capture flap movement End features unmoved – keep rest of protein native-like to maintain low-energy Feature Definition Figure 5: Atom displacements along the first PCA. Red circles mark the indices of our chosen features. Method Results RMSDs of Recovered Conformations along the First PCA (The highest RMSD as measured against the native structure is given. RMSD is measured in angstroms.) Step Size Flap All- Atom RMSD Flap Backbone RMSD Flap Sidechain RMSD Rest All- Atom RMSD Total All- Atom RMSD Clos e 0.1 2.125 2.856 3.188 0.483 1.117 0.25 2.235 2.104 2.266 0.359 0.804 0.5 2.097 2.032 2.113 0.337 0.7552 1.0 2.289 2.166 2.319 0.298 0.798 2.5 2.159 1.993 2.198 0.312 0.764 Open 0.1 4.668 4.027 4.814 0.517 1.599 0.25 3.421 3.111 3.494 0.356 1.166 0.5 3.340 3.171 3.454 0.375 1.164 1.0 2.351 2.030 2.424 0.247 0.802 2.5 1.643 1.434 1.691 0.240 0.582 Figure 6: Backbone representation of flap movement along the first PCA. Features used are shown as gray spheres. Algorithm Rigid geometry model Dihedrals are the only degrees of freedom Reduce problem dimensionality Proteins as Robotic Manipulators Figures adapted from: I. Lotan. (2004). Algorithms exploiting the chain structure of proteins. PhD Thesis, Stanford University. Figure 3 : Using CCD to satisfy spatial constraints. One joint (circled in green) is rotated at a time to bring the end- effector (blue) closer to the target position Figure 2: A protein modeled as an articulated mechanism. Figure 1 : Rigid geometry model. Only dihedral angles are used as degrees of freedom. Backbone dihedrals (phi and psi) are depicted. PROGRAM OUTPUT INPUT Initialize Protein and Features Move Features by PCA Check Energy Rewind to Previous Conformati on Conformation s Time Analysis Closure Satisfaction, Energies, RMSD Protein (Native) PCA Vector, Step Size Features Energy Cutoff Randomization, CCD, and Minimization Parameters within cutoff outside cutoff Use CCD to Satisfy Features Minimize Energy Applications Protein native state behavior Molecular interactions Drug design and discovery Protein Flexibility Energy landscape Funnel-shaped → thermodynamically stable native structure Varying energetic constraints → non- symmetric for open- and close-flap conformations More conformations around the native Cyclic Coordinate Descent (CCD) Iterative, heuristic approach to solving inverse kinematics Adjusts one dihedral at a time to move an atom to its constrained position Computationally fast and analytically simple Robotic Representation Atoms ≡ joints Bonds ≡ links Apply robotic techniques

Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1, Amarda Shehu 2, Lydia Kavraki 2,3 Provided an approach to generating

Embed Size (px)

Citation preview

Page 1: Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1, Amarda Shehu 2, Lydia Kavraki 2,3  Provided an approach to generating

Modeling Protein Flexibility with Spatial and Energetic ConstraintsYi-Chieh Wu1, Amarda Shehu2, Lydia Kavraki2,3

Provided an approach to generating physical conformations of a protein

Modeled flexibility of the binding site Future work

• Investigate other modes of motion• Incorporate multiple motion vectors

ConclusionsAcknowledgements

1Dept. of Electrical and Computer Engineering, Rice University2Dept. of Computer Science, Rice University

3Dept. of Bioengineering, Rice University

Computer Research Association’s Committee on the Status of Women in Computing Research Distributed Mentor Project

W. M. Keck Center Undergraduate Research Training Program

Physical and Biological Computing Group, Rice University

For questions, comments, and preprint requests: Yi-Chieh Wu [email protected]

References• A.A. Canutescu and R. L. Dunbrack. Cyclic coordinate descent: A

robotics algorithm for protein loop closure. Protein Science, 12: 963-972, 2003.

• A. Shehu. (2004). Sampling Biomolecular Conformations with Spatial and Energetic Constraints. MS Thesis, Rice University.

Modeled flap movement of HIV-1 protease using first PCA

Opened and closed flaps but kept protein stable

Movement concentrated in flaps

Open-flap conformations are less constrained – recovered conformations with higher RMSDs

Discussion

Spatial Constraints Inverse kinematics – CCD Features defined along backbone, so sidechains kept rigid Displacement only valid in a small neighborhood

Energetic Constraints Full conjugate gradient minimization of CHARMM energy Energy cutoff of 600 kcal/mol “Rewind” to previous conformation if high-energy barrier encountered

Spatial and Energetic Constraints

Problem Definition

Generate a set of conformations that capture the most important motions

Follow along collective modes of motion starting from an initial structure

Limited by local search – analysis fails far from the native

Problem Statement

Motivation Most current methods

consider proteins as rigid structures

Models incorporating protein flexibility provide better representations

HIV-1 protease A virus protein that assists in HIV replication Target of drug design – single point of failure Native structure: fully minimized structure of

4HVP from the Protein Databank

Principle component analysis (PCA) Identifies major modes of motion Direct physical interpretations HIV-1 protease: First eigenvector

corresponds to opening and closing of the flaps surrounding the binding site

Model System

Figure 4. Backbone representation of HIV-1 protease bound to an inhibitor (orange).

Features: residues with constrained positions Choose atoms with the largest displacements

(Figure 5) Internal features moved along the PCA –

capture flap movement End features unmoved – keep rest of protein

native-like to maintain low-energy

Feature Definition

Figure 5: Atom displacements along the first PCA. Red circles mark the indices of our chosen features.

Method Results

RMSDs of Recovered Conformations along the First PCA(The highest RMSD as measured against the native structure is given. RMSD is measured in angstroms.)

Step Size

Flap All-Atom RMSD

Flap Backbone

RMSD

Flap Sidechain

RMSD

Rest All-Atom RMSD

Total All-Atom RMSD

Close 0.1 2.125 2.856 3.188 0.483 1.1170.25 2.235 2.104 2.266 0.359 0.8040.5 2.097 2.032 2.113 0.337 0.75521.0 2.289 2.166 2.319 0.298 0.7982.5 2.159 1.993 2.198 0.312 0.764

Open 0.1 4.668 4.027 4.814 0.517 1.5990.25 3.421 3.111 3.494 0.356 1.1660.5 3.340 3.171 3.454 0.375 1.1641.0 2.351 2.030 2.424 0.247 0.8022.5 1.643 1.434 1.691 0.240 0.582

Figure 6: Backbone representation of flap movement along the first PCA. Features used are shown as gray spheres.

Algorithm

Rigid geometry model Dihedrals are the only

degrees of freedom Reduce problem

dimensionality

Proteins as Robotic Manipulators

†Figures adapted from: I. Lotan. (2004). Algorithms exploiting the chain structure of proteins. PhD Thesis, Stanford University.

Figure 3†: Using CCD to satisfy spatial constraints. One joint (circled in green) is rotated at a time to bring the end-effector (blue) closer to the target position (red).

Figure 2: A protein modeled as an articulated mechanism.

Figure 1†: Rigid geometry model. Only dihedral angles are used as degrees of freedom. Backbone dihedrals (phi and psi) are depicted.

PROGRAM OUTPUTINPUT

Initialize Protein and Features

Move Features by PCA

Check Energy

Rewind to Previous

Conformation

Conformations

Time Analysis

Closure Satisfaction,

Energies, RMSD

Protein (Native)

PCA Vector, Step Size

Features

Energy Cutoff

Randomization, CCD, and

Minimization Parameters within cutoff

outside cutoff

Use CCD to Satisfy Features

Minimize Energy

Applications Protein native state

behavior Molecular interactions Drug design and discovery

Protein Flexibility

Energy landscape Funnel-shaped →

thermodynamically stable native structure

Varying energetic constraints → non-symmetric for open- and close-flap conformations

More conformations around the native

Cyclic Coordinate Descent (CCD) Iterative, heuristic approach to solving inverse kinematics Adjusts one dihedral at a time to move an atom to its

constrained position Computationally fast and analytically simple

Robotic Representation Atoms ≡ joints Bonds ≡ links Apply robotic techniques