32
Protein Tertiary Protein Tertiary Structure Prediction Structure Prediction

Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

  • View
    259

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Protein Tertiary Protein Tertiary Structure PredictionStructure Prediction

Page 2: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Protein Structure Prediction & Alignment

Protein structure Secondary structure Tertiary structure

Structure prediction Secondary structure 3D structure

Ab initio Comparative modeling Threading

Structure alignment 3D structure alignment Protein docking

Page 3: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Predicting Protein 3D Structure

Goal: Find the best fit of a sequence to a 3D structure

Ab initio methods Attempt to calculate 3D structure “from scratch”

Lattice models off-lattice models Energy minimization Molecular dynamics

Comparative (homology) modeling Construct 3D model from alignment to protein

sequences with known structure Threading (fold recognition/reverse folding)

Pick best fit to sequences of known 2D/3D structures (folds)

Page 4: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

How proteins interact?How proteins interact? It is believed that It is believed that hydrophobic collapsehydrophobic collapse is is

a key driving force for protein foldinga key driving force for protein folding Hydrophobic core!Hydrophobic core! Analog: water and oil separationAnalog: water and oil separation

Model: A chain of twenty kinds of beatsModel: A chain of twenty kinds of beats

Page 5: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

““Elementary school kid Elementary school kid model”model”

Different assembles (shapes) Different assembles (shapes) Frustrated systemFrustrated system Lots of local minimumsLots of local minimums

Jose Onuchic, UCSD

Page 6: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Classes of Classes of Amino AcidsAmino Acids

Page 7: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Cubic lattice modelCubic lattice model

Page 8: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Hydrophobic packing Hydrophobic packing modelsmodels

Dill's HP modelDill's HP model Two classes of amino acids, hydrophobic (H) and polar Two classes of amino acids, hydrophobic (H) and polar

(P)(P) Lattice model for position of amino acids. Lattice model for position of amino acids. Thread chain of H's and P's through lattice to maximize Thread chain of H's and P's through lattice to maximize

number of H-H contactsnumber of H-H contacts

2D 3D

Page 9: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

HydrophoHydrophobic Zipperbic Zipper

Page 10: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Most Designable Most Designable StructuresStructures

Page 11: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

All the chains here are All the chains here are 21 beads21 beads long. The upper panel long. The upper panel shows some of the 107 exceptionally stable foldings of 80 shows some of the 107 exceptionally stable foldings of 80 sequences that maximize the number of sequences that maximize the number of H-HH-H contacts. In contacts. In the lower panel are a few of the other 117,676,504,514,560 the lower panel are a few of the other 117,676,504,514,560 combinations of sequences and foldings, selected at combinations of sequences and foldings, selected at random. (Brian Hayes, American Scientists,1998) random. (Brian Hayes, American Scientists,1998)

Page 12: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

HP Lattice ModelHP Lattice Model Simplifications in the model:Simplifications in the model:

All amino acids are classified as hydrophobic All amino acids are classified as hydrophobic (H) or polar (P). A protein is represented as a (H) or polar (P). A protein is represented as a string of H’s and P’s. string of H’s and P’s. HHHHHPPPHHHPPHHHHHPPPHHHPP

Space is discretized. Each amino acid is Space is discretized. Each amino acid is embedded to a single lattice point. A protein embedded to a single lattice point. A protein fold corresponds to a fold corresponds to a self-avoiding walkself-avoiding walk over over the lattice.the lattice.

The The energy functionenergy function is defined as is defined as

E = E = (# of H-H contacts not including covalent (# of H-H contacts not including covalent interaction).interaction).

Page 13: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Example of HP lattice Example of HP lattice modelmodel

Hydrophobic amino acid

Polar amino acid

Peptide bond

H-H contacts

E = Number of H-H contacts (except for peptide bonds) = -7

Page 14: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

HP Lattice ModelHP Lattice Model

Other latticesOther lattices 2D triangular lattice, 3D-diamond lattice2D triangular lattice, 3D-diamond lattice

Other energy functions Other energy functions HP=0, HH=-1, PP=1HP=0, HH=-1, PP=1

Lattice model can be usedLattice model can be used Study qualitative features of protein foldingStudy qualitative features of protein folding Reduce search space in structure prediction Reduce search space in structure prediction

methodsmethods Study potential effectiveness of the methods Study potential effectiveness of the methods

for structure prediction (inverse folding for structure prediction (inverse folding problem)problem)

Page 15: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Inverse Folding Inverse Folding ProblemProblem

Example:Example:Can we find all protein sequences in Can we find all protein sequences in

GenBank with the GenBank with the globinglobin fold fold??

ClaimClaim::There exist two native sequence SThere exist two native sequence Sii, S, Sjj such that such that

E(S(SE(S(Sii), S), Sii) ) E(S(S E(S(Sii), S), Sjj))

where S(Swhere S(Sii) and S(S) and S(Sjj) be the native structures of S) be the native structures of Sii & & SSjj..

i.e. the sequence Si.e. the sequence Sj j “scores” better on S“scores” better on Sii’s native ’s native structure than Sstructure than Sii itself. itself.

NO.

Page 16: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

ExerciseExercise

Find native structures of SFind native structures of S11 and S and S22

SS11 = HHPPPPHPPPH = HHPPPPHPPPH SS22 = HHPHPPHPHPH = HHPHPPHPHPH

Thread SThread S2 2 on to the structure of Son to the structure of S1 1 and and find the energy associated with that fold find the energy associated with that fold

Page 17: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

ExerciseExercise Find native structures of SFind native structures of S11 and S and S22

SS11 = HHPPPPHPPPH = HHPPPPHPPPH SS22 = HHPHPPHPHPH = HHPHPPHPHPH

Thread SThread S2 2 on to the structure of Son to the structure of S1 1 and find the and find the energy associated with that fold energy associated with that fold

SS11

E(S(SE(S(S11), S), S11) = -2; E(S(S) = -2; E(S(S11), S), S22) = -3; E(S(S) = -3; E(S(S22), ), SS22) = -4.) = -4.

H

P P

H

H P

P

P

HP

P

H

P P

H

H P

P

H

HP

H

H

P

P

H

H

H

P

H

PP

H

Page 18: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

SummarySummary Approach

Reduce computation by limiting degrees of freedom

Limit α-carbon (Cα) atoms to positions on 2D or 3D lattice

Protein sequence → represented as path through lattice points

H-P (hydrophobic-polar) cost model Each residue → hydrophobic (H) or hydrophilic (P) Score position of sequence → maximize H-H contacts

Problem Still NP-hard Greatly simplified problem Emphasis on forming

hydrophobic core Need more accurate cost models

Page 19: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Off-Lattice Models Approach

Compromise between lattice model and molecular dynamics

Backbone placement → allowed by Ramachandran plot

Represent as phi & psi angles of α-carbon atoms Degree of precision

α-carbon only All backbone atoms All backbone atoms + side chains (residues) Common conformation (positions) of side chain = rotamer

Problem Still simplified problem Increased computation cost

Page 20: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Molecular Dynamics GoalGoal

Provides a way to observe the motion of large Provides a way to observe the motion of large molecules such as proteins at the atomic level – molecules such as proteins at the atomic level – dynamic simulationdynamic simulation

Approach Model all interatomic forces acting on atoms in protein

Potential energy function (Potential energy function (Newtonian mechanics)Newtonian mechanics) Perform numerical simulations to predict fold

Repeat for each atom at each time step Calculate & add up all (pairwise) forces

bonds:bonds: non-bonded: electrostatic and non-bonded: electrostatic and van der Waals’

Apply force, move atom to new position (Newton’s 2nd law ? Newton’s 2nd law ? ) )

Obtain trajectories of motion of moleculeObtain trajectories of motion of molecule

F = maF = ma

Page 21: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

MDMD Problem with MD

Smaller time step → more accurate simulation Modeling folding is computationally intensive Current models require tiny (10-15 second) time

steps Simulations reported for at most 10-6 seconds Folding requires 1 second or more

Demo (12 nanosecond MD simulation)

Page 22: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Types of Inter-atomic Forces

Page 23: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Molecular Dynamics Molecular Dynamics

Page 24: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Potential EnergyPotential Energy ComponentsComponents

(1) bond length (1) bond length Bonds behave like spring with equilibrium Bonds behave like spring with equilibrium bond length depending on bond type. bond length depending on bond type. Increase or decrease from equilibrium length Increase or decrease from equilibrium length requires higher energy. requires higher energy.

Page 25: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Potential EnergyPotential Energy

(2) bond angle (2) bond angle Bond angles have equilibrium value eg 108 for Bond angles have equilibrium value eg 108 for

H-C-HH-C-H Behave as if sprung. Behave as if sprung.

Increase or decrease in angle requires Increase or decrease in angle requires higher energy. higher energy.

Page 26: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Potential EnergyPotential Energy

(3) torsion angle(3) torsion angle

Rotation can occur about single bond in A-Rotation can occur about single bond in A-B-C-D but energy depends on torsion B-C-D but energy depends on torsion angle (angle between CD & AB viewed angle (angle between CD & AB viewed along BC). Staggered conformations along BC). Staggered conformations (angle +60, -60 or 180 are preferred). (angle +60, -60 or 180 are preferred).

Page 27: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Potential EnergyPotential Energy(4) van der Waals interactions(4) van der Waals interactions

Interactions between atoms not near Interactions between atoms not near neighbours expressed by Lennard-Jones neighbours expressed by Lennard-Jones potential. Very high repulsive force if potential. Very high repulsive force if atoms closer than sum of van der Waals atoms closer than sum of van der Waals radii. Attractive force if distance greater. radii. Attractive force if distance greater. Because of strong distance dependence, Because of strong distance dependence, van der Waals interactions become van der Waals interactions become negligible at distances over 15 negligible at distances over 15 ÅÅ..

Page 28: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Potential EnergyPotential Energy(5) Electrostatic interactions(5) Electrostatic interactions

All atoms have partial charge eg in C=O, C has All atoms have partial charge eg in C=O, C has partial positive charge, O atom partial negative partial positive charge, O atom partial negative charge. Two atoms that have the same charge charge. Two atoms that have the same charge repel one another, those with unlike charge repel one another, those with unlike charge attract. attract.

Electrostatic energy falls off much less quickly Electrostatic energy falls off much less quickly than for van der Waals interactions and may not than for van der Waals interactions and may not be negligible even at 30 be negligible even at 30 ÅÅ. .

Page 29: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Potential EnergyPotential Energy Potential Energy is given by the sum of Potential Energy is given by the sum of

these contributions:these contributions:

Hydrogen bonds are usually supposed to Hydrogen bonds are usually supposed to arise by electrostatic interactions but arise by electrostatic interactions but occasionally a small extra term is added.occasionally a small extra term is added.

Page 30: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Force fieldsForce fields A force field is the description of how potential A force field is the description of how potential

energy depends on parametersenergy depends on parameters Several force fields are availableSeveral force fields are available

AMBER used for proteins and nucleic acids AMBER used for proteins and nucleic acids (UCSF)(UCSF)

CHARMM (Harvard)CHARMM (Harvard) ……

Force fields differ: Force fields differ: in the precise form of the equations in the precise form of the equations in values of the constants for each atom typein values of the constants for each atom type

Page 31: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Obtain TrajectoryObtain Trajectory

Start with a initial structure (Ex. Structure from Start with a initial structure (Ex. Structure from PDB)PDB)

Assign random starting velocities to the atomsAssign random starting velocities to the atoms Calculating the forces acting on each atomCalculating the forces acting on each atom

Bonds, non-bonded (electrostatic and van der Val’s)Bonds, non-bonded (electrostatic and van der Val’s) Numerically integrate Numerically integrate Newton’s equations of Newton’s equations of

motionmotion Verlet method Verlet method Leapfrog method Leapfrog method

After equilibrating the system, record the positions After equilibrating the system, record the positions and momentum of the atoms as a function of timeand momentum of the atoms as a function of time

Page 32: Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction

Molecular DynamicsMolecular Dynamics Energy minimization gives local minimum, not Energy minimization gives local minimum, not

necessarily global minimum.necessarily global minimum.

Give molecule thermal energy so can explore Give molecule thermal energy so can explore conformational space & overcome energy barriers.conformational space & overcome energy barriers.

Give atoms initial velocity random value + direction. Give atoms initial velocity random value + direction. Scale velocities so total kinetic energy =3/2kT * number Scale velocities so total kinetic energy =3/2kT * number atomsatoms

Solve equation of motion to work out position of atoms Solve equation of motion to work out position of atoms at 1 fs.at 1 fs.