44
Protein Structure prediction Alejandro Giorgetti

Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Protein Structure prediction

Alejandro Giorgetti

Page 2: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Sequence, function and structurerelationships.

• Life is the ability to metabolize nutrients, respondto external stimuli, grow, reproduce and evolve.

• Chemically proteins are: linear polymers of aa• Proteins assume a 3D shape which is usually

responsible for function• The consequence of the tight link between

structure, function and evolutionary pressuredistinguish proteins from ordinary polymers.

Page 3: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Protein Structure• The sequence of aminoacids is called the primary structure.• Secondary structure refers to local folding• Tertiary structure is the arrangement of secondary elements

in 3D.• Quaternary structure describes the arrangement of a protein

subunits.• The peptide bond is planar and the dihedral angle it defines is

almost always 180°.

Page 4: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Protein Structure

• What is a dihedral angle?

– Is the angle between two planes. In practice, if youhave four conected atoms and you want measure the dihedral angle around the central bond, you orient the system in such a way that the two central atoms are superimposed and measure the resulting angle between the first and last atom.

A A

B C

D

B C

D

Page 5: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Protein Structure• The simplest arrangements of aa is the alpha-helix, a right

handed spiral conformation. The structure repeats itself every5.4 A along the helix axis. There are 3.6 aa per turn

• The beta sheet. The R groups of neighboring residues in strand point in opposite directions. Parallel or antiparallel beta sheets.

• Ramchandan plot: pairs of angles that do not cause the atomsof a dipeptide to collide.

Page 6: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

• In experimental structures wecan observe aa in disallowedregions:

The reason of combinationsrarely observed is becausethey are energeticallydisfavoured, but notmathematically impossible.

The loss of energy can becompensated by otherinteractions within the protein.

Page 7: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,
Page 8: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

• Loops: regions without repetitive structure that connectssecondary structure elements.

• Supersecondary elements (cometimes called motif): Arrangements of two or three consecutive secondarystructure that are present in many different proteinstructures, even with completely different sequences. Alpha-alpha unit, beta-beta unit, beta-alpha-beta unit.

• Four-helix bundle; beta-alpha-beta-alpha-beta: Rossmanfold; TIM barrel fold (several beta-alpha-beta units)

• Domain: portion of the polypeptide chain that folds into a compact semi-independent unit.

Page 9: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Domains• Class(C)

– derived from secondary structurecontent is assigned automatically

• Architecture(A)

– describes the gross orientation of secondary structures, independent of connectivity.

• Topology(T)

– clusters structures according totheir topological connections and numbers of secondary structures

• Homologous superfamily (H)

Page 10: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Ala: transient interactions

Thr, Ser: phosphorylationtarget: protein kinases attackphosphate group to the side-chain. Thr: Beta-branched more often found in beta-sheets.

Gly: unusual ramachandran, often found in turns

Cys: Very reactive, coordinate metals.

Page 11: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Il problema del folding• Secondo principio della termodinamica: ∆G = ∆H – T ∆S,

ci da la stabilità di conformazione.• Entalpia: elettrostatica, dispersione, van der Waals,

legami idrogeno.• Entropia: l’acqua forma ‘ordered cages’ attorno agli aa

idrofobici. Folding rompe quest’ordine.• La energia libera di una proteina nello stato di fold è di

solo pochi Kcal/mol (pari a qualche legame idrogeno) ‏• Anfinsen: Tutta l’informazione 3D è contenuta nella

sequenza (sperimento con l’urea).• Levinthal paradox: aa hanno infinite possibilità di

conformazioni. • Folding pathway specifico per ogni proteina: Funnel

theory

Page 12: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Hydrophobic effect

Page 13: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Il problema del folding

Page 14: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Structuralalignment

Page 15: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Evolution of protein structure

• If a base-substitution event occurs in a protein-coding region– The fine balance between the gain and loss of

free energy of folding is compromised: no single energy minimun -> NOT FOLD

– The energy landscape of the protein change. Still there is a global minimun of energy -> same or similar function. Local perturbationswithout affecting the general shape or topology.

Page 16: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Evolution of sequence vsevolution of structure

10 %

30 %

50 %

70 %

90 %

Drug design?

Biochemistry?

Molecular Biology?

[ Chothia & Lesk (1986) ]

X-r

ay c

rist

allo

grap

hy: M

R

Page 17: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Evolutionary-based methods for Protein structure prediction: Homology modelling

Idea: Proteines evolving from a common ancestor maintained similar core 3D structures.

Proteine con struttura nota sono utilizzate come ‘Templati’per modellare una sequenza per la quale non ‘è informazione sulla struttura 3D.

Target – Templato devono essere correlate evolutivamente.

Prima volta: 1970 da Tom Blundell

Page 18: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Template(s) selection

Sequence Alignment

Structure Modeling

Comparative ModelingKnown

Structures (templates)‏

Target

sequence Structure Evaluation

>hTEIIMSSPQAPEDGQGCGDRGDPPGDLRSVLVTTVLNLEPLDEDLFRGRHYWVPAKRLFGGQIVGQALVAAAKSVSEDVHVHSLHCYFVRAGDPKLP

Final Structural Models

Page 19: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Template(s) selection

Sequence Alignment

Structure Modeling

Target

sequence

Protein Data Bank PDB

Structure Evaluation

http://www.pdb.org

Banca Dati dei templati

Separare in singole catene

Controllare la qualità delle strutture

Comparative ModelingKnown

Structures (templates)‏

Final Structural Models

Page 20: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Known Structures (templates)‏

Sequence Alignment

Structure Modeling

Structure Evaluation

Final Structural Models

Target

sequence

Similarità di sequenza / Fold recognition

Analisi della struttura (risoluzione, metodo sperimentale

Ci sono altri atomi e/o composti? Sono legati?

Comparative Modeling

Template(s) selection

Page 21: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Known Structures (templates)‏

Template(s) selection

Structure Modeling

Structure Evaluation

Final Structural Models

Target

sequence

Fondamentale per la modellizzazione per omologia.

Allineamento globaleUn piccolo errore

nell’allineamento può essere fatale per il modello.

Ricordatevi: gli allineamenti a coppie sussurrano, quelli multipli parlano ad alta voce.

Sappiamo qualcos’altro? Ci sono sperimenti?

Comparative Modeling

Sequence Alignment

Page 22: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Known Structures (templates)‏

Template(s) selection

Comparative Modeling

Target

sequence Structure Evaluation

Final Structural Models

Assemblaggio di frammenti (Template based fragment

Assembly - SwissMod).

Minimizzazione della deviazione dai vincoli spaziali (Satisfaction of Spatial Restraints: MODELLER ) ‏

Sequence Alignment

Structure Modeling

Page 23: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Known Structures (templates)‏

Template(s) selection

Sequence Alignment

Structure Modeling

Target

sequence

Errori nella selezione dei templati

Cicli iterativi di: allineamento, modellizzazione e valutazione.

Comparative Modeling

Structure Evaluation

Final Structural Models

Page 24: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Modelli Nascosti di Markov (HMM) ‏

• Rappresentazione degli allineamenti multipli a traverso le probabilità di ‘transizione’.

• Es.: possiamo utilizzare un allineamento per calcolare, in ogni posizione, la probabilità che dopo di essa ci sia una inserzione, un delezione oppure un ‘match’.

• Rappresentazione di un allineamento in termini probabilistici, e può essere utilizzato per stimare se una sequenza appartiene ad una famiglia

Page 25: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Seq1: A C C – E

Seq2: E C E – A

Seq3: A C E A A

Seq4: C – E - E

0.450.18 0.180.360.360.45Match - match

0.090.090.360.090.180.09Match - del

0.090.090.090.090.090.09Ins - Del

0.090.360.090.180.090.09Del – match

Frequenze

0 + 13 + 10 + 11 + 10 + 10 + 1Del – match

0 + 10 + 10 + 10 + 10 + 10 + 1Ins – Del

0 + 1

1 + 1

4 - 5

3 + 1

1 + 1

3 - 4

0 + 10 + 11 + 10 + 1Match - del

4 +13 + 13 + 14 + 1Match - match

5 - end2 – 31 – 2Inizio - 1Quantità

A 0.43

C 0.29

E 0.29

A 0.17

C 0.67

E 0.17

A 0.14

C 0.28

E 0.57

A 0.43

C 0.29

E 0.29

MatchInizio

delete

Ins

0.45

0.09

0.18

0.36

0.09

0.45

0.36

0.18

0.18

0.36

States=7 (+ match-ins,Ins-match, ins-ins)We have to add a countFor each state so the total counts are 11)

Page 26: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

• Modello di hTEII (O14734 ) utilizzando: Sp3: http://theory.med.buffalo.edu/

• Ffas03: http://ffas.ljcrf.edu/ffas-cgi/cgi/ffas.pl?ses= • HHpred- Toolkit:

http://protevo.eb.tuebingen.mpg.de/toolkit/index.php?view=hhpred

• mgenThreader: http://bioinf.cs.ucl.ac.uk/psipred/

• Domande: Analizzando gli allineamenti: ci sono delle differenze importanti?Cosa possiamo dire dei templati?Analisi strutturale: Possiamo a priori dire qual è il modello migliore? Qual è la regione più affidabile?

Page 27: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

I. Template based fragment assembly (SwissModel)‏

[ http://www.expasy.org/spdbv/ ]

Page 28: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

I. Template based fragment assemblya) Costruire il core conservato (Structurally conserved regions -SCRs)‏

[ http://www.expasy.org/spdbv/ ]

In corrispondenza alle regioni più rigide. Alta conservazione della sequenza e meno gaps.In generale: elementi di struttura secondaria.

Page 29: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

I. Template based fragment assemblyb) Modellizzazione dei loop (Structural variable regions - SVRs)‏e regioni mancanti del backbone

Regioni più flessibili.

Alta probabilità di trovare gaps

In corrispondenza con loops e turns

Banche dati dei loops

Ricostruzione “ab-initio” dei loops (Monte Carlo,

dinamica molecolare, algoritmi genetici, ecc.)‏

[ http://www.expasy.org/spdbv/

Page 30: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

I. Template based fragment assembly

c) Modelizzazione delle catene laterali

Trovare la conformazione più probabile per le catene laterali utilizzando:

strutture omologhe.

Librerie per i rotameri

Algoritmi per la minimizzazione energetica.

[ http://www.expasy.org/spdbv/ ]

Page 31: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

I. Template based fragment assembly

d) Minimizzazione della energia

∑∑

∑ ∑ ∑

<< ⎥⎥

⎢⎢

⎟⎟⎠

⎞⎜⎜⎝

⎛−⎟

⎟⎠

⎞⎜⎜⎝

⎛+⎟

⎟⎠

⎞⎜⎜⎝

⎛⋅+

−++−+−=

ji ijijij

ji

ji

bonds angles dihedralsb

rrrqq

nVkxxkV

612

0

20

20

41

)cos(1()()(

σσεεπ

γϕθθθ

Il processo di modeling produrrà contatti ravvicinati fra atomi, e lunghezze di legame sfavorevoli.

⇒ Riuscire ad avere le geometrie giuste

Minimizzazione della energia troppo estensiva, può allontanarci dalla ‘vera’struttura.

SwissModel utilizza GROMOS 96 force field

Page 32: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Interactions and energies

• When protein conformational energy is discussed, peolple talk almost indifferently about forces and energies: the force is the derivative of the energy.

• The Schrodinger equation describes the behavior of a molecule: impossible to solve for complex systems.

• We need a function that approximately describes the energy of interaction that occur in a protein using a simplified representation of both the system and of the energetic contributions of each interaction in the protein: covalent and non-bonded.

Page 33: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Covalent interactions• A covalent bond is formed if the atoms share electrons, but the effect

is not localized and the electron density increase has an effect on the molecule.

• Approximation: treat the bond as a spring between two atoms. The energy is described by use of the Hook’s law.

• The use of this equation is justified by the observation that bondsbetween chemical similar atoms have similar lengths: we assume that the observed equilibrium value is the minimun potential energy.

• The same approximation is used for the energy variation of bondangles.

• Dihedral angles do not have a sinlge energy minimun. In practice it isfound that this potential is not enough to represent the energy of a dihedral angle and often a non-bonded energy interaction termbetween the first and last atoms of the quadruplet is combined

∑∑

∑ ∑ ∑

<< ⎥⎥

⎢⎢

⎟⎟⎠

⎞⎜⎜⎝

⎛−⎟

⎟⎠

⎞⎜⎜⎝

⎛+⎟

⎟⎠

⎞⎜⎜⎝

⎛⋅+

−++−+−=

ji ijijij

ji

ji

bonds angles dihedralsb

rrrqq

nVkxxkV

612

0

20

20

41

)cos(1()()(

σσεεπ

γϕθθθ

Page 34: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Electrostatic interactions• A nucleus and its electron interact according to Coulom’s law.• We assign a formal charge to all the atoms. Types of interactions: salt

bridges, groups that carry no formal charge can be polarized(electronegative atoms attract electrons while other lose electrons): water molecules, electronegative oxygen attracts the electron and leaves the H atoms with net positive charges. So two waters can forma strong electrostatic interaction: the H-bond. The latter are fundamental in protein structure.

• Partial charges are computed by quantum mechanical calculations on model systems.

• The dielectric constant is a macrospcopic entity derived from the average microscopic effect of polarization.

• If we place to polar charge in a polar medium, the molecules of the medium will tend to line up with the electric field. Their dipole willoppose to the electric field reducing its strenght.

∑∑

∑ ∑ ∑

<< ⎥⎥

⎢⎢

⎟⎟⎠

⎞⎜⎜⎝

⎛−⎟

⎟⎠

⎞⎜⎜⎝

⎛+⎟

⎟⎠

⎞⎜⎜⎝

⎛⋅+

−++−+−=

ji ijijij

ji

ji

bonds angles dihedralsb

rrrqq

nVkxxkV

612

0

20

20

41

)cos(1()()(

σσεεπ

γϕθθθ

Page 35: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Van der Waals interactions

• Electromagnetic interactions can affect uncharged atoms, they vibrate producing a dipole moment that interacts with the similarly generateddipoles of the sourrounding atoms. This produces an attractinginteraction.

• The other effect is that the orbital of the atoms cannot overlap becauseof the Pauli exclusion principle: two atoms cannot have the samequantum state.

∑∑

∑ ∑ ∑

<< ⎥⎥

⎢⎢

⎟⎟⎠

⎞⎜⎜⎝

⎛−⎟

⎟⎠

⎞⎜⎜⎝

⎛+⎟

⎟⎠

⎞⎜⎜⎝

⎛⋅+

−++−+−=

ji ijijij

ji

ji

bonds angles dihedralsb

rrrqq

nVkxxkV

612

0

20

20

41

)cos(1()()(

σσεεπ

γϕθθθ

repulsion dispersion

Page 36: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Eelectrostatic . The electrostatic energy is evaluated by using the Restrained ElectrostaticPotential (RESP) partial charges. These charges have the properties of accuratelyreproduce the electrostatic potential multipoles outside the molecule, and they werecalculated in the following way. Ab initio quantum chemical calculations are performedon small molecules and the electrostatic potential { j V } are calculated on M grid pointsoutside the molecule.

Page 37: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

II. Modeling by Satisfaction of Spatial restraintsTrovare la struttura più probabile a

partire da un allineamento Utilizza probability density functions. Minimizza deviazioni dai vincoli.

Comparative protein modeling by satisfaction of spatial restraints. A. Šali and T.L. Blundell. J. Mol. Biol. 234, 779-815

Derivate per omologia: Ottenute dal allineamento.

Stereochimiche: Set di parametri di CHARMM parameter - MacKerell et al., 1998 ).

Energie di Van der Waals e Coulomb: dal campo di forza: CHARMM.•‘Esterne’: Vincoli di distanze esterne.

Page 38: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Valutazione del modello ?Da analizzare:

Fold corretto

copertura del modello (%)‏

Cα - deviazione (rmsd)‏

Accuratezza dell’allineamento(%)‏

Catene laterali

Structure Analysis and Verification Server:

http://nihserver.mbi.ucla.edu/SAVS/

Page 39: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Valutazione dell’accuratezza del modello

EVA

Evaluation of Automatic protein structure prediction

[ Burkhard Rost, Andrej Sali, http://maple.bioc.columbia.edu/eva ]

Page 40: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Protein Structure Resources PDB http://www.pdb.orgPDB – Protein Data Bank of experimentally solved structures (RCSB)‏

CATH http://www.biochem.ucl.ac.uk/bsm/cathHierarchical classification of protein domain structures

SCOP http://scop.mrc-lmb.cam.ac.uk/scopAlexey Murzin’s Structural Classification of proteins

DALI http://www2.ebi.ac.uk/daliLisa Holm and Chris Sander’s protein structure comparison server

SS-Prediction and Fold Recognition PHD http://cubic.bioc.columbia.edu/predictproteinBurkhard Rost’s Secondary Structure and Solvent Accessibility Prediction Server

PSIPRED http://bioinf.cs.ucl.ac.uk/psipred/L.J McGuffin, K Bryson & David T. Jones Secndary struture prediction Server

3DPSSM http://www.sbg.bio.ic.ac.uk/~3dpssFold Recognition Server using 1D and 3D Sequence Profiles coupled.

THREADER: http://bioinf.cs.ucl.ac.uk/threader/threader.htmlDavid T. Jones threading program

Page 41: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Fold recognition Metodi di ProfiliPrinciple: Find a compatible fold

Per ogni aa possiamo calcolare la frequenza relativa:

Presente in struttura secondariaPresente in superficieIn ambiente idrofobico

Allora, ogni aa verrà sostituito da una lettera (propietà)‏

Da struttura proteica possiamo analizzare le posizioni in termini di:

Elemento di struttura secondaria di appartenenza)‏

Percentuale della superficie dell’aa che la occupa esposta al solvente.

Si trova in ambiente idrofobico o polare?

Allora, ogni struttura verrà sostituita da una sequenza lineare di ‘propietà’

>Target Sequence XYMSTLYEKLGGTTAVDLAVAAVAGAPAHKRDVLNQ

Rank models according to

SCORE or ENERGY

Build model of target protein based on each

template structure PDB diventa una banca dati di sequenze. Si utilizzano i metodi di ricerca che già conoscete

Page 42: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Fold recognition Calcolo energetico empirico. Es: frequenza in cui ciascuna coppia di aa si trova ad

una certa distanza nei pdb. Se il numero di osservazioni e suficientemente alto: probabilità di trovarla a quella distanza

Threading (‘Infilare’) ‏

M

A

TE

A

F

TS

G

Q

⎟⎟⎠

⎞⎜⎜⎝

−−=−∆

AlaAlaPAlaAlaP

kTAlaAlaEunfolded

folded

()(

ln)(

Eq. Boltzmann: la probabilità di osservare qualcosa dipende della sua energia:

P(x)=e -(E(x)/KT)‏

Possiamo invertire, la energia di un evento che ha una probabilità P(x) sarà:

Frozen approximation: Si calcola l’energia d’interazione del nostro aa con gli aa del templato. Idea: le posizioni finali(per qualunque allineamento) saranno occupate da aa molto simili.

Page 43: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

New FoldI new fold, sono veramente ripiegamenti mai visti in natura?

Strutture che hanno motivi strutturali comuni a livello di ‘frammenti’ o di strutture supersecondarie

Assemblaggio di Frammenti

La relazione fra sequenza locale e struttura locale e altamente degenerata.Interazioni locali, dipendenti della sequenza possono ‘deviare’ la struttura locale dei segmenti

Idea:

La distribuzione di conformazioni possibili per un segmento locale di una catena polipetidica può essere approssimata con la distribuzione di strutture adottate dalla sequenza e da sequenze evolutivamente

vicine in proteine di struttura nota

Il mappaggio fra sequenze locali e strutture locali comuni (eliche, terminazioni di eliche, turns) e meno degenerato che per frammenti strutturali generici

Page 44: Protein Structure prediciton - Bioinformaticsmolsim.sci.univr.it/2013_bioinfo2/Protein_Structure_prediction.pdf · Protein Structure • The simplest arrangements of aa is the alpha-helix,

Metodo Assemblaggio di frammenti: Dividendo la sequenza in frammenti

MSSPQAPEDGQGCGDRGDPPGDLRSVLVTTV

FRAGFOLDElementi di struttura supersecondaria Frammenti di tri, tetra e pentapeptidi

Ogni frammenti valutato energeticamente(Knowledge-based potential) ‏

Ottimizzazione e Assemblaggio(Knowledge-based potential)‏

ROSETTAFrammenti di 9 aa

Sceglie le strutture delle 25 sequenze più vicine

ROSETTAConfigurazione stessa

Si sostituiscono gli angoli diedri in modo casuale (Simulated Annealing)‏

FRAGFOLDCombinazioni casuali di frammenti.Simulated annealing usando i piccoli

frammenti