Upload
jenis
View
54
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence . Matthew Perella January 31, 2013. Proteins. Abundance 20 Amino Acids Role in nearly all cellular processes Enzymes, hormones, signaling, immune system, muscle fibers, transporters 1. - PowerPoint PPT Presentation
Citation preview
Building an Invisible Puzzle:Predicting Protein Structure and Function from
Sequence
Matthew Perella January 31, 2013
Proteins
• Abundance• 20 Amino Acids• Role in nearly all
cellular processes• Enzymes, hormones,
signaling, immune system, muscle fibers, transporters1
1. Nelson, D. L.; Cox, M. M., Priciples of Biochemistry. 5 ed.; W.H. Freeman and Company: New York, 2008.2. Image obtained from: primary protein structure | protein-pdb.com. http://proteinpdb.com/2011/10/04/primary-protein-
structure/.
Levels of Protein Structure2
Understanding Structure and Function
• Proteomics• Characterize structures– Whole-genome sequencing (<1%)– Experimentally• X-Ray Crystallography• NMR Spectroscopy
– Computational Prediction• Bioinformatics
Research
• Wine Spoilage• Brettanomyces bruxellensis• Vinylphenol Reductase3
– Vinylphenols– Ethylphenols
• How??
Vinylphenol Reductase SequenceMPLMTISDSVKDSLTKSEVVPTVIHDKSFLPKGFLTIQYDSGKEVALGNNIRPADSKNLPRIDFTLNLPSDASSTFNISKDDRFTLIVTDPDAPTRNDEKWSEYLHYLAVDVQLNTFNAENASSNDQLSTADLKGRTLYPYIGPGPPPKTGKHRYVFLLYKQTPGVTPEAPKDRPNWGTGIRGAGAAEYAEKYKLTPYAVNFFYAQNDQQ3
3. Tchobanov, I.; Gal, L.; Guilloux-Benatier, M. l.; Remiz, F.; Nardi, T.; Guzzo, J.; Serpaggi, V.; Alexandre, H., Partial vinylphenol reductase purification and characterization from Brettanomyces bruxellensis - Powered by Google Docs. European Federation of Microbiological Studies 2008.
Sequence Databases
• Protein Data Bank (PDB)– As of Wednesday, January 30th. There are 81,306
characterized structures in the PDB database4
• UniProtKB/Swiss-Prot– 538,849 reviewed sequences 29,266,939
unreviewed sequences5
– Only 77,110 have experimentally solved structures
4. RCSB PDB - Holdings Report. http://www.rcsb.org/pdb/statistics/holdings.do.5. UniProtKB/Swiss-Prot Available at: http://ca.expasy.org/sprot/relnotes/relstat.html
Classification Schemes1. Gene Ontology (GO)2. Secondary Structure3. Structural Motifs4. Family
CATH & SCOP PROSITE InterPro Pfam
Sandhya, S. R.; Jayaram, B., Proteins: Sequence to Structure and Function – Current Status. Current protein and peptide science 2010, (11), 498 – 514.
Resources
• Similar Sequence Searching• Multiple Sequence Alignments• Prediction – Secondary Structure– 3-D Model
• Viewing and Editing Software
Watson, J. D.; Laskowski, R. A.; Thornton, J. M., Predicting protein function from sequence and structural data. Current Opinion in Structural Biology 2005, 15 (3), 275-284.
Resource Name 1. Similarity Search(Sequence Alignments)
2. Predictions 3. Viewer
BLAST α
COBALT αJpred α α
Phyre 2 α α αSWISS-Model α α
MPI Bioinformatics Tookit α αPsiPred α
ClustalW αDNASTAR Lasergene 9 Core Suite Software α α α
CLC Protein Workbench 5.7.1 software α α αCnc3d viewer (Java) α α
Pymol Molecular Graphics System (Java) α
UCSF Chimera Molecular Visualization Software v. 1.6
(Python)
α α
Table 1: Bioinformatics Resource Function Analysis
Methods of Prediction
1. Pattern Recognition• pattern recognition techniques are used to find
sequences with high similarity in order to infer related structures and functions.
2. Ab Initio• prediction method used to create 3-D model to
determine structural and functional information using only the sequence
Lee, D.; Redfern, O.; Orengo, C., Predicting protein function from sequence and structure. Nat Rev Mol Cell Biol 2007, 8 (12), 995-1005.
Sequence Similarity Searches• BLAST• PSI-BLAST
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. (1997) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic Acids Res. 25:3389-3402. PubMed
Multiple Alignment
• MUSCLE• CLUSTALW• COBALT
RC, E., MUSCLE multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2012, 32 (5), 1792-1797.Papadopoulos JS and Agarwala R (2007) COBALT: constraint-based alignment tool for multiple protein sequences, Bioinformatics 23:1073-79. PubMed.
Template Secondary Structure Annotation
Secondary Structure Prediction
Secondary Structure Annotation
3-D Model Prediction with Template • PHYRE-2– PSI-BLAST– Psi-pred and Diso-pred– Hidden Markov Model (HMM)– HMM alignment– 3-D models from known structures– Maximizing Thermodynamic Stability• Modelling insertions and deletions with loop library • Modelling of AA side chains using a rotamer library to
minimize steric interferencesKelley, L. A. S. M., Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 2009, 4, 364-371.
Phyre2 Model Alignment Results
Kelley, L. A. S. M., Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 2009, 4, 364-371.
3-D Model Prediction
Superimposed Structural alignment
• Alignment of α-helices and β-sheets
• Motif conservation
• Infer similar function from homologues
Kelley, L. A. S. M., Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 2009, 4, 364-371.
Prediction Analysis
• QMEAN and SWISS-MODEL used to assess
Models Superimposed on Template
Resources• 1. BLAST References. http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=References. • 2. COBALT:Multiple Alignment Tool. http://www.ncbi.nlm.nih.gov/tools/cobalt/cobalt.cgi?CMD=Doc. • 3. primary protein structure | protein-pdb.com. http://protein-pdb.com/2011/10/04/primary-protein-structure/. • 4. RCSB PDB - Holdings Report. http://www.rcsb.org/pdb/statistics/holdings.do. • 5. Kelley, L. A. S. M., Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 2009, 4,
364-371. • 6. Lambert, C. L. N., De Bolle X, Depiereux E., ESyPred3D submitting form. 2012. • 7. Lee, D.; Redfern, O.; Orengo, C., Predicting protein function from sequence and structure. Nat Rev Mol Cell Biol 2007, 8 (12),
995-1005. • 8. Linding, R. e. a., Protein disorder prediction: Implications for structural proteomics. EMBL - Biocomputing unit: 2012. • 9. Nelson, D. L.; Cox, M. M., Priciples of Biochemistry. 5 ed.; W.H. Freeman and Company: New York, 2008. • 10. RC, E., MUSCLE multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2012, 32 (5),
1792-1797. • 11. Sandhya, S. R.; Jayaram, B., Proteins: Sequence to Structure and Function – Current Status. Current protein and peptide
science 2010, (11), 498 – 514. • 12. Shenoy, S. R.; Jayaram, B., Proteins: sequence to structure and function--current status. Curr Protein Pept Sci 2010, 11 (7),
498-514. • 13. Tchobanov, I.; Gal, L.; Guilloux-Benatier, M. l.; Remiz, F.; Nardi, T.; Guzzo, J.; Serpaggi, V.; Alexandre, H., Partial vinylphenol
reductase purification and characterization from Brettanomyces bruxellensis - Powered by Google Docs. European Federation of Microbiological Studies 2008.
• 14. Watson, J. D.; Laskowski, R. A.; Thornton, J. M., Predicting protein function from sequence and structural data. Current Opinion in Structural Biology 2005, 15 (3), 275-284.