22
Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence Matthew Perella January 31, 2013

Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

  • Upload
    jenis

  • View
    54

  • Download
    0

Embed Size (px)

DESCRIPTION

Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence . Matthew Perella January 31, 2013. Proteins. Abundance 20 Amino Acids Role in nearly all cellular processes Enzymes, hormones, signaling, immune system, muscle fibers, transporters 1. - PowerPoint PPT Presentation

Citation preview

Page 1: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Building an Invisible Puzzle:Predicting Protein Structure and Function from

Sequence

Matthew Perella January 31, 2013

Page 2: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Proteins

• Abundance• 20 Amino Acids• Role in nearly all

cellular processes• Enzymes, hormones,

signaling, immune system, muscle fibers, transporters1

1. Nelson, D. L.; Cox, M. M., Priciples of Biochemistry. 5 ed.; W.H. Freeman and Company: New York, 2008.2. Image obtained from: primary protein structure | protein-pdb.com. http://proteinpdb.com/2011/10/04/primary-protein-

structure/.

Levels of Protein Structure2

Page 3: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Understanding Structure and Function

• Proteomics• Characterize structures– Whole-genome sequencing (<1%)– Experimentally• X-Ray Crystallography• NMR Spectroscopy

– Computational Prediction• Bioinformatics

Page 4: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Research

• Wine Spoilage• Brettanomyces bruxellensis• Vinylphenol Reductase3

– Vinylphenols– Ethylphenols

• How??

Vinylphenol Reductase SequenceMPLMTISDSVKDSLTKSEVVPTVIHDKSFLPKGFLTIQYDSGKEVALGNNIRPADSKNLPRIDFTLNLPSDASSTFNISKDDRFTLIVTDPDAPTRNDEKWSEYLHYLAVDVQLNTFNAENASSNDQLSTADLKGRTLYPYIGPGPPPKTGKHRYVFLLYKQTPGVTPEAPKDRPNWGTGIRGAGAAEYAEKYKLTPYAVNFFYAQNDQQ3

3. Tchobanov, I.; Gal, L.; Guilloux-Benatier, M. l.; Remiz, F.; Nardi, T.; Guzzo, J.; Serpaggi, V.; Alexandre, H., Partial vinylphenol reductase purification and characterization from Brettanomyces bruxellensis - Powered by Google Docs. European Federation of Microbiological Studies 2008.

Page 5: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Sequence Databases

• Protein Data Bank (PDB)– As of Wednesday, January 30th. There are 81,306

characterized structures in the PDB database4

• UniProtKB/Swiss-Prot– 538,849 reviewed sequences 29,266,939

unreviewed sequences5

– Only 77,110 have experimentally solved structures

4. RCSB PDB - Holdings Report. http://www.rcsb.org/pdb/statistics/holdings.do.5. UniProtKB/Swiss-Prot Available at: http://ca.expasy.org/sprot/relnotes/relstat.html

Page 6: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Classification Schemes1. Gene Ontology (GO)2. Secondary Structure3. Structural Motifs4. Family

CATH & SCOP PROSITE InterPro Pfam

Sandhya, S. R.; Jayaram, B., Proteins: Sequence to Structure and Function – Current Status. Current protein and peptide science 2010, (11), 498 – 514.

Page 7: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Resources

• Similar Sequence Searching• Multiple Sequence Alignments• Prediction – Secondary Structure– 3-D Model

• Viewing and Editing Software

Watson, J. D.; Laskowski, R. A.; Thornton, J. M., Predicting protein function from sequence and structural data. Current Opinion in Structural Biology 2005, 15 (3), 275-284.

Page 8: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence
Page 9: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Resource Name 1. Similarity Search(Sequence Alignments)

2. Predictions 3. Viewer

BLAST α

COBALT αJpred α α

Phyre 2 α α αSWISS-Model α α

MPI Bioinformatics Tookit α αPsiPred α

ClustalW αDNASTAR Lasergene 9 Core Suite Software α α α

CLC Protein Workbench 5.7.1 software α α αCnc3d viewer (Java) α α

Pymol Molecular Graphics System (Java) α

UCSF Chimera Molecular Visualization Software v. 1.6

(Python)

α α

Table 1: Bioinformatics Resource Function Analysis

Page 10: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Methods of Prediction

1. Pattern Recognition• pattern recognition techniques are used to find

sequences with high similarity in order to infer related structures and functions.

2. Ab Initio• prediction method used to create 3-D model to

determine structural and functional information using only the sequence

Lee, D.; Redfern, O.; Orengo, C., Predicting protein function from sequence and structure. Nat Rev Mol Cell Biol 2007, 8 (12), 995-1005.

Page 11: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Sequence Similarity Searches• BLAST• PSI-BLAST

Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. (1997) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic Acids Res. 25:3389-3402. PubMed

Page 12: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Multiple Alignment

• MUSCLE• CLUSTALW• COBALT

RC, E., MUSCLE multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2012, 32 (5), 1792-1797.Papadopoulos JS and Agarwala R (2007) COBALT: constraint-based alignment tool for multiple protein sequences, Bioinformatics 23:1073-79. PubMed.

Page 13: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Template Secondary Structure Annotation

Page 14: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Secondary Structure Prediction

Page 15: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Secondary Structure Annotation

Page 16: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

3-D Model Prediction with Template • PHYRE-2– PSI-BLAST– Psi-pred and Diso-pred– Hidden Markov Model (HMM)– HMM alignment– 3-D models from known structures– Maximizing Thermodynamic Stability• Modelling insertions and deletions with loop library • Modelling of AA side chains using a rotamer library to

minimize steric interferencesKelley, L. A. S. M., Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 2009, 4, 364-371.

Page 17: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Phyre2 Model Alignment Results

Kelley, L. A. S. M., Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 2009, 4, 364-371.

Page 18: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

3-D Model Prediction

Page 19: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Superimposed Structural alignment

• Alignment of α-helices and β-sheets

• Motif conservation

• Infer similar function from homologues

Kelley, L. A. S. M., Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 2009, 4, 364-371.

Page 20: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Prediction Analysis

• QMEAN and SWISS-MODEL used to assess

Page 21: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Models Superimposed on Template

Page 22: Building an Invisible Puzzle: Predicting Protein Structure and Function from Sequence

Resources• 1. BLAST References. http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=References. • 2. COBALT:Multiple Alignment Tool. http://www.ncbi.nlm.nih.gov/tools/cobalt/cobalt.cgi?CMD=Doc. • 3. primary protein structure | protein-pdb.com. http://protein-pdb.com/2011/10/04/primary-protein-structure/. • 4. RCSB PDB - Holdings Report. http://www.rcsb.org/pdb/statistics/holdings.do. • 5. Kelley, L. A. S. M., Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 2009, 4,

364-371. • 6. Lambert, C. L. N., De Bolle X, Depiereux E., ESyPred3D submitting form. 2012. • 7. Lee, D.; Redfern, O.; Orengo, C., Predicting protein function from sequence and structure. Nat Rev Mol Cell Biol 2007, 8 (12),

995-1005. • 8. Linding, R. e. a., Protein disorder prediction: Implications for structural proteomics. EMBL - Biocomputing unit: 2012. • 9. Nelson, D. L.; Cox, M. M., Priciples of Biochemistry. 5 ed.; W.H. Freeman and Company: New York, 2008. • 10. RC, E., MUSCLE multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2012, 32 (5),

1792-1797. • 11. Sandhya, S. R.; Jayaram, B., Proteins: Sequence to Structure and Function – Current Status. Current protein and peptide

science 2010, (11), 498 – 514. • 12. Shenoy, S. R.; Jayaram, B., Proteins: sequence to structure and function--current status. Curr Protein Pept Sci 2010, 11 (7),

498-514. • 13. Tchobanov, I.; Gal, L.; Guilloux-Benatier, M. l.; Remiz, F.; Nardi, T.; Guzzo, J.; Serpaggi, V.; Alexandre, H., Partial vinylphenol

reductase purification and characterization from Brettanomyces bruxellensis - Powered by Google Docs. European Federation of Microbiological Studies 2008.

• 14. Watson, J. D.; Laskowski, R. A.; Thornton, J. M., Predicting protein function from sequence and structural data. Current Opinion in Structural Biology 2005, 15 (3), 275-284.