View
216
Download
0
Tags:
Embed Size (px)
Citation preview
MULTICOM – A Combination Pipeline for Protein Structure Prediction
Jianlin Cheng
Computer Science Department & Informatics InstituteUniversity of Missouri, Columbia, MO, USA
MULTICOM Structure Prediction PipelineServer PredictorQuery Sequence
Output
Human Predictor
MULTICOM Structure Prediction PipelineQuery Sequence
Output
• PSI-BLAST• HHSearch• COMPASS• FOLDpro + SPEM
Query-template alignments:
Find a set of good templates / fragments; generate alternative query-templatealignments
MULTICOM Structure Prediction PipelineQuery Sequence
Output
1. Combine top ranked query-template alignment (QTA) withother significant QTAs2. Take fragments from lesssignificant QTA (Template-free)
Don’t try to find the best template; Instead combine multiple good templates / fragments.
Combination
MULTICOM Structure Prediction PipelineQuery Sequence
Output
1. Modeller 2. Rosetta for template-free small domains
Domain-level combination of template-based and template-free approaches
Integrative Model Generation
MULTICOM Structure Prediction PipelineQuery Sequence
Output
Model Ranking by ModelEvaluator
ModelEvaluator3D Model Ab initio Sequence-Based Structural Feature Prediction
EEEECCEEEHHHHHHHHHHHHEEEECCEEEHHHH
eeee-----eeeee----------eeeee------eeeee---eeeeeeee
Secondary Structure
Relative Solvent Accessibility
Contact Map
Beta-Sheet Pairing
Input Features
Predicted GDT-TS score
Good models ranked at the top. Very effective fortemplate-free models.
Comparison
MULTICOM Structure Prediction PipelineQuery Sequence
Output
1. Start from a top ranked model2. Combine it with other models having global similarity (80%, 4Å)3. Combine it with the longest
similar model fragments
Global-Local Model Combination
Modeller Iterative Modeling
Average Model
Don’t try to find the best model.Instead combine multiple goodmodels / fragments (2-3% improvement).
Good Template-Free Example: T0416_2
Structure MULTICOM (GDT = 0.66, RMSD = 2.5)
Superposition (red: model) (Courtesy by Prof. Joel Sussman)
Combination of 20 models:
Zhang-ServerRobettaTASSERMULTICOMYASARAforecast
Success: rank very good models at top.
Good Template-Free Example: T0513_2
StructureMULTICOM (GDT = 0.73, RMSD=2.1)
Combine Robetta modelsBetter than each one of them
Superposition (blue: model)
Success: rank very good models at top and combination improves modeling.
Not Good Template-Free Example: T0405_1
Structure(Helix Bundle)
MULTICOMGDT = 0.41
Superposition (by Prof. Sussman)(Gray: structure, yellow: best modelgreen: MULTICOM model)
Failure: ModelEvaluator fails to identify correct helix orientations.
Concluding Remarks• CASP Community can sometime generate good
template-free models (e.g. Rosetta-based tools)
• ModelEvaluator can rank good template-free models at the top
• Iterative global-local combination of models can improve template-free modeling
• Blending of template-free and template-based modeling
Blending of Template-Free and Template-Based Modeling
100% TBM 100% FM50% TBM+50%FM
Protein Modeling Spectrum
Acknowledgements
• CASP8 organizers and assessors• CASP8 participants• MU colleagues: Dong Xu, Toni Kazic • My group: Zheng Wang Allison Tegge Xin Deng