19
Case study (ref) PSI meta-server used Computer drive drug design process

Session ii g2 overview metabolic network modeling mcc

Embed Size (px)

Citation preview

Page 1: Session ii g2 overview metabolic network modeling mcc

Case study (ref)

PSI meta-server used

Computer drive drug design process

Page 2: Session ii g2 overview metabolic network modeling mcc

Step Input Tools Output / result

Selection of ligand compoundsequence

PubChem structure

structure Open Babel required file format

ADMET studies sequence Pre-ADMET Provide drug-likeliness, ADME profile and toxicity analysis for the ligand.

Receptor characterization

sequence GOR IV Text data result

Select Template sequence BLAST Identity, similarity, expectation value & alignment scores

Homology modeling

protein sequences Modeller pdb file

Visualizing pdb file PyMol visualized model

Model validation pdb file

PROCHECK Parameters(covalent bond distances and angles, stereochemicalvalidation and atom nomenclature)

ERRAT The overall quality factor of non-bonded interactions between different atoms types

DaliLite mean square deviation (RMSD) between the set of targets and template protein to check deviation of modelled protein from the template protein structure.

Docking studies PDB, PDBQ, PDBQT, SYBYL mol2 or PQR format

AutoDock 4.2 PDBQT format and include special keywords establishing the torsional flexibility

Page 3: Session ii g2 overview metabolic network modeling mcc
Page 4: Session ii g2 overview metabolic network modeling mcc

Data Preparation and Analysis

The world wide Protein Data Bank: The single archive of experimental marcomolecular structural data. [RCSB PDB] (USA); [PDBe] (Europe); [PDBj] (Japan)CATH: A manually curated hierarchical domain classification of protein structures in the Protein Data Bank.

UniProt: Protein Knowledgebase. A comprehensive, high-quality and freely accessible database of protein sequence and functional information.RefSeq: NCBI Reference Sequence. A collection of curated, non-redundant genomic DNA, transcript (RNA), and protein sequences produced by NCBI.

SBKB: Structural Biology Knowledgebase. A portal to protein structures, sequences, functions and methods.

Experimental Data

Acquisition

Amino Acid Sequences

Structure Template libraries

Page 5: Session ii g2 overview metabolic network modeling mcc

Structure Modelling

Template Search/Fold Recognition

BLAST/PSI-BLAST: Local alignment search tools.

HHpred: Server for homology detection and structure prediction by HMM-HMM comparison.

Homology Modeling

New prediction

yes

no

Page 6: Session ii g2 overview metabolic network modeling mcc

HHpred: Server for homology detection and structure prediction by HMM-HMM comparison.

I-Tasser: I-TASSER is a server for protein structure and function predictions. 3D models are built based on multiple-threading alignments by LOMETS and iterative TASSER assembly simulations.

M4T: Comparative Modelling using a combination of multiple templates and iterative optimization of alternative alignments.

ModWeb: A web server for automated comparative modeling that relies on PSI-BLAST, IMPALA and MODELLER.

SWISS-MODEL: Fully automated protein structure homology-modeling server accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer).

Homology Modeling

Structure Modeling

Page 7: Session ii g2 overview metabolic network modeling mcc

Swiss-model

identification of

structural template

alignment of target sequence

& template structure

model building

model quality

evaluation

InterPro Domain Scan:InterPro, Pfam, TIGRFAMs, PROSITE, SUPERFAMILY, ProDom,PRINTS, SMART, PROSITEPsiPred Secondary Structure Prediction:PSIPREDDISOPRED Disorder Prediction:DISOPREDMEMSAT:MEMSAT

using BLAST query against the ExPDB template library extracted from PDB.When no suitable templates are identified, using Iterative Profile Blast. Which is the template library is searched with PSI-BLAST (Altschul et al.)  using an iteratively generated sequence profile based on NR (Wheeler et al.).HHSearch: To detect distantly related template structures (Söding et al.) Display of template identification results

DeepView:http://www.expasy.org/spdbv/

Protein Structure & Model Assessment Tools: ANOLEAQMEAN, The global QMEAN4 scoring function ( Benkert et al. 2008), The global QMEAN6 scoring function (Benkert et al. 2008), The local version of the QMEAN scoring function (Benkert et al. 2009),DFIRE GROMOSWhat CheckPROCHECKPROMOTIFDSSP

QMEAN4 global scoresLocal Model Quality Estimation: Anolea / QMEAN / Gromos:Alignment  Modelling Log  Template Selection Log Quaternary Structure Modeling LogLigand Modeling Log

Outputs:

Structure Modelling

Page 8: Session ii g2 overview metabolic network modeling mcc

The "automated mode" is suited for cases where the target-template similarity is sufficiently high to allow for fully automated modeling. 

This submission requires only the amino acid sequence or the UniProt accession code of the target protein as input data. 

Depending on the planned model application, it can be necessary to select a different structural template than the one ranked first in the automated process. Please make sure that this file contains only a single protein chain, and does not contain chemically modified amino acids, hereto atoms, ligands, etc.

Automated Mode

Page 9: Session ii g2 overview metabolic network modeling mcc

 If the three-dimensional structure is known for at least one of the members, this alignment can be used as starting point for comparative modelling using the "alignment mode". 

The "alignment mode" allows the user to test several alternative alignments and evaluate the quality of the resulting models in order to achieve an optimal result.

1. Prepare a multiple sequence alignment.2. Submit your alignment to the Workspace Alignment Mode.3. Select Target and Template.4. Check Alignment and Submit.

The server pipeline will build the model purely based on this alignment. During the modeling process, implemented as rigid fragment assembly in the SWISS-MODEL (Schwede et al.) pipeline, the modeling engine might introduce minor heuristic modifications to the placement of insertions and deletions. 

Alignment Mode

Page 10: Session ii g2 overview metabolic network modeling mcc

In difficult modeling situations, where the correct alignment between target and template cannot be clearly determined by sequence based methods, visual inspection and manual manipulation of the alignment can significantly help improving the quality of the resulting model. 

Project files contain the superposed template structures, and the alignment between the target and template. Project files can be generated inside the program DeepView (Swiss-PdbViewer Guex et al.), by the workspace template selection tools, and are also the default output format of the modeling pipeline. This allows analyzing and iteratively improving the models generated by the "Automated mode" and "Alignment mode" modeling approaches. 

Project Mode

Page 11: Session ii g2 overview metabolic network modeling mcc

M4T

Output: pdb filepir fileana fileEnergy profile

Structure Modelling

Page 12: Session ii g2 overview metabolic network modeling mcc

I-TASSER

 tries retrieve template proteins

of similar folds from the PDB library

by LOMETS

Structure assembly continuous fragments

by replica-exchange Monte Carlo simulations

with the threading unaligned regions

(mainly loops) built by ab initio modeling

low free-energy states are identified by SPICKER

 

Structure Re-assembly

Retrieve from cluster LOMETS & TM-align

lowest energy

structures are

selected.

final full-atomic models

(Remo H-Bond optimization)

function predictions

TM-alignsearch

TM-score

Outputs:Predicted Secondary StructurePredicted Solvent Accessibilitypdb fileTop 10 templatesProteins with highly similar structure in PDB Function PredictionPredicted GO termsPredicted Binding Site

Structure Modellingsubmit an amino acid sequence

Page 13: Session ii g2 overview metabolic network modeling mcc

HHpred

Select input format

MSA Generation Method

More Options

Max. MSA Generation iterations

Score secondary structure

Realign with MAC algorithm

Alignment mode

Select HMM databases

Entering a single query sequenceEntering a multiple alignment

ProteomesPdb70,Scop70,CDD,InterPro,PfamA,SMART,

PANTHER, TIGRFAMs, PIRSF, SUPERFAMILY, CATH/Gene3D,

COG/KOG, PfamB

HBlits : Download pdf filePSI-Blast : View Article

E-value threshold for MSA Generation             

Min. coverage of MSA hitsMin. sequence identity of MSA hits

with query         MAC realignment threshold

(0.0:global, >=0.1:local)        Compositional bias correction

Show sequences per HMM         Width of alignments

Min. probability in hit listMax. number of hits in hit list

Output: pdb file

Structure Modelling

Page 14: Session ii g2 overview metabolic network modeling mcc

Modeler

Searching for structures related to TvLDH

Selecting a template

Model evaluation Model building

Aligning TvLDF with the template

 target TvLDH sequence

profile.build() 

automodel  Output: pdb file

Structure Modeling

Page 15: Session ii g2 overview metabolic network modeling mcc

New predictionWhen no suitable template structure can be identified, de novo (a.k.a. ab initio) structure prediction methods can be used to generate three-dimensional protein models without relying on a homologus template structure:

Robetta: Full-chain protein structure prediction server based on the Rosetta method.Rosetta: De novo protein structure prediction software.

Structure Modeling

Page 16: Session ii g2 overview metabolic network modeling mcc

Hybrid techniquesThe goal of hybrid techniques is to contribute to a comprehensive structural characterization of biomolecules ranging in size and complexity from small peptides to large macromolecular assemblies. Detailed structural characterization of assemblies is generally impossible by any single existing experimental or computational method. This barrier can be overcome by hybrid approaches that integrate data from diverse biochemical and biophysical experiments:CS-ROSETTA: System for chemical shifts based protein structure prediction using ROSETTA.IMP: software for a comprehensive structural characterization of biomolecules.

Structure Modelling

Page 17: Session ii g2 overview metabolic network modeling mcc

Confidence EstimationValidation and Quality estimation

Example: Verify3D

Page 18: Session ii g2 overview metabolic network modeling mcc

Tools Input Output

Verify 3D pdb 3D-1D average score, raw data, raw average data,

Whatcheck

pdb Detial text report, TeX file

Prove pdb An pdf file with Z–score and analysis of residues, and an text file

Errat pdb An pdf file with overall quality factor, error value in each residue

PROCHECK

pdb comprise a number of plots, in PostScript format

MolProbity

pdb can view in pdf or KiNG, can choose lots of output

ProSA pdb Z-Score, knowledge based energy, sequence position

Confidence Estimation

Page 19: Session ii g2 overview metabolic network modeling mcc

Application

Structure Visualization & Analysis

PyMol: A Python based open-source viewer for visualization of macromolecular structures.

AutoDock: A suite of automated docking tools.

Molecular Interactions

Molecular Motions DynDom: Protein Domain Motion Analysis.molmovdb.org: Gallery of morphs.molmovdb.org: Molecular Movements Database.