EMBER EMBnet teams : University of Manchester (United Kingdom), Swiss Institute of Bioinformatics...
1
EMBER EMBnet teams : University of Manchester (United Kingdom), Swiss Institute of Bioinformatics (Switzerland), University of Nijmegen (The Netherlands), University of the Western Cape (South Africa), European Bioinformatics Institute (United Kingdom), Instituto Gulbenkian de Ciencia (Portugal), ULB University of Bruxelles (Belgium), Canada Institute for Marine Biosciences (Canada), Research Institute for Genetic engineering and Biotechnology (Turkey), Expert Center for Taxonomic Identification (The Netherlands). The project coordinator is Professor Terri Attwood from the University of Manchester: the principal authors include Ioannis Selimas, from the Manchester group and Marc Brugman from the Expert Centre for Taxonomic Identification. Ember is a new tutorial on sequence analysis and bio computing developed by several EMBnet teams within an EC framework. The course can be used by independent users as well as material for academic purposes and is structured by chapters of gradually increasing difficulty. Each chapter has several sections: AIM, INFO (presenting theoretical aspects of the subjects tackled), INSTRUCTIONS (presenting practical exercises on line) Quiz and References. EUROPEAN MULTIMEDIA BIOINFORMATICS EDUCATIONAL RESOURCE a new tutorial on sequence analysis and bio computing Figure 1. Ember presentation page: here chapter 3 of the tutorial, containing a detailed presentation of the most important secondary databases: PROSITE, eMOTIF, PRINTS, BLOCKS, Pfam and InterPro. The information presented is supported by multiple web links, illustrative animations and practical exercises. The tutorial is addressed to a wide variety of researchers (Master and PhD students, post-docs, junior and senior researchers) from all Molecular Biology and Bioinformatics departments, covering broad analysis areas such as: •DNA analysis: DNA translation (chapter 1), similarity searches (chapter 2), multiple alignments (chapter 4), restriction mapping (chapter 13); determination of gene structure through intron/exon prediction (chapter 10); inference of protein coding sequence through open reading frame (ORF) analysis (chapter 10); •Protein analysis: retrieving protein sequences from databases (chapter 1); classifying proteins into families (chapter 3); searching primary and secondary protein databases (chapter 3); finding the best alignment between two or more proteins (chapter 4); computing amino-acid composition, molecular weight, isoelectric point, and other parameters (chapter 5); computing hydrophobicity/hydrophilicity profiles, locating membrane-spanning segments (chapter 5); predicting elements of secondary structure (chapter 5); visualizing the protein structure in 3D (chapter 6); predicting a protein 3D structure from its sequence (chapters 7 and 8); finding evolutionary relationships between proteins (chapter 12). •Genome analysis: analysing genomic sequences; locating genes in a genome; displaying genomes; parsing a eukaryotic genome sequence: GenScan (chapter 10), etc. The tutorial presents a wide variety of tools and websites for multiple types of analysis: similarity searches tools (BLAST, PSI-BLAST); protein family analysis through databases searches (PROSITE, eMOTIF, BLOCKS, PRINTS, Pfam); multiple alignment tools (Clustal, DIALIGN, T-COFFEE, CINEMA, Jalview); physicochemical parameters and profile prediction (ProtParam and ProtScale); transmembrane helix prediction (MEMSAT, TMpred); secondary structure prediction (Jpredet, NNPREDICT); 3D prediction, comparison and visualisation (RasMol, QuickPDB, Cn- 3D); homology modelling (Swiss Model, Geno-3D); fold recognition (GenThreader, 3D-PSSM); phylogenetic analysis (Pylip); SRS (sequence retrieval), etc. Figure 7. “Human Genome” case study chapter proposes a complex analysis using advanced bioinformatics tools in concrete research applications. Using a genomic fragment of the human chromosome 6, the students are invited to find potential genes in this fragment with GenMark and GENESCAN software. They can then compare the results and assess their reliability using GeneQuiz, an integrated system for large-scale biological sequence analysis, and current database annotation in Human Genome project - Ensembl. Figure 6. In the “Sickle cell haemoglobin” case study chapter the users can compare sickle cell and normal β globin sequences to reveal the nature of the sickle cell mutation.The exercise integrates several databases searches and multiple tools:SRS, CLUSTALW, Restriction map as well as an advanced RasMol session by scripting files to visualise the mutant haemoglobin and the interaction between mutant β chains and further amino acid side chains in the vicinity of mutated Val6 residue. In this representation, the two central mutant β chains are highlighted as white and orange wireframes. Also highlighted are the side chains of the central Val6 mutation and porphyrin prosthetic group (in CPK coloured space-filling models). Both the porphyrin prosthetic groups (blue) and the mutant Val6 residues (red) are represented as space filling models. Highlighted in yellow are the side-chains in the vicinity of Val6 at the interface of the two haemoglobin molecules. Viorica Ghita*, Valérie Ledent*, Robert Herzog*, Terry Attwood # , Ioannis Selimas # , Marc Brugman $ *Belgian EMBnet Node – BEN. Laboratoire de Bioinformatique. Université Libre de Bruxelles. Campus de la Plaine – Bat NO. Bd du Triomphe. 1050 Bruxelles. #UMBER, the University of Manchester Specialist Node of EMBnet, School of Biological Sciences, Oxford Road, M13 9PL, Manchester. $University of Amsterdam, Mauritkade 61, 1092 AD Amsterdam, The Netherlands Figure 2. The tutorial presents the most important tools for multiple sequence alignment, rich information about manual and automatic multiple alignment tools, exercises and links to various software and alignment databases (chapter 4). Figure 3. Physicochemical parameters computation tools for molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, hydropathy, chain flexibility, solvent-accessible surface area, etc., software tools to predict the transmembrane topology of proteins and some secondary structure prediction software are presented in tutorial (chapter 5). Figure 4. Figure 4. A detailed presentation of Protein Data Bank, the principal repository of biological macromolecule structures, and some structure classification resources (CATH, SCOP, EC->PDB) are presented in Chapter 6 “Fold classification”, as well as visualisation and comparison of protein 3D structure with various Molecular Structure Viewers: RasmOl, QuickPDB, Deep View, Cn-3D. Figure 5. Different protein structure viewers, presented in the tutorial, displaying the ubiquitin- like signalling protein, Nedd8 (PDB ID: 1NND). (A) Deep View, (B) Rasmol, (C) QuickPDB and (D) CN3D. (A) illustrates classical ball and stick mode, (B) cartoon mode, (C) a wireframe α-carbon trace, with a small section of the structure highlighted in blue, and (D) a hybrid display with amino acid chains in cartoon mode and non-amino acid atoms in space-filling mode.
EMBER EMBnet teams : University of Manchester (United Kingdom), Swiss Institute of Bioinformatics (Switzerland), University of Nijmegen (The Netherlands),
EMBER EMBnet teams : University of Manchester (United Kingdom),
Swiss Institute of Bioinformatics (Switzerland), University of
Nijmegen (The Netherlands), University of the Western Cape (South
Africa), European Bioinformatics Institute (United Kingdom),
Instituto Gulbenkian de Ciencia (Portugal), ULB University of
Bruxelles (Belgium), Canada Institute for Marine Biosciences
(Canada), Research Institute for Genetic engineering and
Biotechnology (Turkey), Expert Center for Taxonomic Identification
(The Netherlands). The project coordinator is Professor Terri
Attwood from the University of Manchester: the principal authors
include Ioannis Selimas, from the Manchester group and Marc Brugman
from the Expert Centre for Taxonomic Identification. Ember is a new
tutorial on sequence analysis and bio computing developed by
several EMBnet teams within an EC framework. The course can be used
by independent users as well as material for academic purposes and
is structured by chapters of gradually increasing difficulty. Each
chapter has several sections: AIM, INFO (presenting theoretical
aspects of the subjects tackled), INSTRUCTIONS (presenting
practical exercises on line) Quiz and References. EUROPEAN
MULTIMEDIA BIOINFORMATICS EDUCATIONAL RESOURCE a new tutorial on
sequence analysis and bio computing Figure 1. Ember presentation
page: here chapter 3 of the tutorial, containing a detailed
presentation of the most important secondary databases: PROSITE,
eMOTIF, PRINTS, BLOCKS, Pfam and InterPro. The information
presented is supported by multiple web links, illustrative
animations and practical exercises. The tutorial is addressed to a
wide variety of researchers (Master and PhD students, post-docs,
junior and senior researchers) from all Molecular Biology and
Bioinformatics departments, covering broad analysis areas such as:
DNA analysis : DNA translation (chapter 1), similarity searches
(chapter 2), multiple alignments (chapter 4), restriction mapping
(chapter 13); determination of gene structure through intron/exon
prediction (chapter 10); inference of protein coding sequence
through open reading frame (ORF) analysis (chapter 10); Protein
analysis : retrieving protein sequences from databases (chapter 1);
classifying proteins into families (chapter 3); searching primary
and secondary protein databases (chapter 3); finding the best
alignment between two or more proteins (chapter 4); computing
amino-acid composition, molecular weight, isoelectric point, and
other parameters (chapter 5); computing
hydrophobicity/hydrophilicity profiles, locating membrane-spanning
segments (chapter 5); predicting elements of secondary structure
(chapter 5); visualizing the protein structure in 3D (chapter 6);
predicting a protein 3D structure from its sequence (chapters 7 and
8); finding evolutionary relationships between proteins (chapter
12). Genome analysis: analysing genomic sequences; locating genes
in a genome; displaying genomes; parsing a eukaryotic genome
sequence: GenScan (chapter 10), etc. The tutorial presents a wide
variety of tools and websites for multiple types of analysis:
similarity searches tools (BLAST, PSI-BLAST); protein family
analysis through databases searches (PROSITE, eMOTIF, BLOCKS,
PRINTS, Pfam); multiple alignment tools (Clustal, DIALIGN,
T-COFFEE, CINEMA, Jalview); physicochemical parameters and profile
prediction (ProtParam and ProtScale); transmembrane helix
prediction (MEMSAT, TMpred); secondary structure prediction
(Jpredet, NNPREDICT); 3D prediction, comparison and visualisation
(RasMol, QuickPDB, Cn- 3D); homology modelling (Swiss Model,
Geno-3D); fold recognition (GenThreader, 3D-PSSM); phylogenetic
analysis (Pylip); SRS (sequence retrieval), etc. Figure 7. Human
Genome case study chapter proposes a complex analysis using
advanced bioinformatics tools in concrete research applications.
Using a genomic fragment of the human chromosome 6, the students
are invited to find potential genes in this fragment with GenMark
and GENESCAN software. They can then compare the results and assess
their reliability using GeneQuiz, an integrated system for
large-scale biological sequence analysis, and current database
annotation in Human Genome project - Ensembl. Figure 6. In the
Sickle cell haemoglobin case study chapter the users can compare
sickle cell and normal globin sequences to reveal the nature of the
sickle cell mutation.The exercise integrates several databases
searches and multiple tools:SRS, CLUSTALW, Restriction map as well
as an advanced RasMol session by scripting files to visualise the
mutant haemoglobin and the interaction between mutant chains and
further amino acid side chains in the vicinity of mutated Val6
residue. In this representation, the two central mutant chains are
highlighted as white and orange wireframes. Also highlighted are
the side chains of the central Val6 mutation and porphyrin
prosthetic group (in CPK coloured space-filling models). Both the
porphyrin prosthetic groups (blue) and the mutant Val6 residues
(red) are represented as space filling models. Highlighted in
yellow are the side-chains in the vicinity of Val6 at the interface
of the two haemoglobin molecules. Viorica Ghita*, Valrie Ledent*,
Robert Herzog*, Terry Attwood #, Ioannis Selimas #, Marc Brugman $
* Belgian EMBnet Node BEN. Laboratoire de Bioinformatique.
Universit Libre de Bruxelles. Campus de la Plaine Bat NO. Bd du
Triomphe. 1050 Bruxelles. #UMBER, the University of Manchester
Specialist Node of EMBnet, School of Biological Sciences, Oxford
Road, M13 9PL, Manchester. $University of Amsterdam, Mauritkade 61,
1092 AD Amsterdam, The Netherlands Figure 2. The tutorial presents
the most important tools for multiple sequence alignment, rich
information about manual and automatic multiple alignment tools,
exercises and links to various software and alignment databases
(chapter 4). Figure 3. Physicochemical parameters computation tools
for molecular weight, theoretical pI, amino acid composition,
atomic composition, extinction coefficient, hydropathy, chain
flexibility, solvent-accessible surface area, etc., software tools
to predict the transmembrane topology of proteins and some
secondary structure prediction software are presented in tutorial
(chapter 5). Figure 4. Figure 4. A detailed presentation of Protein
Data Bank, the principal repository of biological macromolecule
structures, and some structure classification resources (CATH,
SCOP, EC->PDB) are presented in Chapter 6 Fold classification,
as well as visualisation and comparison of protein 3D structure
with various Molecular Structure Viewers: RasmOl, QuickPDB, Deep
View, Cn-3D. Figure 5. Different protein structure viewers,
presented in the tutorial, displaying the ubiquitin-like signalling
protein, Nedd8 (PDB ID: 1NND). (A) Deep View, (B) Rasmol, (C)
QuickPDB and (D) CN3D. (A) illustrates classical ball and stick
mode, (B) cartoon mode, (C) a wireframe -carbon trace, with a small
section of the structure highlighted in blue, and (D) a hybrid
display with amino acid chains in cartoon mode and non-amino acid
atoms in space-filling mode.