EMBER EMBnet teams : University of Manchester (United Kingdom), Swiss Institute of Bioinformatics (Switzerland), University of Nijmegen (The Netherlands),

EMBER EMBnet teams : University of Manchester (United Kingdom), Swiss Institute of Bioinformatics (Switzerland), University of Nijmegen (The Netherlands), University of the Western Cape (South Africa), European Bioinformatics Institute (United Kingdom), Instituto Gulbenkian de Ciencia (Portugal), ULB University of Bruxelles (Belgium), Canada Institute for Marine Biosciences (Canada), Research Institute for Genetic engineering and Biotechnology (Turkey), Expert Center for Taxonomic Identification (The Netherlands). The project coordinator is Professor Terri Attwood from the University of Manchester: the principal authors include Ioannis Selimas, from the Manchester group and Marc Brugman from the Expert Centre for Taxonomic Identification. Ember is a new tutorial on sequence analysis and bio computing developed by several EMBnet teams within an EC framework. The course can be used by independent users as well as material for academic purposes and is structured by chapters of gradually increasing difficulty. Each chapter has several sections: AIM, INFO (presenting theoretical aspects of the subjects tackled), INSTRUCTIONS (presenting practical exercises on line) Quiz and References. EUROPEAN MULTIMEDIA BIOINFORMATICS EDUCATIONAL RESOURCE a new tutorial on sequence analysis and bio computing Figure 1. Ember presentation page: here chapter 3 of the tutorial, containing a detailed presentation of the most important secondary databases: PROSITE, eMOTIF, PRINTS, BLOCKS, Pfam and InterPro. The information presented is supported by multiple web links, illustrative animations and practical exercises. The tutorial is addressed to a wide variety of researchers (Master and PhD students, post-docs, junior and senior researchers) from all Molecular Biology and Bioinformatics departments, covering broad analysis areas such as: DNA analysis : DNA translation (chapter 1), similarity searches (chapter 2), multiple alignments (chapter 4), restriction mapping (chapter 13); determination of gene structure through intron/exon prediction (chapter 10); inference of protein coding sequence through open reading frame (ORF) analysis (chapter 10); Protein analysis : retrieving protein sequences from databases (chapter 1); classifying proteins into families (chapter 3); searching primary and secondary protein databases (chapter 3); finding the best alignment between two or more proteins (chapter 4); computing amino-acid composition, molecular weight, isoelectric point, and other parameters (chapter 5); computing hydrophobicity/hydrophilicity profiles, locating membrane-spanning segments (chapter 5); predicting elements of secondary structure (chapter 5); visualizing the protein structure in 3D (chapter 6); predicting a protein 3D structure from its sequence (chapters 7 and 8); finding evolutionary relationships between proteins (chapter 12). Genome analysis: analysing genomic sequences; locating genes in a genome; displaying genomes; parsing a eukaryotic genome sequence: GenScan (chapter 10), etc. The tutorial presents a wide variety of tools and websites for multiple types of analysis: similarity searches tools (BLAST, PSI-BLAST); protein family analysis through databases searches (PROSITE, eMOTIF, BLOCKS, PRINTS, Pfam); multiple alignment tools (Clustal, DIALIGN, T-COFFEE, CINEMA, Jalview); physicochemical parameters and profile prediction (ProtParam and ProtScale); transmembrane helix prediction (MEMSAT, TMpred); secondary structure prediction (Jpredet, NNPREDICT); 3D prediction, comparison and visualisation (RasMol, QuickPDB, Cn- 3D); homology modelling (Swiss Model, Geno-3D); fold recognition (GenThreader, 3D-PSSM); phylogenetic analysis (Pylip); SRS (sequence retrieval), etc. Figure 7. Human Genome case study chapter proposes a complex analysis using advanced bioinformatics tools in concrete research applications. Using a genomic fragment of the human chromosome 6, the students are invited to find potential genes in this fragment with GenMark and GENESCAN software. They can then compare the results and assess their reliability using GeneQuiz, an integrated system for large-scale biological sequence analysis, and current database annotation in Human Genome project - Ensembl. Figure 6. In the Sickle cell haemoglobin case study chapter the users can compare sickle cell and normal globin sequences to reveal the nature of the sickle cell mutation.The exercise integrates several databases searches and multiple tools:SRS, CLUSTALW, Restriction map as well as an advanced RasMol session by scripting files to visualise the mutant haemoglobin and the interaction between mutant chains and further amino acid side chains in the vicinity of mutated Val6 residue. In this representation, the two central mutant chains are highlighted as white and orange wireframes. Also highlighted are the side chains of the central Val6 mutation and porphyrin prosthetic group (in CPK coloured space-filling models). Both the porphyrin prosthetic groups (blue) and the mutant Val6 residues (red) are represented as space filling models. Highlighted in yellow are the side-chains in the vicinity of Val6 at the interface of the two haemoglobin molecules. Viorica Ghita*, Valrie Ledent*, Robert Herzog*, Terry Attwood #, Ioannis Selimas #, Marc Brugman $ * Belgian EMBnet Node BEN. Laboratoire de Bioinformatique. Universit Libre de Bruxelles. Campus de la Plaine Bat NO. Bd du Triomphe. 1050 Bruxelles. #UMBER, the University of Manchester Specialist Node of EMBnet, School of Biological Sciences, Oxford Road, M13 9PL, Manchester. $University of Amsterdam, Mauritkade 61, 1092 AD Amsterdam, The Netherlands Figure 2. The tutorial presents the most important tools for multiple sequence alignment, rich information about manual and automatic multiple alignment tools, exercises and links to various software and alignment databases (chapter 4). Figure 3. Physicochemical parameters computation tools for molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, hydropathy, chain flexibility, solvent-accessible surface area, etc., software tools to predict the transmembrane topology of proteins and some secondary structure prediction software are presented in tutorial (chapter 5). Figure 4. Figure 4. A detailed presentation of Protein Data Bank, the principal repository of biological macromolecule structures, and some structure classification resources (CATH, SCOP, EC->PDB) are presented in Chapter 6 Fold classification, as well as visualisation and comparison of protein 3D structure with various Molecular Structure Viewers: RasmOl, QuickPDB, Deep View, Cn-3D. Figure 5. Different protein structure viewers, presented in the tutorial, displaying the ubiquitin-like signalling protein, Nedd8 (PDB ID: 1NND). (A) Deep View, (B) Rasmol, (C) QuickPDB and (D) CN3D. (A) illustrates classical ball and stick mode, (B) cartoon mode, (C) a wireframe -carbon trace, with a small section of the structure highlighted in blue, and (D) a hybrid display with amino acid chains in cartoon mode and non-amino acid atoms in space-filling mode.

Documents

EMBER EMBnet teams : University of Manchester (United Kingdom), Swiss Institute of Bioinformatics (Switzerland), University of Nijmegen (The Netherlands),