Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
CMSE 520
BIOMOLECULAR STRUCTURE, FUNCTION AND DYNAMICS
(Computational Structural Biology)
OUTLINE
Review: Molecular biologyProteins: structure, conformation and function(5 lectures)Generalized coordinates, Phi, psi angles, DNA/RNA: structure and function (3 lectures)
Structural and functional databases(PDB, SCOP, CATH, Functional domain database, gene ontology)
Use scripting languages (e.g. python) to cross refernce between these databases: starting from sequence to find the function
Relationship between sequence, structure and functionMolecular Modeling, homology modeling
Conservation, CONSURFRelationship between function and dynamics
Confromational changes in proteins (structural changes due to ligation, hinge motions, allosteric changes in proteins and consecutive function change)Molecular DynamicsMonte Carlo
Protein-protein interaction: recognition, structural matching, dockingPPI databases: DIP, BIND, MINT, etc...
References:
CURRENT PROTOCOLS IN BIOINFORMATICS (e-book)(http://www.mrw.interscience.wiley.com/cp/cpbi/articles/bi0101/frame.html)Andreas D. Baxevanis, Daniel B. Davison, Roderic D.M. Page, Gregory A. Petsko, Lincoln D. Stein, and Gary D. Stormo (eds.) 2003 John Wiley & Sons, Inc.
INTRODUCTION TO PROTEIN STRUCTURE Branden C & Tooze, 2nd ed. 1999, Garland Publishing
COMPUTER SIMULATION OF BIOMOLECULAR SYSTEMSVan Gusteren, Weiner, Wilkinson
Internet sources
Ref: Department of Energy
Human Genome ProjectsTwo major goals1. DNA mapping2. DNA sequencing
Rapid growth in experimental technologies
Rapid growth in experimental technologies
Weiss, S. (1999). Fluorescence spectroscopy of single molecules.Science 283, 1676-1683.
Microrarray technologies – serial gene expression patterns and mutations
Time-resolved optical, rapid mixing techniques - folding & function mechanisms ( ns)Techniques for probing single molecule mechanics (AFM, STM) ( pN)more accurate models/data for computer-aided studies
function
Structural Biology/Molecular Structural Biology/Molecular BiophysicsBiophysics
Most (all?) basic “life processes” are Most (all?) basic “life processes” are mediated by “machines” that represent mediated by “machines” that represent the ultimate miniaturization achievable the ultimate miniaturization achievable in a universe comprised of atoms and in a universe comprised of atoms and molecules.molecules.The goal is to understand the underlying The goal is to understand the underlying principles that govern the operation of principles that govern the operation of these molecular machines.these molecular machines.
What this course is aboutWhat What ththisis coursecourse is is aboutabout
overview of ways in which computers overview of ways in which computers are used to solve problems in biologyare used to solve problems in biologysupervised learning of illustrative or supervised learning of illustrative or frequentlyfrequently--used algorithms and used algorithms and programsprograms and databasesand databasessupervised learning of programming supervised learning of programming techniques and algorithms selected techniques and algorithms selected from these usesfrom these uses
StructureStructure
What do the molecules look like?What do the molecules look like?How do we determine that experimentally?How do we determine that experimentally?Are there general structural principles?Are there general structural principles?How is this information organized?How is this information organized?How do structural generalizations relate to How do structural generalizations relate to simple physical/chemical principles?simple physical/chemical principles?
DynamicsDynamics
Time is of the essence in biological Time is of the essence in biological processes therefore how do we processes therefore how do we understand timeunderstand time--dependent processes at dependent processes at the molecular level?the molecular level?How do we do this experimentally?How do we do this experimentally?How do we do this computationally?How do we do this computationally?
Promising Future for Computational BiologyPromising Future for Computational BiologyExponential growth in dataSequence and structure data from experimentsComputational technology
12,665 structures as of July 11, 2000
22,810 structures as of October 7, 2003
35,026 structures as of February 7, 2006
Rost, B. (1998). Marrying structure and genomics.Structure 6, 259-263
Large databases
Archival databanks of biological informationProtein, DNA sequence databasesProtein structure and nucleic acid databasesProtein expression patterns
Derived databanksSequence motifsMutations and variations in proteinsClassifications and or relationships
Databanks of web sitesDatabanks of databanks containing biological informationLinks between databanks
Experimental Tecniques
BIOINFORMATICS (definition)
Definition by Luscombe et al., Yale, Dept. of Molecular Biophysics and Biochemistry, 2001
“Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical chemistry) and then applying ‘informatics’ techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organizethe information associated with these molecules, on a large-scale”
COMPUTATIONAL BIOLOGY (definition)
Definition by NIH (working definition)
The development and application of data-analytical andtheoretical methods, mathematical modeling and computational simulation techniquesto the study of biological, behavioral, and social systems.
Information flowInformation flowInformation flow
A major task in computational molecular A major task in computational molecular biology is to “decipher” information biology is to “decipher” information contained in biological sequencescontained in biological sequencesSince the nucleotide sequence of a Since the nucleotide sequence of a genome contains all information genome contains all information necessary to produce a functional necessary to produce a functional organism, we should in theory be able organism, we should in theory be able to duplicate this decoding using to duplicate this decoding using computerscomputers
5
http://www-fp.mcs.anl.gov/~gaasterland/sg-review-slides.html
Two major challenges after completion of the HGP:Two major challenges after completion of the HGP:Structural Genomics and Functional GenomicsStructural Genomics and Functional GenomicsSchematic representation of the universe of proteins in a given organism
Aim: “to construct the complete scheme of biological functions and cellular pathways for the entire organism”
Kim, S.H. (1998). Nature Struct.Biol. 5, 643-645
What's EWhat's E--Cell Project?Cell Project?
EE--Cell Project is an Cell Project is an international research international research project aiming to model project aiming to model and reconstruct and reconstruct biological phenomena in biological phenomena in silicosilico, and developing , and developing necessary theoretical necessary theoretical supports, technologies supports, technologies and software platforms and software platforms to allow precise whole to allow precise whole cell simulation. cell simulation.
Metabolism model of the model cell constructed with 127 genes
PROTEOMICSPROTEOMICS
Covers the following areas (but not limited to):Protein structure
Primary Structure: sequence of amino acidsSecondary Structure: local spatial arrangementTertiary Structure: three dimensional native conformation
Protein Functionrelated to 3-D shape of the protein
Protein clusters according to a specified characteristic
Protein-Protein Interactioninteraction among a number of proteins
Protein-DNA Interactioninteraction between one protein and the genome