16 Genome

Embed Size (px)

Citation preview

  • 8/2/2019 16 Genome

    1/80

    Genomics

  • 8/2/2019 16 Genome

    2/80

    The Human Genome Project

    Mapping and Sequencing the Genomes of

    Model Organisms

    Data Collection and Distribution Ethical, Legal, and Social Considerations

    Research Training

    Technology Development

    Technology Transfer

  • 8/2/2019 16 Genome

    3/80

    A Few Genome Resources

    NCBI Genome Resources

    UCSC Human Genome Browser

    EnsemblHuman Genome Server

    http://www.ncbi.nlm.nih.gov/Genomes/index.htmlhttp://genome.ucsc.edu/http://www.ensembl.org/http://www.ensembl.org/http://www.ensembl.org/http://www.ensembl.org/http://genome.ucsc.edu/http://www.ncbi.nlm.nih.gov/Genomes/index.html
  • 8/2/2019 16 Genome

    4/80

    Genome Sequencing Progress

    NCBI Genome Sequence Repository

    All organisms

    Eukaryoticgenomes

    Prokaryotic genomes

    Archaeagenomes Viruses

    http://www.ensembl.org/http://www.ensembl.org/http://www.ensembl.org/http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genomehttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/allorg.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/euk.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/euk.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/eub.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/a.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/a.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/a.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/a.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/eub.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/euk.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/euk.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/allorg.htmlhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genome
  • 8/2/2019 16 Genome

    5/80

    Genome Sequencing

    From NCBI, 5/2001

    http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.html
  • 8/2/2019 16 Genome

    6/80

    Human Genome Sequencing 2/11/2001

    From NCBI

    http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.html
  • 8/2/2019 16 Genome

    7/80

    Human Genome Progress 2/11/2001

    Total

    sequence

    (kb)

    Non-redundant

    sequence (kb)

    Percentage of

    genome

    Finished 1,140,365 1,040,372 32.50%

    Unfinished 3,547,899 1,951,344 61.00%

    Total 4,688,264 2,991,716 93.50%

    From NCBI

    http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.html
  • 8/2/2019 16 Genome

    8/80

    Microbial Genomes

    Published complete microbial genomes

    Microbial genomes and chromosomes in progr

    http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.tigr.org/tdb/mdb/mdbcomplete.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbcomplete.html
  • 8/2/2019 16 Genome

    9/80

    Genome Informatics

    Annotation and Analysis

    Data Handling

    Metabolic Reconstruction

    Comparative Genomics

    Functional Genomics

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html
  • 8/2/2019 16 Genome

    10/80

    Genome Project Organization

    Cloning

    Mapping

    Sequencing

    Annotation

    Analysis

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html
  • 8/2/2019 16 Genome

    11/80

    Cloning and Mapping

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html
  • 8/2/2019 16 Genome

    12/80

    Cloning

    Large YACs

    1 Mb

    BACs 100 - 200 Kb

    Intermediate Cosmids Lambda clones

    Small Plasmids; M13

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html
  • 8/2/2019 16 Genome

    13/80

    Mapping

    Establishment of Guideposts

    Aids in Assembly

    Error Checking

    Useful in mapping of genetic disorders

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html
  • 8/2/2019 16 Genome

    14/80

    Genetic Maps

    Cytogenetic markers

    Linkage maps Polymorphic loci screened by PCR to

    determine inheritence patterns

    Produce linkage map with nearby loci

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html
  • 8/2/2019 16 Genome

    15/80

    Physical Maps

    Radiation Hybrid/YACs/Cosmids Restriction Sites Sequence Tagged Sites

    100 Kb resolution needed 30,000 STSs

    Expressed Sequence Tags

    Detection PCR Hybridization FISH

    Fluoresecent in situ Hybridization

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html
  • 8/2/2019 16 Genome

    16/80

    Human Genome STS Mapping Strategy

    STS Content Mapping Screen YACs by PCR

    Radiation Hybrid Mapping Screen RH Cell lines by PCR

    Genetic Mapping

    PCR Screening of polymorphic loci Combine above to produce an integrated

    map

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html
  • 8/2/2019 16 Genome

    17/80

    Mapping Resolution

    YAC mapping 1 Mb

    Radiation hybrid mapping 10 Mb

    Genetic map 30 Mb

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html
  • 8/2/2019 16 Genome

    18/80

    GeneMap98

    Integrated Human Genetic Map

    Over 30,000 unique gene-based markers 100 Kb resolution

    http://www.ncbi.nlm.nih.gov/genemap98/

    http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    19/80

    Map Integration

    http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    20/80

    Human Chromosome 1 Genetic Map

    http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    21/80

    Human Chromosome 1 Combination Map

    http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    22/80

    Sequencing

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    23/80

    Sequencing Methods

    Random Shotgun

    Ordered Shotgun

    Directed Primer Walking

    Direct genomic sequencing

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    24/80

    Random Shotgun Sequencing

    Randomly shear or cut DNA into small pieces 2-4 Kb

    Clone into M13, pUC or some other sequencingvector

    Sequence the clones from both ends

    Rely on the computer to assemble the

    sequences into one (or as few as possible)

    contigs

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    25/80

    Shotgun Sequencing Statistics

    Lander and Waterman equation poisson distribution

    Po = e-m probability that a base is not sequenced

    where m=sequence coverage

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    26/80

    H. influenza Sequencing

    For 1X random sequence coverage = 1.8 Mb P = 0.37 (63% of the bases are sequenced)

    To get > 99% of the bases sequenced 5X coverage = 8.74 Mb of sequence

    Po = e-5 = 0.0067

    This coverage would leave approx. 128 gaps of

    about 100 bp in size From Science 269:496-512. 1995

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    27/80

    Ordered Sequencing

    Generate a set of large sequence clones in

    lambda phage

    May be subcloned from YACs or BACs as necessary End sequence the lambda clones and order the

    clones to produce a map of the genome

    Choose a minimal tiling path of the genome from

    the ordered lambda clones

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    28/80

    Ordered Sequencing...

    Shear and subclone the lambda inserts

    that comprise the minimal tiling set into

    sequencing vectors Shotgun sequence and assemble each of

    these lambda inserts individually

    Assemble all sequences into one,contiguous genome

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    29/80

    Directed Sequencing

    Process used for finishing following the

    shotgun sequencing phase

    Gap closure Use specific sequencing primers to extend

    appropriate clones into gap regions

    Use specific sequencing primers tosequence directly from genomic DNA

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    30/80

    Sequence Assembly

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    31/80

    Assembly of Shotgun Fragments

    For H. influenzae (TIGR) 1.8 Mb 24,304 Sequence fragments were generated

    for the random assembly phase 11,631,485 bases

    Generated 140 contigs

    Assembled using the TIGR Assembler 30 hours of cpu time

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    32/80

    phred/phrap/consed

    Widely used programs for sequence: base calling (phred)

    assembly (phrap) editing (consed)

    Developed at the University ofWashington Phil Green (phrap) Brent Ewing (phred) David Gordon (consed)

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    33/80

    Genome Annotation and Analysis

    Pattern Matching

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    34/80

    Sequence Annotation

    ORF identification

    Frameshift resolution

    Genome map construction

    Functional assignments

    Metabolic pathway assignment

    Metabolic pathway Reconstruction

    Comparative analysis

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    35/80

  • 8/2/2019 16 Genome

    36/80

    Annotation Tools

    Semi-automated

    Manual

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/
  • 8/2/2019 16 Genome

    37/80

    MAGPIE

    Multipurpose Automated Genome Project

    Investigation Environment

    Terry Gaasterland et. al. http://genomes.rockefeller.edu/magpie/magpie.htmlAutomated

    Semi-automated analysis tool for microbial

    genome projects

    http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.html
  • 8/2/2019 16 Genome

    38/80

    MAGPIE Example

  • 8/2/2019 16 Genome

    39/80

    Non-Automated Analysis and Prediction

    The Ureaplasma urealyticum genome

    database

    Run analysis tool Parse results

    Dump results into the database

    View results

    Manually annotate

  • 8/2/2019 16 Genome

    40/80

    Genomic Sequence Database

    Data Storage Sequence

    Gene MapAnnotation

    User Interface

    Web browser Customizable

  • 8/2/2019 16 Genome

    41/80

    The Ureaplasma urealyticum Genome Project

    Uu - 751,719 bp http://genome.microbio.uab.edu/uu/uugen.htm

    Web-based genome analysis tool

    http://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htm
  • 8/2/2019 16 Genome

    42/80

  • 8/2/2019 16 Genome

    43/80

  • 8/2/2019 16 Genome

    44/80

  • 8/2/2019 16 Genome

    45/80

  • 8/2/2019 16 Genome

    46/80

    Annotation Problems

    Problems with existing sequence databases Incomplete datasets Skewed datasets Incorrectly annotated records

    Annotations based on experimental vs. predicteddata

    Nomenclature differences Transitive errors in gene function predictions Functional predictions for hypothetical genes

  • 8/2/2019 16 Genome

    47/80

    Metabolic Pathway Reconstruction

  • 8/2/2019 16 Genome

    48/80

    Metabolic Pathway Reconstruction

    Role assignment

    Extract metabolic pathways from genomes

    Navigation and analysis

    Pathway editing

  • 8/2/2019 16 Genome

    49/80

    Metabolic Assignments

    Amino acid Biosynthesis Biosynthesis of cofactors, prosthetic groups, and carriers Cell envelope Cellular processes Central intermediary metabolism

    Energy metabolism Fatty acid and phospholipid metabolism Purines, pyrimidines, nucleosides, and nucleotides Regulatory functions Replication Transcription Translation Transport and binding proteins Other categories, Unassigned Hypothetical

  • 8/2/2019 16 Genome

    50/80

    750,001

    700,001

    650,001

    600,001

    550,001

    500,001

    450,001

    400,001

    350,001

    300,001

    250,001

    200,001

    150,001

    100,001

    50,001

    1

    750,000

    751,719

    700,000

    650,000

    600,000

    550,000

    500,000

    450,000

    400,000

    350,000

    300,000

    250,000

    200,000

    150,000

    100,000

    50,000

    Cofactor BiosynthesisCell envelopeCellular processesCentral Intermediary Metabolism

    Energy MetabolismFatty Acid MetabolismHypothetical

    Nucleotide Metabolism

    ReplicationTranscriptionTranslationTransport

    RNA

    tRNA

    Other

    Ureaplasma urealyticum Gene Map

    U G M G

  • 8/2/2019 16 Genome

    51/80

    Amino acid Biosynthesis

    Biosynthesis of cofactorsCell envelope

    Cellular processes

    Central intermediary metabolism

    Energy metabolism

    Fatty acid - phospholipidsHypothetical

    Other categories

    Purines, pyrimidines

    Regulatory functions

    Replication

    Transcription

    Translation

    Transport and binding proteins

    Unassigned

    Total

    1

    1019

    13

    15

    23

    6293

    1

    18

    4

    45

    17

    100

    37

    4

    606

    0.2%

    1.7%3.1%

    2.1%

    2.5%

    3.8%

    1.0%48.3%

    0.2%

    3.0%

    0.7%

    7.4%

    2.8%

    16.5%

    6.1%

    0.7%

    100.0%

    0

    726

    15

    7

    30

    7169

    3

    20

    4

    31

    19

    99

    35

    7

    479

    0.0%

    1.5%5.4%

    3.1%

    1.5%

    6.3%

    1.5%35.3%

    0.6%

    4.2%

    0.8%

    6.5%

    4.0%

    20.7%

    7.3%

    1.5%

    100.0%

    Role

    Uu Genes Mg Genes

    #Percent

    of Total #Percent

    of Total

  • 8/2/2019 16 Genome

    52/80

    EcoCyc

    Peter D. Karp, PhD

    SRI International Menlo Park, CA

    http://ecocyc.pangeasystems.com/ecocyc/

    ecocyc.html

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    53/80

    Pathway Reconstruction

    Genomic

    Maps

    Genes

    Gene Products

    Reactions (Compounds)

    Pathways

    Metabolic Network

    Annotated Genome

    List of Genes/ORFs

    List of Gene Products

    DNA Sequence

    Cell

    Adapted from P. Karp, Pangea Systems

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    54/80

  • 8/2/2019 16 Genome

    55/80

  • 8/2/2019 16 Genome

    56/80

  • 8/2/2019 16 Genome

    57/80

  • 8/2/2019 16 Genome

    58/80

  • 8/2/2019 16 Genome

    59/80

  • 8/2/2019 16 Genome

    60/80

  • 8/2/2019 16 Genome

    61/80

    glyceraldehyde 3-phosphate

    dehydrogenase

    1.2.1.12

    fructose-6-phosphate

    glucose-6-phosphate

    fructose-1,6-bisphosphate

    pyruvate3-phosphoglycerate

    3-phospho-D-glyceroyl-phosphate

    glyceraldehyde-3-phosphate

    phosphoglucose isomerase

    6-phosphofructokinase

    fructose bisphosphate aldolase

    phosphoglycerate kinase

    glucose-1-phosphate

    phosphoglucomutase

    glyceraldehyde-3-phosphate

    dehydrogenase

    1.2.1.9

    Glycolysis in Uu?

    ?

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    62/80

    Uu Energy Metabolism

    Glycolysis Missing several components

    Pentose-phosphate pathway Only 2/8 enzyme complexes present

    Proton motive force - ATP synthasecomplex

    Urease Gene Complex Biologically relevant

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    63/80

    Comparative Genomics

    What makes one organism different from

    all other organisms?

    Molecular Biology Physiology

    Pathogenesis

    Epidemiology

    Genetics

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    64/80

    Ortholog Comparisons

    Uu to Mg genes: 324 53% of Uu; 67% of Mg 71 hypothetical

    Mh to Mg genes: 314 41% of Mh; 57% of Mg 55 hypothetical (2 unique hypothetical)

    Mh to Uu genes: 330 47% of Uu; 43% of Mh 82 hypothetical (19 unique hypothetical)

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    65/80

    M. genitalium - M. pneumoniae Gene Order

    0

    100,000

    200,000

    300,000

    400,000

    500,000

    0 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000

    M. pneumoniae Gene PositionM. pneumoniae Gene Position

    M. g

    enitaliu

    mGene

    Po

    sition

    M. g

    enita

    liu

    mGe

    ne

    Positi o

    n

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    66/80

    M. genitalium - U. urealyticum Gene Order

    0

    100,000

    200,000

    300,000

    400,000

    500,000

    0 100,000 200,000 300,000 400,000 500,000 600,000 700,000

    U. urealyticum Gene Position

    M. g

    enitaliu

    mGe

    ne

    Positi o

    n

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    67/80

    Paralog Analysis

    Identification of conserved, paralogousgroups

    All against All comparison Genes within one organism

    Identifies groups of related genes Primary sequence Structure Function

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    68/80

    Uu Paralogous Clusters >3

    4 tRNA synthetase 4 Translation factors 4 Hypothetical membrane lipoprotein 5 ATP synthase alpha, beta chains 6 MBA 7 Hypothetical membrane lipoprotein

    8 Hypothetical 10 Iron transporters 13 Transporters

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    69/80

    Functional Genomics

    Gene Expression

    Gene Regulation

    Genome-wide Mutagenesis

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    70/80

    Expression Arrays

    Cell growth in different environments

    Isolate cDNAs

    Measure expression using array technology Create database of expression information

    Display information in an easy-to-use format Show ratio of expression under different conditions

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    71/80

    Putting it all together

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    72/80

    From F. Blattner, U. Wisc.

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    73/80

    Chromosome Views

    Ensembl view

    UC Santa Cruz view

    NCBI View

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    74/80

  • 8/2/2019 16 Genome

    75/80

  • 8/2/2019 16 Genome

    76/80

  • 8/2/2019 16 Genome

    77/80

    A Final Caveat

    The difficulty of identifying genes in

    anonymous vertebrate sequences

    Claverie JM, Poirot O, Lopez F

    Comput Chem 1997;21(4):203-14

    The identification of genes in newly determined vertebrate genomic

    f t i i l t i ibl t k I

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    78/80

    sequences can range from a trivial to an impossible task. In a

    statistical preamble, we show how "insignificant" are the individual

    features on which gene identification can be rigorously based:

    promoter signals, splice sites, open reading frames, etc. The practicalidentification of genes is thus ultimately a tributary of their

    resemblance to those already present in sequence databases, or

    incorporated into training sets. The inherent conservatism of the

    currently popular methods (database similarity search, GRAIL) willgreatly limit our capacity for making unexpected biological

    discoveries from increasingly abundant genomic data. Beyond a very

    limited subset of trivial cases, the automated interpretation (i.e.

    without experimental validation) of genomic data, is still a myth. On

    the other hand, characterizing the 60,000 to 100,000 genes thought to

    be hidden in the human genome by the mean of individual

    experiments is not feasible. Thus, it appears that our only hope of

    turning genome data into genome information must rely on drastic

    progresses in the way we identify and analyze genes in silico.

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    79/80

    Only One Final Word of Wisdom...

    ...although the computer is a wonderful

    helpmate for the sequence searcher and

    comparer, biochemists and molecularbiologists must guard against the blind

    acceptance of any algorithmic output;

    given the choice, think like a biologist and

    not a statistician. - Russell F. Doolittle, 1990

    http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html
  • 8/2/2019 16 Genome

    80/80