Upload
inamul-hasanmadar
View
218
Download
0
Embed Size (px)
Citation preview
8/2/2019 16 Genome
1/80
Genomics
8/2/2019 16 Genome
2/80
The Human Genome Project
Mapping and Sequencing the Genomes of
Model Organisms
Data Collection and Distribution Ethical, Legal, and Social Considerations
Research Training
Technology Development
Technology Transfer
8/2/2019 16 Genome
3/80
A Few Genome Resources
NCBI Genome Resources
UCSC Human Genome Browser
EnsemblHuman Genome Server
http://www.ncbi.nlm.nih.gov/Genomes/index.htmlhttp://genome.ucsc.edu/http://www.ensembl.org/http://www.ensembl.org/http://www.ensembl.org/http://www.ensembl.org/http://genome.ucsc.edu/http://www.ncbi.nlm.nih.gov/Genomes/index.html8/2/2019 16 Genome
4/80
Genome Sequencing Progress
NCBI Genome Sequence Repository
All organisms
Eukaryoticgenomes
Prokaryotic genomes
Archaeagenomes Viruses
http://www.ensembl.org/http://www.ensembl.org/http://www.ensembl.org/http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genomehttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/allorg.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/euk.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/euk.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/eub.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/a.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/a.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/a.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/a.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/eub.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/euk.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/euk.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/allorg.htmlhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genome8/2/2019 16 Genome
5/80
Genome Sequencing
From NCBI, 5/2001
http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.html8/2/2019 16 Genome
6/80
Human Genome Sequencing 2/11/2001
From NCBI
http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.html8/2/2019 16 Genome
7/80
Human Genome Progress 2/11/2001
Total
sequence
(kb)
Non-redundant
sequence (kb)
Percentage of
genome
Finished 1,140,365 1,040,372 32.50%
Unfinished 3,547,899 1,951,344 61.00%
Total 4,688,264 2,991,716 93.50%
From NCBI
http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.html8/2/2019 16 Genome
8/80
Microbial Genomes
Published complete microbial genomes
Microbial genomes and chromosomes in progr
http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/vis.htmlhttp://www.tigr.org/tdb/mdb/mdbcomplete.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbcomplete.html8/2/2019 16 Genome
9/80
Genome Informatics
Annotation and Analysis
Data Handling
Metabolic Reconstruction
Comparative Genomics
Functional Genomics
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html8/2/2019 16 Genome
10/80
Genome Project Organization
Cloning
Mapping
Sequencing
Annotation
Analysis
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html8/2/2019 16 Genome
11/80
Cloning and Mapping
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html8/2/2019 16 Genome
12/80
Cloning
Large YACs
1 Mb
BACs 100 - 200 Kb
Intermediate Cosmids Lambda clones
Small Plasmids; M13
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html8/2/2019 16 Genome
13/80
Mapping
Establishment of Guideposts
Aids in Assembly
Error Checking
Useful in mapping of genetic disorders
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html8/2/2019 16 Genome
14/80
Genetic Maps
Cytogenetic markers
Linkage maps Polymorphic loci screened by PCR to
determine inheritence patterns
Produce linkage map with nearby loci
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html8/2/2019 16 Genome
15/80
Physical Maps
Radiation Hybrid/YACs/Cosmids Restriction Sites Sequence Tagged Sites
100 Kb resolution needed 30,000 STSs
Expressed Sequence Tags
Detection PCR Hybridization FISH
Fluoresecent in situ Hybridization
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html8/2/2019 16 Genome
16/80
Human Genome STS Mapping Strategy
STS Content Mapping Screen YACs by PCR
Radiation Hybrid Mapping Screen RH Cell lines by PCR
Genetic Mapping
PCR Screening of polymorphic loci Combine above to produce an integrated
map
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html8/2/2019 16 Genome
17/80
Mapping Resolution
YAC mapping 1 Mb
Radiation hybrid mapping 10 Mb
Genetic map 30 Mb
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.html8/2/2019 16 Genome
18/80
GeneMap98
Integrated Human Genetic Map
Over 30,000 unique gene-based markers 100 Kb resolution
http://www.ncbi.nlm.nih.gov/genemap98/
http://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.tigr.org/tdb/mdb/mdbinprogress.htmlhttp://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
19/80
Map Integration
http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
20/80
Human Chromosome 1 Genetic Map
http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
21/80
Human Chromosome 1 Combination Map
http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
22/80
Sequencing
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
23/80
Sequencing Methods
Random Shotgun
Ordered Shotgun
Directed Primer Walking
Direct genomic sequencing
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
24/80
Random Shotgun Sequencing
Randomly shear or cut DNA into small pieces 2-4 Kb
Clone into M13, pUC or some other sequencingvector
Sequence the clones from both ends
Rely on the computer to assemble the
sequences into one (or as few as possible)
contigs
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
25/80
Shotgun Sequencing Statistics
Lander and Waterman equation poisson distribution
Po = e-m probability that a base is not sequenced
where m=sequence coverage
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
26/80
H. influenza Sequencing
For 1X random sequence coverage = 1.8 Mb P = 0.37 (63% of the bases are sequenced)
To get > 99% of the bases sequenced 5X coverage = 8.74 Mb of sequence
Po = e-5 = 0.0067
This coverage would leave approx. 128 gaps of
about 100 bp in size From Science 269:496-512. 1995
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
27/80
Ordered Sequencing
Generate a set of large sequence clones in
lambda phage
May be subcloned from YACs or BACs as necessary End sequence the lambda clones and order the
clones to produce a map of the genome
Choose a minimal tiling path of the genome from
the ordered lambda clones
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
28/80
Ordered Sequencing...
Shear and subclone the lambda inserts
that comprise the minimal tiling set into
sequencing vectors Shotgun sequence and assemble each of
these lambda inserts individually
Assemble all sequences into one,contiguous genome
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
29/80
Directed Sequencing
Process used for finishing following the
shotgun sequencing phase
Gap closure Use specific sequencing primers to extend
appropriate clones into gap regions
Use specific sequencing primers tosequence directly from genomic DNA
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
30/80
Sequence Assembly
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
31/80
Assembly of Shotgun Fragments
For H. influenzae (TIGR) 1.8 Mb 24,304 Sequence fragments were generated
for the random assembly phase 11,631,485 bases
Generated 140 contigs
Assembled using the TIGR Assembler 30 hours of cpu time
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
32/80
phred/phrap/consed
Widely used programs for sequence: base calling (phred)
assembly (phrap) editing (consed)
Developed at the University ofWashington Phil Green (phrap) Brent Ewing (phred) David Gordon (consed)
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
33/80
Genome Annotation and Analysis
Pattern Matching
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
34/80
Sequence Annotation
ORF identification
Frameshift resolution
Genome map construction
Functional assignments
Metabolic pathway assignment
Metabolic pathway Reconstruction
Comparative analysis
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
35/80
8/2/2019 16 Genome
36/80
Annotation Tools
Semi-automated
Manual
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/8/2/2019 16 Genome
37/80
MAGPIE
Multipurpose Automated Genome Project
Investigation Environment
Terry Gaasterland et. al. http://genomes.rockefeller.edu/magpie/magpie.htmlAutomated
Semi-automated analysis tool for microbial
genome projects
http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://www.ncbi.nlm.nih.gov/genemap98/http://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.htmlhttp://genomes.rockefeller.edu/magpie/magpie.html8/2/2019 16 Genome
38/80
MAGPIE Example
8/2/2019 16 Genome
39/80
Non-Automated Analysis and Prediction
The Ureaplasma urealyticum genome
database
Run analysis tool Parse results
Dump results into the database
View results
Manually annotate
8/2/2019 16 Genome
40/80
Genomic Sequence Database
Data Storage Sequence
Gene MapAnnotation
User Interface
Web browser Customizable
8/2/2019 16 Genome
41/80
The Ureaplasma urealyticum Genome Project
Uu - 751,719 bp http://genome.microbio.uab.edu/uu/uugen.htm
Web-based genome analysis tool
http://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htmhttp://genome.microbio.uab.edu/uu/uugen.htm8/2/2019 16 Genome
42/80
8/2/2019 16 Genome
43/80
8/2/2019 16 Genome
44/80
8/2/2019 16 Genome
45/80
8/2/2019 16 Genome
46/80
Annotation Problems
Problems with existing sequence databases Incomplete datasets Skewed datasets Incorrectly annotated records
Annotations based on experimental vs. predicteddata
Nomenclature differences Transitive errors in gene function predictions Functional predictions for hypothetical genes
8/2/2019 16 Genome
47/80
Metabolic Pathway Reconstruction
8/2/2019 16 Genome
48/80
Metabolic Pathway Reconstruction
Role assignment
Extract metabolic pathways from genomes
Navigation and analysis
Pathway editing
8/2/2019 16 Genome
49/80
Metabolic Assignments
Amino acid Biosynthesis Biosynthesis of cofactors, prosthetic groups, and carriers Cell envelope Cellular processes Central intermediary metabolism
Energy metabolism Fatty acid and phospholipid metabolism Purines, pyrimidines, nucleosides, and nucleotides Regulatory functions Replication Transcription Translation Transport and binding proteins Other categories, Unassigned Hypothetical
8/2/2019 16 Genome
50/80
750,001
700,001
650,001
600,001
550,001
500,001
450,001
400,001
350,001
300,001
250,001
200,001
150,001
100,001
50,001
1
750,000
751,719
700,000
650,000
600,000
550,000
500,000
450,000
400,000
350,000
300,000
250,000
200,000
150,000
100,000
50,000
Cofactor BiosynthesisCell envelopeCellular processesCentral Intermediary Metabolism
Energy MetabolismFatty Acid MetabolismHypothetical
Nucleotide Metabolism
ReplicationTranscriptionTranslationTransport
RNA
tRNA
Other
Ureaplasma urealyticum Gene Map
U G M G
8/2/2019 16 Genome
51/80
Amino acid Biosynthesis
Biosynthesis of cofactorsCell envelope
Cellular processes
Central intermediary metabolism
Energy metabolism
Fatty acid - phospholipidsHypothetical
Other categories
Purines, pyrimidines
Regulatory functions
Replication
Transcription
Translation
Transport and binding proteins
Unassigned
Total
1
1019
13
15
23
6293
1
18
4
45
17
100
37
4
606
0.2%
1.7%3.1%
2.1%
2.5%
3.8%
1.0%48.3%
0.2%
3.0%
0.7%
7.4%
2.8%
16.5%
6.1%
0.7%
100.0%
0
726
15
7
30
7169
3
20
4
31
19
99
35
7
479
0.0%
1.5%5.4%
3.1%
1.5%
6.3%
1.5%35.3%
0.6%
4.2%
0.8%
6.5%
4.0%
20.7%
7.3%
1.5%
100.0%
Role
Uu Genes Mg Genes
#Percent
of Total #Percent
of Total
8/2/2019 16 Genome
52/80
EcoCyc
Peter D. Karp, PhD
SRI International Menlo Park, CA
http://ecocyc.pangeasystems.com/ecocyc/
ecocyc.html
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
53/80
Pathway Reconstruction
Genomic
Maps
Genes
Gene Products
Reactions (Compounds)
Pathways
Metabolic Network
Annotated Genome
List of Genes/ORFs
List of Gene Products
DNA Sequence
Cell
Adapted from P. Karp, Pangea Systems
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
54/80
8/2/2019 16 Genome
55/80
8/2/2019 16 Genome
56/80
8/2/2019 16 Genome
57/80
8/2/2019 16 Genome
58/80
8/2/2019 16 Genome
59/80
8/2/2019 16 Genome
60/80
8/2/2019 16 Genome
61/80
glyceraldehyde 3-phosphate
dehydrogenase
1.2.1.12
fructose-6-phosphate
glucose-6-phosphate
fructose-1,6-bisphosphate
pyruvate3-phosphoglycerate
3-phospho-D-glyceroyl-phosphate
glyceraldehyde-3-phosphate
phosphoglucose isomerase
6-phosphofructokinase
fructose bisphosphate aldolase
phosphoglycerate kinase
glucose-1-phosphate
phosphoglucomutase
glyceraldehyde-3-phosphate
dehydrogenase
1.2.1.9
Glycolysis in Uu?
?
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
62/80
Uu Energy Metabolism
Glycolysis Missing several components
Pentose-phosphate pathway Only 2/8 enzyme complexes present
Proton motive force - ATP synthasecomplex
Urease Gene Complex Biologically relevant
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
63/80
Comparative Genomics
What makes one organism different from
all other organisms?
Molecular Biology Physiology
Pathogenesis
Epidemiology
Genetics
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
64/80
Ortholog Comparisons
Uu to Mg genes: 324 53% of Uu; 67% of Mg 71 hypothetical
Mh to Mg genes: 314 41% of Mh; 57% of Mg 55 hypothetical (2 unique hypothetical)
Mh to Uu genes: 330 47% of Uu; 43% of Mh 82 hypothetical (19 unique hypothetical)
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
65/80
M. genitalium - M. pneumoniae Gene Order
0
100,000
200,000
300,000
400,000
500,000
0 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000
M. pneumoniae Gene PositionM. pneumoniae Gene Position
M. g
enitaliu
mGene
Po
sition
M. g
enita
liu
mGe
ne
Positi o
n
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
66/80
M. genitalium - U. urealyticum Gene Order
0
100,000
200,000
300,000
400,000
500,000
0 100,000 200,000 300,000 400,000 500,000 600,000 700,000
U. urealyticum Gene Position
M. g
enitaliu
mGe
ne
Positi o
n
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
67/80
Paralog Analysis
Identification of conserved, paralogousgroups
All against All comparison Genes within one organism
Identifies groups of related genes Primary sequence Structure Function
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
68/80
Uu Paralogous Clusters >3
4 tRNA synthetase 4 Translation factors 4 Hypothetical membrane lipoprotein 5 ATP synthase alpha, beta chains 6 MBA 7 Hypothetical membrane lipoprotein
8 Hypothetical 10 Iron transporters 13 Transporters
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
69/80
Functional Genomics
Gene Expression
Gene Regulation
Genome-wide Mutagenesis
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
70/80
Expression Arrays
Cell growth in different environments
Isolate cDNAs
Measure expression using array technology Create database of expression information
Display information in an easy-to-use format Show ratio of expression under different conditions
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
71/80
Putting it all together
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
72/80
From F. Blattner, U. Wisc.
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
73/80
Chromosome Views
Ensembl view
UC Santa Cruz view
NCBI View
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
74/80
8/2/2019 16 Genome
75/80
8/2/2019 16 Genome
76/80
8/2/2019 16 Genome
77/80
A Final Caveat
The difficulty of identifying genes in
anonymous vertebrate sequences
Claverie JM, Poirot O, Lopez F
Comput Chem 1997;21(4):203-14
The identification of genes in newly determined vertebrate genomic
f t i i l t i ibl t k I
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
78/80
sequences can range from a trivial to an impossible task. In a
statistical preamble, we show how "insignificant" are the individual
features on which gene identification can be rigorously based:
promoter signals, splice sites, open reading frames, etc. The practicalidentification of genes is thus ultimately a tributary of their
resemblance to those already present in sequence databases, or
incorporated into training sets. The inherent conservatism of the
currently popular methods (database similarity search, GRAIL) willgreatly limit our capacity for making unexpected biological
discoveries from increasingly abundant genomic data. Beyond a very
limited subset of trivial cases, the automated interpretation (i.e.
without experimental validation) of genomic data, is still a myth. On
the other hand, characterizing the 60,000 to 100,000 genes thought to
be hidden in the human genome by the mean of individual
experiments is not feasible. Thus, it appears that our only hope of
turning genome data into genome information must rely on drastic
progresses in the way we identify and analyze genes in silico.
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
79/80
Only One Final Word of Wisdom...
...although the computer is a wonderful
helpmate for the sequence searcher and
comparer, biochemists and molecularbiologists must guard against the blind
acceptance of any algorithmic output;
given the choice, think like a biologist and
not a statistician. - Russell F. Doolittle, 1990
http://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.htmlhttp://ecocyc.pangeasystems.com/ecocyc/ecocyc.html8/2/2019 16 Genome
80/80