45
Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing Eric D. Green, M.D., Ph.D. National Human Genome Research Institute Phone: 301-402-2023 FAX: 301-402-2040 E-Mail: [email protected] Mendel 1865 Miescher 1871 Avery 1944 Watson & Crick 1953 Foundational Milestones in Genetics & Genomics

Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

  • Upload
    doque

  • View
    223

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 1

Techniques for Genome Mapping & Sequencing

Eric D. Green, M.D., Ph.D.National Human Genome Research Institute

Phone: 301-402-2023FAX: 301-402-2040E-Mail: [email protected]

Mendel

1865

Miescher

1871

Avery

1944

Watson & Crick

1953

Foundational Milestones in Genetics & Genomics

Page 2: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 2

Collins et al. (2003)

Page 3: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 3

Collins et al. (2003)

Outline

I. Fundamentals of Genome Mapping

II. Fundamentals of Genome Sequencing

III. Mapping & Sequencing in the Human Genome Project

IV. Comparative Sequencing

V. New DNA Sequencing Technologies

Page 4: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 4

Genome Sizes

~3,000,000,000 bp

~160,000,000 bp

~100,000,000 bp

~15,000,000 bp

~5,000,000 bp

Human GenomeMouse Genome

Fruit Fly Genome

Nematode Genome

Yeast Genome

E. coli Genome

The Human Cytogenetic Map

Page 5: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 5

Genetic Map

Physical Map

Cytogenetic Map

25 50 75 100 125 150 Mb

cM2520303020

…GATCTGCTATACTACCGCATTATTCCG…

Sequence MapClone-Based MapRH Map

Clone-Based Physical MappingChromosome

Contigs

Clones

Page 6: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 6

Construction of YACs and BACs

High-Molecular Weight DNAPartial Restriction Digestion

Size SelectionSize Selection

YAC Vector Arms BAC Vector

Ligate & Transform into Yeast

Ligate & Transform into Bacteria

YAC Insert: ~100-1000 kb BAC Insert: ~100-200 kb

TelomeresYeast DNAPlasmid DNAInsert DNA

BAC Vector

Insert DNAGreen et al. (1998)Birren et al. (1998)

Cloning Capacity

YAC

BAC

Cosmid

Bacteriophage

~1,000,000 bp

~100,000 bp

~45,000 bp

~25,000 bp

Genome Sizes

~3,000,000,000 bp

~160,000,000 bp

~100,000,000 bp

~15,000,000 bp

~5,000,000 bp

HumanMouse

Fruit Fly

Nematode

Yeast

E. coli

Page 7: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 7

Bacterial Artificial Chromosomes (BACs)

• Bacterial-Based Cloning System

• Based on the E. coli F Factor (Fertility Plasmid): Replication Control

• Cloned Inserts: 100-200 kb, Circular DNA

• Low Copy Number

Low Yields of DNA by Standard MethodsReasonably Stable

• BAC Libraries from Many Different Species Available(e.g., www.chori.org/bacpac)

• See Birren et al. (1998)

Chromosome(~130 Mb)

BAC(~0.1-0.2 Mb)

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTAGGTATTGGGGCATTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCACCGGTTAGACATACATGAGGCCCACCGCCGCTGTGCACGTCCACCACC

Genome(~3000 Mb)

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTGTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCACCGAGGCCCACCGCCGCTGTGCACGTCCACCACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTGTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCACCGAGGCCCACCGCCGCTGTGCACGTCCACCACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTGTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCACCGAGGCCCACCGCCGCTGTGCACGTCCACCACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTGTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCACCGAGGCCCACCGCCGCTGTGCACGTCCACCACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTGTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCACCGAGGCCCACCGCCGCTGTGCACGTCCACCACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTGTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCACCGAGGCCCACCGCCGCTGTGCACGTCCACCACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCACCGGTTAGACATACATGAGGCCCACCGCCGCTGTGCACGTCCACCACC

YAC(~0.5-1.0 Mb)

Page 8: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 8

Sequence-Ready Contig MapMarra et al. (1997) and Gregory et al. (1997)

Physical Mapping: Future Prospects

• Availability of Many BAC Libraries is Allowing Physical Mapping of More Species’ Genomes

• Strategies for Physical Mapping have AdvancedGreatly in the Sequence-Based Era

• Close Interplay of Mapping and Sequencing in the Exploration of Genomes

Page 9: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 9

DNA Sequencing

History of DNA Sequencing

Avery: Proposes DNA as ‘Genetic Material’

Watson & Crick: Double Helix Structure of DNA

Holley: Sequences Yeast tRNAAla

1870

1953

1940

1965

1970

1977

1980

1990

2006

Miescher: Discovers DNA

Wu: Sequences λ Cohesive End DNA

Sanger: Dideoxy Chain TerminationGilbert: Chemical Degradation

Messing: M13 Cloning

Hood et al.: Partial Automation

• Cycle Sequencing • Improved Sequencing Enzymes• Improved Fluorescent Detection Schemes

1986

Adapted from Messing & Llaca, PNAS (1998)

1

15

150

50,000

25,000

1,500

200,000

>100,000,000

Efficiency(bp/person/year)

15,000

Page 10: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 10

DNA Tagged with Radioactivity

G A T C

G: G ReactionA: A ReactionT: T ReactionC: C Reaction

Radioactive Sequencing

Page 11: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 11

Fluorescent DNA Sequencing

A G T A C T G G G A T C

GelElectrophoresis

Wilson & Mardis (1997)

Detection of Fluorescently Tagged DNA

DNA FragmentsSeparated by

Electrophoresis

Output to Computer

Laser Excites Fluorescent Dyes

Optical Detection System

Page 12: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 12

Analyzing Fluorescent DNA Sequencing Data

ComputerAnalysis

A G T A C T G G G A T C

Fluorescent DNA Sequencing Results

Page 13: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 13

Slab Gel-Based DNA Sequencing Instruments

Capillary-Based DNA Sequencing Instruments

Page 14: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 14

Large-Scale cDNA Sequencing

• Full-Insert (Full-Length) cDNA Sequencing

• ESTs: Expressed-Sequence Tags

• SAGE: Serial Analysis of Gene Expression

mgc.nci.nih.govGerhard et al. (2004)

Shotgun SequencingWilson & Mardis (1997)

Green (2001)

Large-Scale Genome Sequencing

Page 15: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 15

Subclone Construction

Subclone FragmentsSubclone Fragments

Randomly FragmentRandomly Fragment

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACCGATTATTACCATTTTAATCCTTAGGATTGACA

BAC DNA

Prepare Multiple CopiesPrepare Multiple CopiesGATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACCGATTATTACCATTTTAATCCTTAGGATTGACA

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACCGATTATTACCATTTTAATCCTTAGGATTGACA

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACCGATTATTACCATTTTAATCCTTAGGATTGACA

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACCGATTATTACCATTTTAATCCTTAGGATTGACA

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACCGATTATTACCATTTTAATCCTTAGGATTGACA

BAC

Shotgun Sequencing Strategy

Page 16: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 16

Poisson Calculations The sequencing strategy for the shotgun approach follows the

Lander and Waterman application of the Poisson distribution

The probability a base is not sequenced is given by:

P0=e-c

Where:

• c = fold sequence coverage (c=LN/G),

• LN = # bases sequenced, i.e. L = average sequencing

read length and N = # reads

• G = target sequence length

• e = 2.718 (e=2.718281828459)

Fold Coverage P0=e-c % not sequenced % sequenced 1 0.37 37% 63% 2 0.135 13.5% 87.5% 3 0.05 5% 95% 4 0.018 1.8% 98.2% 5 0.0067 0.6% 99.4% 6 0.0025 0.25% 99.75% 7 0.0009 0.09% 99.91% 8 0.0003 0.03% 99.97 9 0.0001 0.01% 99.99% 10 0.000045 0.005% 99.995%

Shotgun Sequence Assembly

“Consed” (Gordon et al., 1998)

Page 17: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 17

BAC

FinishedSequence

“Finishing”“Working Draft”Sequence

Shotgun Sequencing Strategy

Page 18: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 18

Sequence Finishing: Resolving Ambiguities

*** Sequence Finishing: Remains Relatively Expensive ***

Historically SignificantGenome Sequencing Projects

Page 19: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 19

www.tigr.org

Microbial Genome Sequences

Goffeau et al. (1997)

First Eukaryotic Genome Sequence

Page 20: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 20

C. elegans Sequencing Consortium (1998)

First Animal Genome Sequence

Adams et al. (2000)

Second Animal Genome Sequence

Page 21: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 21

Human Genome Sequencing Centers

Clone-Based Shotgun Sequencing

Green (2001)

Page 22: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 22

Whole-Genome Shotgun Sequencing

Green (2001)

February, 2001 Draft Sequence

Venter et al. (2001)International Human Genome Sequencing Consortium (2001)

Page 23: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 23

April, 2003 Completion

October, 2004 Publication

International Human Genome Sequencing Consortium (2004)

Page 24: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 24

Tuesday, March 29, 2005 Posted: 4:24 PM EST (2124 GMT)

CNN’s #1 Medical Story of Past 25 Years

April, 2003April, 1953

Page 25: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 25

All of the original goals of the Human Genome Project have

been accomplished!

What’s Next?

Page 26: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 26

Collins et al. (2003)

Collins et al. (2003)

Page 27: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 27

Mappingthe Human Genome

Sequencingthe Human Genome

~1990 to ~2000

~1998 to ~2003

Interpretingthe Human Genome

Sequence~2003 to ???

The Human Genome Project

BeyondThe Human

Genome Project

Page 28: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 28

TGCCGCGGAACTTTTCGGCTCTCTAAGGCTGTATTTTGATATACGAAAGGCACATTTTCCTTCCCTTTTCAAAATGCACCTTGCAAACGTAACAG GAACCCGACTAGGATCATCGGGAAAAGGAGGAGGAGGAGGAAGGCAGGCTCCGGGGAAGCTGGTGGCAGCGGGTCCTGGGTCTGGCGGACCCTGA CGCGAAGGAGGGTCTAGGAAGCTCTCCGGGGAGCCGGTTCTCCCGCCGGTGGCTTCTTCTGTCCTCCAGCGTTGCCAACTGGACCTAAAGAGAGG CCGCGACTGTCGCCCACCTGCGGGATGGGCCTGGTGCTGGGCGGTAAGGACACGGACCTGGAAGGAGCGCGCGCGAGGGAGGGAGGCTGGGAGTC AGAATCGGGAAAGGGAGGTGCGGGGCGGCGAGGGAGCGAAGGAGGAGAGGAGGAAGGAGCGGGAGGGGTGCTGGCGGGGGTGCGTAGTGGGTGGA GAAAGCCGCTAGAGCAAATTTGGGGCCGGACCAGGCAGCACTCGGCTTTTAACCTGGGCAGTGAAGGCGGGGGAAAGAGCAAAAGGAAGGGGTGG TGTGCGGAGTAGGGGTGGGTGGGGGGAATTGGAAGCAAATGACATCACAGCAGGTCAGAGAAAAAGGGTTGAGCGGCAGGCACCCAGAGTAGTAG GTCTTTGGCATTAGGAGCTTGAGCCCAGACGGCCCTAGCAGGGACCCCAGCGCCCGAGAGACCATGCAGAGGTCGCCTCTGGAAAAGGCCAGCGT TGTCTCCAAACTTTTTTTCAGGTGAGAAGGTGGCCAACCGAGCTTCGGAAAGACACGTGCCCACGAAAGAGGAGGGCGTGTGTATGGGTTGGGTT TGGGGTAAAGGAATAAGCAGTTTTTAAAAAGATGCGCTATCATTCATTGTTTTGAAAGAAAATGTGGGTATTGTAGAATAAAACAGAAAGCATTA AGAAGAGATGGAAGAATGAACTGAAGCTGATTGAATAGAGAGCCACATCTACTTGCAACTGAAAAGTTAGAATCTCAAGACTCAAGTACGCTACT ATGCACTTGTTTTATTTCATTTTTCTAAGAAACTAAAAATACTTGTTAATAAGTACCTAAGTATGGTTTATTGGTTTTCCCCCTTCATGCCTTGG ACACTTGATTGTCTTCTTGGCACATACAGGTGCCATGCCTGCATATAGTAAGTGCTCAGAAAACATTTCTTGACTGAATTCAGCCAACAAAAATT TTGGGGTAGGTAGAAAATATATGCTTAAAGTATTTATTGTTATGAGACTGGATATATCTAGTATTTGTCACAGGTAAATGATTCTTCAAAAATTG AAAGCAAATTTGTTGAAATATTTATTTTGAAAAAAGTTACTTCACAAGCTATAAATTTTAAAAGCCATAGGAATAGATACCGAAGTTATATCCAA CTGACATTTAATAAATTGTATTCATAGCCTAATGTGATGAGCCACAGAAGCTTGCAAACTTTAATGAGATTTTTTAAAATAGCATCTAAGTTCGG AATCTTAGGCAAAGTGTTGTTAGATGTAGCACTTCATATTTGAAGTGTTCTTTGGATATTGCATCTACTTTGTTCCTGTTATTATACTGGTGTGA ATGAATGAATAGGTACTGCTCTCTCTTGGGACATTACTTGACACATAATTACCCAATGAATAAGCATACTGAGGTATCAAAAAAGTCAAATATGT TATAAATAGCTCATATATGTGTGTAGGGGGGAAGGAATTTAGCTTTCACATCTCTCTTATGTTTAGTTCTCTGCATGTGCAGTTAATCCTGGAAC TCCGGTGCTAAGGAGAGACTGTTGGCCCTTGAAGGAGAGCTCCTCCCTGTGGATGAGAGAGAAGGACTTTACTCTTTGGAATTATCTTTTTGTGT TGATGTTATCCACCTTTTGTTACTCCACCTATAAAATCGGCTTATCTATTGATCTGTTTTCCTAGTCCTTATAAAGTCAAAATGTTAATTGGCAT AAATTATAGACTTTTTTTAGCAGAGAACTTTGAGGAACCTAAATGCCAACCAGTCTAAAAATGCAGTTTTCAGAAGAATGAATATTTCATGGATA GTTCTAAATACTAATGAACTTTAAAATAGCTTACTATTGATCTGTCAAAGTGGGTTTTTATATAATTTTCTTTTTACAAATCACCTGACACATTT AATATAGGTTAAAAAATGCTATCAGGCTGGTTTGCAAAGAAAATGTATTACAAAGGCTGCTAAGTGTGTTAAGAGCATACTCATTTCTGTTCTCC AAAATATTTCATAAGGTGCTTTAAGAATAGGTATGTTTTTAAAAGTTAAGTTCCTACTATTTATAGGAACTGACAATCACCTAAAATACCAATGA TTACAAACTTCCTTCTGGCCTTCTGGACTGCAATTCTAAAAGTGTAAAAAACATATTTTCTGCATTAAGTTAGGCAGTATTGCTTAGTTTTCAAA GTGGTAGGCTTTGGAGTCAGATTATTTTGATTCAGATCCTACATCTACTGTTTAGTAGCTCTGTTGCCTGAGGCAGGTCCCTTAACATCTCTGTG TGTGACTTGACCTTTAAAATTTGGAGACTGTCATAGGGGTTAATCCCTTGAGAAAATGAATGTGAAAAGTTAGCCTAATGTTAACTGCTATTATT ATGGATTACCATATTTTCACATTCATCACAGTACATGCACCTTGTTAATATAAGATGCTCAATTCATCTTTGAGTATAATTTTGTGACTCTCAAT CTGGATATGCAATGAGTGGGCCTGTATGAGAATTTAATTTATGAAAAATTGTGTTTCACATGGCCTTACCAGATATACAGGAAACACGTCACATG TTTCTATTGTATGTTGTTAAATGCCTTAGAATTTAACTTTCTGAATAGGATCCCTTCAGTTTGAGAGTCATAAAAGAGTAAAATTATTATGGTAT

~3,000 bp (0.0001%) of Human Genome Sequence

The Human Genome… by the Numbers

~5% of Human Genome is Functionally Important5% of 3B Bases = ~150M BasesDo NOT Yet Know the Position of these ~150M Functional Bases

~1.5% Encodes for Protein (Genes)Corresponds to ~18-22K GenesMany More than ~22K Different ProteinsGood Inventory at Present

~3.5% Functional But Non-CodingGene Regulatory ElementsChromosomal Functional ElementsUndiscovered Functional Elements (NOT Yet in Textbooks!)Poor Inventory at Present

Page 29: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 29

Darwin

1859

Mendel

1865

Miescher

1871

Avery

1944

Watson & Crick

1953

Foundational Milestones in Genetics & Genomics

Comparing Genomes is Like Cryptography

CKQEBHEREYTWASUISCZMEISDFOGETHEBLPBGOODFQSTLKSTUFFRTAC

DLUCEHEREZBRTTOISAWNDCDARJJPTHERROFGOODERGHCLSTUFFBRHA

Page 30: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 30

Functional Elements: Coding vs. Non-Coding

Coding Sequences (i.e., Genes)Relatively EASY to IdentifyMostly Know What to Look ForComplementary Data Sets Available (ESTs, cDNAs)Ever-Improving Computational Gene Predictions

Non-Coding Functional SequencesHARD to IdentifyVery Little Known About What to Look ForVirtually No Complementary Data Sets AvailablePoor Computational Predictions

++ ++

+ --

The Language of the Genome

Page 31: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 31

Functional Elements: Coding vs. Non-Coding

Coding Sequences (i.e., Genes)Relatively EASY to IdentifyMostly Know What to Look ForComplementary Data Sets Available (ESTs, cDNAs)Ever-Improving Computational Gene Predictions

Non-Coding Functional SequencesHARD to IdentifyVery Little Known About What to Look ForVirtually No Complementary Data Sets AvailablePoor Computational Predictions

Major role for comparative sequence analysis will be the identification of functionally important, non-coding sequences

Species BTATCGGCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACCGAGGGAGAGGGGTGACCACACTGTGACA

Species AGATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACGATTTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACTACCGAGATACACGATACCTACACAGGTGTGACACACCCCTACCCGTCCACCACACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACTACCGAGATACACGATACCTACACAGGTGTGACACACGATCCTTACCACATTACACATTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACCGGGACCGAGG

Comparative Sequence Analysis

Species BTATCGGCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACCGAGGGAGAGGGGTGACCACACTGTGACA

Using the Experiments of Evolutionto Decode the Human Genome

Species AGATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACGATTTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACTACCGAGATACACGATACCTACACAGGTGTGACACACCCCTACCCGTCCACCACACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACTACCGAGATACACGATACCTACACAGGTGTGACACACGATCCTTACCACATTACACATTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACCGGGACCGAGG

Sequences in Common (i.e., ‘Conserved’ or ‘Constrained’)

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACCGAGGGAGAGGGGTGACCACACTGTGACA

Compare

Page 32: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 32

Zebrafish Pufferfish

Vertebrate Genome Sequences

Cow

Xenopus

MonodelphisMacaque

Chicken

Platypus

ChimpanzeeRatMouse

Orangutan Marmoset

Dog

Hybrid Shotgun Sequencing

Green (2001)

Page 33: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 33

Margulies EM et al. (2005)

Low-Redundancy, Whole-Genome Shotgun Sequencing

Mouse

Pufferfish

ChickenChimpanzee

Zebrafish

DogCow

Rat

XenopusMonodelphisMacaque

Human

PlatypusMarmosetOrangutan

Landscape of Vertebrate Genome Sequencing

RabbitCat

ElephantTenrec

Armadillo

ShrewGuinea PigHedgehog(and others…)

Page 34: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 34

Multi-Species Sequence Comparisons

Species C Species DSpecies B Species ESpecies A Species FGATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACC

GATCGTCTAGAATCTCGAGATCTCTGAGAGTCGTGGGAAACTGTGTGATGTGACTAGCCACAGTTACGTGTGAGAGATGTATGATGCACCTGACCCGGGTTTCACTCTCAACGACTCACTCCACCTCAGAGGCCCACCGCCGCTGTGCACGTCCACCACGATCCTTACCACACTTACACATTACCATATATCCACCTACCACACATACCTACCCCATTGCACACCTATTATTATTACCAGTAACTACTACCACCTAGAGGGAGCTA

HUMAN

Multi-Species Conserved Sequences (MCSs)

Margulies et al. (2003)

Future Genomes to Sequence???

Page 35: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 35

ENCODE Project

● ENCODE: ENCyclopedia Of DNA Elements

● Initial Pilot Project: 1% of Human Genome

● Apply Multiple, Diverse Approaches to Study and Analyze that 1% in a Consortium Fashion

● Goal: Compile a Comprehensive Encyclopedia ofAll Functional Elements in the Human Genome

ENCODE Project Consortium (2004)

Page 36: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 36

ENCODE Project: Comparative Sequencing

ENCODE Project: Web Sites

genome.gov/ENCODE genome.ucsc.edu/ENCODE

Page 37: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 37

Human Genome Sequence

~$100,000

~$1,000

>$1,000,000,000

Page 38: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 38

En Route to the $1000 Genome

Page 39: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 39

Margulies M et al. (2005)

Bennett (2004)

Shendure et al. (2005)

Bennett et al. (2005)

Metzker (2005) Church (2006)

Page 40: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 40

Method Read LengthFeasibilityRaw Data ProductionData Quality

SangerSequencing

Long(800-1200 bases)

Well Established ++++

Stepwise Synthesis

Short(25-100 bases)

BecomingEstablished ++++++

Single Molecule

Long(>1000 bases ?)

Far fromEstablished +++++ ????

DNA Sequencing Technologies

Realities of New DNA Sequencing Technologies…

Page 41: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 41

“Inter-Species” Comparisons

“Intra-Species” Comparisons

Page 42: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 42

? HGP

Realization ofGenomic Medicine

The Pathway to Genomic Medicine

? HGP

Platypus

Armadillo Marmoset

Elephant Chicken

Tenrec

Realization ofGenomic Medicine

The Pathway to Genomic Medicine

Page 43: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Eric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing

Page 43

The Current Big Challenges…

• Large-Scale Deployment of Medical Sequencing

• Defining “Saturation Points” in Terms of Information Gained by Comparative Sequence Analyses

• Achieving the “$1000 Genome”

…from base pairs to bedside.

The Human Genome Sequence to Genomic Medicine…

Page 44: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

Bibliography

Adams MD et al. (2000). The genome sequence of Drosophila melanogaster. Science 287:2185-2195.

Bennett S (2004). Solexa Ltd. Pharmacogenomics 5:433-438. Bennett ST (2005). Toward the $1000 human genome. Pharmacogenomics 6:373-382. Birren B et al. (1998). Bacterial artificial chromosomes. In Genome Analysis: A Laboratory

Manual, Vol. 3 Cloning systems (B Birren et al., eds.; Cold Spring Harbor Laboratory Press), pp. 241-295.

C. elegans Sequencing Consortium (1998). Genome sequence of the nematode C. elegans: a

platform for investigating biology. Science 282:2012-2018. Chimpanzee Sequencing and Analysis Consortium (2005). Initial sequence of the chimpanzee

genome and comparison with the human genome. Nature 437:69-87. Church GM (2006). Genomes for all. Sci Am 294:46-54. Collins FS et al. (2003). A vision for the future of genomics research: a blueprint for the genomic

era. Nature 422:835-847. ENCODE Project Consortium (2004). The ENCODE (ENCyclopedia Of DNA Elements) Project.

Science 306:636-640. Gerhard DS et al. (2004). The status, quality, and expansion of the NIH full-length cDNA project:

the Mammalian Gene Collection (MGC). Genome Res 14:2121-2127. Goffeau A et al. (1997). The Yeast Genome Directory. Nature 387S:1-105. Gordon D et al. (1998). Consed: a graphical tool for sequence finishing. Genome Res 8:195-202. Green ED (2001). Strategies for the systematic sequencing of complex genomes. Nature Rev

Genet 2:573-583. Green ED et al. (1998). Yeast artificial chromosomes. In Genome Analysis: A Laboratory Manual,

Vol. 3 Cloning systems (B Birren et al., eds.; Cold Spring Harbor Laboratory Press), pp. 297-565.

Gregory SG et al. (1997). Genome mapping by fluorescent fingerprinting. Genome Res 7:1162-

1168. Hillier LW et al. (2004). Sequence and comparative analysis of the chicken genome provide

unique perspectives on vertebrate evolution. Nature 432:695-716. International Human Genome Sequencing Consortium (2001). Initial sequencing and analysis of

the human genome. Nature 409:860-921.

Page 45: Techniques for Genome Mapping & Sequencing · PDF fileEric D. Green, M.D., Ph.D. Techniques for Genome Mapping & Sequencing Page 1 Techniques for Genome Mapping & Sequencing ... (e=2.718281828459

International Human Genome Sequencing Consortium (2004). Finishing the euchromatic

sequence of the human genome. Nature 431:931-945. Lindblad-Toh K et al. (2005). Genome sequence, comparative analysis and haplotype structure of

the domestic dog. Nature 438:803-819. Margulies EM et al. (2003). Identification and characterization of multi-species conserved

sequences. Genome Res 13:2507-2518. Margulies EM et al. (2005). An initial strategy for the systematic identification of functional

elements in the human genome by low-redundancy comparative sequencing. Proc Natl Acad Sci 102:4795-4800.

Margulies M et al. (2005). Genome sequencing in microfabricated high-density picolitre reactors.

Nature 437:376-380. Marra MA et al. (1997). High throughput fingerprint analysis of large-insert clones. Genome Res

7:1072-1084. Messing J and Llaca V (1998). Importance of anchor genomes for any plant genome project. Proc

Natl Acad Sci 95:2017-2020. Metzker ML et al. (2005). Emerging technologies in DNA sequencing. Genome Res 15:1767-

1776. Mouse Genome Sequencing Consortium (2002). Initial sequencing and comparative analysis of

the mouse genome. Nature 420:520-562. Rat Genome Sequencing Project Consortium (2004). Genome sequence of the Brown Norway rat

yields insights into mammalian evolution. Nature 428:493-521. Shendure J et al. (2005). Accurate multiplex polony sequencing of an evolved bacterial genome.

Science 309:1728-1732. Thomas JW et al. (2003). Comparative analyses of multi-species sequences from targeted

genomic regions. Nature 424:788-793. Venter JC et al. (2001). The sequence of the human genome. Science 291:1304-1351. Wilson RK and Mardis ER (1997). Fluorescence-based DNA sequencing. In Genome Analysis: A

Laboratory Manual, Vol. 1 Analyzing DNA (B Birren et al., eds.; Cold Spring Harbor Laboratory Press), pp. 301-395.

Wilson RK and Mardis ER (1997). Shotgun sequencing. In Genome Analysis: A Laboratory

Manual, Vol. 1 Analyzing DNA (B Birren et al., eds.; Cold Spring Harbor Laboratory Press), pp. 397-454.