43
http://bejerano.stanford.edu 1 GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAG GCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGC AATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAA CGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTT TTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCC CTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTC AGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAG ACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATC CCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGT GCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAA TGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCT CTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAG GAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATT TAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTC AGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCC ATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGAT GCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTA GTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGA GAAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTG GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAG GCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGC AATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAA CGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTT TTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCC CTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTC AGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAG ACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATC CCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGT GCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAA TGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCT CTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAG Ultraconservation and Living Fossils: Mysteries of the Human Genome Assistant Professor Dept. of Developmental Biology & Dept. of Computer Science Stanford University Postdoc w/David Haussler School of Engineering UC Santa Cruz Gill Bejerano, PhD 2007 2006

GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

http://bejerano.stanford.edu 1

GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAG

Ultraconservation and Living Fossils:Mysteries of the Human Genome

Assistant ProfessorDept. of Developmental Biology& Dept. of Computer Science

Stanford University

Postdoc w/David HausslerSchool of Engineering

UC Santa Cruz

Gill Bejerano, PhD20072006

Page 2: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

This is “the Century of Biology”

http://bejerano.stanford.edu 2

Page 3: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

We can now cast Biology in “our” terms

http://bejerano.stanford.edu 3

strings

time series

circuits

Page 4: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

The Meaning of Life (abridged)

http://bejerano.stanford.edu 4

Page 5: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

DNA: Functional and Non-Functional

http://bejerano.stanford.edu 5

DNA = linear molecule that carries genetic instructions for making living organisms ~ long string over a small alphabetAlphabet of four {A,C,G,T} Strings of length 104-1011

...ACGTACGACTGACTAGCATCGACTACGACTAGCAC...

genetic instructions:

how to...when to...where to...

“junk” DNA “junk” DNA

Page 6: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

One Cell, One Genome, One Replication

http://bejerano.stanford.edu 6

Every cell holds a single copy of all its DNA = its genome.The genome is replicated every cell division.The human body is made of ~1014 cells.All originate from a single cell through cell division.

cell

genome =all DNA

chicken ≈ 1014 copies(DNA) of egg (DNA)

chicken

egg egg

egg

celldivision

DNAstring

Page 7: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

http://bejerano.stanford.edu 7

Comparative Genomics

human

mouserat

chimp

chicken

fugu

zfish

dog

tetra

opossum

cow

macaque

platypus

humanchimp

mouserat

chicken

fugu

zfish

dog

tetra

macaque

cow

opossumplatypus

“Nothing in Biology Makes Sense Except in the Light of Evolution”

Theodosius Dobzhansky

t

Intelligent Designer

Page 8: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

DNA Replication is Imperfect

http://bejerano.stanford.edu 8

Small Scale: single letters are substituted, erased, added

...ACGTACGACTGACTAGCATCGACTACGA...

chicken

egg ...ACGTACGACTGACTAGCATCGACTACGA...

functionaljunk

TT CAT

“anythinggoes”

many changesare not tolerated

chicken

thus, sequence conservation function!

Page 9: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Conservation implies Function

http://bejerano.stanford.edu 9

Comparative Genomics of Distantly related species:

functional region!

human

mouse

mammalianancestor

...CTTTGCGA-TGAGTAGCATCTACTATTT...

...ACGTGGGACTGACTA-CATCGACTACGA...

(but which function/s?...)

Page 10: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

The Human Genome is Full of Mysteries

http://bejerano.stanford.edu 10

all human-mouse DNAhuman-mouse junk DNA

Difference: 5% of Human

HumanGenome:

3*109 letters

[Mouse Consortium 2002]conservation level

frequ

ency

[Science 2004 Breakthrough of the Year, 5th runner up]

1.5%known

function >50%junk

3x more functional DNA than known!But what do these 107 substrings do?..

hilo

why bother?why bother?

Page 11: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Genes, Proteins and Gene Control

http://bejerano.stanford.edu 11

gene (how to)control region(when & where)

DNA

proximal: in 103 letters

genome.ucsc.edu3kb

Page 12: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Ultraconserved Elements

http://bejerano.stanford.edu 12

[Bejerano et al., Science 2004]

HOXA4 exon

Page 13: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Why is Perfect Conservation So Surprising?

http://bejerano.stanford.edu 13

If a substring is identical between enough distant species,it must have rejected many different changes over time.But... all functions we understand in our genome are encoded using redundant codes.

*****

Coding: 3 DNA letters → 1 Protein letter.E.g. Protein Coding Genes:DNA – 108 letters

over alphabet of 4.Protein – 102 letters

over alphabet of 20.

Page 14: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Genes, Proteins and Gene Control Revisited

http://bejerano.stanford.edu 14

gene (how to)control region(when & where)distal: in 106 letters

DNA

proximal: in 103 letters

DNA bindingproteins

Page 15: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Vertebrate Gene Regulation

http://bejerano.stanford.edu 15

gene (how to)control region(when & where)~106 letters!!!

DNA

~103 letters

crucial regulationmany thousandspreviously invisible

Page 16: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Ultraconserved Elements

http://bejerano.stanford.edu 16

481 regions perfectly conserved over 200 DNA bases or more, between human, mouse and rat (P<10-22 in "junk")

• Evolve 20-fold slower than human average.• Most do not overlap protein coding DNA.• Those that do not code cluster spatially,

near genes encoding DNA binding proteins.Dozens validated since as controlling genes.

• Those that do code, are found in genescoding for a specific type of protein.

• The tip of a continuum of very slowly evolving elements.• The ultras cannot be found beyond vertebrates.

[Bejerano et al., Science 2004Chicken Consortium, Nature 2004]

conservation

freq

Page 17: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Origins of Ultraconserved Elements?

http://bejerano.stanford.edu 17

ultra

cons

erve

d el

emen

ts

Page 18: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Origins of Ultraconserved Element

http://bejerano.stanford.edu 18

uc.338

Page 19: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Coelacanth Homologs to uc.338 Closer than Human Ones

http://bejerano.stanford.edu 19

[Bejerano et al., Nature 2006]

Page 20: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Coelacanth “the Living Fossil” Fish

http://bejerano.stanford.edu 20

Fossil Record: Appeared >360Mya, Peaked 240Mya, Disappeared 80MyaRediscovered (by science) in 1938. Possible Explanation: Habitat Switch.

Page 21: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Repeats / obile Elements ("selfish DNA")

http://bejerano.stanford.edu 21

HumanGenome:

3*109 letters1.5%

knownfunction >50%

junk

Page 22: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

>360My Old and Going Strong

http://bejerano.stanford.edu 22

?

xB

D

Upto 80%id between Coelacanth repeatand human instances, inc uc.338.

repeat repeat

Page 23: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Cis-reg & Ultra elements from obile Elements

http://bejerano.stanford.edu 23

Co-option event, probably due to favorable genomic context

All other copies are destined to decay over time at a neutral rate

[Yass is a small town in New South Wales, Australia.]

[Bejerano et al., Nature 2006]

Page 24: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Exapted Into Which Cellular Roles?

http://bejerano.stanford.edu 24

gene

?

xHuman instances cluster together, found <1Mb from 35 TFs (P<3*10-6).

No evidence for Transcription (Tx) as small RNAs,no orientation preference in introns, not in antisense Tx.

Page 25: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Transient Transgenics

http://bejerano.stanford.edu 25

Eddy Rubin’s Lab, LBNL

Reporter GeneMinimal PromoterConservedElement

in situ

Construct is injected into 1 cell embryosTaken out at embryonic day 10.5-14.5Assayed for reporter gene activity

transgenic

Page 26: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Instance 500kb Downstream of ISL1

http://bejerano.stanford.edu 26

ISL1 is a neuro-developmental gene, also expressed in testis.Three previously known enhancers are conserved in all vertebrates.

1Mb

Page 27: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

http://bejerano.stanford.edu 27

Mouse Isl1 in situ (B) vs. LacZ driven by LF SINE region (C)

Matched staining in genital emimence

Matched staining in dorsal apical ectodermal ridge (part of limb bud)

Nadav Ahituv, Eddy Rubin

Page 28: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Matched Level Sections

http://bejerano.stanford.edu 28

Bryan King, Sofie Salama, Nadav Ahituv, Eddy Rubin

in situtransgenic

Corresponding expression patterns in: (a, b) the developing thalamus (Th)

and basal plate (BP) in the brain. (c, d) the trigeminal (V) ganglion and

facio-acoustic (VII/VIII) ganglia in the head region.

(e, f) the dorsal root ganglion (DRG), and the lateral region of the ventral horn (VH) of the spinal cordin thoracic sections.

Page 29: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

DNA Replication is Imperfect (contd)

http://bejerano.stanford.edu 29

Medium Scale: substrings are duplicated, deleted, invertedLarge Scale: whole DNA strings are duplicated, deleted

junk functional

...ACGTACGACTGACTAGCATCGACTACGA........TCTGACTAGCATCGACTACGA...

...ACGTACGACTGACTAGCATCGACTACGA...

...ACGTACGACTGACTAGCATCGACTACGA........TCTGACTAGCATCGACTACGA...

functionalfunctional

functional’’functional’

substringduplication

functionaldivergence

So...More Genes...More Complexity!!...Right?

Page 30: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Genes & Complexity

http://bejerano.stanford.edu 30

Gene numbers do not correlate with organism complexity. Many gene families are surprisingly old.

flyworm

humanweed

fishrice

# genes

103 cells1014 cells pre-genomic era:

“100,000 genes tothe human genome”

Page 31: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

The Evolution of Morphological Diversity

http://bejerano.stanford.edu 31

Gene numbers do not correlate with organism complexity. Many gene families are surprisingly old.

“Regulatory sequence evolution must be the major contribution to the evolution of form.” [Sean Carroll, PLoS Bio 2005]

In/vertebrate DivideIn/vertebrate Dividefly

wormhuman

weedfishrice

# genes

Hold on... junk DNA can contribute these elements

Page 32: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

From junk DNA to recruitment into pathway?

http://bejerano.stanford.edu 32

[Davidson & Erwin, 2006]

[Britten & Davidson, 1971]

Page 33: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Same Junk, Different Functional Elements

http://bejerano.stanford.edu 33

proteincoding

repeat

generegulating

Page 34: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Additional Mysteries Abound

http://bejerano.stanford.edu 34

Page 35: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Genome in Flux

http://bejerano.stanford.edu 35

Human Genome

Copied out to make ... ???

Copied out to makeprotein coding genes

Page 36: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Bejerano Lab: Research Interests

http://bejerano.stanford.edu 36

Many thousands of human conserved elementscongregate en-masse near developmental genes.[Dog Genome Paper, Nature, 2005; Bejerano et al., Nature Methods, 2005]

Contribution toHuman Disease

Origins & Evolution Functions & Encoding

deve

lopm

ent

Page 37: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Bejerano Lab: Research Interests

http://bejerano.stanford.edu 37

Many thousands of human conserved elementscongregate en-masse near developmental genes.[Dog Genome Paper, Nature, 2005; Bejerano et al., Nature Methods, 2005]

Contribution toHuman Disease

Origins & Evolution Functions & Encoding

deve

lopm

ent

[Ernst Haeckel, 1866]

Page 38: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Bejerano Lab: Research Interests

http://bejerano.stanford.edu 38

Many thousands of human conserved elementscongregate en-masse near developmental genes.

Contribution toHuman Disease

Origins & Evolution Functions & Encoding

Break regulatory code• syntax• grammar• meaning

Page 39: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Bejerano Lab: Research Interests

http://bejerano.stanford.edu 39

Many thousands of human conserved elementscongregate en-masse near developmental genes.

Contribution toHuman Disease

Origins & Evolution Functions & Encoding

In/vertebrate DivideIn/vertebrate Divide

Understand our evolution• Reconstruct ancient genomes• Track regulatory regions histories

Page 40: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Bejerano Lab: Research Interests

http://bejerano.stanford.edu 40

Many thousands of human conserved elementscongregate en-masse near developmental genes.

Contribution toHuman Disease

Origins & Evolution Functions & Encoding

Make a difference• “bench to bedside”

Page 41: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Bejerano Lab: Research Interests

http://bejerano.stanford.edu 41

Many thousands of human conserved elementscongregate en-masse near developmental genes.

Contribution toHuman Disease

Origins & Evolution Functions & Encoding

Discovery tools• large databases• heterogeneous, noisy data• statistical correlations• human interfaces

thousands and thousands ofpage requests served daily

exponential growth of public data

Page 42: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

Summary

http://bejerano.stanford.edu 42

We are only beginning to understand the complexity unearthed by observing whole genomes.

Technology (genome sequencing, gene chips, etc) is flooding us with different form of whole genome measurements – extremely valuable, if challenging.

Some of the challenges discussed today:•Explain Ultraconservation in particular, and the myriad of unexplained constrained elements in our genome.

•Understand the evolution of morphological diversity(how much has repeats contributed to it quantitatively and qualitatively)

•Understand why so much of our genome is transcribed.

Page 43: GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...bejerano.stanford.edu/talks/BejeranoGoogleOct06.pdf · Ultraconservation and Living Fossils: Mysteries of the Human Genome

http://bejerano.stanford.edu 43

Kudos

UC Santa CruzDavid HausslerDavid HausslerSofie Sofie SalamaSalama, Jim Kent, , Jim Kent, Craig Lowe,Bryan King, Bryan King, Adam Siepel, JakobJakob Pedersen Pedersen Katie Pollard, Courtney OnoderaRachel Harte, Genomics/Browser Group

Lawrence Berkeley LabsEddy RubinNadav Ahituv

McGill U.Mathieu Mathieu Blanchette

Penn State U.Webb Miller’s group

U. QueenslandJohn Mattick’s group

Genome Sequencing ConsortiaAll GenBank contributors

Blanchette

Gill [email protected]