39
Erythropoiesis, gene regulation and comparative genomics CCGB Jan 30, 2008

Erythropoiesis, gene regulation and comparative genomicsross/share/ErythGeneReglnCompGeno.pdf · Erythropoiesis, gene regulation and comparative genomics CCGB Jan 30, 2008. Connections

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Erythropoiesis, gene regulation andcomparative genomics

CCGBJan 30, 2008

Connections and expansions

• Three recent and current projects:– Determinants of GATA-1 occupancy– Regulation of microRNAs that regulated erythropoiesis– Variation in regulation in humans

• Explore connections and possible expansions– Finding discriminatory motifs– More complete data on erythroid regulation– What are the functions of DNA segments occupied by

transcription factors?– Relationship of conservation and function– Which sequence variants make a difference?– Improved methods of predicting regulatory regions

What determines occupancy of agenomic interval by a sequence-specific

binding protein?

GATA-1 is required for erythroid maturation

Aria Rad, 2007 http://commons.wikimedia.org/wiki/Image:Hematopoiesis_(human)_diagram.png

MEP Hematopoietic stem cell

Commonmyeloidprogenitor

Myeloblast

Basophil

Commonlymphoidprogenitor

Neutrophil

Eosinophil

Monocyte, macrophage

GATA-1G1E cells

G1E-ER4 cells

GATA-1 occupancy in 66Mb of mouse

ChIP-chipin

referencesites

qPCR to validated ChIP-chip peaks

Distribution of occupied sites

Multiple WGATAR motifs in occupied sites

But only about 1 in 1000intervals with WGATARare occupied!

Probabilistic methods find somediscriminatory motifs

Combinations of motifs are associated with amajority of occupied sites

Connections and expansions: Motifs

• Enumerative methods for finding enriched motifs– Hexamer counting: Ying Zhang, Bob Harris, Kuan-Bei

Chen, Francesca Chiaromonte– Chungoo Park, K. Makova, F. Chiaromonte

• Suffix trees– Jianbin He, A. Nekrutenko

• Other efforts– Cizhong Jiang, Frank Pugh

Connections and expansions: Other genome features

• Are the regions not bound by GATA-1 in a“repressed” or “condensed” state?– Histone modifications associated with silencing, e.g.

H3K27Me• What other transcription factors are also bound?• Multiple grad students and labs interested• More high-throughput ChIP analysis:

– NimbleGen microarrays?– Solexa sequencing?– AB SOLiD sequencing?

Do the occupied sites have identifiablefunctions?

Occupied sites are closer to genes regulated by GATA-1than to randomly chosen sites or genes

Occupiedsites act asenhancersmuch morefrequently

thanunoccupied

sites

About half the occupied sites are enhancers

Many sites occupied by GATA-1 are close to promoters

GATA-1 occupied sites that overlap TSS that TSS, total = 63 overlap occupied sites

empiricalTSS source Total After merge Nmbr % p-value NmbrRefSeq 830 745, 3.81Mb 19 30% <0.001 19 5000bpKnown Genes 1170 736, 3.84Mb 23 37% <0.001 22 5000bpCAGE tags 20,431 4255, 2.13Mb 25 40% <0.001 24 (RIKEN), 500bp

80% of genes whose expression changes upon restoration of GATA-1 havean occupied site in the vicinity.

Connections and expansions: Function

• Occupied sites can be far from each other on thelinear sequence map, but are they close in theinterphase nucleus?– Chromosome conformation capture, genome-wide

• Can one see a gain-of-function OR loss-of-function phenotype in other species or systems?

• Are occupied sites also associated with thenuclear matrix?

• Insulators? Domain boundaries?

What is conservation good for?

Different functional classes diverge atdistinctive rates

Miller et al. (2007) 28-species alignments … Genome Research

Depth of conservation of GATA-1 occupied sites andmotifs

Conservation of the WGATAR motif isassociated with enhancer activity

Connections and expansions: Using conservation

• Need to provide robust tools for finding classifyingregulatory regions and motifs as clade-specific, conservedto a certain distance, or subject to turnover– Current efforts: David King, Kuan-Bei Chen, Demesew Abebe

• Are regulatory regions and motifs conserved to aparticular phylogenetic distance associated withsomething distinctive?– Functional category of the regulated gene (e.g. GO terms)– Biochemical mechanism?

• What sorts of functions are found in clade-specificregulatory regions?

• Are we missing conserved regions because of limitationsin sequences available or the choice of comparisonsequences?

Two interesting and cautionary tales…

Control of a regulatory miRNA

• Multiple labs, institutions– Lou Dore, Julio Amigo, Camila dos Santos …– Mitch Weiss, Barry Paw, Hardison– Children’s Hospital of Phila, Harvard, PSU

• miRNAs 144 and 451, two products of the samemiRNA gene, are required for late erythroidmaturation

• Position of an enhancer and occupancy byGATA-1 was successful using regulatory potentialplus a conserved consensus GATA-1 binding site

miR-144, 451 are highly expressed inresponse to GATA-1

L. Dore, J. Amigo, C. dos Santos…..R. Hardison, B. Paw, M. Weiss (2008) PNAS

miR-451 is required for erythroidmaturation

Accurate prediction of a GATA-1responsive enhancer for miR-144, 451

A

miRNA enhancer is seen only in mammalsby multiZ

But you CAN see something like it inmultiZ alignments of fish genomes!

A site occupied by GATA-1 upstream of HBG2is a negative regulator of expression in adult

cells• Multiple institutes:

– Zhiyi Chen, Hong-yuan Luo, Marty Steinberg, David Chui, BostonUniv

– George Patrinos, Erasmus University– Hardison, PSU

• Family of Iranian descent with elevated HbF (clinicallywell)

• New polymorphism upstream of promoter of HBG2,encoding gamma-globin

• GATA-1 binding motif is found only in simian primates,correlates with pattern of expression

• Affects binding by GATA-1 and acts as a negativeregulator in mutagenesis and transfection assays

Table 1. Clinical laboratory findings of a Iranian family with increased Hb F

Father 52 15.7 82 2.5 10.2 α α / α α

Familymembers

Age atdiagnosis

(years)

Hb(g/dl)

MCVfl

Hb A2(%)

Hb F(%)

α-Globingenotype

Mother 44 14.1 77 3.0 0.7 - α3.7 / α αBrother 13 13.2 75 3.4 0.7 - α3.7 / α αProband 9 13.8 75 3.3 5.9 α α / α α

A

B

Fig. 1

-567

…..3’5’…..

…..3’5’…..

-567

5’ 3’…..

….. ……………………………………………

………………..

… … ……

ε Gγ ψβ Aγ δ β

β-like genes

-567 motif restricted to simians

-1383GTCTTTTAGCCGCCTAACACTTTGAGCAGATATAAGCCTTACACAGGATTATGAAGTCTGAAAGGATTCC GATA—1288 GATA--1280ACCAATATTATTATAATTCCTATCAACCTGATAGGTTAGGGGAAGGTAGAGCTCTCCTCCAATAAGCCAGATTTCCAGAGTTTCTGACGTCATAATCTACCAAGGTCATGGATCGAGTTCAGAGAAAAAACAAAAGCAAAACCAAACCTACCAAAAAATAAAAATCCCAAAGAAAAAATAAAGAAAAAAACAGCATGAATACTTCCTGCC GATA-1058ATGTTAAGTGGCCAATATGTCAGAAACAGCACTGAGTTACAGATAAAGATGTCTAAACTACAGTGACATCCCAGCTGTCACAGTGTGTGGACTATTAGTCAATAAAACAGTCCCTGCCTCTTAAGAGTTGTTTTCCATGC GATA-898AAATACATGTCTTATGTCTTAGAATAAGATTCCCTAAGAAGTGAACCTAGCATTTATACAAGATAATTAATTCTAATCCATAGTATCTGGTAAAGAGCATTCTACCATCATCTTTACCGAGCATAGAAGAGCTACACCAA GATA-781AACCCTGGGTCATCAGCCAGCACATACACTTATCCAGTGATAAATACACATCATCGGGTGCCTACATACATACCTGAATATAAAAAAAATACTTTTGCTGAGATGAAACAGGCGTGATTTATTTCAAATAGGTACGGATAAGTAGATATTGAAGTAAGGATTCAGTCTTATATTATATTACATAACATTAATCTATTCCTGCACTGAAAC GATA-567TGTTGCTTTATAGGATTTTTCACTACACTAATGAGAACTTAAGAGATAATGGCCTAAAACCACAGAGAGT GATA-534ATATTCAAAGATAAGTATAGCACTTCTTATTTGGAAACCAATGCTTACTAAATGAGACTAAGACGTGTCCCATCAAAAATCCTGGACCTATGCCTAAAACACATTTCACAATCCCTGAACTTTTCAAAAATTGGTACATGCTTTAACTTTAAACTACAGGCCTCACTGGAGCTACAGACAAGAAGGTGAAAAACGGCTGACAAAAGAAGTCCTGGTATCTTCTATGGTGGGAGAAGAAAACTAGCTAAAGGGAAGAATAAATTAGAGAAAAATTGGAATGACTGAATCGGAACAAGGCAAAGGCTATAAAAAAAATTAAGCAGCAGTATCCTCTTGGGGGCCCCTTCCCC GATA-185 GATA-171ACACTATCTCAATGCAAATATCTGTCTGAAACGGTCCCTGGCTAAACTCCACCCATGGGTTGGCCAGCCTTGCCTTGACCAATAGCCTTGACAAGGCAAACTTGACCAATAGTCTTAGAGTATCCAGTGAGGCCAGGGGC +1CGGCGGCTGGCTAGGGATGAAGAATAAAAGGAAGCACCCTTCAGCAGTTCCACA +49

Re

lativ

e L

uc

ifera

se

Ac

tivity

(% W

T)

0

50

100

150

200

250

300

350

pGL

3-B

asic

WTMUT

-171

MUT

-1288MUT

-1058MUT

-534MUT

-185

MUT

-1280MUT

-898MUT

-781

MUT

-567

* (P < 0.005)

Fig. 6

AB

Connections and expansions: Sequencevariants that make a difference

• Human variation and conservation with apesused to find DNA segments deviating fromneutrality– Heather Lawson, Hiroshi Akashi, several others

• Recording and organizing human variation data– Belinda Giardine and Cathy Riemer, Chui, Patrinos

• Finding variants comprehensively– Genomes of Jim and Craig– More humans?– Other species?

• What are regions that should be targeted forresequencing in specific populations?

Connections and Expansions: Improvingprediction of cis-regulatory modules

• Can we break the 50% ceiling? That is thevalidation rate for:– Ultra- and highly conserved noncoding sequences as

early developmental enhancers by transienttransgenics (Pennachio et al)

– Positive regulatory potential and conserved consensusWGATAR for erythroid regulated genes (Wang et al.)

– Sites occupied by GATA-1 (Cheng et al.)• ESPERR applied to occupied sites, different

classes of promoters (David Gilmour), others ?