Transcript

Genome-wide profiling of PRC1 and PRC2 Polycombchromatin binding in Drosophila melanogasterBas Tolhuis1, Elzo de Wit1,2, Inhua Muijrers1,2, Hans Teunissen1, Wendy Talhout1, Bas van Steensel1 &Maarten van Lohuizen1

Polycomb group (PcG) proteins maintain transcriptionalrepression of developmentally important genes1 and have beenimplicated in cell proliferation and stem cell self-renewal2. Weused a genome-wide approach3 to map binding patterns of PcGproteins (Pc, esc and Sce) in Drosophila melanogaster Kc cells.We found that Pc associates with large genomic regions of upto B150 kb in size, hereafter referred to as ‘Pc domains’. Sceand esc accompany Pc in most of these domains. PcG-boundchromatin is trimethylated at histone H3 Lys27 and isgenerally transcriptionally silent. Furthermore, PcG proteinspreferentially bind to developmental genes. Many of theseencode transcriptional regulators and key components of signaltransduction pathways, including Wingless, Hedgehog, Notchand Delta. We also identify several new putative functions ofPcG proteins, such as in steroid hormone biosynthesis. Theseresults highlight the extensive involvement of PcG proteins inthe coordination of development through the formation oflarge repressive chromatin domains.

We generated genome-wide binding profiles of three PcG proteins(Pc, esc and Sce) in D. melanogaster Kc cells to gain comprehensiveinsight into the nature and regulation of Polycomb target genes. Pcand Sce represent core components of the Polycomb repressivecomplex (PRC) 1, whereas esc belongs to PRC2. These biochemicallydistinct complexes are involved in several aspects of PcG-mediatedgene repression1.

To study PcG binding profiles, we used DamID, a method based onthe ability of a chromatin protein fused to Escherichia coli DNAadenine methyltransferase (Dam) to methylate the native binding siteof the chromatin protein3,4. In this method, Dam fusion proteins areexpressed at very low levels to avoid mistargeting4. Subsequently,methylated DNA fragments are isolated, labeled and hybridized to amicroarray. Methylated DNA fragments from cells transfected withDam alone serve as reference. Genomic binding sites of the proteincan then be identified based on the targeted methylation pattern3. Toensure that the PcG-Dam fusion proteins are functional, we usedprotein blots and immunoprecipitation assays to demonstrate correctexpression and the ability of these fusion proteins to incorporate into

native PcG complexes (Supplementary Fig. 1 online). In addition, ourDamID approach was functionally validated by the increased bindingratios we found at most known PcG target genes (SupplementaryTable 1 online).

Initially, we used a high-density genomic tiling array, coveringB30% of the D. melanogaster genome with 60-mer oligonucleotidesspanning every 100 bp. We were surprised to find that Pc binds largegenomic regions (Fig. 1a, Gene Expression Omnibus (GEO)GSE4564). Using a running mean algorithm we identified 131 regionslarger than 10 kb, which we refer to as ‘Pc domains’. A total of 97 Pcdomains ranged from 10 to 30 kb, and 34 domains were larger than30 kb (Supplementary Fig. 2 online). The largest domain covered145 kb and included three genes (Fig. 1b).

In total, the Pc domains contained 241 genes (SupplementaryTable 2 online), including known PcG target genes such as engrailed,invected (Fig. 1c) and polyhomeotic1. The number of genes within Pcdomains ranged from 0 to 9. Half of the domains contained two ormore genes, about 40% contained a single gene and B10% did notcover any known genes (Fig. 1d).

Previous DamID tiling array data for GAGA factor (GAF) andheterochromatin protein HP1a in the Adh-cactus region (2.9 Mb) didnot show such large domains5, whereas our data shows that this regiondoes contain Pc domains (Supplementary Fig. 3 online). We did notobserve any correlation between Pc and HP1a binding. Notably, mostPc domains seemed to contain at least one sharp peak of GAF binding,which generally coincides with a strong peak of Pc binding. Previousstudies in the homeotic gene cluster showed that GAF and Pc indeedcan bind the same chromatin regions1. However, GAF peaks are notrestricted to Pc domains, which further supports the specificity of thedetected Pc pattern. Notably, comparison of our DamID map of Pcwith chromatin immunoprecipitation maps of endogenous Pc and phin D. melanogaster embryos6 showed a marked correspondencebetween the data obtained with these two independent methods(data not shown). This demonstrates that our DamID results accu-rately reflect the localization of endogenous Pc.

Genetics and reporter studies have identified cis-regulatoryelements, called PcG response elements (PREs), that are essential forPcG-mediated repression and serve as binding sites for PcG proteins1.

Received 26 October 2005; accepted 31 March 2006; published online 20 April 2006; corrected after print 9 June 2006; doi:10.1038/ng1792

1Division of Molecular Genetics, and the Centre for Biomedical Genetics, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands.2These authors contributed equally to this work. Correspondence should be addressed to B.v.S. ([email protected]) and M.v.L. ([email protected]).

6 94 VOLUME 38 [ NUMBER 6 [ JUNE 2006 NATURE GENETICS

LET TERS

In addition, 167 PREs have been predicted based on sequencesimilarities, and for 29 of these PREs, enrichment for Pc bindinghas been confirmed in Schneider cells7. Our tiling array included eightof these experimentally confirmed PREs, and all of these are locatedwithin a Pc domain. Thus, our DamID map is in excellent agreementwith previous experimental data. Of 39 computationally predictedPREs that were present on our tiling array, we found 12 to be locatedwithin a Pc domain (Fig. 1a–c). Additionally, 119 Pc domains lack apredicted PRE, suggesting that the sequence definition of PREs needsfurther refinement. This is supported by the recent discovery of apreviously unknown PRE binding protein and its consensus bindingsite8. Alternatively, PcG proteins could be targeted to chromatin by aPRE-independent mechanism.

The observation that Pc binds to large domains that span entiregenes allowed us to use a 12K cDNA microarray, covering B60% of thecoding genome, to study the distribution of PcG proteins morecomprehensively. Notably, data generated by the two microarraysshowed a high degree of similarity (Supplementary Fig. 4 online). Inaddition, we performed DamID with esc and Sce. We noticed that thechromatin binding profiles of the three PcG proteins were very similar(Fig. 2a–c, GEO GSE4564). In contrast, none of the proteins showedsignificant overlap with HP1a (ref. 9; Fig. 2d), consistent with the tilingarray data and in good agreement with the differences in nuclearlocalization of endogenous HP1a and Pc (Supplementary Fig. 5online). These results demonstrate that Pc, Sce and esc bind mainlyto the same chromatin regions that are distinct from heterochromatin.

Extensive links have been documented between PcG proteins andtrimethylation of Lys27 of histone H3 (H3K27me3)1. We performedH3K27me3 chromatin immunoprecipitation combined with micro-array analysis (ChIP-on-chip), using the 12K cDNA array and com-pared this with the PcG binding profiles. As expected, we observed astrong positive correlation between H3K27me3 levels and binding ofthe three PcG proteins (Fig. 3a–c, GEO GSE4564). For a control, weperformed ChIP-on-chip with an antibody against H3K4me3 or withan irrelevant IgG antibody, neither of which showed any positivecorrelation with PcG proteins (Fig. 3d–f, GEO GSE4564). We con-clude that Pc, Sce and esc bind to chromatin regions containing highlevels of H3K27me3.

Genetic analysis indicated that PcG proteins repress transcription ofsome developmental genes1. Therefore, we compared Pc and escbinding patterns with a genome-wide mRNA expression data set10.Although approximately 70% of all genes are actively transcribed in Kccells10, the Pc and esc targets contained only B20% active genes(Fig. 4a). Moreover, expression of this small active fraction wasroughly twofold lower than expression of non-targets (Fig. 4b). Toconfirm the repressed status of PcG-bound genes, we compared PcGbinding patterns to genome-wide histone modification mappingdata11. The levels of histone H3 acetylation (H3ac) and H3 Lys4dimethylation (H3K4me2), two histone marks that are linked to activetranscription11, were markedly lower at both Pc and esc targets than atnon-targets (Fig. 4c–f). We found similar results with other ‘active’histone modification levels, such as H3K4me3 (note the negative

a

b c

d

8

4

0

0 5 10 15 20 Mb

8

4

0

6

3

0

6

3

0

0 4 8 Mb 0 0.0 0.5 1.0 1.51 Mb Mb

15

10

CG14021 H15 CG6634 CG14020 CG30023 CG7777 inv en touCG7759

CG30022 CG33472 CG30034CG30026

CG7763E(Pc)

anon-48aCG6907

CG31648

clCG12512 CG31647 CG7382

CG7371CG31989

nompC

5.35

40

20

00 1 2 3 4 5 6 7 8 9

5.40 5.45 5.50 6.45 6.50 6.55 6.60 6.65

5

15

20

25

10

0

5

Chromosome 2R (part)

Chromosomal location (Mb)

Number of genes/Pc domain

Pc domain(30 kb)

Pc domain(10–30 kb)

Predicted PREwithin Pc domain

Predicted PREoutside Pc domain

Threshold level Pc

dom

ain

coun

t

Chromosomal location (Mb)

Chromosome 2L

Pc-

Dam

:Dam

Pc-

Dam

:Dam

Pc-

Dam

:Dam

Pc-

Dam

:Dam

Pc-

Dam

:Dam

Pc-

Dam

:Dam

Chromosome 4 Chromosome X (part)

Figure 1 Pc binds large genomic regions or ‘domains’. (a) Plots show running

mean of Pc-Dam:Dam ratios from genomic tiling array with 60-nt probes every

100 bp on chromosomes 2L, 2R, 4 and X. (b,c) Histograms showing unfiltered

data of a 145 kb domain (b) and the engrailed domain (c). Genes located within

the illustrated regions are indicated below the histograms. Black line indicates

running mean filtered data. Y-axes depict binding ratios, and x-axes show genomic

position. Pc domains are highlighted in green (domains 10–30 kb) and red

(domains 430 kb). Blue line represents threshold level. Dots indicate predicted

PREs7 that are either overlapping with domains (blue) or separate from domains

(gray). (d) Bar graph showing the number of Pc domains encompassing the

indicated number of genes. Color key refers to panels a–c.

NATURE GENETICS VOLUME 38 [ NUMBER 6 [ JUNE 2006 69 5

LET TERS

correlations in Fig. 3d,e) and H4ac and H3K79me2 (data not shown).Thus, our genome-wide data demonstrate that binding of Pc and escto chromatin is globally linked to gene repression. However, a smallportion of PcG-bound genes is transcribed, suggesting the existence ofan ‘escape’ mechanism that may allow certain PcG-bound genes toevade silencing. Likewise, Pc binding has been observed at thepromoters of active HOX genes, although the binding level waslower than at the promoters of repressed HOX genes12.

The observation that many Pc domains contain multiple genesraised the possibility that domains act as units of regulation.To investigate this, we compared developmental expression profiles13

of genes within each domain. Indeed, we found some domains thatshowed a high degree of coordinated expression (Fig. 5a). However,the majority of the domains did not show such a correlation (Fig. 5band data not shown). This general lack ofcoordinated expression suggests a more com-plex regulation mechanism. For example, Pcbinding to chromatin has been reported to bedynamic and cell type–specific14,15. Alterna-tively, the ‘escape’ mechanism suggestedabove may allow genes within a Pc domainto be activated individually.

Our genome-wide data allowed for the firsttime to test in an unbiased and thoroughmanner whether PcG proteins preferentiallybind to genes with specific biological func-tions. Therefore, we systematically queried allgene annotation categories in the GeneOntology database16 for enrichment of bind-ing by the three proteins as detected by thecDNA arrays, using the Quontology algo-rithm17 (Supplementary Table 3 online).We found 42 Gene Ontology categories thatdisplayed significantly enriched binding by atleast one of the three proteins, encompassing329 different target genes in total (Fig. 6a,band Supplementary Table 4 online). Theresults suggest that PcG proteins prominentlycontrol developmental processes, as 12PcG-enriched Gene Ontology categories

were annotated as such, including early developmental processes(such as ectoderm and mesoderm development) and late develop-mental processes (such as eye morphogenesis) developmental pro-cesses. Notably, many of the target genes are involved in multipledevelopmental pathways. This is illustrated by the 54 Pc target genes inthe ‘ectoderm development’ category (Fig. 6c and SupplementaryTable 5 online): a majority (44) of these also function in one or moreother developmental categories (Fig. 6c, blue).

The PcG-bound genes that are involved in developmental processesencode primarily two types of proteins. First, a large fraction ofthese target genes encode transcription regulators. For example,almost half (25/54) of the Pc target genes involved in ectodermdevelopment are also annotated as ‘transcription factor activity’,‘RNAP II transcription factor activity’ or ‘specific transcriptionalrepressor activity’, and they include 17 homeodomain-like proteins,four basic-Helix-Loop-Helix proteins, two high mobility group pro-teins, a LIM-domain protein and Dorsal (Fig. 6c, green, and Supple-mentary Table 5). We observed similar enrichment of transcriptionregulators in other developmental categories, and a total of at least 67PcG-bound genes encode proteins involved in transcription regulation(Supplementary Table 4). Moreover, the Pc domains identified on thetiling array included many genes encoding transcriptional regulators(Supplementary Table 6 online).

Secondly, many PcG target genes involved in development encodeproteins implicated in major signaling cascades controlling cell fate.For example, ten Pc targets linked to ectoderm development encodeintegral membrane or secreted proteins (Fig. 6c, red), includingcomponents of important signal transduction pathways such as

r = 0.76 r = 0.78 r = 0.42

r = –0.50 r = –0.60 r = –0.02

3

2

1

0

–1

–1 0 1 2 3–2log2 Pc-Dam:Dam log2 esc-Dam:Dam log2 Sce-Dam:Dam

log 2

H3K

27m

e3:H

3

3

2

1

0

–1

–1 0 1 2 3–2

log 2

H3K

27m

e3:H

3

log 2

H3K

4me3

:H3

log 2

H3K

4me3

:H3

log 2

IgG

:H3

3

2

1

0

–1

3

2

1

0

–1

2

1

0

–1

2

1

0

–1

–1 0 1 2 3–2

–1 0 1 2 3–2log2 Pc-Dam:Dam log2 esc-Dam:Dam log2 Pc-Dam:Dam

–1 0 1 2 3–2 –1 0 1 2 3–2

log 2

H3K

27m

e3:H

3a b c

d e f

Figure 3 PcG-bound chromatin is trimethylated at Lys27 of histone H3. Bivariate scatter plots of

log2-transformed data compare ChIP-on-chip analyses of Kc cell histone modifications and PcG DamID

profiles. (a–c) The mapping data of H3K27me3 fit very well with the Pc (a), esc (b) and Sce (c)

binding profiles. (d–f) In contrast, H3K4me3 data correlate negative with Pc (d) and esc (e). Moreover,

a nonspecific control ChIP shows no overlap with Pc (f). Pearson’s correlation coefficient is shown in

each plot (r).

r = 0.82 r = 0.66

r = 0.54

r = 0.14

3

3

2

2

1

1

log2 Pc-Dam:Dam

log 2

esc

-Dam

:Dam

0

0

–1

–1

–2

–2

3a b

c d

3

2

2

1

1

log2 Pc-Dam:Dam

log 2

Sce

-Dam

:Dam

log 2

HP

1a-D

am:D

am

log 2

Sce

-Dam

:Dam

0

0

–1

–1

–2

–2

3

3

2

2

1

1

log2 Pc-Dam:Dam

0

0

–1

–1

–2

–2

4

3

3

2

2

1

log2 Pc-Dam:Dam

1

0

0

–1

–1

–2

Figure 2 PcG proteins bind largely to same genomic sites. Bivariate scatter

plots of log2-transformed binding ratios demonstrate a high degree of

similarity between binding profiles of different PcG proteins. Shown are

binding profiles for esc versus Sce (a), Pc versus Sce (b), Pc versus esc (c)

and Pc versus HP1a (d). Pearson’s correlation coefficient (r) is shown in

each plot.

6 96 VOLUME 38 [ NUMBER 6 [ JUNE 2006 NATURE GENETICS

LET TERS

Wingless, Notch and Delta (Supplementary Table 5). In addition, thegenomic tiling array revealed that Pc domains included Wnt pathwaygenes (such as wingless, Wnt6 and two Wnt-binding proteins,CG33115 and CG8942). Wnt signaling has previously been correlatedto increased expression of PcG proteins18,19. Combined with our data,this suggests the existence of a feedback loop between PcG proteinsand Wnt signaling. Finally, the hedgehog pathway has been known tobe controlled by PcG proteins20,21, and our data confirm this con-nection (Supplementary Table 1).

We also identified a number of biological functions that had notbeen previously linked to PcG proteins (Fig. 6a). Among these areelectron and glucose transporter activities, various hydrolases, carbo-hydrate and steroid metabolism, oxidoreductases, G protein signalingpathways and transmission of nerve impulse. Some of these genes havepreviously been linked to development. For instance, the nrv2 geneencodes a Na+/K+ ATPase cation transporter that is required fortrachea development22. Notably, several cytochrome P450 proteinsimplicated in biosynthesis of steroid hormones23 were among the PcGtargets (Supplementary Table 4), and the tiling array data showed inaddition that a gene encoding a steroid hormone receptor (EcR) islocated in a Pc domain (Supplementary Table 6). This suggests thatPcG proteins regulate signaling by steroids, which are essentialhormonal stimuli during metamorphosis stages of development24.

In summary, our tiling array data demonstrate that Pc binds tolarge genomic regions, which typically encompass one or more genes.These Pc domains have important implications for understandingPcG-mediated gene repression mechanisms. In addition, we demon-strate strong overlaps between the genomic binding patterns of Pc, Sceand esc, as well as H3K27me3. Finally, our genome-wide data allow,for the first time, examination of the functional role of PcG proteins inan unbiased manner. We observe that PcG-bound genes encode manytranscriptional regulators, suggestive of roles in signaling pathwaysand in steroid hormone biosynthesis. These results imply a key role forPcG proteins in a variety of distinct developmental processes andprovide a broad framework for future developmental studies.

METHODSCell and molecular biology. D. melanogaster Kc cells were grown in BPYE

medium (Shields and Sang M3 Insect medium (Sigma) supplemented with

25% (w/v) bacto-peptone, 20% (w/v) yeast extract and 5% heat inactivated

fetal calf serum (Gibco)) at 23 1C. Full-length Pc, esc and Sce were cloned into

pNDamMyc4, and plasmids were transfected into Kc cells as described25.

DamID and data analysis. DamID experiments were performed as

described4,9. Two independent Pc DamID samples were hybridized to custom

60-mer oligonucleotide tiling arrays, using different dye orientations (Nimble-

Gen Systems). The tiling arrays contained the entire chromosomes 2L and 4,

the first 11 Mb of chromosome 2R, the first 2 Mb of chromosome X,

heterochromatic sequences of chromosome 2, rDNA genes and randomly

generated control probes. Log2-transformed ratios were normalized using

lowess normalization26. To define Pc domains, we first removed random noise

by calculating a running mean with a window size of 2 kb, which is the

approximate resolution of DamID27. Next, we calculated the running mean on

a randomly shuffled chromosome and its distribution for estimation of the

random error. As a threshold level for binding or no binding, we chose three

times the standard deviation of this randomly sampled set. A Pc domain

was defined in the noise-filtered data as 410 kb of consecutive probes that

had a value greater than the threshold. In addition, we allowed for five

consecutive probes to fall beneath the threshold and not break the domain.

Notably, when this method is applied on a randomly shuffled chromosome, not

a single domain is identified, indicating that our definition of Pc domains is

highly stringent.

For each PcG fusion, samples from four independent experiments were

hybridized to Fly12K microarrays (Genomics Facility, Fred Hutchinson Cancer

Research Center) containing 11,857 cDNA clones from the D. melanogaster

Gene Collection, Release 1 and 2, and 126 additional cDNAs from the

Northwest flychip consortium. After removal of low-quality, redundant or

ambiguous proves, hybridization data of 9,994 probes were used for analyses.

Probe annotations were obtained as described9 using Berkeley Drosophila

Genome Project Release 3.1. For statistical analysis, all measured ratios were

log2 transformed and normalized to the median value of the entire array.

Normalized log2 ratios from four independent experiments were averaged.

For the expression state of PcG bound chromatin, we defined Pc and esc

target genes by testing whether log2-transformed ratios were significantly

aP < 2 × 10–16

P < 2 × 10–16 P < 2 × 10–16

P < 2 × 10–16 P < 2 × 10–16 P < 1 × 10–11b

c d

e f

0.8

Pro

port

ion

of a

ctiv

e ge

nes

Mea

n lo

g 2 ex

pres

sion

leve

l

log2 modification level

Rel

ativ

e fr

eque

ncy

0.4

0.0 0.0

0.0

0.4

0.4

0.8

0.8

0.0

0.4

0.8

1.2

1.2

Pc

–2 –1 0 1 2 –2 –1 0 1 2

0.0

0.4

0.8

0.0

0.4

0.8

1.2

–2 –1 0 1 2 –2 –1 0 1 2

N Nesc Pc

H3Ac H3K4me2

P < 2 × 10–16 P < 2 × 10–16

H3Ac H3K4me2

N Nesc

Figure 4 PcG-bound chromatin is transcriptionally repressed. (a–b) Transcrip-

tional activity assessed by the fraction of active genes (a) and transcription

levels of expressed genes (b). (c–f) Histone modification levels of either Pc

targets (c,d) or esc targets (e,f) versus non-targets, represented as density

plots. In all panels, Pc target genes are black (n ¼ 644), esc targets dark

gray (n ¼ 463) and non-targets gray. P values were calculated by

hypergeometric test (e) or Mann-Whitney U-test (a–d,f).

1.5

1.0

0.5

0.0

–0.5

0.0

0.4

0.8a b

–0.4

–1.0

E0 E3 L P M

Mean r = 0.8

CG11629CG1421CG1428

CG8853sobdrm

Mean r = –0.066

Developmental stage

log 2

rela

tive

expr

essi

on le

vel

F E0 E3 L P M F

Figure 5 Developmental expression of genes located within Pc domains.

(a,b) Developmental expression profiles of genes within the 2L_21713465

(a) and 2L_3523335 (b) Pc domains. X-axes show developmental stage:

embryos at 0 h (E0) and 3 h (E3), larva (L), pupa (P), adult male (M) and

adult female (F).

NATURE GENETICS VOLUME 38 [ NUMBER 6 [ JUNE 2006 69 7

LET TERS

greater than 0 using the CyberT algorithm28 followed by a correction for

multiple testing29 adjusting the false discovery rate to 1%.

For the coordinate expression analysis, we used developmental expression

data from reported whole-genome expression profiles of six developmental

stages of D. melanogaster13. Hybridization intensities of each exon probe were

log2 transformed, data from replicate experiments were averaged and exon

probes corresponding to the same gene were averaged to obtain a single log2-

transformed expression level for each gene at each developmental stage. Next,

these values were normalized to the mean log2 value, first by developmental

stage and then by gene. Pearson correlation coefficients (r) were calculated for

the expression data of each possible gene pair within a Pc domain and were

averaged to yield one mean r.

We used the Quontology algorithm17 to identify functional groups of genes

that are significantly bound by PcG proteins. A Z score is calculated for the

functional categories from the FlyBase Gene Ontology database, which mea-

sures the deviation of the average log2-transformed binding ratio for genes in a

category from the genome-wide average, in units of the standard deviation. A

Z score larger than 4 is significant (P o 0.05).

All further statistical analyses were performed using the R package for

statistical computing; scripts are available upon request.

ChIP-on-chip. The ChIP method used was an adaptation of a previously

published protocol30. Approximately 2 � 108 Kc cells were cross-linked

by the addition of formaldehyde (1% final concentration). The reaction

was quenched with 125 mM glycine, and the cells were washed twice

with PBS. Subsequently, the cells were incubated for 5 min on ice in

1.5 ml lysis buffer (1% (v/v) SDS; 10 mM EDTA, pH 8.0; 50 mM Tris-HCl,

pH 8.0; protease inhibitor cocktail (Roche)). The cell lysates were sonicated

and centrifuged to remove cell debris. We used the equivalent of 1 � 106 cells

for one immunoprecipitation. Chromatin samples were diluted tenfold in

immunoprecipitation buffer (1% Triton X-100; 1.2 mM EDTA; 16.7 mM

Tris-HCl, pH 8.0; 167 mM NaCl; protease inhibitor cocktail (Roche)) and

precleared with 10 ml 50% protein A Sepharose slurry containing 1 mg/ml BSA

for 1 h agitation at 4 1C. Precleared chromatin was incubated with 2 mg

antibody overnight at 4 1C. All subsequent ChIP steps were performed

as described30.

We used the following antibodies for ChIP analyses: anti-H3K27me3

(Upstate), anti-H3K4me3 (Abcam) and anti-IgG (Jackson ImmunoResearch).

The data were normalized to a parallel ChIP with antibody to H3 (Abcam) to

correct for differences in nucleosome density.

For the microarray analysis, immunoprecipitated DNA fragments were

amplified using linker-modified PCR. Randomly sheared DNA fragments were

made blunt using T4 DNA polymerase for 15 min at 16 1C in 10 mM Tris-HCl,

pH 8.0; 50 mM NaCl; 10 mM MgCl2; 1 mM DTT and 0.1 mM dNTPs.

Adapter ligation, PCR amplification, labeling and hybridization were carried

out according to the DamID protocol3,9. Data analysis was performed as

described above.

URLs. Gene annotations and ontology data were downloaded from: http://

www.flybase.org. The R package can be downloaded from http://www.

r-project.org. The Gene Expression Omnibus (GEO) is found at http://

www.ncbi.nlm.nih.gov/geo/.

a b c

esc

ScePc Description

Pc targets/GO categoryEctoderm developmentPc targets/GO category

Ectoderm developmentMesoderm development

Ventral cord developmentNeurogenesis

Neuroblast cell fate determination

Eye morphogenesisSkeletal development

Tracheal system developmentMuscle developmentHindgut morphogenesisRegulation of transcription

Cation transportAnion transportCarbohydrate transportCarbohydrate metabolismSteroid metabolism

Transcription factor activity

Electron transporter activity

Endothelin-converting enzyme activityCarboxylesterase activityAlkaline phosphatase activityChitinase activityOxidoreductase activityG-protein coupled receptor activityStructural constituent of peritrophic membrane

MicrosomeMembraneIntegral to membraneExtracellular regionNucleus

Z score Pc targets Pc targets annotated in ectoderm development

Other developmental processes

Transcriptional regulators/nuclear proteins

(Integral) membrane/secreted proteins

Other GO categories

–4 0 4 8

Glucose transporter activitySerine-type endopeptidase activityTrypsin activity

RNAP II transcription factor activitySpecific RNAP II transcription factor activitySpecific transcriptional repressor activity

Transmission of nerve impulseG-protein coupled receptor protein signaling pathway

Negative regulation of transcriptionNegative regulation of transcription from Pol II promoter

Ganglion mother cell fate determination

Leg disc proximal/distal pattern formation

Bio

logi

cal p

roce

ssM

olec

ular

func

tion

Com

part

men

t

0 40 80 0 20 40

Figure 6 Functional role of PcG proteins in Kc cells. (a) Gene Ontology (GO) fingerprint of PcG protein binding profiles. Z scores were calculated using the

Quontology algorithm17. GO categories with a Z score 44 (P o 0.05) in at least one binding profile are listed. GO subcategories are indicated at right.(b) Number of Pc targets for each GO category are indicated by the bars. (c) Bar plots show Pc targets annotated in ‘ectoderm development’. Black bars

represent the total number of Pc targets in this category. The colored bars represent targets that are also annotated in other categories.

6 98 VOLUME 38 [ NUMBER 6 [ JUNE 2006 NATURE GENETICS

LET TERS

Note: Supplementary information is available on the Nature Genetics website.

ACKNOWLEDGMENTSWe thank G. Cavalli and N. Negre for sharing unpublished results and forproviding Pc and ph antibodies; J. Muller for Su(z)12 antibody; F. Greil,H. Pickersgill, M. Heimerickx and the NKI Central Microarray Facility fortechnical assistance; M. Fornerod for sharing R scripts; H. Bussemaker for theQuontology script; the Genomics Facility of the Fred Hutchinson CancerResearch Center for preparing 12K cDNAs and A. Brinkman and H. Stunnenbergfor help with ChIP. This work was supported by a Veni fellowship from theNetherlands Organisation for Scientific Research (NWO) to B.T., a fellowshipfrom the Association of International Cancer Research to I.M., a Centre forBiomedical Genetics grant to M.v.L. and support from the European ‘Epigenome’Network of Excellence and a European Young Investigator (EURYI) award to B.v.S.

COMPETING INTERESTS STATEMENTThe authors declare that they have no competing financial interests.

Published online at http://www.nature.com/naturegenetics/

Reprints and permissions information is available online at http://npg.nature.com/

reprintsandpermissions/

1. Ringrose, L. & Paro, R. Epigenetic regulation of cellular memory by the Polycomb andTrithorax group proteins. Annu. Rev. Genet. 38, 413–443 (2004).

2. Valk-Lingbeek, M.E., Bruggeman, S.W. & van Lohuizen, M. Stem cells and cancer; thepolycomb connection. Cell 118, 409–418 (2004).

3. van Steensel, B., Delrow, J. & Henikoff, S. Chromatin profiling using targeted DNAadenine methyltransferase. Nat. Genet. 27, 304–308 (2001).

4. van Steensel, B. & Henikoff, S. Identification of in vivo DNA targets of chromatinproteins using tethered dam methyltransferase. Nat. Biotechnol. 18, 424–428(2000).

5. Sun, L.V. et al. Protein-DNA interaction mapping using genomic tiling path microarraysin Drosophila. Proc. Natl. Acad. Sci. USA 100, 9428–9433 (2003).

6. Negre, N. et al. Chromosomal distribution of PcG proteins during Drosophila develop-ment. PLoS Biol. 4, e170 (2006).

7. Ringrose, L., Rehmsmeier, M., Dura, J.M. & Paro, R. Genome-wide predictionof Polycomb/Trithorax response elements in Drosophila melanogaster. Dev. Cell 5,759–771 (2003).

8. Dejardin, J. et al. Recruitment of Drosophila Polycomb group proteins to chromatin byDSP1. Nature 434, 533–538 (2005).

9. Greil, F. et al. Distinct HP1 and Su(var)3–9 complexes bind to sets of developmentallycoexpressed genes depending on chromosomal location. Genes Dev. 17, 2825–2838(2003).

10. Schubeler, D. et al. Genome-wide DNA replication profile for Drosophila melanogaster:a link between transcription and replication timing. Nat. Genet. 32, 438–442 (2002).

11. Schubeler, D. et al. The histone modification pattern of active genes revealed throughgenome-wide chromatin analysis of a higher eukaryote. Genes Dev. 18, 1263–1271(2004).

12. Breiling, A., O’Neill, L.P., D’Eliseo, D., Turner, B.M. & Orlando, V. Epigenome changesin active and inactive polycomb-group-controlled regions. EMBO Rep. 5, 976–982(2004).

13. Stolc, V. et al. A gene expression map for the euchromatic genome of Drosophilamelanogaster. Science 306, 655–660 (2004).

14. Ficz, G., Heintzmann, R. & Arndt-Jovin, D.J. Polycomb group proteincomplexes exchange rapidly in living Drosophila. Development 132, 3963–3976(2005).

15. Hernandez-Munoz, I., Taghavi, P., Kuijl, C., Neefjes, J. & vanLohuizen, M. Assocationof BMI1 with Polycomb bodies is dynamic and requires PRC2/EZH2 and the main-tenance DNA methyltransferase DNMT1. Mol. Cell. Biol. 25, 11047–11058 (2005).

16. Harris, M.A. et al. The Gene Ontology (GO) database and informatics resource. NucleicAcids Res. 32, D258–D261 (2004).

17. Lascaris, R. et al. Hap4p overexpression in glucose-grown Saccharomycescerevisiae induces cells to enter a novel metabolic state. Genome Biol. 4, R3(2003).

18. Kirmizis, A., Bartley, S.M. & Farnham, P.J. Identification of the polycomb group proteinSU(Z)12 as a potential molecular target for human cancer therapy. Mol. Cancer Ther.2, 113–121 (2003).

19. Klebes, A. et al. Regulation of cellular plasticity in Drosophila imaginal disc cells bythe Polycomb group, trithorax group and lama genes. Development 132, 3753–3765(2005).

20. Leung, C. et al. Bmi1 is essential for cerebellar development and is overexpressed inhuman medulloblastomas. Nature 428, 337–341 (2004).

21. Maurange, C. & Paro, R. A cellular memory module conveys epigenetic inheritance ofhedgehog expression during Drosophila wing imaginal disc development. Genes Dev.16, 2672–2683 (2002).

22. Paul, S.M., Ternet, M., Salvaterra, P.M. & Beitel, G.J. The Na+/K+ ATPase is requiredfor septate junction function and epithelial tube-size control in the Drosophila trachealsystem. Development 130, 4963–4974 (2003).

23. Warren, J.T. et al. Molecular and biochemical characterization of two P450 enzymes inthe ecdysteroidogenic pathway of Drosophila melanogaster. Proc. Natl. Acad. Sci. USA99, 11043–11048 (2002).

24. Baehrecke, E.H. Steroid regulation of programmed cell death during Drosophiladevelopment. Cell Death Differ. 7, 1057–1062 (2000).

25. Henikoff, S., Ahmad, K., Platero, J.S. & van Steensel, B. Heterochromatic depositionof centromeric histone H3-like proteins. Proc. Natl. Acad. Sci. USA 97, 716–721(2000).

26. Cleveland, W.S. Robust locally weighted regression and smoothing scatterplots. J. Am.Stat. Assoc. 74, 829–836 (1979).

27. van Steensel, B., Delrow, J. & Bussemaker, H.J. Genomewide analysis of DrosophilaGAGA factor target genes reveals context-dependent DNA binding. Proc. Natl. Acad.Sci. USA 100, 2580–2585 (2003).

28. Baldi, P. & Long, A.D. A Bayesian framework for the analysis of microarray expressiondata: regularized t-test and statistical inferences of gene changes. Bioinformatics 17,509–519 (2001).

29. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practicaland powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodolog. 57,289–300 (1995).

30. Rietveld, L.E., Caldenhoven, E. & Stunnenberg, H.G. In vivo repression of an erythroid-specific gene by distinct corepressor complexes. EMBO J. 21, 1389–1397(2002).

NATURE GENETICS VOLUME 38 [ NUMBER 6 [ JUNE 2006 69 9

LET TERS

CORR IGENDA

Corrigendum: Genome-wide profiling of PRC1 and PRC2 Polycomb chromatin binding in Drosophila melanogasterBas Tolhuis, Inhua Muijrers, Elzo de Wit, Hans Teunissen, Wendy Talhout, Bas van Steensel & Maarten van LohuizenNature Genetics 38, 694–699 (2006); published online 20 April 2006; corrected after print 9 June 2006

In the version of this article initially published, the order of the first three authors was incorrect. The correct order should be Bas Tolhuis, Elzo de Wit, Inhua Muijrers. This error has been corrected in the PDF version of the article.


Recommended