Genome-wide profiling of PRC1 and PRC2 Polycombchromatin binding in Drosophila melanogasterBas Tolhuis1, Elzo de Wit1,2, Inhua Muijrers1,2, Hans Teunissen1, Wendy Talhout1, Bas van Steensel1 &Maarten van Lohuizen1
Polycomb group (PcG) proteins maintain transcriptionalrepression of developmentally important genes1 and have beenimplicated in cell proliferation and stem cell self-renewal2. Weused a genome-wide approach3 to map binding patterns of PcGproteins (Pc, esc and Sce) in Drosophila melanogaster Kc cells.We found that Pc associates with large genomic regions of upto B150 kb in size, hereafter referred to as ‘Pc domains’. Sceand esc accompany Pc in most of these domains. PcG-boundchromatin is trimethylated at histone H3 Lys27 and isgenerally transcriptionally silent. Furthermore, PcG proteinspreferentially bind to developmental genes. Many of theseencode transcriptional regulators and key components of signaltransduction pathways, including Wingless, Hedgehog, Notchand Delta. We also identify several new putative functions ofPcG proteins, such as in steroid hormone biosynthesis. Theseresults highlight the extensive involvement of PcG proteins inthe coordination of development through the formation oflarge repressive chromatin domains.
We generated genome-wide binding profiles of three PcG proteins(Pc, esc and Sce) in D. melanogaster Kc cells to gain comprehensiveinsight into the nature and regulation of Polycomb target genes. Pcand Sce represent core components of the Polycomb repressivecomplex (PRC) 1, whereas esc belongs to PRC2. These biochemicallydistinct complexes are involved in several aspects of PcG-mediatedgene repression1.
To study PcG binding profiles, we used DamID, a method based onthe ability of a chromatin protein fused to Escherichia coli DNAadenine methyltransferase (Dam) to methylate the native binding siteof the chromatin protein3,4. In this method, Dam fusion proteins areexpressed at very low levels to avoid mistargeting4. Subsequently,methylated DNA fragments are isolated, labeled and hybridized to amicroarray. Methylated DNA fragments from cells transfected withDam alone serve as reference. Genomic binding sites of the proteincan then be identified based on the targeted methylation pattern3. Toensure that the PcG-Dam fusion proteins are functional, we usedprotein blots and immunoprecipitation assays to demonstrate correctexpression and the ability of these fusion proteins to incorporate into
native PcG complexes (Supplementary Fig. 1 online). In addition, ourDamID approach was functionally validated by the increased bindingratios we found at most known PcG target genes (SupplementaryTable 1 online).
Initially, we used a high-density genomic tiling array, coveringB30% of the D. melanogaster genome with 60-mer oligonucleotidesspanning every 100 bp. We were surprised to find that Pc binds largegenomic regions (Fig. 1a, Gene Expression Omnibus (GEO)GSE4564). Using a running mean algorithm we identified 131 regionslarger than 10 kb, which we refer to as ‘Pc domains’. A total of 97 Pcdomains ranged from 10 to 30 kb, and 34 domains were larger than30 kb (Supplementary Fig. 2 online). The largest domain covered145 kb and included three genes (Fig. 1b).
In total, the Pc domains contained 241 genes (SupplementaryTable 2 online), including known PcG target genes such as engrailed,invected (Fig. 1c) and polyhomeotic1. The number of genes within Pcdomains ranged from 0 to 9. Half of the domains contained two ormore genes, about 40% contained a single gene and B10% did notcover any known genes (Fig. 1d).
Previous DamID tiling array data for GAGA factor (GAF) andheterochromatin protein HP1a in the Adh-cactus region (2.9 Mb) didnot show such large domains5, whereas our data shows that this regiondoes contain Pc domains (Supplementary Fig. 3 online). We did notobserve any correlation between Pc and HP1a binding. Notably, mostPc domains seemed to contain at least one sharp peak of GAF binding,which generally coincides with a strong peak of Pc binding. Previousstudies in the homeotic gene cluster showed that GAF and Pc indeedcan bind the same chromatin regions1. However, GAF peaks are notrestricted to Pc domains, which further supports the specificity of thedetected Pc pattern. Notably, comparison of our DamID map of Pcwith chromatin immunoprecipitation maps of endogenous Pc and phin D. melanogaster embryos6 showed a marked correspondencebetween the data obtained with these two independent methods(data not shown). This demonstrates that our DamID results accu-rately reflect the localization of endogenous Pc.
Genetics and reporter studies have identified cis-regulatoryelements, called PcG response elements (PREs), that are essential forPcG-mediated repression and serve as binding sites for PcG proteins1.
Received 26 October 2005; accepted 31 March 2006; published online 20 April 2006; corrected after print 9 June 2006; doi:10.1038/ng1792
1Division of Molecular Genetics, and the Centre for Biomedical Genetics, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands.2These authors contributed equally to this work. Correspondence should be addressed to B.v.S. ([email protected]) and M.v.L. ([email protected]).
6 94 VOLUME 38 [ NUMBER 6 [ JUNE 2006 NATURE GENETICS
LET TERS
In addition, 167 PREs have been predicted based on sequencesimilarities, and for 29 of these PREs, enrichment for Pc bindinghas been confirmed in Schneider cells7. Our tiling array included eightof these experimentally confirmed PREs, and all of these are locatedwithin a Pc domain. Thus, our DamID map is in excellent agreementwith previous experimental data. Of 39 computationally predictedPREs that were present on our tiling array, we found 12 to be locatedwithin a Pc domain (Fig. 1a–c). Additionally, 119 Pc domains lack apredicted PRE, suggesting that the sequence definition of PREs needsfurther refinement. This is supported by the recent discovery of apreviously unknown PRE binding protein and its consensus bindingsite8. Alternatively, PcG proteins could be targeted to chromatin by aPRE-independent mechanism.
The observation that Pc binds to large domains that span entiregenes allowed us to use a 12K cDNA microarray, covering B60% of thecoding genome, to study the distribution of PcG proteins morecomprehensively. Notably, data generated by the two microarraysshowed a high degree of similarity (Supplementary Fig. 4 online). Inaddition, we performed DamID with esc and Sce. We noticed that thechromatin binding profiles of the three PcG proteins were very similar(Fig. 2a–c, GEO GSE4564). In contrast, none of the proteins showedsignificant overlap with HP1a (ref. 9; Fig. 2d), consistent with the tilingarray data and in good agreement with the differences in nuclearlocalization of endogenous HP1a and Pc (Supplementary Fig. 5online). These results demonstrate that Pc, Sce and esc bind mainlyto the same chromatin regions that are distinct from heterochromatin.
Extensive links have been documented between PcG proteins andtrimethylation of Lys27 of histone H3 (H3K27me3)1. We performedH3K27me3 chromatin immunoprecipitation combined with micro-array analysis (ChIP-on-chip), using the 12K cDNA array and com-pared this with the PcG binding profiles. As expected, we observed astrong positive correlation between H3K27me3 levels and binding ofthe three PcG proteins (Fig. 3a–c, GEO GSE4564). For a control, weperformed ChIP-on-chip with an antibody against H3K4me3 or withan irrelevant IgG antibody, neither of which showed any positivecorrelation with PcG proteins (Fig. 3d–f, GEO GSE4564). We con-clude that Pc, Sce and esc bind to chromatin regions containing highlevels of H3K27me3.
Genetic analysis indicated that PcG proteins repress transcription ofsome developmental genes1. Therefore, we compared Pc and escbinding patterns with a genome-wide mRNA expression data set10.Although approximately 70% of all genes are actively transcribed in Kccells10, the Pc and esc targets contained only B20% active genes(Fig. 4a). Moreover, expression of this small active fraction wasroughly twofold lower than expression of non-targets (Fig. 4b). Toconfirm the repressed status of PcG-bound genes, we compared PcGbinding patterns to genome-wide histone modification mappingdata11. The levels of histone H3 acetylation (H3ac) and H3 Lys4dimethylation (H3K4me2), two histone marks that are linked to activetranscription11, were markedly lower at both Pc and esc targets than atnon-targets (Fig. 4c–f). We found similar results with other ‘active’histone modification levels, such as H3K4me3 (note the negative
a
b c
d
8
4
0
0 5 10 15 20 Mb
8
4
0
6
3
0
6
3
0
0 4 8 Mb 0 0.0 0.5 1.0 1.51 Mb Mb
15
10
CG14021 H15 CG6634 CG14020 CG30023 CG7777 inv en touCG7759
CG30022 CG33472 CG30034CG30026
CG7763E(Pc)
anon-48aCG6907
CG31648
clCG12512 CG31647 CG7382
CG7371CG31989
nompC
5.35
40
20
00 1 2 3 4 5 6 7 8 9
5.40 5.45 5.50 6.45 6.50 6.55 6.60 6.65
5
15
20
25
10
0
5
Chromosome 2R (part)
Chromosomal location (Mb)
Number of genes/Pc domain
Pc domain(30 kb)
Pc domain(10–30 kb)
Predicted PREwithin Pc domain
Predicted PREoutside Pc domain
Threshold level Pc
dom
ain
coun
t
Chromosomal location (Mb)
Chromosome 2L
Pc-
Dam
:Dam
Pc-
Dam
:Dam
Pc-
Dam
:Dam
Pc-
Dam
:Dam
Pc-
Dam
:Dam
Pc-
Dam
:Dam
Chromosome 4 Chromosome X (part)
Figure 1 Pc binds large genomic regions or ‘domains’. (a) Plots show running
mean of Pc-Dam:Dam ratios from genomic tiling array with 60-nt probes every
100 bp on chromosomes 2L, 2R, 4 and X. (b,c) Histograms showing unfiltered
data of a 145 kb domain (b) and the engrailed domain (c). Genes located within
the illustrated regions are indicated below the histograms. Black line indicates
running mean filtered data. Y-axes depict binding ratios, and x-axes show genomic
position. Pc domains are highlighted in green (domains 10–30 kb) and red
(domains 430 kb). Blue line represents threshold level. Dots indicate predicted
PREs7 that are either overlapping with domains (blue) or separate from domains
(gray). (d) Bar graph showing the number of Pc domains encompassing the
indicated number of genes. Color key refers to panels a–c.
NATURE GENETICS VOLUME 38 [ NUMBER 6 [ JUNE 2006 69 5
LET TERS
correlations in Fig. 3d,e) and H4ac and H3K79me2 (data not shown).Thus, our genome-wide data demonstrate that binding of Pc and escto chromatin is globally linked to gene repression. However, a smallportion of PcG-bound genes is transcribed, suggesting the existence ofan ‘escape’ mechanism that may allow certain PcG-bound genes toevade silencing. Likewise, Pc binding has been observed at thepromoters of active HOX genes, although the binding level waslower than at the promoters of repressed HOX genes12.
The observation that many Pc domains contain multiple genesraised the possibility that domains act as units of regulation.To investigate this, we compared developmental expression profiles13
of genes within each domain. Indeed, we found some domains thatshowed a high degree of coordinated expression (Fig. 5a). However,the majority of the domains did not show such a correlation (Fig. 5band data not shown). This general lack ofcoordinated expression suggests a more com-plex regulation mechanism. For example, Pcbinding to chromatin has been reported to bedynamic and cell type–specific14,15. Alterna-tively, the ‘escape’ mechanism suggestedabove may allow genes within a Pc domainto be activated individually.
Our genome-wide data allowed for the firsttime to test in an unbiased and thoroughmanner whether PcG proteins preferentiallybind to genes with specific biological func-tions. Therefore, we systematically queried allgene annotation categories in the GeneOntology database16 for enrichment of bind-ing by the three proteins as detected by thecDNA arrays, using the Quontology algo-rithm17 (Supplementary Table 3 online).We found 42 Gene Ontology categories thatdisplayed significantly enriched binding by atleast one of the three proteins, encompassing329 different target genes in total (Fig. 6a,band Supplementary Table 4 online). Theresults suggest that PcG proteins prominentlycontrol developmental processes, as 12PcG-enriched Gene Ontology categories
were annotated as such, including early developmental processes(such as ectoderm and mesoderm development) and late develop-mental processes (such as eye morphogenesis) developmental pro-cesses. Notably, many of the target genes are involved in multipledevelopmental pathways. This is illustrated by the 54 Pc target genes inthe ‘ectoderm development’ category (Fig. 6c and SupplementaryTable 5 online): a majority (44) of these also function in one or moreother developmental categories (Fig. 6c, blue).
The PcG-bound genes that are involved in developmental processesencode primarily two types of proteins. First, a large fraction ofthese target genes encode transcription regulators. For example,almost half (25/54) of the Pc target genes involved in ectodermdevelopment are also annotated as ‘transcription factor activity’,‘RNAP II transcription factor activity’ or ‘specific transcriptionalrepressor activity’, and they include 17 homeodomain-like proteins,four basic-Helix-Loop-Helix proteins, two high mobility group pro-teins, a LIM-domain protein and Dorsal (Fig. 6c, green, and Supple-mentary Table 5). We observed similar enrichment of transcriptionregulators in other developmental categories, and a total of at least 67PcG-bound genes encode proteins involved in transcription regulation(Supplementary Table 4). Moreover, the Pc domains identified on thetiling array included many genes encoding transcriptional regulators(Supplementary Table 6 online).
Secondly, many PcG target genes involved in development encodeproteins implicated in major signaling cascades controlling cell fate.For example, ten Pc targets linked to ectoderm development encodeintegral membrane or secreted proteins (Fig. 6c, red), includingcomponents of important signal transduction pathways such as
r = 0.76 r = 0.78 r = 0.42
r = –0.50 r = –0.60 r = –0.02
3
2
1
0
–1
–1 0 1 2 3–2log2 Pc-Dam:Dam log2 esc-Dam:Dam log2 Sce-Dam:Dam
log 2
H3K
27m
e3:H
3
3
2
1
0
–1
–1 0 1 2 3–2
log 2
H3K
27m
e3:H
3
log 2
H3K
4me3
:H3
log 2
H3K
4me3
:H3
log 2
IgG
:H3
3
2
1
0
–1
3
2
1
0
–1
2
1
0
–1
2
1
0
–1
–1 0 1 2 3–2
–1 0 1 2 3–2log2 Pc-Dam:Dam log2 esc-Dam:Dam log2 Pc-Dam:Dam
–1 0 1 2 3–2 –1 0 1 2 3–2
log 2
H3K
27m
e3:H
3a b c
d e f
Figure 3 PcG-bound chromatin is trimethylated at Lys27 of histone H3. Bivariate scatter plots of
log2-transformed data compare ChIP-on-chip analyses of Kc cell histone modifications and PcG DamID
profiles. (a–c) The mapping data of H3K27me3 fit very well with the Pc (a), esc (b) and Sce (c)
binding profiles. (d–f) In contrast, H3K4me3 data correlate negative with Pc (d) and esc (e). Moreover,
a nonspecific control ChIP shows no overlap with Pc (f). Pearson’s correlation coefficient is shown in
each plot (r).
r = 0.82 r = 0.66
r = 0.54
r = 0.14
3
3
2
2
1
1
log2 Pc-Dam:Dam
log 2
esc
-Dam
:Dam
0
0
–1
–1
–2
–2
3a b
c d
3
2
2
1
1
log2 Pc-Dam:Dam
log 2
Sce
-Dam
:Dam
log 2
HP
1a-D
am:D
am
log 2
Sce
-Dam
:Dam
0
0
–1
–1
–2
–2
3
3
2
2
1
1
log2 Pc-Dam:Dam
0
0
–1
–1
–2
–2
4
3
3
2
2
1
log2 Pc-Dam:Dam
1
0
0
–1
–1
–2
Figure 2 PcG proteins bind largely to same genomic sites. Bivariate scatter
plots of log2-transformed binding ratios demonstrate a high degree of
similarity between binding profiles of different PcG proteins. Shown are
binding profiles for esc versus Sce (a), Pc versus Sce (b), Pc versus esc (c)
and Pc versus HP1a (d). Pearson’s correlation coefficient (r) is shown in
each plot.
6 96 VOLUME 38 [ NUMBER 6 [ JUNE 2006 NATURE GENETICS
LET TERS
Wingless, Notch and Delta (Supplementary Table 5). In addition, thegenomic tiling array revealed that Pc domains included Wnt pathwaygenes (such as wingless, Wnt6 and two Wnt-binding proteins,CG33115 and CG8942). Wnt signaling has previously been correlatedto increased expression of PcG proteins18,19. Combined with our data,this suggests the existence of a feedback loop between PcG proteinsand Wnt signaling. Finally, the hedgehog pathway has been known tobe controlled by PcG proteins20,21, and our data confirm this con-nection (Supplementary Table 1).
We also identified a number of biological functions that had notbeen previously linked to PcG proteins (Fig. 6a). Among these areelectron and glucose transporter activities, various hydrolases, carbo-hydrate and steroid metabolism, oxidoreductases, G protein signalingpathways and transmission of nerve impulse. Some of these genes havepreviously been linked to development. For instance, the nrv2 geneencodes a Na+/K+ ATPase cation transporter that is required fortrachea development22. Notably, several cytochrome P450 proteinsimplicated in biosynthesis of steroid hormones23 were among the PcGtargets (Supplementary Table 4), and the tiling array data showed inaddition that a gene encoding a steroid hormone receptor (EcR) islocated in a Pc domain (Supplementary Table 6). This suggests thatPcG proteins regulate signaling by steroids, which are essentialhormonal stimuli during metamorphosis stages of development24.
In summary, our tiling array data demonstrate that Pc binds tolarge genomic regions, which typically encompass one or more genes.These Pc domains have important implications for understandingPcG-mediated gene repression mechanisms. In addition, we demon-strate strong overlaps between the genomic binding patterns of Pc, Sceand esc, as well as H3K27me3. Finally, our genome-wide data allow,for the first time, examination of the functional role of PcG proteins inan unbiased manner. We observe that PcG-bound genes encode manytranscriptional regulators, suggestive of roles in signaling pathwaysand in steroid hormone biosynthesis. These results imply a key role forPcG proteins in a variety of distinct developmental processes andprovide a broad framework for future developmental studies.
METHODSCell and molecular biology. D. melanogaster Kc cells were grown in BPYE
medium (Shields and Sang M3 Insect medium (Sigma) supplemented with
25% (w/v) bacto-peptone, 20% (w/v) yeast extract and 5% heat inactivated
fetal calf serum (Gibco)) at 23 1C. Full-length Pc, esc and Sce were cloned into
pNDamMyc4, and plasmids were transfected into Kc cells as described25.
DamID and data analysis. DamID experiments were performed as
described4,9. Two independent Pc DamID samples were hybridized to custom
60-mer oligonucleotide tiling arrays, using different dye orientations (Nimble-
Gen Systems). The tiling arrays contained the entire chromosomes 2L and 4,
the first 11 Mb of chromosome 2R, the first 2 Mb of chromosome X,
heterochromatic sequences of chromosome 2, rDNA genes and randomly
generated control probes. Log2-transformed ratios were normalized using
lowess normalization26. To define Pc domains, we first removed random noise
by calculating a running mean with a window size of 2 kb, which is the
approximate resolution of DamID27. Next, we calculated the running mean on
a randomly shuffled chromosome and its distribution for estimation of the
random error. As a threshold level for binding or no binding, we chose three
times the standard deviation of this randomly sampled set. A Pc domain
was defined in the noise-filtered data as 410 kb of consecutive probes that
had a value greater than the threshold. In addition, we allowed for five
consecutive probes to fall beneath the threshold and not break the domain.
Notably, when this method is applied on a randomly shuffled chromosome, not
a single domain is identified, indicating that our definition of Pc domains is
highly stringent.
For each PcG fusion, samples from four independent experiments were
hybridized to Fly12K microarrays (Genomics Facility, Fred Hutchinson Cancer
Research Center) containing 11,857 cDNA clones from the D. melanogaster
Gene Collection, Release 1 and 2, and 126 additional cDNAs from the
Northwest flychip consortium. After removal of low-quality, redundant or
ambiguous proves, hybridization data of 9,994 probes were used for analyses.
Probe annotations were obtained as described9 using Berkeley Drosophila
Genome Project Release 3.1. For statistical analysis, all measured ratios were
log2 transformed and normalized to the median value of the entire array.
Normalized log2 ratios from four independent experiments were averaged.
For the expression state of PcG bound chromatin, we defined Pc and esc
target genes by testing whether log2-transformed ratios were significantly
aP < 2 × 10–16
P < 2 × 10–16 P < 2 × 10–16
P < 2 × 10–16 P < 2 × 10–16 P < 1 × 10–11b
c d
e f
0.8
Pro
port
ion
of a
ctiv
e ge
nes
Mea
n lo
g 2 ex
pres
sion
leve
l
log2 modification level
Rel
ativ
e fr
eque
ncy
0.4
0.0 0.0
0.0
0.4
0.4
0.8
0.8
0.0
0.4
0.8
1.2
1.2
Pc
–2 –1 0 1 2 –2 –1 0 1 2
0.0
0.4
0.8
0.0
0.4
0.8
1.2
–2 –1 0 1 2 –2 –1 0 1 2
N Nesc Pc
H3Ac H3K4me2
P < 2 × 10–16 P < 2 × 10–16
H3Ac H3K4me2
N Nesc
Figure 4 PcG-bound chromatin is transcriptionally repressed. (a–b) Transcrip-
tional activity assessed by the fraction of active genes (a) and transcription
levels of expressed genes (b). (c–f) Histone modification levels of either Pc
targets (c,d) or esc targets (e,f) versus non-targets, represented as density
plots. In all panels, Pc target genes are black (n ¼ 644), esc targets dark
gray (n ¼ 463) and non-targets gray. P values were calculated by
hypergeometric test (e) or Mann-Whitney U-test (a–d,f).
1.5
1.0
0.5
0.0
–0.5
0.0
0.4
0.8a b
–0.4
–1.0
E0 E3 L P M
Mean r = 0.8
CG11629CG1421CG1428
CG8853sobdrm
Mean r = –0.066
Developmental stage
log 2
rela
tive
expr
essi
on le
vel
F E0 E3 L P M F
Figure 5 Developmental expression of genes located within Pc domains.
(a,b) Developmental expression profiles of genes within the 2L_21713465
(a) and 2L_3523335 (b) Pc domains. X-axes show developmental stage:
embryos at 0 h (E0) and 3 h (E3), larva (L), pupa (P), adult male (M) and
adult female (F).
NATURE GENETICS VOLUME 38 [ NUMBER 6 [ JUNE 2006 69 7
LET TERS
greater than 0 using the CyberT algorithm28 followed by a correction for
multiple testing29 adjusting the false discovery rate to 1%.
For the coordinate expression analysis, we used developmental expression
data from reported whole-genome expression profiles of six developmental
stages of D. melanogaster13. Hybridization intensities of each exon probe were
log2 transformed, data from replicate experiments were averaged and exon
probes corresponding to the same gene were averaged to obtain a single log2-
transformed expression level for each gene at each developmental stage. Next,
these values were normalized to the mean log2 value, first by developmental
stage and then by gene. Pearson correlation coefficients (r) were calculated for
the expression data of each possible gene pair within a Pc domain and were
averaged to yield one mean r.
We used the Quontology algorithm17 to identify functional groups of genes
that are significantly bound by PcG proteins. A Z score is calculated for the
functional categories from the FlyBase Gene Ontology database, which mea-
sures the deviation of the average log2-transformed binding ratio for genes in a
category from the genome-wide average, in units of the standard deviation. A
Z score larger than 4 is significant (P o 0.05).
All further statistical analyses were performed using the R package for
statistical computing; scripts are available upon request.
ChIP-on-chip. The ChIP method used was an adaptation of a previously
published protocol30. Approximately 2 � 108 Kc cells were cross-linked
by the addition of formaldehyde (1% final concentration). The reaction
was quenched with 125 mM glycine, and the cells were washed twice
with PBS. Subsequently, the cells were incubated for 5 min on ice in
1.5 ml lysis buffer (1% (v/v) SDS; 10 mM EDTA, pH 8.0; 50 mM Tris-HCl,
pH 8.0; protease inhibitor cocktail (Roche)). The cell lysates were sonicated
and centrifuged to remove cell debris. We used the equivalent of 1 � 106 cells
for one immunoprecipitation. Chromatin samples were diluted tenfold in
immunoprecipitation buffer (1% Triton X-100; 1.2 mM EDTA; 16.7 mM
Tris-HCl, pH 8.0; 167 mM NaCl; protease inhibitor cocktail (Roche)) and
precleared with 10 ml 50% protein A Sepharose slurry containing 1 mg/ml BSA
for 1 h agitation at 4 1C. Precleared chromatin was incubated with 2 mg
antibody overnight at 4 1C. All subsequent ChIP steps were performed
as described30.
We used the following antibodies for ChIP analyses: anti-H3K27me3
(Upstate), anti-H3K4me3 (Abcam) and anti-IgG (Jackson ImmunoResearch).
The data were normalized to a parallel ChIP with antibody to H3 (Abcam) to
correct for differences in nucleosome density.
For the microarray analysis, immunoprecipitated DNA fragments were
amplified using linker-modified PCR. Randomly sheared DNA fragments were
made blunt using T4 DNA polymerase for 15 min at 16 1C in 10 mM Tris-HCl,
pH 8.0; 50 mM NaCl; 10 mM MgCl2; 1 mM DTT and 0.1 mM dNTPs.
Adapter ligation, PCR amplification, labeling and hybridization were carried
out according to the DamID protocol3,9. Data analysis was performed as
described above.
URLs. Gene annotations and ontology data were downloaded from: http://
www.flybase.org. The R package can be downloaded from http://www.
r-project.org. The Gene Expression Omnibus (GEO) is found at http://
www.ncbi.nlm.nih.gov/geo/.
a b c
esc
ScePc Description
Pc targets/GO categoryEctoderm developmentPc targets/GO category
Ectoderm developmentMesoderm development
Ventral cord developmentNeurogenesis
Neuroblast cell fate determination
Eye morphogenesisSkeletal development
Tracheal system developmentMuscle developmentHindgut morphogenesisRegulation of transcription
Cation transportAnion transportCarbohydrate transportCarbohydrate metabolismSteroid metabolism
Transcription factor activity
Electron transporter activity
Endothelin-converting enzyme activityCarboxylesterase activityAlkaline phosphatase activityChitinase activityOxidoreductase activityG-protein coupled receptor activityStructural constituent of peritrophic membrane
MicrosomeMembraneIntegral to membraneExtracellular regionNucleus
Z score Pc targets Pc targets annotated in ectoderm development
Other developmental processes
Transcriptional regulators/nuclear proteins
(Integral) membrane/secreted proteins
Other GO categories
–4 0 4 8
Glucose transporter activitySerine-type endopeptidase activityTrypsin activity
RNAP II transcription factor activitySpecific RNAP II transcription factor activitySpecific transcriptional repressor activity
Transmission of nerve impulseG-protein coupled receptor protein signaling pathway
Negative regulation of transcriptionNegative regulation of transcription from Pol II promoter
Ganglion mother cell fate determination
Leg disc proximal/distal pattern formation
Bio
logi
cal p
roce
ssM
olec
ular
func
tion
Com
part
men
t
0 40 80 0 20 40
Figure 6 Functional role of PcG proteins in Kc cells. (a) Gene Ontology (GO) fingerprint of PcG protein binding profiles. Z scores were calculated using the
Quontology algorithm17. GO categories with a Z score 44 (P o 0.05) in at least one binding profile are listed. GO subcategories are indicated at right.(b) Number of Pc targets for each GO category are indicated by the bars. (c) Bar plots show Pc targets annotated in ‘ectoderm development’. Black bars
represent the total number of Pc targets in this category. The colored bars represent targets that are also annotated in other categories.
6 98 VOLUME 38 [ NUMBER 6 [ JUNE 2006 NATURE GENETICS
LET TERS
Note: Supplementary information is available on the Nature Genetics website.
ACKNOWLEDGMENTSWe thank G. Cavalli and N. Negre for sharing unpublished results and forproviding Pc and ph antibodies; J. Muller for Su(z)12 antibody; F. Greil,H. Pickersgill, M. Heimerickx and the NKI Central Microarray Facility fortechnical assistance; M. Fornerod for sharing R scripts; H. Bussemaker for theQuontology script; the Genomics Facility of the Fred Hutchinson CancerResearch Center for preparing 12K cDNAs and A. Brinkman and H. Stunnenbergfor help with ChIP. This work was supported by a Veni fellowship from theNetherlands Organisation for Scientific Research (NWO) to B.T., a fellowshipfrom the Association of International Cancer Research to I.M., a Centre forBiomedical Genetics grant to M.v.L. and support from the European ‘Epigenome’Network of Excellence and a European Young Investigator (EURYI) award to B.v.S.
COMPETING INTERESTS STATEMENTThe authors declare that they have no competing financial interests.
Published online at http://www.nature.com/naturegenetics/
Reprints and permissions information is available online at http://npg.nature.com/
reprintsandpermissions/
1. Ringrose, L. & Paro, R. Epigenetic regulation of cellular memory by the Polycomb andTrithorax group proteins. Annu. Rev. Genet. 38, 413–443 (2004).
2. Valk-Lingbeek, M.E., Bruggeman, S.W. & van Lohuizen, M. Stem cells and cancer; thepolycomb connection. Cell 118, 409–418 (2004).
3. van Steensel, B., Delrow, J. & Henikoff, S. Chromatin profiling using targeted DNAadenine methyltransferase. Nat. Genet. 27, 304–308 (2001).
4. van Steensel, B. & Henikoff, S. Identification of in vivo DNA targets of chromatinproteins using tethered dam methyltransferase. Nat. Biotechnol. 18, 424–428(2000).
5. Sun, L.V. et al. Protein-DNA interaction mapping using genomic tiling path microarraysin Drosophila. Proc. Natl. Acad. Sci. USA 100, 9428–9433 (2003).
6. Negre, N. et al. Chromosomal distribution of PcG proteins during Drosophila develop-ment. PLoS Biol. 4, e170 (2006).
7. Ringrose, L., Rehmsmeier, M., Dura, J.M. & Paro, R. Genome-wide predictionof Polycomb/Trithorax response elements in Drosophila melanogaster. Dev. Cell 5,759–771 (2003).
8. Dejardin, J. et al. Recruitment of Drosophila Polycomb group proteins to chromatin byDSP1. Nature 434, 533–538 (2005).
9. Greil, F. et al. Distinct HP1 and Su(var)3–9 complexes bind to sets of developmentallycoexpressed genes depending on chromosomal location. Genes Dev. 17, 2825–2838(2003).
10. Schubeler, D. et al. Genome-wide DNA replication profile for Drosophila melanogaster:a link between transcription and replication timing. Nat. Genet. 32, 438–442 (2002).
11. Schubeler, D. et al. The histone modification pattern of active genes revealed throughgenome-wide chromatin analysis of a higher eukaryote. Genes Dev. 18, 1263–1271(2004).
12. Breiling, A., O’Neill, L.P., D’Eliseo, D., Turner, B.M. & Orlando, V. Epigenome changesin active and inactive polycomb-group-controlled regions. EMBO Rep. 5, 976–982(2004).
13. Stolc, V. et al. A gene expression map for the euchromatic genome of Drosophilamelanogaster. Science 306, 655–660 (2004).
14. Ficz, G., Heintzmann, R. & Arndt-Jovin, D.J. Polycomb group proteincomplexes exchange rapidly in living Drosophila. Development 132, 3963–3976(2005).
15. Hernandez-Munoz, I., Taghavi, P., Kuijl, C., Neefjes, J. & vanLohuizen, M. Assocationof BMI1 with Polycomb bodies is dynamic and requires PRC2/EZH2 and the main-tenance DNA methyltransferase DNMT1. Mol. Cell. Biol. 25, 11047–11058 (2005).
16. Harris, M.A. et al. The Gene Ontology (GO) database and informatics resource. NucleicAcids Res. 32, D258–D261 (2004).
17. Lascaris, R. et al. Hap4p overexpression in glucose-grown Saccharomycescerevisiae induces cells to enter a novel metabolic state. Genome Biol. 4, R3(2003).
18. Kirmizis, A., Bartley, S.M. & Farnham, P.J. Identification of the polycomb group proteinSU(Z)12 as a potential molecular target for human cancer therapy. Mol. Cancer Ther.2, 113–121 (2003).
19. Klebes, A. et al. Regulation of cellular plasticity in Drosophila imaginal disc cells bythe Polycomb group, trithorax group and lama genes. Development 132, 3753–3765(2005).
20. Leung, C. et al. Bmi1 is essential for cerebellar development and is overexpressed inhuman medulloblastomas. Nature 428, 337–341 (2004).
21. Maurange, C. & Paro, R. A cellular memory module conveys epigenetic inheritance ofhedgehog expression during Drosophila wing imaginal disc development. Genes Dev.16, 2672–2683 (2002).
22. Paul, S.M., Ternet, M., Salvaterra, P.M. & Beitel, G.J. The Na+/K+ ATPase is requiredfor septate junction function and epithelial tube-size control in the Drosophila trachealsystem. Development 130, 4963–4974 (2003).
23. Warren, J.T. et al. Molecular and biochemical characterization of two P450 enzymes inthe ecdysteroidogenic pathway of Drosophila melanogaster. Proc. Natl. Acad. Sci. USA99, 11043–11048 (2002).
24. Baehrecke, E.H. Steroid regulation of programmed cell death during Drosophiladevelopment. Cell Death Differ. 7, 1057–1062 (2000).
25. Henikoff, S., Ahmad, K., Platero, J.S. & van Steensel, B. Heterochromatic depositionof centromeric histone H3-like proteins. Proc. Natl. Acad. Sci. USA 97, 716–721(2000).
26. Cleveland, W.S. Robust locally weighted regression and smoothing scatterplots. J. Am.Stat. Assoc. 74, 829–836 (1979).
27. van Steensel, B., Delrow, J. & Bussemaker, H.J. Genomewide analysis of DrosophilaGAGA factor target genes reveals context-dependent DNA binding. Proc. Natl. Acad.Sci. USA 100, 2580–2585 (2003).
28. Baldi, P. & Long, A.D. A Bayesian framework for the analysis of microarray expressiondata: regularized t-test and statistical inferences of gene changes. Bioinformatics 17,509–519 (2001).
29. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practicaland powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodolog. 57,289–300 (1995).
30. Rietveld, L.E., Caldenhoven, E. & Stunnenberg, H.G. In vivo repression of an erythroid-specific gene by distinct corepressor complexes. EMBO J. 21, 1389–1397(2002).
NATURE GENETICS VOLUME 38 [ NUMBER 6 [ JUNE 2006 69 9
LET TERS
CORR IGENDA
Corrigendum: Genome-wide profiling of PRC1 and PRC2 Polycomb chromatin binding in Drosophila melanogasterBas Tolhuis, Inhua Muijrers, Elzo de Wit, Hans Teunissen, Wendy Talhout, Bas van Steensel & Maarten van LohuizenNature Genetics 38, 694–699 (2006); published online 20 April 2006; corrected after print 9 June 2006
In the version of this article initially published, the order of the first three authors was incorrect. The correct order should be Bas Tolhuis, Elzo de Wit, Inhua Muijrers. This error has been corrected in the PDF version of the article.