70
INTEGRATED BIOLOGY Supplement to Nature Publishing Group Journals December 2012 Epigenetics SPONSORED CONTENT Produced with support from:

Nature Publishing Group - Epigenetics Collection

Embed Size (px)

DESCRIPTION

A collection of articles dedicated to epigenetic phenomena and their role in human disease and the possibility of exploring epigenetic pathways as targets for drug discovery

Citation preview

Integrated bIologySupplement to Nature Publishing Group Journals December 2012

epigenetics

S P O N S O R E D C O N T E N T

Produced with support from:

The next frontier of personalized therapeutics – epigenetics

Epizyme is at the forefront of drug discovery and development, leveraging discoveries to

New Approaches to Personalized Cancer Treatment

www.epizyme.com

Personalized Therapeutics • The Power of Epigenetics

Each day we learn more about

the biology of cancer and

how genetic mutations in

cancer cells cause them to

grow and spread. This is the

age of personalized therapeutics

– medicines that hone in on

NATURE REPRINT COLLECTION Epigenetics S1

Sponsor’s forewordDisease-Driving Genes and Molecules to Target Them Create the Promise of Personalized Therapeutics

Personalized therapeutics pairs the identification of disease-causing genes with the discovery of innovative therapies that target key genetic

aberrations. Today, a common precursor to personalized therapeutics is the discovery of small molecule tool compounds that bridge pathobiological understanding and chemical biology, thus establishing that drug-like chemicals can effectively modulate disease-relevant targets. From this crucible emerge bona fide drug discovery efforts that eventually lead to new medicines.

Histone methylation stands at the dawn of such a transformational moment. Histones are methylated by a class of enzymes known as protein methyltransferases (PMTs) and this methyl marking is reversed by another class of enzymes known as histone demethylases (HDMs). The reprints collected here illustrate how genetic alterations in both specific PMTs and HDMs lead to pathogenic changes that drive particular human cancers. A clear roadmap for translating these biological observations into systematic drug discovery for the PMTs is also described within the collection. This approach has led to potent and selective small molecule inhibitors of several PMTs that display cancer-specific cell killing effects; these are exemplified in the reprint collection by inhibitors of G9a and of EZH2. The collection also highlights how selective PMT inhibitors may play a role in regenerative medicine, by mediating the conversion of differentiated cells into a more stem cell-like state of pluripotency.

The ultimate test of these targets will come from clinical trials of specific enzyme inhibitors in genetically defined patients, with a relevant companion diagnostic. The first such clinical trial of a PMT inhibitor, EPZ-5676, began in September 2012; other specific inhibitors are likely to enter clinical trials shortly. How well the pathobiology of histone methylation translates into meaningful new medicines for genetically defined patients will be exciting to see.

Robert A. Copeland, Ph.D.

Chief Scientific Officer, Epizyme, Inc.

3 Protein methyltransferases as a target class for drug discovery. Copeland, RA et al. Nat. Rev. Drug Discov. 8, 724–732. (2009)

12 Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma. Morin, RD et al. Nature 476, 298–303 (2011).

18 A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells. Vedadi, M et al. Nat. Chem. Biol. 7, 566–574 (2011)

27 Chromatin-modifying enzymes as modulators of reprogramming. Onder, TT et al. Nature 483, 598–602 (2012)

32 Novel mutations target distinct subgroups of medulloblastoma. Robinson, G et al. Nature 488, 43–48 (2012)

38 A selective inhibitor of EZH2 blocks H3K27 methylation and kills mutant lymphoma cells. Knutson, SK et al. Nat. Chem. Biol. 8, 890–896 (2012)

This supplement is published by Nature Publishing Group on behalf of Epizyme. All content has been chosen by Epizyme.

Nature Reprint Collection Epigenetics

publisher: Melanie Brazil editor: Terry L. Sheppard, Amy Donner copyeditor: Yasmin Tayagsenior art editor: Erin Dewaltproduction editor: Carol Evangelistaproduction manager: Mabel Eng, Kelly Hopkinsmarketing: Nazly De La Rosasponsorship: Reya Silao, Yvette Smithsponsor: Epizyme

Nature - WWW.NATURE.COM/NATUREThe Macmillan Building, 4 Crinan Street,London N1 9XW, UKTel: +44 (0) 20 7833 4000e-mail: [email protected]

CITING THE COLLECTIONAll papers have been previously published in Nature, Nature Reviews Drug Discovery and Nature Chemical Biology. Please use original citation, which can be found on the table of contents.

VISIT THE COLLECTIONwww.nature.com/reprintcollections/epigenetics

SUBSCRIPTIONS AND CUSTOMER SERVICESFor UK/ROW (excluding Japan):Nature Publishing Group, Subscriptions,Brunel Road, Basingstoke, Hants, RG21 6XS, UK. Tel: +44 (0) 1256 329242.

Subscriptions and customer services forAmericas – including Canada, Latin America and the Caribbean: Nature Publishing Group, Subscription Department, PO Box 5161, Brent-wood, TN 37024-5161, USA. Tel: +1 (800) 524 2688 (US) or +1 615 850 5315 (outside the US).

COV

ER A

RT P

ROV

iDED

BY

EPiZ

YME,

iNC

.

Methyltransferases are enzymes that facilitate the transfer of a methyl (–CH3) group to

a reaction mechanism in which the nucleophilic acceptor site attacks the electrophilic carbon of S-adenosyl-L-methionine (SAM) in an SN2 displacement reaction that pro-duces a methylated biomolecule and S-adenosyl-L-homocysteine (SAH) as a byprod-uct. Methylation reactions are essential transformations in small-molecule metabolism,

dynamic and reversible methylation of amino acid side chains of chromatin proteins, particularly within the N-terminal tail of histone proteins, has revealed the importance

of methyl ‘marks’ as regulators of gene expression. Human protein methyltransferases (PMTs) fall into two major families—protein lysine methyltransferases (PKMTs) and protein arginine methyltransferases (PRMTs)—that are distinguishable by the amino acid that accepts the methyl group and by the conserved sequences of their respective catalytic domains. Given their involvement in many cellular processes, PMTs have at-tracted attention as potential drug targets, spurring the search for small-molecule PMT

-cal probes that are active in cells will be required to elucidate the biological roles of PMTs and serve as potent leads for PMT-focused drug development.

The�human�protein�methyltransferasesSu

pplement�to�Na

ture�Pub

lishing

�Group

�Journals

associated with the underlying causes of multiple human diseases. Our patient-driven approach to the creation of personalized therapeutics represents the future of cancer therapy, creating better therapeu-tics matched to the right patients more quickly and at lower cost than traditional approaches.

www.epizyme.com

[email protected] Dr. Victoria RichonVice President, Biological [email protected]

4. Daigle, S.R. et al. Cancer Cell 20, 53–65 (2011). 5. Ferguson, A.D. et al. Structure 19, 1262–1273 (2011). 6. Mori, S. et al. Bioorg. Med. Chem. 18, 8158–8166 (2010).7. Kubicek, S. et al. Mol. Cell 25, 473–481 (2007).

11. Allan, M. et al. Bioorg. Med. Chem. Lett. 19, 1218–1223 (2009).12. Huynh, T. et al. Biorg. Med. Chem. Lett. 19, 2924–2927 (2009).13. Yao, Y. et al. J. Am. Chem. Soc. 133, 16746–16749 (2011).14. Cheng, D. et al. J. Med. Chem. 54, 4928–4932 (2011).15. Greiner, D. et al. Nat. Chem. Biol. 1, 143–145 (2005).

MLL

EZH1EZH2

MLL4

SETD1B

SETD1A

MLL2

MLL3

SUV39H1

SUV39H2

EHMT1

EHMT2

SETDB1SETMAR

SETDB2Q6ZW69

MLL5

SETD5

NSD1

WHSC1L1

WHSC1

ASH1L

SETD2

SETD7

SETD8

SUV420H2SUV420H1

SETD6

SETD3

PRDM3PRDM5

PRDM16

PRDM2

PRDM1

PRDM11

PRDM7

PRDM9

PRDM10

PRDM8

PRDM13

PRDM6

PRDM14

PRDM12

PRDM4

SETD4

SMYD5

SMYD1

SMYD2

SMYD3

SMYD4

PRDM15

METTL11A

METTL11B

COQ3

METTL12

METTL13

ECE2

PRMT5

METTL10

METTL20

PRMT7

PRMT10

PRMT6PRMT2

PRMT3

PRMT1

PRMT8CARM1

WBSCR22

ALKBH8

WBSCR27

COQ5DOT1L

METTL7B

AS3MT

METTL7A

NSUN4

PNMT

ASMT

NOP2

NSUN7

PRMT9

PRMT11NSUN5B

NNMT

INMT

NSUN5C

NSUN3

NSUN6NSUN2

NSUN5

METTL2A

METTL2B

METTL6

METTL8

C20orf7

Protein�lysine�methyltransferases�(PKMTs)�The phylogenetic tree shows 51 genes predicted to encode PKMTs, which are positioned in the tree on the basis of the similarities of their amino acid sequences1. This tree ex-cludes one validated PKMT, DOT1L, which lacks a SET domain—the catalytic domain

Protein�arginine�methyltransferases�(PRMTs)The human PRMT phylogenetic tree comprises 45 predicted enzymes including the PKMT DOT1L1. There are two major types of PRMT; both catalyze the formation of monomethylarginine (Rme1) but distinct reaction mechanisms yield symmetric (Rme2s) or asymmetric (Rme2a)

© 2011 Nature Publishing Group

Available online at: http://www.nature.com/nchembio/poster/hpm.pdf

• A selection of small-molecule PMT inhibitors with some target selectivity is shown (minimally validated in quantitative in vitro assays) around the trees along with the name of the molecule, citation information and the chemical structure2,3.

• DOT1L is a validated therapeutic target for mixed-lineage leukemia4. The major-ity of these leukemias result from chromosomal rearrangements that cause aber-rant recruitment of DOT1L to MLL-fusion target genes. Inhibition of DOT1L with EPZ004777 demonstrated that these leukemia cells are addicted to DOT1L activity and established proof of concept for DOT1L inhibition as a therapeutic option.

• Priority therapeutic targets also include MLL for leukemias; SETD1B and CARM1 for neurodegeneration; as well as EZH2, SMYD3 and EHMTs for multiple cancers.

• Additional PMTs have been implicated in human diseases and may yet emerge as therapeutic targets.

• Elucidation of the biological function of PMTs would be facilitated by the development of selective chemical probes; this is a compelling area for future chemical biology studies, given the paucity of available tool compounds, many of which remain to be validated in cells. In particular, the emergence of these enzyme families as therapeutic targets suggests that such chemical probes could yield lead compounds for drug development.

• Understanding the mechanisms

especially for nonhistone targets, merits additional study.

Targeting�PMTs

AZ505 ref. 5

OOO

NHN N

H

Cl

Cl

HO

HN

BIX-01294 ref. 7

OMe

N

OMeNH

N

NNN

Chaetocin ref. 15

HO O

O

SS

NN

SSN

N

O

O

H NH

HNHOH

ref. 6

N N

OHHO

H2N H

HO2C

NH2

O

NH

N

N

N

UNC-0224 ref. 8

N

O

OMeNH

N

N NN

N

UNC-0638 ref. 9

N

O

OMeNH

N

N N

ref. 10

O

O

O

O

SS NH

HN

ref. 11

OOS

NHNH

NN

H2N

CF3

MeO

ref. 12

N

O

HN NH2

SF3C

O

NN

NN

ref. 14

HOBr

N

O

BrOH

EPZ004777 ref. 4

O

O

OHHO

HN NH2

N

NNN

HN

IBAO ref. 13

H2N

HO2C

H

IHO OH

ON

N

N

NN

HN

HO2C

HH2N

HO OH

NSO

NH2N

N N

MeHO2C

HH2N

HO OH

NSO

NH2N

N N

H H

HN

HN

NH

H

CH3

H N HN

HHN

H3C

CH3

N H

HHN CH3

H3C N

HN

HH

NN

HN

HH CH3N

H HCH3

CH3H3CNCH3

H3C HN

Human protein methyltransferasesNature Chemical Biology presents a poster highlighting the human protein methyltransferase families, the small molecules known to target them and the prospects for PMT-focused drug development.

Human protein methyltransferases (PMTs) transfer one or more methyl groups to the sidechains of lysine or arginine amino acids. Given their roles in regulating gene expression and driving disease, PMTs have attracted attention as potential drug targets. Several classes of small-molecule PMT inhibitors have been identified, but new specific chemical probes that are active in cells will be required to elucidate the biological roles of PMTs and serve as leads for PMT-focused drug development.

FREE POSTER

Poster sponsored by:

Download the Poster today by visiting: www.nature.com/nchembio/poster/hpm

23689-01 NChemBio poster ad.indd 1 12/01/2012 15:42

NATURE REPRINT COLLECTION Epigenetics S3

Cellular differentiation is one of the most important components of embryonic development and postnatal tissue maintenance and repair. Almost every nucleated cell of the human body contains the same, complete complement of genomic DNA. However, the ability of pluripotent cells to differentiate into distinct lineages and ultimate cell types is conferred by specific patterns of transcription of subsets of genes in the genome. A large and growing body of data support the idea that epigenetic regulation of gene transcription is a key biological deter-minant of cellular differentiation1.

The chromosomes within eukaryotic cell nuclei are packaged together with structural proteins (histones) to form the complex known as chromatin. Four major histones (H2A, H2B, H3 and H4) form an octameric, disc-shaped aggregate — composed of two copies of each histone type — around which the DNA is wound to form regular, repeating units known as nucleosomes (FIG. 1). Chromatin exists in two main conformational states: a condensed state (heterochromatin) in which the nucleo-somes are tightly packed together and gene transcription is largely repressed; and a more relaxed state (euchro-matin) in which gene transcription is activated. Epigenetic regulation of gene transcription is mediated by selective, enzyme-catalysed, covalent modification of specific nucleo tides within the genes and also by post-translational modifications of the histone proteins (FIG. 1). Modification of DNA can silence gene transcription directly, whereas the post-translational modifications of histones control the conformational transition between the heterochroma-tin and euchromatin states2. The enzymes that covalently modify DNA and histones are therefore the key mediators of epigenetic regulation of gene transcription.

Several putative epigenetic enzymes have recently been identified and, in some cases, their catalytic mechanism and three-dimensional structures have been determined2,3.

Epigenetic enzymes that are encoded in the human genome catalyse group transfer reactions and can be categorized according to the nature of the covalent modifications that they catalyse and by the substrates upon which they act. In humans, these enzymes include DNA methyltransferases (DNMTs), which methylate the carbon atom at the 5-position of cytosine in the CpG dinucleotide sites of the genome; protein methyl-transferases (PMTs), which methylate lysine or arginine residues on histones and other proteins; protein demethyl-ases, which remove methyl groups from the lysine or arginine residues of proteins; histone acetyltransferases, which acetylate lysine residues on histones and other proteins; histone deacetylases (HDACs), which remove acetyl groups from lysine residues on histones and other proteins; ubiquitin ligases, which add ubiquitin to lysine residues on histones and other proteins; and specific kinases that phosphorylate serine residues on histones4,5.

Given that small-molecule inhibitors have been suc-cess fully designed for HDACs and DNMTs (discussed below), it is likely that additional families of histone-modifying enzymes will also be amenable to small-mol-ecule modulation. The opportunity for chemical-probe development and pharmacological control of epigenetic gene transcription is therefore of great interest in the fields of basic biology and drug discovery4,5. Indeed, the role of these enzymes in human diseases is high-lighted by the recent approval of three drugs by the US Food and Drug Administration6 that act as selective, small-molecule inhibitors of HDACs and DNMTs for the treatment of specific human cancers (TABLE 1).

In recent years, there have been numerous reviews in the literature that highlight different aspects of the biology, disease association and/or structural biology of various histone-modifying enzymes. In this Review, we

Epizyme, Inc., 840 Memorial Drive, Cambridge, Massachussets 02139, USA.Correspondence to R.A.C. e-mail: [email protected]:10.1038/nrd2974

EpigeneticsA stably heritable change in phenotype or gene expression in an organism or cell, resulting from changes in a chromosome that are not caused by a change in DNA sequence. The process of eukaryotic cell differentiation is one of the most well-known examples of epigenetic changes.

Protein methyltransferases as a target class for drug discoveryRobert A. Copeland, Michael E. Solomon and Victoria M. Richon

Abstract | The protein methyltransferases (PMTs) — which methylate protein lysine and arginine residues and have crucial roles in gene transcription — are emerging as an important group of enzymes that play key parts in normal physiology and human diseases. The collection of human PMTs is a large and diverse group of enzymes that have a common mechanism of catalysis. Here, we review the biological, biochemical and structural data that together present PMTs as a novel, chemically tractable target class for drug discovery.

R E V I E W S

724 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 724 18/8/09 09:48:45

First published in Nature Reviews Drug Discovery 8, 724–732 (2009); doi: 10.1038/nrd2974

S4 NATURE REPRINT COLLECTION Epigenetics

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S5

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S6 NATURE REPRINT COLLECTION Epigenetics

common natural ligand, the ATP-binding pockets of protein kinases have afforded medicinal chemists a rich diversity of chemical scaffolds, which have resulted in a range of drug molecules of varying degrees of target selectivity32. Similarly, the commonality of SAM use by the PMTs belies the structural, biological and pathologi-cal diversity of these enzymes. From the perspective of drug discovery and medicinal chemistry, the diversity of SAM-binding modes and catalytic mechanisms of these enzymes is of key importance.

A common structural feature of PKMTs and PRMTs that distinguishes these enzymes from other pro-teins that use SAM is the overall architecture of their extended catalytic active sites. This generally consists of a SAM-binding pocket that is accessed from one face of the protein, and a narrow, hydrophobic, acceptor (that is, lysine or arginine) channel that extends to the opposite face of the protein surface, such that the two substrates enter the active site from opposite sides of the enzyme surface.

Table 2 | Selected PKMTs and PRMTs that have shown an association with human cancers

Protein methyltransferase

Methylation substrates

cancers cancer association refs

SUV39H1 H3K9 Colon cancer Increased expression in colorectal tumours; associated with transcriptional repression

53

EHMT2 H3K9 Lung, prostate and hepatocellular carcinoma

Increased expression in lung cancer cell lines; regulates centrosome duplication, presumably through chromatin structure

54,55

MLL H3K4 Leukaemia Chromosomal aberrations involving MLL are a cause of acute leukaemias; the SET domain is lost in translocation

56–58

NSD1 H3K36 Acute myeloid leukaemia

Translocation fuses NSD1 to nucleoporin 98 in human acute myeloid leukaemia

59

WHSC1 H3K36 and H4K20

Myeloma Translocated and increased expression in myeloma; associated with transcriptional regulation

60–62

WHSC1L1 H3K4 Lung and breast cancers, and childhood acute myeloid leukaemia

Amplified in lung cancer and breast cancer; translocation with nucleoporin 98; mediates transcriptional activation

63–64

DOT1L H3K79 MLL-rearranged leukaemias

Recruited by MLL fusion partners MLLT1, MLLT2, MLLT3 and MLLT10 to homeobox genes; associated with transcriptional activation and elongation

11, 66,67

SMYD3 H3K4 Breast, liver, colon and gastric cancers

Overexpressed in multiple tumour types; associated with transcriptional activation

68,69

EZH2 H3K27 Breast, prostate, colon, gastric, bladder and liver cancers, melanoma and lymphoma

Amplified and increased expression in several tumour types; a member of the polycomb repressive complex 2; associated with transcriptional repression

10,15,70,71

SETD7 H3K4 Breast cancers SET7-mediated methylation stabilizes the oestrogen receptor and is necessary for the recruitment of the oestrogen receptor to its target genes and target gene transactivation

72

PRDM14 No known substrate

Breast cancers Amplified and overexpressed in cancers; associated with transcriptional repression

73

CARM1 H3R17, EP300–CBP and NCOA3

Breast and prostate cancers

Increased expression correlates with androgen independence in human prostate carcinoma; overexpressed in breast tumours and associated with transcriptional activation

74,75

PRMT5 H3R8, p53, SNRPD1, SNRPD3 and SUPT5H

Lymphoma PRMT5 expression and H3R8 methylation levels are increased in lymphoid cancer cells; PRMT5 mediates p53 methylation, which promotes cell arrest rather than cell death; H4R3 methylation promotes recruitment of DNMT3A, subsequent promoter CpG methylation and gene silencing

12, 76

CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); CBP, CREB-binding protein; DNMT3A, DNA (cytosine-5-)-methyltransferase 3α; EHMT2, euchromatic histone-lysine N-methyltransferase 2 (also known as G9A and KMT1C); EP300, E1A-binding protein p300; EZH2, enhancer of zeste homologue 2 (also known as KMT6); DOT1L, DOT1-like, histone H3 methyltransferase (also known as KMT4); MLL, myeloid, lymphoid or mixed-lineage leukaemia (also known as KMT2A); MLLT1, myeloid, lymphoid or mixed-lineage leukemia, translocated to 1; NCOA3, nuclear receptor coactivator 3; NSD1, nuclear receptor-binding SET domain protein 1; PKMT, protein lysine methyltransferase; PRDM14, PR domain-containing protein 14; PRMT, protein arginine methyltransferase; SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SMYD3, SET and MYND domain-containing protein 3; SNRPD1, small nuclear ribonucleoprotein D1 polypeptide 16kDa (also known as SMD1); SNRPD3, small nuclear ribonucleoprotein D3 polypeptide 18kDa (also known as SMD3); SUPT5H, suppressor of Ty 5 homologue; SUV39H1, suppressor of variegation 3–9 homologue 1 (also known as KMT1A); WHSC1, Wolf–Hirschhorn syndrome candidate 1 (also known as MMSET and NSD2); WHSC1L1, Wolf–Hirschhorn syndrome candidate 1-like protein 1 (also known as NSD3).

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 727

nrd_2974_sep09.indd 727 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S7

Nature Reviews | Drug Discovery

PMTs

SAM SAH

N

O

H

NH

O

NHH

CH3

NH3+

–O2C

S+

OHH H

NO

N

N

N

NH2

H

OH

H

N

O

H

NH

O

NCH3

HH

NH3+

–O2C

S+

OHH H

NO

N

N

N

NH2

H

OH

H

Nu– LG Nu LG–+LGNuδ– δ+ ‡

a

b

Crystallographic studies have revealed two distinct binding modes for SAM or SAH in the cofactor-binding pockets of PMTs24. For the SET domain PKMTs that have been co-crystallized with SAM or SAH, it is known that the cofactor adopts a ‘U-shaped’ configuration within the active site (FIG. 3) that aligns the methylsulphonium cation of SAM at the base of the narrow lysine channel, in perfect juxtaposition to the ε-amino group of the acceptor lysine residue, which facilitates group transfer. This U-shaped configuration is induced by a conserved aspartate or glutamate residue that binds to the ribose hydroxyl groups, and a positively charged lysine or arginine residue that forms a salt bridge with the car-boxylate of SAM. In striking contrast to the U-shaped configuration that is adopted by the cofactor when bound to PKMTs, SAM bound within the active site of PRMTs adopts an extended configuration that resem-bles the extended SAM configuration seen in the DNA methyltransferases; again, the binding motif results in alignment of the SAM methylsulphonium cation with the base of the acceptor-binding channel. Another dis-tinction between cofactor binding within the PKMTs and the PRMTs is that, in PRMTs, dimer formation seems to be a crucial component of SAM binding and catalysis, whereas this is not the case for PKMTs3,24. The mechanistic consequences of obligate dimer formation in the PRMTs is not yet clear, but it may be involved in multiple methylations of the arginine residue.

From the above discussion, it could be concluded that the configuration of the bound SAM is structurally related to the identity of the methyl acceptor nitrogen species upon which the enzymes act; that is, U-shaped for PKMTs and extended for PRMTs. However, data on the non-SET domain PKMT, DoT1l, do not support this conclusion. In the co-crystal structures of human DoT1l bound to SAM33, and the yeast homologue DoT1P bound to SAH34, the cofactor is bound in the extended configu-ration, similar to that seen in the PRMTs. Additionally, the solvent-exposed surface area of the bound cofactor in DoT1l is more similar to that seen in the PRMTs than the PKMTs, as is the overall amino-acid sequence around the cofactor-binding pocket24,33. Therefore, from a struc-tural perspective, DoT1l seems to link the PKMT and PRMT groups of PMTs.

The discovery and optimization of selective drugs for the PMTs will depend not only on the static structure of the active site of the enzyme, as revealed through crys-tallographic studies, but also on the structural dynamics of the active site that accompany catalytic turnover27,35. Studies on the kinetic mechanisms of the PMTs may provide some information in this area.

Some of the SET domain PKMTs, such as SETD7, perform a single round of catalysis on a lysine residue, resulting in a mono-methylated product, whereas other SET domain PKMTs catalyse multiple rounds of methyl-ation on a specific lysine residue. Crystallographic studies suggest that the difference between single-turnover and multiple-turnover SET domain enzymes results from the degree of steric crowding and hydrogen-bonding patterns in the lysine-binding channel of these enzymes3,24,36,37. In particular, the identity of an aromatic residue within the lysine-binding pocket seems to be the key determinant of the multiplicity of lysine methyl-ation. In the PKMT DIM5, this residue is a phenylalanine (F281), and the enzyme can tri-methylate the acceptor lysine residue of its protein substrate. The correspond-ing residue in SETD7 is a tyrosine (Y305), and this enzyme can only mono-methylate its protein substrate. Remarkably, the mutant F281Y transforms DIM5 into a mono-methylating PKMT, and the corresponding mutant Y305F in SETD7 results in an enzyme that is capable of multiple rounds of lysine methylation38. These mutagenesis results have been extended to the PKMT euchromatic histone lysine N-methyltransferase 2 (EHMT2; also known as G9A)39, and the ‘tyrosine–phenylalanine switch’ seems to be a general determinant of product specificity among the SET domain PKMTs24. Molecular dynamics and hybrid quantum mechanical–molecular mechanical studies also suggest a key role for bound water molecules (a water channel) in the extent of lysine methylation by PKMTs30.

An outstanding question that has yet to be reconciled with the mechanistic hypothesis described above is how the quaternary nitrogen atom is deprotonated to gener-ate a neutral amine methyl acceptor. At physiological pH, the lysine amine is protonated (the negative log-arithm of the acid dissociation constant (pKa) of the side chain amine is ~10.8 (REF. 35)), and so there are no lone pair electrons to act as the attacking nucleophile in the

Figure 2 | PMT-catalysed methylation of proteins by an sN2 reaction with sAM as the methyl donor. The protein methyltransferases (PMTs) catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) to a nitrogen atom of lysine or arginine side chains to form S-adenosyl-l-homocysteine (SAH; also known as AdoHcy). a | The methyl group (shown in red) of the SAM sulphonium cation is ‘attacked’ by the lone pair electrons of a lysine (shown here) or arginine (not shown) side-chain nitrogen atom. The reaction results in transfer of the methyl group to the attacking nitrogen atom and the production of SAH from the reaction cofactor. b | A more generalized chemical scheme of a bimolecular nucleophilic substitution (S

N2)

group transfer reaction, illustrating the attacking nucleophile (Nu–; lysine or arginine in the case of PMTs), the leaving group (LG; the methyl group in the case of PMTs), and the transient but essential formation of a penta-coordinate carbon transition state (‡).

R E V I E W S

728 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 728 18/8/09 09:48:47

S8 NATURE REPRINT COLLECTION Epigenetics

b DOT1L

a PRMT

c SET domain

Nature Reviews | Drug Discovery

General base catalysisA mechanism that can occur in enzyme catalysis, in which a basic group accepts protons from a substrate molecule, usually to stabilize a charged transition-state species.

SN2-mediated methyl transfer reaction. A potential mech-anism of deprotonation is through general base catalysis. However, inspection of the amino acids in the active sites of PKMTs reveals no obvious basic side chains that could act in this capacity. Another hypothesis is that the solvent acts as a proton ‘sponge’; however, this seems inconsistent with the fact that the lysine side chain is buried deeply in the protein, with no clear access to bulk solvent. An alternative hypothesis has recently been proposed, based on molecular dynamics simulations30. According to this model, binding of SAM and protein substrates creates a ‘water shuttle’ that can remove a proton from the buried lysine side chain and ferry this proton along a contiguous chain of water molecules to be deposited into the bulk solvent. Additionally, the electrostatic repulsion created by the quarternary nitrogen atom and the positively charged SAM cofactor lowers the pKa of the lysine side-chain amine to ~8.2, thereby facilitating this deprotonation

process. Furthermore, the water shuttle hypothesis pro-vides an alternative mechanism to explain the differences in extent of lysine methylation by the PKMTs. The molec-ular dynamics studies suggest that the ability to form a water shuttle will determine the extent of methylation that is catalysed by a given enzyme. For example, simulations of SETD7-mediated catalysis suggest that mono-methyl-ation of lysine prevents re-formation of a new water shuttle, and so this enzyme terminates catalysis after one round of methylation. The same simulations suggest that other PKMTs, such as the ribulose bisphosphate carboxylase–oxygenase large subunit methyltransferase, can readily re-form the water shuttle, leading to multiple rounds of methylation.

Enzymes that perform multiple rounds of catalysis on a macromolecular substrate can do so by one of two mechanisms: a distributive enzyme mechanism, in which each round of catalysis results in macromolecular product dissociation and rebinding, or a processive mechanism, in which multiple rounds of catalysis proceed before dis-sociation of the macromolecular product. PMTs use both of these mechanisms: some SET domain PKMTs that perform multiple rounds of lysine methylation have been found to use a processive mechanism3,24, whereas DoT1l has been shown to perform multiple rounds of H3K79 methylation through a non-processive (distributive) mechanism40.

The PRMTs are also capable of performing multiple rounds of arginine methylation to produce either mono- or di-methylated arginine products. The PRMTs that have been studied so far follow an ordered, sequential mechanism in which SAM binds before the arginine-containing substrate, and di-methyl arginine production occurs through a processive mechanism3. on the basis of product specificity, PRMTs can be subdivided into two types: type I PRMTs, which produce an asymmetrical N,N ′-dimethyl arginine; and type II PRMTs, which produce a symmetrical N,N-dimethyl arginine3.

The variations in active-site structure and chemical mechanism that are summarized above reflect a target class with the potential for substantial chemical diversity among small-molecule modulators of individual enzymes in the class. Therefore, the opportunity for the develop-ment of different chemotypes that compete with the com-mon, natural ligands of these enzymes (for example, SAM, lysine and arginine), and can be modified to produce enzyme-selective inhibitors, seems promising.

Known inhibitors of PMTsDespite the convergence of data concerning PMTs, the search for potent, selective inhibitors of these enzymes has only recently begun in earnest. Some indirect approaches to inhibiting or depleting PMTs have been reported. For example, the antiviral compound 3-deazaneplanocin (DZNep) inhibits the enzyme SAH hydrolase and thereby increases intracellular levels of the universal product of PMTs, SAH41. Product inhibition by SAH would therefore be expected for all PMTs and other SAM-dependent enzymes, with the degree of inhibition for specific enzymes being related to their relative inhibi-tion constant (Ki) and Michaelis constant (Km) values for

Figure 3 | variations in the configuration of sAM or sAH bound within the active sites of different PMTs. a | The representative conformation shown for the protein arginine methyltransferases (PRMTs) was taken from the crystal structure of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) bound to coactivator- associated arginine methyltransferase 1 (CARM1)49. b | The conformation shown for DOT1-like, histone H3 methyltransferase (DOT1L) was taken from the crystal structure of S-adenosyl-l-methionine (SAM; also known as AdoMet) bound to this protein33. c | The representative conformation shown for the protein lysine methyl-transferases (PKMTs) was taken from the crystal structure of SAH bound to SET domain-containing lysine methyl transferase 8 (REF. 50). Carbon atoms are represented by grey circles; nitrogen atoms are represented by blue circles; oxygen atoms are represented by red circles; and sulphur atoms are represented by yellow circles.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 729

nrd_2974_sep09.indd 729 18/8/09 09:48:48

NATURE REPRINT COLLECTION Epigenetics S9

SAH and SAM, respectively27. Similarly, the activity of all SAM-dependent enzymes in a cell could be reduced by blocking SAM biosynthesis — for example, by inhibit-ing dihydrofolate reductase or SAM synthase, which are

two enzymes involved in SAM biosynthesis42. Also, the pan-HDAC inhibitor panobinostat has recently been shown to cause depletion of cellular levels of the PMT EZH2 (REF. 43). Although the mechanism by which this

Table 3 | Chemical structures and biochemical data for small-molecule inhibitors of PMTs

compound structure Mechanism and potency selectivity* refs

SAH

H2N S

CO H2O

OHOH

N

N

N

N

NH2 Product of the reactions catalysed by PMTs IC50 values range from 0.1 to 20 µM

Non-selective 77,78

Sinefungin

CO H2O

H2N

OHOH

N

N

N

N

NH2

NH2

Natural product analogue of SAM and SAH IC50 values range from 0.1 to 20 µM

Non-selective 36

ChaetocinHN N

HN

O OH

O

S S

SSNH

NH

N

O

O OH

SAM-competitive inhibitor of SUV39 IC

50 = 0.6 µM

> 4-fold 79

BIX-01294

N

NH

N

NMeO

MeO

NN

SAM-non-competitive inhibitor of EHMT2 IC

50 = 2.7 µM

> 4-fold 80

Methylgene compound 7a of REF. 45

CH3O

HN

O

NN

F3C

S

NH

O NH2

CARM1 inhibitor IC

50 = 60 nM

> 100-fold for PRMT1 and SETD7

45

Bristol–Myers Squibb compound 7f of REF. 47

N N

O

NS

NN

F3C

HN

O

NH2

CARM1 inhibitor IC

50 = 40 nM

>100-fold for PRMT1 and PRMT3

46,47

CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); EHMT2, euchromatic histone lysine N-methyltransferase 2 (also known as G9A and KMT1C); IC

50, half-maximal inhibitory concentration; PMT, protein methyltransferase;

PRMT, protein arginine methyltransferase; SAH, S-adenosyl-l-homocysteine (also known as AdoHcy); SAM, S-adenosyl-l-methionine (also known as AdoMet); SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SUV39, suppressor of variegation 3–9; *Selectivity is given as the ratio of the IC

50 value for the most potent inhibition at a non-target PMT over the IC

50 value

for the primary target. See REF. 27.

R E V I E W S

730 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 730 18/8/09 09:48:50

S10 NATURE REPRINT COLLECTION Epigenetics

Structure–activity relationshipThe relationship between the chemical structure of a compound and its pharmacological activity.

occurs is not yet fully understood, an approach of this type would nevertheless deplete the protein levels of EZH2 and so abolish the PMT catalytic activity of the enzyme along with any other non-enzymatic functions of EZH2.

Direct inhibitors of PMTs have recently been reviewed4, along with other probes of histone-modifying enzymes. Some natural ligands for these enzymes have been known for some time, including the reaction product, SAH, and a natural inhibitor isolated from Streptomyces spp. cultures, sinefungin (TABLE 3). More selective inhibitors have been identified for SUV39 (chaetocin; reported half-maximal inhibitory concentration (IC50) = 0.6 μM) and for EHMT2 (BIX-01294; reported IC50 = 1.6 μM), but no further opti-mization of these compounds has been reported to date4. A co-crystal structure of BIX-01294 bound to EHMT2 has recently been published44. Surprisingly, the compound was found to bind to the enzyme non-competitively with respect to SAM, in a groove that is normally occupied by a portion of the protein substrate.

More recently, two groups have reported potent, selec-tive, pyrazole-based inhibitors of the PRMT CARM1 (REFS 45–47) (TABLE 3). These compounds are the first examples of inhibitors of a specific PMT that are effective at nanomolar concentrations and display >100-fold selec-tivity for the primary target over related enzymes. The compound series reported by Methylgene45 was found to be inactive in cellular assays; no cellular data have been reported for the compound series from Bristol–Myers Squibb46,47. Therefore, although an exciting first step has been made towards developing selective inhibitors of PMTs, substantial work remains to be done before these findings can be translated into pharmacologically tractable species.

The paucity of potent, selective, pharmacologically tractable inhibitors of the PMTs creates a crucial thera-peutic gap which medicinal chemists should strive to fill. As described here, the pathobiological relevance of these enzymes, together with the structural and mecha-nistic information that suggests their druggability as a target class, converge to make the PMTs an attractive and important class of novel enzymes for contemporary drug discovery.

ConclusionsThere is a growing body of evidence that enzymes in this target class have important pathogenic roles in human diseases. The structures and enzymatic mechanisms of the PMTs support the view that pharmacological modu-lation of these enzymes by small-molecule inhibitors will be an effective means of therapeutic intervention in cancer and numerous other unmet medical needs. The discovery of small-molecule inhibitors of PMTs as starting points for drug development should clearly be a key focus of new research efforts. Beyond this goal, there are many opportunities to use chemical probes of PMT function to define the underlying biology and pathobiology that are associated with protein modifi-cation by these enzymes. The nature of PMT catalysis, and the available structural information about these enzymes, should facilitate the discovery of PMT ligands through mechanism- and structure-guided discovery methods48, as well as methods that do not rely on mech-anistic knowledge, such as high-throughput screening of diverse chemical libraries.

A key remaining question when considering the PMTs as a drug discovery target class is whether or not selective inhibition of particular enzymes can be achieved through targeting the SAM-binding pocket. This is analogous to the question that hindered the early acceptance of protein kinases as drug targets: whether it was possible to achieve selectivity among the ATP-binding pockets of the kinases. In retrospect, it is clear that the diversity of binding-site architecture and the binding-site dynamics associated with enzyme catalysis provide ample opportunities for selective inhibition of kinases through medicinal chemistry efforts. Will the same be true for the SAM-binding pockets of PMTs? Ultimately, structure–activity relationship profiles, selec-tivity and collateral inhibition of off-target enzymes by PMT inhibitors will need to be determined empirically. Despite these limitations, it is our hope that the data pre-sented here will help to stimulate systematic exploration of the human PMT target class towards the goal of devel-oping selective inhibitors of PMTs as therapeutic agents for human diseases.

1. Strahl, B. D. & Allis, C. D. The language of covalent histone modifications. Nature 403, 41–45 (2000).

2. Kouzarides, T. Chromatin modifications and their function. Cell 128, 693–705 (2007).A thorough overview of post-translational modifications on core histones, the enzymes that mediate these modifications and the biological functions of the modification.

3. Smith, B. C. & Denu, J. M. Chemical mechanisms of histone lysine and arginine modifications. Biochim. Biophys. Acta 1789, 45–57 (2008).An excellent review of the chemical biology of lysine- and arginine-modifying enzymes.

4. Cole, P. A. Chemical probes for histone-modifying enzymes. Nature Chem. Biol. 4, 590–597 (2008).

5. Keppler, B. R. & Archer, T. K. Chromatin-modifying enzymes as therapeutic targets — Part 1. Expert Opin. Ther. Targets. 12, 1301–1312 (2008).

6. Pray, L. At the flick of a switch: epigenetic drugs. Chem. Biol. 15, 640–641 (2008).

7. Jones, P. A. & Baylin, S. B. The epigenomics of cancer. Cell 128, 683–692 (2007).

8. Wilson, C. B., Rowell, E. & Sekimata, M. Epigenetic control of T-helper-cell differentiation. Nature Rev. Immunol. 9, 91–105 (2009).

9. Tsankova, N., Renthal, W., Kumar, A. & Nestler, E. J. Epigenetic regulation in psychiatric disorders. Nature Rev. Neurosci. 8, 355–367 (2007).

10. Kleer, C. G. et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells. Proc. Natl Acad. Sci. USA 100, 11606–11611 (2003).

11. Krivtsov, A. V. et al. H3K79 methylation profiles define murine and human MLL-AF4 leukemias. Cancer Cell 14, 355–368 (2008).

12. Jansson, M. et al. Arginine methylation regulates the p53 response. Nature Cell Biol. 10, 1431–1439 (2008).

13. Hong, H. et al. Aberrant expression of CARM1, a transcriptional coactivator of androgen receptor, in the development of prostate carcinoma and androgen-independent status. Cancer 101, 83–89 (2004).

14. Schneider, R., Bannister, A. J. & Kouzarides, T. Unsafe SETs: histone lysine methyltransferases and cancer. Trends Biochem. Sci. 27, 396–402 (2002).

15. Simon, J. A. & Lange, C. A. Roles of the EZH2 histone methyltransferase in cancer epigenetics. Mutat. Res. 647, 21–29 (2008).

16. Dillon, S. C., Zhang, X., Trievel, R. C. & Cheng, X. The SET-domain protein superfamily: protein lysine methyltransferases. Genome Biol. 6, 227 (2005).

17. Ryu, H. et al. ESET/SETDB1 gene expression and histone H3 (K9) trimethylation in Huntington’s disease. Proc. Natl Acad. Sci. USA 103, 19176–19181 (2006).

18. Cheng, D., Cote, J., Shaaban, S. & Bedford, M. T. The arginine methyltransferase CARM1 regulates the coupling of transcription and mRNA processing. Mol. Cell 25, 71–83 (2007).

19. Li, Y. et al. Role of the histone H3 lysine 4 methyltransferase, SET7/9, in the regulation of NF-κB-dependent inflammatory genes. Relevance to diabetes and inflammation. J. Biol. Chem. 283, 26771–26781 (2008).

20. Covic, M. et al. Arginine methyltransferase CARM1 is a promoter-specific regulator of NF-κB-dependent gene expression. EMBO J. 24, 85–96 (2005).

21. Hassa, P. O., Covic, M., Bedford, M. T. & Hottiger, M. O. Protein arginine methyltransferase 1 coactivates NF-κB-dependent gene expression synergistically with CARM1 and PARP1. J. Mol. Biol. 377, 668–678 (2008).

22. Huang, J. et al. Trimethylation of histone H3 lysine 4 by Set1 in the lytic infection of human herpes simplex virus 1. J. Virol. 80, 5740–5746 (2006).

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 731

nrd_2974_sep09.indd 731 18/8/09 09:48:50

NATURE REPRINT COLLECTION Epigenetics S11

23. Jeong, S. J. et al. Coactivator-associated arginine methyltransferase 1 enhances transcriptional activity of the human T-cell lymphotropic virus type 1 long terminal repeat through direct interaction with Tax. J. Virol. 80, 10036–10044 (2006).

24. Cheng, X., Collins, R. E. & Zhang, X. Structural and sequence motifs of protein (histone) methylation enzymes. Annu. Rev. Biophys. Biomol. Struct. 34, 267–294 (2005).

25. Goldstein, D. M., Gray, N. S. & Zarrinkar, P. P. High-throughput kinase profiling as a platform for drug discovery. Nature Rev. Drug Discov. 7, 391–397 (2008).

26. Mook, R. A. The importance and complexity of target class selectivity in drug discovery. The American Association for Cancer Research Education Book 223–226 (The American Association for Cancer Research, Philadelphia, 2005).

27. Copeland, R. A. Evaluation of Enzyme Inhibitors in Drug Discovery: A Guide for Medicinal Chemists and Pharmacologists (Wiley, Hoboken, 2005).

28. Cheng, D. et al. Small molecule regulators of protein arginine methyltransferases. J. Biol. Chem. 279, 23892–23899 (2004).

29. Allis, C. D. et al. New nomenclature for chromatin-modifying enzymes. Cell 131, 633–636 (2007).

30. Zhang, X. & Bruice, T. C. Enzymatic mechanism and product specificity of SET-domain protein lysine methyltransferases. Proc. Natl Acad. Sci. USA 105, 5728–5732 (2008).This work provides a detailed theoretical basis to explain the substrate specificity of the protein lysine methyltransferases.

31. Fedorov, O. et al. A systematic interaction map of validated kinase inhibitors with Ser/Thr kinases. Proc. Natl Acad. Sci. USA 104, 20523–20528 (2007).

32. Karaman, M. W. et al. A quantitative analysis of kinase inhibitor selectivity. Nature Biotech. 26, 127–132 (2008).

33. Min, J., Feng, Q., Li, Z., Zhang, Y. & Xu, R. M. Structure of the catalytic domain of human DOT1L, a non-SET domain nucleosomal histone methyltransferase. Cell 112, 711–723 (2003).

34. Sawada, K. et al. Structure of the conserved core of the yeast Dot1p, a nucleosomal histone H3 lysine 79 methyltransferase. J. Biol. Chem. 279, 43296–43306 (2004).

35. Copeland, R. A. Enzymes: A Practical Introduction to Structure, Mechanism and Data Analysis 2nd edn (Wiley, Hoboken, 2000).

36. Couture, J. F., Hauk, G., Thompson, M. J., Blackburn, G. M. & Trievel, R. C. Catalytic roles for carbon–oxygen hydrogen bonding in SET domain lysine methyltransferases. J. Biochem. 281, 19280–19287 (2006).

37. Collins, R. E. et al. In vitro and in vivo analyses of a Phe/Tyr switch controlling product specificity of histone lysine methyltransferases. J. Biol. Chem. 280, 5563–5570 (2005).This study provides a structural basis for the wide range of lysine methylation patterns that is achieved by different SET domain PKMTs.

38. Trievel, R. C., Flynn, E. M., Houtz, R. L. & Hurley, J. H. Mechanism of multiple lysine methylation by the SET domain enzyme Rubisco LSMT. Nature Struct. Biol. 10, 545–552 (2003).

39. Zhang, X. et al. Structural basis for the product specificity of histone lysine methyltransferases. Mol. Cell 12, 177–185 (2003).

40. Frederiks, F. et al. Nonprocessive methylation by Dot1 leads to functional redundancy of histone H3K79 methylation states. Nature Struct. Mol. Biol. 15, 550–557 (2008).

41. Chiang, P. K. Biological effects of inhibitors of S-adenosylhomocysteine hydrolase. Pharmacol. Ther. 77, 115–134 (1998).

42. Bender, C. M., Zingg, J.-M. & Jones, P. A. DNA methylation as a target for drug design. Pharm. Res. 15, 175–187 (1998).

43. Fiskus, W. et al. Panobinostat treatment depletes EZH2 and DNMT1 levels and enhances decitabine mediated de-repression of JunB and loss of survival of human acute leukemia cells. Cancer Biol. Ther. 8, 939–950 (2009).

44. Chang, Y. et al. Structural basis for G9a-like protein lysine methyltransferase inhibition by BIX-01294. Nature Struct. Mol. Biol. 16, 312–317 (2009).

45. Allan, M. et al. N-Benzyl-1-heteroaryl-3-(trifluoromethyl)-1H-pyrazole-5-carboxamides as inhibitors of co-activator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 19, 1218–1223 (2009).The first examples of potent, drug-like inhibitors of a human PMT.

46. Purandare, A. V. et al. Pyrazole inhibitors of coactivator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 18, 4438–4441 (2008).

47. Huynh, T. et al. Optimization of pyrazole inhibitors of coactivator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 19, 2924–2927 (2009).

48. Copeland, R. A., Gontarek, R. & Luo, L. in Textbook of Drug Design and Discovery 4th edn Ch. 12 (eds. Krogsgaard-Larsen, P., Madsen, U. & Stromgaard, K.) 378–407 (Taylor and Francis, New York, 2009).

49. Troffer-Charlier, N., Cura, V., Hassenboehler, P., Moras, D. & Cavarelli, J. Functional insights from structures of coactivator-associated arginine methyltransferase 1 domains. EMBO J. 26, 4391–4401 (2007).

50. Couture, J.-F., Collazo, E., Brunzelle, J. S. & Trievel, R. C. Structural and functional analysis of SET8, a histone H4 Lys-20 methyltransferase. Genes Dev. 19, 1455–1465 (2005).

51. Ma, W. W. & Adjei, A. A. Novel agents on the horizon for cancer therapy. CA Cancer J. Clin. 59, 111–137 (2009).A review of the current knowledge on how aberrant epigenetic mechanisms can contribute to the development of cancer and the progress in developing therapies that target these mechanisms.

52. Cortez, C. C. & Jones, P. A. Chromatin, cancer and drug therapies. Mutat. Res. 647, 44–51 (2008).

53. Kang, M. Y. et al. Association of the SUV39H1 histone methyltransferase with the DNA methyltransferase 1 at mRNA expression level in primary colorectal cancer. Int. J. Cancer 121, 2192–2197 (2007).

54. Watanabe, H. et al. Deregulation of histone lysine methyltransferases contributes to oncogenic transformation of human bronchoepithelial cells. Cancer Cell Int. 8, 15 (2008).

55. Kondo, Y. et al. Downregulation of histone H3 lysine 9 methyltransferase G9a induces centrosome disruption and chromosome instability in cancer cells. PLoS One 3, e2037 (2008).

56. Tkachuk, D., Kohler, S. & Cleary, M. L. Involvement of a homolog of Drosophila trithorax by 11q23 chromosomal translocations in acute leukemias. Cell 71, 691–700 (1992).

57. Gu, Y. et al. The t(4;11) chromosome translocation of human acute leukemias fuses the ALL-1 gene, related to Drosophila trithorax, to the AF-4 gene. Cell 71, 701–708 (1992).

58. Liedtke, M. & Cleary, M. L. Therapeutic targeting of MLL. Blood 113, 6061–6068 (2009).

59. Wang, G. G., Cai, L., Pasillas, M. P. & Kamps, M. P. NUP98-NSD1 links H3K36 methylation to Hox-A gene activation and leukaemogenesis. Nature Cell Biol. 9, 804–812 (2007).

60. Marango, J. et al. The MMSET protein is a histone methyltransferase with characteristics of a transcriptional corepressor. Blood 111, 3145–3154 (2008).

61. Kim, J. Y. et al. Multiple-myeloma-related WHSC1/MMSET isoform RE-IIBP is a histone methyltransferase with transcriptional repression activity. Mol. Cell Biol. 28, 2023–2034 (2008).

62. Lauring, J. et al. The multiple myeloma associated MMSET gene contributes to cellular adhesion, clonogenic growth, and tumorigenicity. Blood 111, 856–864 (2008).

63. Angrand, P. O. et al. NSD3, a new SET domain-containing gene, maps to 8p12 and is amplified in human breast cancer cell lines. Genomics 74, 79–88 (2001).

64. Rosati, R. et al. NUP98 is fused to the NSD3 gene in acute myeloid leukemia associated with t(8;11)(p11.2;p15). Blood 99, 3857–3860 (2002).

65. Tonon, G. et al. High-resolution genomic profiles of human lung cancer. Proc. Natl Acad. Sci. USA 102, 9625–9630 (2005).

66. Okada, Y. et al. hDOT1L links histone methylation to leukemogenesis. Cell 121, 167–178 (2005).

67. Bitoun, E., Oliver, P. L. & Davies, K. E. The mixed-lineage leukemia fusion partner AF4 stimulates RNA polymerase II transcriptional elongation and mediates coordinated chromatin remodeling. Hum. Mol. Genet. 16, 92–106 (2007).

68. Hamamoto, R. et al. SMYD3 encodes a histone methyltransferase involved in the proliferation of cancer cells. Nature Cell Biol. 6, 731–740 (2004).

69. Hamamoto, R. et al. Enhanced SMYD3 expression is essential for the growth of breast cancer cells. Cancer Sci. 97, 113–118 (2006).

70. Bracken, A. P. et al. EZH2 is downstream of the pRB-E2F pathway, essential for proliferation and amplified in cancer. EMBO J. 22, 5323–5335 (2003).

71. Varambally, S. et al. The polycomb group protein EZH2 is involved in progression of prostate cancer. Nature 419, 624–629 (2002).

72. Subramanian, K. et al. Regulation of estrogen receptor alpha by the SET7 lysine methyltransferase. Mol. Cell 30, 336–347 (2008).

73. Nishikawa, N. et al. Gene amplification and overexpression of PRDM14 in breast cancers. Cancer Res. 67, 9649–9657 (2007).

74. Majumder, S., Liu, Y., Ford, O. H., 3rd, Mohler, J. L. & Whang, Y. E. Involvement of arginine methyltransferase CARM1 in androgen receptor function and prostate cancer cell viability. Prostate 66, 1292–1301 (2006).

75. Frietze, S., Lupien, M., Silver, P. A. & Brown, M. CARM1 regulates estrogen-stimulated breast cancer growth through up-regulation of E2F1. Cancer Res. 68, 301–306 (2008).

76. Zhao, Q. et al. PRMT5-mediated methylation of histone H4R3 recruits DNMT3A, coupling histone and DNA methylation in gene silencing. Nature Struct. Mol. Biol. 16, 304–311 (2009).

77. Patnaik, D. et al. Substrate specificity and kinetic mechanism of mammalian G9a histone H3 methyltransferase. J. Biol. Chem. 279, 53248–53258 (2004).

78. Chin, H. G., Patnaik, D., Esteve, P.-O., Jacobsen, S. E. & Pradhan, S. Catalytic properties and kinetic mechanism of human recombinant lys-9 histone H3 methyltransferase SUV39H1: participation of the chromodomain in enzymatic catalysis. Biochemistry 45, 3272–3284 (2006).

79. Greiner, D., Bonaldi, T., Eskeland, R., Roemer, E. & Imhof, A. Identification of a specific inhibitor of the histone methyltransferase SU(VAR)3–9. Nature Chem. Biol. 1, 143–145 (2005).

80. Kubicek, S. et al. Reversal of H3K9me2 by a small-molecule inhibitor for the G9a histone methyltransferase. Mol. Cell 25, 473–481 (2007).

AcknowledgementsWe are grateful to K. Shiosaki, C. T. Walsh, H. R. Horvitz, Y. Zhang, and R. Gould for their insights, constant support and encouragement. We also thank K. Boater, E. Olhava, L. Jin and T. Luly for expert help in preparation of this manuscript.

Competing interests statementThe authors declare competing financial interests: see web version for details.

DATABASESUniProtKB: http://www.uniprot.orgCARM1 | DOT1L | EHMT2 | EZH2 | PRMT1 | SETD7 | SETD8 | SETD1A | SETDB1 | SUZ12

FURTHER INFORMATIONAuthor’s homepage: http://www.epizyme.com

All liNks Are AcTive iN THe oNliNe PDf

R E V I E W S

732 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 732 18/8/09 09:48:50

NATURE REPRINT COLLECTION Epigenetics S3

Cellular differentiation is one of the most important components of embryonic development and postnatal tissue maintenance and repair. Almost every nucleated cell of the human body contains the same, complete complement of genomic DNA. However, the ability of pluripotent cells to differentiate into distinct lineages and ultimate cell types is conferred by specific patterns of transcription of subsets of genes in the genome. A large and growing body of data support the idea that epigenetic regulation of gene transcription is a key biological deter-minant of cellular differentiation1.

The chromosomes within eukaryotic cell nuclei are packaged together with structural proteins (histones) to form the complex known as chromatin. Four major histones (H2A, H2B, H3 and H4) form an octameric, disc-shaped aggregate — composed of two copies of each histone type — around which the DNA is wound to form regular, repeating units known as nucleosomes (FIG. 1). Chromatin exists in two main conformational states: a condensed state (heterochromatin) in which the nucleo-somes are tightly packed together and gene transcription is largely repressed; and a more relaxed state (euchro-matin) in which gene transcription is activated. Epigenetic regulation of gene transcription is mediated by selective, enzyme-catalysed, covalent modification of specific nucleo tides within the genes and also by post-translational modifications of the histone proteins (FIG. 1). Modification of DNA can silence gene transcription directly, whereas the post-translational modifications of histones control the conformational transition between the heterochroma-tin and euchromatin states2. The enzymes that covalently modify DNA and histones are therefore the key mediators of epigenetic regulation of gene transcription.

Several putative epigenetic enzymes have recently been identified and, in some cases, their catalytic mechanism and three-dimensional structures have been determined2,3.

Epigenetic enzymes that are encoded in the human genome catalyse group transfer reactions and can be categorized according to the nature of the covalent modifications that they catalyse and by the substrates upon which they act. In humans, these enzymes include DNA methyltransferases (DNMTs), which methylate the carbon atom at the 5-position of cytosine in the CpG dinucleotide sites of the genome; protein methyl-transferases (PMTs), which methylate lysine or arginine residues on histones and other proteins; protein demethyl-ases, which remove methyl groups from the lysine or arginine residues of proteins; histone acetyltransferases, which acetylate lysine residues on histones and other proteins; histone deacetylases (HDACs), which remove acetyl groups from lysine residues on histones and other proteins; ubiquitin ligases, which add ubiquitin to lysine residues on histones and other proteins; and specific kinases that phosphorylate serine residues on histones4,5.

Given that small-molecule inhibitors have been suc-cess fully designed for HDACs and DNMTs (discussed below), it is likely that additional families of histone-modifying enzymes will also be amenable to small-mol-ecule modulation. The opportunity for chemical-probe development and pharmacological control of epigenetic gene transcription is therefore of great interest in the fields of basic biology and drug discovery4,5. Indeed, the role of these enzymes in human diseases is high-lighted by the recent approval of three drugs by the US Food and Drug Administration6 that act as selective, small-molecule inhibitors of HDACs and DNMTs for the treatment of specific human cancers (TABLE 1).

In recent years, there have been numerous reviews in the literature that highlight different aspects of the biology, disease association and/or structural biology of various histone-modifying enzymes. In this Review, we

Epizyme, Inc., 840 Memorial Drive, Cambridge, Massachussets 02139, USA.Correspondence to R.A.C. e-mail: [email protected]:10.1038/nrd2974

EpigeneticsA stably heritable change in phenotype or gene expression in an organism or cell, resulting from changes in a chromosome that are not caused by a change in DNA sequence. The process of eukaryotic cell differentiation is one of the most well-known examples of epigenetic changes.

Protein methyltransferases as a target class for drug discoveryRobert A. Copeland, Michael E. Solomon and Victoria M. Richon

Abstract | The protein methyltransferases (PMTs) — which methylate protein lysine and arginine residues and have crucial roles in gene transcription — are emerging as an important group of enzymes that play key parts in normal physiology and human diseases. The collection of human PMTs is a large and diverse group of enzymes that have a common mechanism of catalysis. Here, we review the biological, biochemical and structural data that together present PMTs as a novel, chemically tractable target class for drug discovery.

R E V I E W S

724 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 724 18/8/09 09:48:45

First published in Nature Reviews Drug Discovery 8, 724–732 (2009); doi: 10.1038/nrd2974

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S4 NATURE REPRINT COLLECTION Epigenetics

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S5

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S6 NATURE REPRINT COLLECTION Epigenetics

common natural ligand, the ATP-binding pockets of protein kinases have afforded medicinal chemists a rich diversity of chemical scaffolds, which have resulted in a range of drug molecules of varying degrees of target selectivity32. Similarly, the commonality of SAM use by the PMTs belies the structural, biological and pathologi-cal diversity of these enzymes. From the perspective of drug discovery and medicinal chemistry, the diversity of SAM-binding modes and catalytic mechanisms of these enzymes is of key importance.

A common structural feature of PKMTs and PRMTs that distinguishes these enzymes from other pro-teins that use SAM is the overall architecture of their extended catalytic active sites. This generally consists of a SAM-binding pocket that is accessed from one face of the protein, and a narrow, hydrophobic, acceptor (that is, lysine or arginine) channel that extends to the opposite face of the protein surface, such that the two substrates enter the active site from opposite sides of the enzyme surface.

Table 2 | Selected PKMTs and PRMTs that have shown an association with human cancers

Protein methyltransferase

Methylation substrates

cancers cancer association refs

SUV39H1 H3K9 Colon cancer Increased expression in colorectal tumours; associated with transcriptional repression

53

EHMT2 H3K9 Lung, prostate and hepatocellular carcinoma

Increased expression in lung cancer cell lines; regulates centrosome duplication, presumably through chromatin structure

54,55

MLL H3K4 Leukaemia Chromosomal aberrations involving MLL are a cause of acute leukaemias; the SET domain is lost in translocation

56–58

NSD1 H3K36 Acute myeloid leukaemia

Translocation fuses NSD1 to nucleoporin 98 in human acute myeloid leukaemia

59

WHSC1 H3K36 and H4K20

Myeloma Translocated and increased expression in myeloma; associated with transcriptional regulation

60–62

WHSC1L1 H3K4 Lung and breast cancers, and childhood acute myeloid leukaemia

Amplified in lung cancer and breast cancer; translocation with nucleoporin 98; mediates transcriptional activation

63–64

DOT1L H3K79 MLL-rearranged leukaemias

Recruited by MLL fusion partners MLLT1, MLLT2, MLLT3 and MLLT10 to homeobox genes; associated with transcriptional activation and elongation

11, 66,67

SMYD3 H3K4 Breast, liver, colon and gastric cancers

Overexpressed in multiple tumour types; associated with transcriptional activation

68,69

EZH2 H3K27 Breast, prostate, colon, gastric, bladder and liver cancers, melanoma and lymphoma

Amplified and increased expression in several tumour types; a member of the polycomb repressive complex 2; associated with transcriptional repression

10,15,70,71

SETD7 H3K4 Breast cancers SET7-mediated methylation stabilizes the oestrogen receptor and is necessary for the recruitment of the oestrogen receptor to its target genes and target gene transactivation

72

PRDM14 No known substrate

Breast cancers Amplified and overexpressed in cancers; associated with transcriptional repression

73

CARM1 H3R17, EP300–CBP and NCOA3

Breast and prostate cancers

Increased expression correlates with androgen independence in human prostate carcinoma; overexpressed in breast tumours and associated with transcriptional activation

74,75

PRMT5 H3R8, p53, SNRPD1, SNRPD3 and SUPT5H

Lymphoma PRMT5 expression and H3R8 methylation levels are increased in lymphoid cancer cells; PRMT5 mediates p53 methylation, which promotes cell arrest rather than cell death; H4R3 methylation promotes recruitment of DNMT3A, subsequent promoter CpG methylation and gene silencing

12, 76

CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); CBP, CREB-binding protein; DNMT3A, DNA (cytosine-5-)-methyltransferase 3α; EHMT2, euchromatic histone-lysine N-methyltransferase 2 (also known as G9A and KMT1C); EP300, E1A-binding protein p300; EZH2, enhancer of zeste homologue 2 (also known as KMT6); DOT1L, DOT1-like, histone H3 methyltransferase (also known as KMT4); MLL, myeloid, lymphoid or mixed-lineage leukaemia (also known as KMT2A); MLLT1, myeloid, lymphoid or mixed-lineage leukemia, translocated to 1; NCOA3, nuclear receptor coactivator 3; NSD1, nuclear receptor-binding SET domain protein 1; PKMT, protein lysine methyltransferase; PRDM14, PR domain-containing protein 14; PRMT, protein arginine methyltransferase; SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SMYD3, SET and MYND domain-containing protein 3; SNRPD1, small nuclear ribonucleoprotein D1 polypeptide 16kDa (also known as SMD1); SNRPD3, small nuclear ribonucleoprotein D3 polypeptide 18kDa (also known as SMD3); SUPT5H, suppressor of Ty 5 homologue; SUV39H1, suppressor of variegation 3–9 homologue 1 (also known as KMT1A); WHSC1, Wolf–Hirschhorn syndrome candidate 1 (also known as MMSET and NSD2); WHSC1L1, Wolf–Hirschhorn syndrome candidate 1-like protein 1 (also known as NSD3).

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 727

nrd_2974_sep09.indd 727 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S7

Nature Reviews | Drug Discovery

PMTs

SAM SAH

N

O

H

NH

O

NHH

CH3

NH3+

–O2C

S+

OHH H

NO

N

N

N

NH2

H

OH

H

N

O

H

NH

O

NCH3

HH

NH3+

–O2C

S+

OHH H

NO

N

N

N

NH2

H

OH

H

Nu– LG Nu LG–+LGNuδ– δ+ ‡

a

b

Crystallographic studies have revealed two distinct binding modes for SAM or SAH in the cofactor-binding pockets of PMTs24. For the SET domain PKMTs that have been co-crystallized with SAM or SAH, it is known that the cofactor adopts a ‘U-shaped’ configuration within the active site (FIG. 3) that aligns the methylsulphonium cation of SAM at the base of the narrow lysine channel, in perfect juxtaposition to the ε-amino group of the acceptor lysine residue, which facilitates group transfer. This U-shaped configuration is induced by a conserved aspartate or glutamate residue that binds to the ribose hydroxyl groups, and a positively charged lysine or arginine residue that forms a salt bridge with the car-boxylate of SAM. In striking contrast to the U-shaped configuration that is adopted by the cofactor when bound to PKMTs, SAM bound within the active site of PRMTs adopts an extended configuration that resem-bles the extended SAM configuration seen in the DNA methyltransferases; again, the binding motif results in alignment of the SAM methylsulphonium cation with the base of the acceptor-binding channel. Another dis-tinction between cofactor binding within the PKMTs and the PRMTs is that, in PRMTs, dimer formation seems to be a crucial component of SAM binding and catalysis, whereas this is not the case for PKMTs3,24. The mechanistic consequences of obligate dimer formation in the PRMTs is not yet clear, but it may be involved in multiple methylations of the arginine residue.

From the above discussion, it could be concluded that the configuration of the bound SAM is structurally related to the identity of the methyl acceptor nitrogen species upon which the enzymes act; that is, U-shaped for PKMTs and extended for PRMTs. However, data on the non-SET domain PKMT, DoT1l, do not support this conclusion. In the co-crystal structures of human DoT1l bound to SAM33, and the yeast homologue DoT1P bound to SAH34, the cofactor is bound in the extended configu-ration, similar to that seen in the PRMTs. Additionally, the solvent-exposed surface area of the bound cofactor in DoT1l is more similar to that seen in the PRMTs than the PKMTs, as is the overall amino-acid sequence around the cofactor-binding pocket24,33. Therefore, from a struc-tural perspective, DoT1l seems to link the PKMT and PRMT groups of PMTs.

The discovery and optimization of selective drugs for the PMTs will depend not only on the static structure of the active site of the enzyme, as revealed through crys-tallographic studies, but also on the structural dynamics of the active site that accompany catalytic turnover27,35. Studies on the kinetic mechanisms of the PMTs may provide some information in this area.

Some of the SET domain PKMTs, such as SETD7, perform a single round of catalysis on a lysine residue, resulting in a mono-methylated product, whereas other SET domain PKMTs catalyse multiple rounds of methyl-ation on a specific lysine residue. Crystallographic studies suggest that the difference between single-turnover and multiple-turnover SET domain enzymes results from the degree of steric crowding and hydrogen-bonding patterns in the lysine-binding channel of these enzymes3,24,36,37. In particular, the identity of an aromatic residue within the lysine-binding pocket seems to be the key determinant of the multiplicity of lysine methyl-ation. In the PKMT DIM5, this residue is a phenylalanine (F281), and the enzyme can tri-methylate the acceptor lysine residue of its protein substrate. The correspond-ing residue in SETD7 is a tyrosine (Y305), and this enzyme can only mono-methylate its protein substrate. Remarkably, the mutant F281Y transforms DIM5 into a mono-methylating PKMT, and the corresponding mutant Y305F in SETD7 results in an enzyme that is capable of multiple rounds of lysine methylation38. These mutagenesis results have been extended to the PKMT euchromatic histone lysine N-methyltransferase 2 (EHMT2; also known as G9A)39, and the ‘tyrosine–phenylalanine switch’ seems to be a general determinant of product specificity among the SET domain PKMTs24. Molecular dynamics and hybrid quantum mechanical–molecular mechanical studies also suggest a key role for bound water molecules (a water channel) in the extent of lysine methylation by PKMTs30.

An outstanding question that has yet to be reconciled with the mechanistic hypothesis described above is how the quaternary nitrogen atom is deprotonated to gener-ate a neutral amine methyl acceptor. At physiological pH, the lysine amine is protonated (the negative log-arithm of the acid dissociation constant (pKa) of the side chain amine is ~10.8 (REF. 35)), and so there are no lone pair electrons to act as the attacking nucleophile in the

Figure 2 | PMT-catalysed methylation of proteins by an sN2 reaction with sAM as the methyl donor. The protein methyltransferases (PMTs) catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) to a nitrogen atom of lysine or arginine side chains to form S-adenosyl-l-homocysteine (SAH; also known as AdoHcy). a | The methyl group (shown in red) of the SAM sulphonium cation is ‘attacked’ by the lone pair electrons of a lysine (shown here) or arginine (not shown) side-chain nitrogen atom. The reaction results in transfer of the methyl group to the attacking nitrogen atom and the production of SAH from the reaction cofactor. b | A more generalized chemical scheme of a bimolecular nucleophilic substitution (S

N2)

group transfer reaction, illustrating the attacking nucleophile (Nu–; lysine or arginine in the case of PMTs), the leaving group (LG; the methyl group in the case of PMTs), and the transient but essential formation of a penta-coordinate carbon transition state (‡).

R E V I E W S

728 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 728 18/8/09 09:48:47

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S8 NATURE REPRINT COLLECTION Epigenetics

b DOT1L

a PRMT

c SET domain

Nature Reviews | Drug Discovery

General base catalysisA mechanism that can occur in enzyme catalysis, in which a basic group accepts protons from a substrate molecule, usually to stabilize a charged transition-state species.

SN2-mediated methyl transfer reaction. A potential mech-anism of deprotonation is through general base catalysis. However, inspection of the amino acids in the active sites of PKMTs reveals no obvious basic side chains that could act in this capacity. Another hypothesis is that the solvent acts as a proton ‘sponge’; however, this seems inconsistent with the fact that the lysine side chain is buried deeply in the protein, with no clear access to bulk solvent. An alternative hypothesis has recently been proposed, based on molecular dynamics simulations30. According to this model, binding of SAM and protein substrates creates a ‘water shuttle’ that can remove a proton from the buried lysine side chain and ferry this proton along a contiguous chain of water molecules to be deposited into the bulk solvent. Additionally, the electrostatic repulsion created by the quarternary nitrogen atom and the positively charged SAM cofactor lowers the pKa of the lysine side-chain amine to ~8.2, thereby facilitating this deprotonation

process. Furthermore, the water shuttle hypothesis pro-vides an alternative mechanism to explain the differences in extent of lysine methylation by the PKMTs. The molec-ular dynamics studies suggest that the ability to form a water shuttle will determine the extent of methylation that is catalysed by a given enzyme. For example, simulations of SETD7-mediated catalysis suggest that mono-methyl-ation of lysine prevents re-formation of a new water shuttle, and so this enzyme terminates catalysis after one round of methylation. The same simulations suggest that other PKMTs, such as the ribulose bisphosphate carboxylase–oxygenase large subunit methyltransferase, can readily re-form the water shuttle, leading to multiple rounds of methylation.

Enzymes that perform multiple rounds of catalysis on a macromolecular substrate can do so by one of two mechanisms: a distributive enzyme mechanism, in which each round of catalysis results in macromolecular product dissociation and rebinding, or a processive mechanism, in which multiple rounds of catalysis proceed before dis-sociation of the macromolecular product. PMTs use both of these mechanisms: some SET domain PKMTs that perform multiple rounds of lysine methylation have been found to use a processive mechanism3,24, whereas DoT1l has been shown to perform multiple rounds of H3K79 methylation through a non-processive (distributive) mechanism40.

The PRMTs are also capable of performing multiple rounds of arginine methylation to produce either mono- or di-methylated arginine products. The PRMTs that have been studied so far follow an ordered, sequential mechanism in which SAM binds before the arginine-containing substrate, and di-methyl arginine production occurs through a processive mechanism3. on the basis of product specificity, PRMTs can be subdivided into two types: type I PRMTs, which produce an asymmetrical N,N ′-dimethyl arginine; and type II PRMTs, which produce a symmetrical N,N-dimethyl arginine3.

The variations in active-site structure and chemical mechanism that are summarized above reflect a target class with the potential for substantial chemical diversity among small-molecule modulators of individual enzymes in the class. Therefore, the opportunity for the develop-ment of different chemotypes that compete with the com-mon, natural ligands of these enzymes (for example, SAM, lysine and arginine), and can be modified to produce enzyme-selective inhibitors, seems promising.

Known inhibitors of PMTsDespite the convergence of data concerning PMTs, the search for potent, selective inhibitors of these enzymes has only recently begun in earnest. Some indirect approaches to inhibiting or depleting PMTs have been reported. For example, the antiviral compound 3-deazaneplanocin (DZNep) inhibits the enzyme SAH hydrolase and thereby increases intracellular levels of the universal product of PMTs, SAH41. Product inhibition by SAH would therefore be expected for all PMTs and other SAM-dependent enzymes, with the degree of inhibition for specific enzymes being related to their relative inhibi-tion constant (Ki) and Michaelis constant (Km) values for

Figure 3 | variations in the configuration of sAM or sAH bound within the active sites of different PMTs. a | The representative conformation shown for the protein arginine methyltransferases (PRMTs) was taken from the crystal structure of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) bound to coactivator- associated arginine methyltransferase 1 (CARM1)49. b | The conformation shown for DOT1-like, histone H3 methyltransferase (DOT1L) was taken from the crystal structure of S-adenosyl-l-methionine (SAM; also known as AdoMet) bound to this protein33. c | The representative conformation shown for the protein lysine methyl-transferases (PKMTs) was taken from the crystal structure of SAH bound to SET domain-containing lysine methyl transferase 8 (REF. 50). Carbon atoms are represented by grey circles; nitrogen atoms are represented by blue circles; oxygen atoms are represented by red circles; and sulphur atoms are represented by yellow circles.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 729

nrd_2974_sep09.indd 729 18/8/09 09:48:48

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S9

SAH and SAM, respectively27. Similarly, the activity of all SAM-dependent enzymes in a cell could be reduced by blocking SAM biosynthesis — for example, by inhibit-ing dihydrofolate reductase or SAM synthase, which are

two enzymes involved in SAM biosynthesis42. Also, the pan-HDAC inhibitor panobinostat has recently been shown to cause depletion of cellular levels of the PMT EZH2 (REF. 43). Although the mechanism by which this

Table 3 | Chemical structures and biochemical data for small-molecule inhibitors of PMTs

compound structure Mechanism and potency selectivity* refs

SAH

H2N S

CO H2O

OHOH

N

N

N

N

NH2 Product of the reactions catalysed by PMTs IC50 values range from 0.1 to 20 µM

Non-selective 77,78

Sinefungin

CO H2O

H2N

OHOH

N

N

N

N

NH2

NH2

Natural product analogue of SAM and SAH IC50 values range from 0.1 to 20 µM

Non-selective 36

ChaetocinHN N

HN

O OH

O

S S

SSNH

NH

N

O

O OH

SAM-competitive inhibitor of SUV39 IC

50 = 0.6 µM

> 4-fold 79

BIX-01294

N

NH

N

NMeO

MeO

NN

SAM-non-competitive inhibitor of EHMT2 IC

50 = 2.7 µM

> 4-fold 80

Methylgene compound 7a of REF. 45

CH3O

HN

O

NN

F3C

S

NH

O NH2

CARM1 inhibitor IC

50 = 60 nM

> 100-fold for PRMT1 and SETD7

45

Bristol–Myers Squibb compound 7f of REF. 47

N N

O

NS

NN

F3C

HN

O

NH2

CARM1 inhibitor IC

50 = 40 nM

>100-fold for PRMT1 and PRMT3

46,47

CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); EHMT2, euchromatic histone lysine N-methyltransferase 2 (also known as G9A and KMT1C); IC

50, half-maximal inhibitory concentration; PMT, protein methyltransferase;

PRMT, protein arginine methyltransferase; SAH, S-adenosyl-l-homocysteine (also known as AdoHcy); SAM, S-adenosyl-l-methionine (also known as AdoMet); SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SUV39, suppressor of variegation 3–9; *Selectivity is given as the ratio of the IC

50 value for the most potent inhibition at a non-target PMT over the IC

50 value

for the primary target. See REF. 27.

R E V I E W S

730 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 730 18/8/09 09:48:50

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S10 NATURE REPRINT COLLECTION Epigenetics

Structure–activity relationshipThe relationship between the chemical structure of a compound and its pharmacological activity.

occurs is not yet fully understood, an approach of this type would nevertheless deplete the protein levels of EZH2 and so abolish the PMT catalytic activity of the enzyme along with any other non-enzymatic functions of EZH2.

Direct inhibitors of PMTs have recently been reviewed4, along with other probes of histone-modifying enzymes. Some natural ligands for these enzymes have been known for some time, including the reaction product, SAH, and a natural inhibitor isolated from Streptomyces spp. cultures, sinefungin (TABLE 3). More selective inhibitors have been identified for SUV39 (chaetocin; reported half-maximal inhibitory concentration (IC50) = 0.6 μM) and for EHMT2 (BIX-01294; reported IC50 = 1.6 μM), but no further opti-mization of these compounds has been reported to date4. A co-crystal structure of BIX-01294 bound to EHMT2 has recently been published44. Surprisingly, the compound was found to bind to the enzyme non-competitively with respect to SAM, in a groove that is normally occupied by a portion of the protein substrate.

More recently, two groups have reported potent, selec-tive, pyrazole-based inhibitors of the PRMT CARM1 (REFS 45–47) (TABLE 3). These compounds are the first examples of inhibitors of a specific PMT that are effective at nanomolar concentrations and display >100-fold selec-tivity for the primary target over related enzymes. The compound series reported by Methylgene45 was found to be inactive in cellular assays; no cellular data have been reported for the compound series from Bristol–Myers Squibb46,47. Therefore, although an exciting first step has been made towards developing selective inhibitors of PMTs, substantial work remains to be done before these findings can be translated into pharmacologically tractable species.

The paucity of potent, selective, pharmacologically tractable inhibitors of the PMTs creates a crucial thera-peutic gap which medicinal chemists should strive to fill. As described here, the pathobiological relevance of these enzymes, together with the structural and mecha-nistic information that suggests their druggability as a target class, converge to make the PMTs an attractive and important class of novel enzymes for contemporary drug discovery.

ConclusionsThere is a growing body of evidence that enzymes in this target class have important pathogenic roles in human diseases. The structures and enzymatic mechanisms of the PMTs support the view that pharmacological modu-lation of these enzymes by small-molecule inhibitors will be an effective means of therapeutic intervention in cancer and numerous other unmet medical needs. The discovery of small-molecule inhibitors of PMTs as starting points for drug development should clearly be a key focus of new research efforts. Beyond this goal, there are many opportunities to use chemical probes of PMT function to define the underlying biology and pathobiology that are associated with protein modifi-cation by these enzymes. The nature of PMT catalysis, and the available structural information about these enzymes, should facilitate the discovery of PMT ligands through mechanism- and structure-guided discovery methods48, as well as methods that do not rely on mech-anistic knowledge, such as high-throughput screening of diverse chemical libraries.

A key remaining question when considering the PMTs as a drug discovery target class is whether or not selective inhibition of particular enzymes can be achieved through targeting the SAM-binding pocket. This is analogous to the question that hindered the early acceptance of protein kinases as drug targets: whether it was possible to achieve selectivity among the ATP-binding pockets of the kinases. In retrospect, it is clear that the diversity of binding-site architecture and the binding-site dynamics associated with enzyme catalysis provide ample opportunities for selective inhibition of kinases through medicinal chemistry efforts. Will the same be true for the SAM-binding pockets of PMTs? Ultimately, structure–activity relationship profiles, selec-tivity and collateral inhibition of off-target enzymes by PMT inhibitors will need to be determined empirically. Despite these limitations, it is our hope that the data pre-sented here will help to stimulate systematic exploration of the human PMT target class towards the goal of devel-oping selective inhibitors of PMTs as therapeutic agents for human diseases.

1. Strahl, B. D. & Allis, C. D. The language of covalent histone modifications. Nature 403, 41–45 (2000).

2. Kouzarides, T. Chromatin modifications and their function. Cell 128, 693–705 (2007).A thorough overview of post-translational modifications on core histones, the enzymes that mediate these modifications and the biological functions of the modification.

3. Smith, B. C. & Denu, J. M. Chemical mechanisms of histone lysine and arginine modifications. Biochim. Biophys. Acta 1789, 45–57 (2008).An excellent review of the chemical biology of lysine- and arginine-modifying enzymes.

4. Cole, P. A. Chemical probes for histone-modifying enzymes. Nature Chem. Biol. 4, 590–597 (2008).

5. Keppler, B. R. & Archer, T. K. Chromatin-modifying enzymes as therapeutic targets — Part 1. Expert Opin. Ther. Targets. 12, 1301–1312 (2008).

6. Pray, L. At the flick of a switch: epigenetic drugs. Chem. Biol. 15, 640–641 (2008).

7. Jones, P. A. & Baylin, S. B. The epigenomics of cancer. Cell 128, 683–692 (2007).

8. Wilson, C. B., Rowell, E. & Sekimata, M. Epigenetic control of T-helper-cell differentiation. Nature Rev. Immunol. 9, 91–105 (2009).

9. Tsankova, N., Renthal, W., Kumar, A. & Nestler, E. J. Epigenetic regulation in psychiatric disorders. Nature Rev. Neurosci. 8, 355–367 (2007).

10. Kleer, C. G. et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells. Proc. Natl Acad. Sci. USA 100, 11606–11611 (2003).

11. Krivtsov, A. V. et al. H3K79 methylation profiles define murine and human MLL-AF4 leukemias. Cancer Cell 14, 355–368 (2008).

12. Jansson, M. et al. Arginine methylation regulates the p53 response. Nature Cell Biol. 10, 1431–1439 (2008).

13. Hong, H. et al. Aberrant expression of CARM1, a transcriptional coactivator of androgen receptor, in the development of prostate carcinoma and androgen-independent status. Cancer 101, 83–89 (2004).

14. Schneider, R., Bannister, A. J. & Kouzarides, T. Unsafe SETs: histone lysine methyltransferases and cancer. Trends Biochem. Sci. 27, 396–402 (2002).

15. Simon, J. A. & Lange, C. A. Roles of the EZH2 histone methyltransferase in cancer epigenetics. Mutat. Res. 647, 21–29 (2008).

16. Dillon, S. C., Zhang, X., Trievel, R. C. & Cheng, X. The SET-domain protein superfamily: protein lysine methyltransferases. Genome Biol. 6, 227 (2005).

17. Ryu, H. et al. ESET/SETDB1 gene expression and histone H3 (K9) trimethylation in Huntington’s disease. Proc. Natl Acad. Sci. USA 103, 19176–19181 (2006).

18. Cheng, D., Cote, J., Shaaban, S. & Bedford, M. T. The arginine methyltransferase CARM1 regulates the coupling of transcription and mRNA processing. Mol. Cell 25, 71–83 (2007).

19. Li, Y. et al. Role of the histone H3 lysine 4 methyltransferase, SET7/9, in the regulation of NF-κB-dependent inflammatory genes. Relevance to diabetes and inflammation. J. Biol. Chem. 283, 26771–26781 (2008).

20. Covic, M. et al. Arginine methyltransferase CARM1 is a promoter-specific regulator of NF-κB-dependent gene expression. EMBO J. 24, 85–96 (2005).

21. Hassa, P. O., Covic, M., Bedford, M. T. & Hottiger, M. O. Protein arginine methyltransferase 1 coactivates NF-κB-dependent gene expression synergistically with CARM1 and PARP1. J. Mol. Biol. 377, 668–678 (2008).

22. Huang, J. et al. Trimethylation of histone H3 lysine 4 by Set1 in the lytic infection of human herpes simplex virus 1. J. Virol. 80, 5740–5746 (2006).

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 731

nrd_2974_sep09.indd 731 18/8/09 09:48:50

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S11

23. Jeong, S. J. et al. Coactivator-associated arginine methyltransferase 1 enhances transcriptional activity of the human T-cell lymphotropic virus type 1 long terminal repeat through direct interaction with Tax. J. Virol. 80, 10036–10044 (2006).

24. Cheng, X., Collins, R. E. & Zhang, X. Structural and sequence motifs of protein (histone) methylation enzymes. Annu. Rev. Biophys. Biomol. Struct. 34, 267–294 (2005).

25. Goldstein, D. M., Gray, N. S. & Zarrinkar, P. P. High-throughput kinase profiling as a platform for drug discovery. Nature Rev. Drug Discov. 7, 391–397 (2008).

26. Mook, R. A. The importance and complexity of target class selectivity in drug discovery. The American Association for Cancer Research Education Book 223–226 (The American Association for Cancer Research, Philadelphia, 2005).

27. Copeland, R. A. Evaluation of Enzyme Inhibitors in Drug Discovery: A Guide for Medicinal Chemists and Pharmacologists (Wiley, Hoboken, 2005).

28. Cheng, D. et al. Small molecule regulators of protein arginine methyltransferases. J. Biol. Chem. 279, 23892–23899 (2004).

29. Allis, C. D. et al. New nomenclature for chromatin-modifying enzymes. Cell 131, 633–636 (2007).

30. Zhang, X. & Bruice, T. C. Enzymatic mechanism and product specificity of SET-domain protein lysine methyltransferases. Proc. Natl Acad. Sci. USA 105, 5728–5732 (2008).This work provides a detailed theoretical basis to explain the substrate specificity of the protein lysine methyltransferases.

31. Fedorov, O. et al. A systematic interaction map of validated kinase inhibitors with Ser/Thr kinases. Proc. Natl Acad. Sci. USA 104, 20523–20528 (2007).

32. Karaman, M. W. et al. A quantitative analysis of kinase inhibitor selectivity. Nature Biotech. 26, 127–132 (2008).

33. Min, J., Feng, Q., Li, Z., Zhang, Y. & Xu, R. M. Structure of the catalytic domain of human DOT1L, a non-SET domain nucleosomal histone methyltransferase. Cell 112, 711–723 (2003).

34. Sawada, K. et al. Structure of the conserved core of the yeast Dot1p, a nucleosomal histone H3 lysine 79 methyltransferase. J. Biol. Chem. 279, 43296–43306 (2004).

35. Copeland, R. A. Enzymes: A Practical Introduction to Structure, Mechanism and Data Analysis 2nd edn (Wiley, Hoboken, 2000).

36. Couture, J. F., Hauk, G., Thompson, M. J., Blackburn, G. M. & Trievel, R. C. Catalytic roles for carbon–oxygen hydrogen bonding in SET domain lysine methyltransferases. J. Biochem. 281, 19280–19287 (2006).

37. Collins, R. E. et al. In vitro and in vivo analyses of a Phe/Tyr switch controlling product specificity of histone lysine methyltransferases. J. Biol. Chem. 280, 5563–5570 (2005).This study provides a structural basis for the wide range of lysine methylation patterns that is achieved by different SET domain PKMTs.

38. Trievel, R. C., Flynn, E. M., Houtz, R. L. & Hurley, J. H. Mechanism of multiple lysine methylation by the SET domain enzyme Rubisco LSMT. Nature Struct. Biol. 10, 545–552 (2003).

39. Zhang, X. et al. Structural basis for the product specificity of histone lysine methyltransferases. Mol. Cell 12, 177–185 (2003).

40. Frederiks, F. et al. Nonprocessive methylation by Dot1 leads to functional redundancy of histone H3K79 methylation states. Nature Struct. Mol. Biol. 15, 550–557 (2008).

41. Chiang, P. K. Biological effects of inhibitors of S-adenosylhomocysteine hydrolase. Pharmacol. Ther. 77, 115–134 (1998).

42. Bender, C. M., Zingg, J.-M. & Jones, P. A. DNA methylation as a target for drug design. Pharm. Res. 15, 175–187 (1998).

43. Fiskus, W. et al. Panobinostat treatment depletes EZH2 and DNMT1 levels and enhances decitabine mediated de-repression of JunB and loss of survival of human acute leukemia cells. Cancer Biol. Ther. 8, 939–950 (2009).

44. Chang, Y. et al. Structural basis for G9a-like protein lysine methyltransferase inhibition by BIX-01294. Nature Struct. Mol. Biol. 16, 312–317 (2009).

45. Allan, M. et al. N-Benzyl-1-heteroaryl-3-(trifluoromethyl)-1H-pyrazole-5-carboxamides as inhibitors of co-activator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 19, 1218–1223 (2009).The first examples of potent, drug-like inhibitors of a human PMT.

46. Purandare, A. V. et al. Pyrazole inhibitors of coactivator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 18, 4438–4441 (2008).

47. Huynh, T. et al. Optimization of pyrazole inhibitors of coactivator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 19, 2924–2927 (2009).

48. Copeland, R. A., Gontarek, R. & Luo, L. in Textbook of Drug Design and Discovery 4th edn Ch. 12 (eds. Krogsgaard-Larsen, P., Madsen, U. & Stromgaard, K.) 378–407 (Taylor and Francis, New York, 2009).

49. Troffer-Charlier, N., Cura, V., Hassenboehler, P., Moras, D. & Cavarelli, J. Functional insights from structures of coactivator-associated arginine methyltransferase 1 domains. EMBO J. 26, 4391–4401 (2007).

50. Couture, J.-F., Collazo, E., Brunzelle, J. S. & Trievel, R. C. Structural and functional analysis of SET8, a histone H4 Lys-20 methyltransferase. Genes Dev. 19, 1455–1465 (2005).

51. Ma, W. W. & Adjei, A. A. Novel agents on the horizon for cancer therapy. CA Cancer J. Clin. 59, 111–137 (2009).A review of the current knowledge on how aberrant epigenetic mechanisms can contribute to the development of cancer and the progress in developing therapies that target these mechanisms.

52. Cortez, C. C. & Jones, P. A. Chromatin, cancer and drug therapies. Mutat. Res. 647, 44–51 (2008).

53. Kang, M. Y. et al. Association of the SUV39H1 histone methyltransferase with the DNA methyltransferase 1 at mRNA expression level in primary colorectal cancer. Int. J. Cancer 121, 2192–2197 (2007).

54. Watanabe, H. et al. Deregulation of histone lysine methyltransferases contributes to oncogenic transformation of human bronchoepithelial cells. Cancer Cell Int. 8, 15 (2008).

55. Kondo, Y. et al. Downregulation of histone H3 lysine 9 methyltransferase G9a induces centrosome disruption and chromosome instability in cancer cells. PLoS One 3, e2037 (2008).

56. Tkachuk, D., Kohler, S. & Cleary, M. L. Involvement of a homolog of Drosophila trithorax by 11q23 chromosomal translocations in acute leukemias. Cell 71, 691–700 (1992).

57. Gu, Y. et al. The t(4;11) chromosome translocation of human acute leukemias fuses the ALL-1 gene, related to Drosophila trithorax, to the AF-4 gene. Cell 71, 701–708 (1992).

58. Liedtke, M. & Cleary, M. L. Therapeutic targeting of MLL. Blood 113, 6061–6068 (2009).

59. Wang, G. G., Cai, L., Pasillas, M. P. & Kamps, M. P. NUP98-NSD1 links H3K36 methylation to Hox-A gene activation and leukaemogenesis. Nature Cell Biol. 9, 804–812 (2007).

60. Marango, J. et al. The MMSET protein is a histone methyltransferase with characteristics of a transcriptional corepressor. Blood 111, 3145–3154 (2008).

61. Kim, J. Y. et al. Multiple-myeloma-related WHSC1/MMSET isoform RE-IIBP is a histone methyltransferase with transcriptional repression activity. Mol. Cell Biol. 28, 2023–2034 (2008).

62. Lauring, J. et al. The multiple myeloma associated MMSET gene contributes to cellular adhesion, clonogenic growth, and tumorigenicity. Blood 111, 856–864 (2008).

63. Angrand, P. O. et al. NSD3, a new SET domain-containing gene, maps to 8p12 and is amplified in human breast cancer cell lines. Genomics 74, 79–88 (2001).

64. Rosati, R. et al. NUP98 is fused to the NSD3 gene in acute myeloid leukemia associated with t(8;11)(p11.2;p15). Blood 99, 3857–3860 (2002).

65. Tonon, G. et al. High-resolution genomic profiles of human lung cancer. Proc. Natl Acad. Sci. USA 102, 9625–9630 (2005).

66. Okada, Y. et al. hDOT1L links histone methylation to leukemogenesis. Cell 121, 167–178 (2005).

67. Bitoun, E., Oliver, P. L. & Davies, K. E. The mixed-lineage leukemia fusion partner AF4 stimulates RNA polymerase II transcriptional elongation and mediates coordinated chromatin remodeling. Hum. Mol. Genet. 16, 92–106 (2007).

68. Hamamoto, R. et al. SMYD3 encodes a histone methyltransferase involved in the proliferation of cancer cells. Nature Cell Biol. 6, 731–740 (2004).

69. Hamamoto, R. et al. Enhanced SMYD3 expression is essential for the growth of breast cancer cells. Cancer Sci. 97, 113–118 (2006).

70. Bracken, A. P. et al. EZH2 is downstream of the pRB-E2F pathway, essential for proliferation and amplified in cancer. EMBO J. 22, 5323–5335 (2003).

71. Varambally, S. et al. The polycomb group protein EZH2 is involved in progression of prostate cancer. Nature 419, 624–629 (2002).

72. Subramanian, K. et al. Regulation of estrogen receptor alpha by the SET7 lysine methyltransferase. Mol. Cell 30, 336–347 (2008).

73. Nishikawa, N. et al. Gene amplification and overexpression of PRDM14 in breast cancers. Cancer Res. 67, 9649–9657 (2007).

74. Majumder, S., Liu, Y., Ford, O. H., 3rd, Mohler, J. L. & Whang, Y. E. Involvement of arginine methyltransferase CARM1 in androgen receptor function and prostate cancer cell viability. Prostate 66, 1292–1301 (2006).

75. Frietze, S., Lupien, M., Silver, P. A. & Brown, M. CARM1 regulates estrogen-stimulated breast cancer growth through up-regulation of E2F1. Cancer Res. 68, 301–306 (2008).

76. Zhao, Q. et al. PRMT5-mediated methylation of histone H4R3 recruits DNMT3A, coupling histone and DNA methylation in gene silencing. Nature Struct. Mol. Biol. 16, 304–311 (2009).

77. Patnaik, D. et al. Substrate specificity and kinetic mechanism of mammalian G9a histone H3 methyltransferase. J. Biol. Chem. 279, 53248–53258 (2004).

78. Chin, H. G., Patnaik, D., Esteve, P.-O., Jacobsen, S. E. & Pradhan, S. Catalytic properties and kinetic mechanism of human recombinant lys-9 histone H3 methyltransferase SUV39H1: participation of the chromodomain in enzymatic catalysis. Biochemistry 45, 3272–3284 (2006).

79. Greiner, D., Bonaldi, T., Eskeland, R., Roemer, E. & Imhof, A. Identification of a specific inhibitor of the histone methyltransferase SU(VAR)3–9. Nature Chem. Biol. 1, 143–145 (2005).

80. Kubicek, S. et al. Reversal of H3K9me2 by a small-molecule inhibitor for the G9a histone methyltransferase. Mol. Cell 25, 473–481 (2007).

AcknowledgementsWe are grateful to K. Shiosaki, C. T. Walsh, H. R. Horvitz, Y. Zhang, and R. Gould for their insights, constant support and encouragement. We also thank K. Boater, E. Olhava, L. Jin and T. Luly for expert help in preparation of this manuscript.

Competing interests statementThe authors declare competing financial interests: see web version for details.

DATABASESUniProtKB: http://www.uniprot.orgCARM1 | DOT1L | EHMT2 | EZH2 | PRMT1 | SETD7 | SETD8 | SETD1A | SETDB1 | SUZ12

FURTHER INFORMATIONAuthor’s homepage: http://www.epizyme.com

All liNks Are AcTive iN THe oNliNe PDf

R E V I E W S

732 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 732 18/8/09 09:48:50

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S3

Cellular differentiation is one of the most important components of embryonic development and postnatal tissue maintenance and repair. Almost every nucleated cell of the human body contains the same, complete complement of genomic DNA. However, the ability of pluripotent cells to differentiate into distinct lineages and ultimate cell types is conferred by specific patterns of transcription of subsets of genes in the genome. A large and growing body of data support the idea that epigenetic regulation of gene transcription is a key biological deter-minant of cellular differentiation1.

The chromosomes within eukaryotic cell nuclei are packaged together with structural proteins (histones) to form the complex known as chromatin. Four major histones (H2A, H2B, H3 and H4) form an octameric, disc-shaped aggregate — composed of two copies of each histone type — around which the DNA is wound to form regular, repeating units known as nucleosomes (FIG. 1). Chromatin exists in two main conformational states: a condensed state (heterochromatin) in which the nucleo-somes are tightly packed together and gene transcription is largely repressed; and a more relaxed state (euchro-matin) in which gene transcription is activated. Epigenetic regulation of gene transcription is mediated by selective, enzyme-catalysed, covalent modification of specific nucleo tides within the genes and also by post-translational modifications of the histone proteins (FIG. 1). Modification of DNA can silence gene transcription directly, whereas the post-translational modifications of histones control the conformational transition between the heterochroma-tin and euchromatin states2. The enzymes that covalently modify DNA and histones are therefore the key mediators of epigenetic regulation of gene transcription.

Several putative epigenetic enzymes have recently been identified and, in some cases, their catalytic mechanism and three-dimensional structures have been determined2,3.

Epigenetic enzymes that are encoded in the human genome catalyse group transfer reactions and can be categorized according to the nature of the covalent modifications that they catalyse and by the substrates upon which they act. In humans, these enzymes include DNA methyltransferases (DNMTs), which methylate the carbon atom at the 5-position of cytosine in the CpG dinucleotide sites of the genome; protein methyl-transferases (PMTs), which methylate lysine or arginine residues on histones and other proteins; protein demethyl-ases, which remove methyl groups from the lysine or arginine residues of proteins; histone acetyltransferases, which acetylate lysine residues on histones and other proteins; histone deacetylases (HDACs), which remove acetyl groups from lysine residues on histones and other proteins; ubiquitin ligases, which add ubiquitin to lysine residues on histones and other proteins; and specific kinases that phosphorylate serine residues on histones4,5.

Given that small-molecule inhibitors have been suc-cess fully designed for HDACs and DNMTs (discussed below), it is likely that additional families of histone-modifying enzymes will also be amenable to small-mol-ecule modulation. The opportunity for chemical-probe development and pharmacological control of epigenetic gene transcription is therefore of great interest in the fields of basic biology and drug discovery4,5. Indeed, the role of these enzymes in human diseases is high-lighted by the recent approval of three drugs by the US Food and Drug Administration6 that act as selective, small-molecule inhibitors of HDACs and DNMTs for the treatment of specific human cancers (TABLE 1).

In recent years, there have been numerous reviews in the literature that highlight different aspects of the biology, disease association and/or structural biology of various histone-modifying enzymes. In this Review, we

Epizyme, Inc., 840 Memorial Drive, Cambridge, Massachussets 02139, USA.Correspondence to R.A.C. e-mail: [email protected]:10.1038/nrd2974

EpigeneticsA stably heritable change in phenotype or gene expression in an organism or cell, resulting from changes in a chromosome that are not caused by a change in DNA sequence. The process of eukaryotic cell differentiation is one of the most well-known examples of epigenetic changes.

Protein methyltransferases as a target class for drug discoveryRobert A. Copeland, Michael E. Solomon and Victoria M. Richon

Abstract | The protein methyltransferases (PMTs) — which methylate protein lysine and arginine residues and have crucial roles in gene transcription — are emerging as an important group of enzymes that play key parts in normal physiology and human diseases. The collection of human PMTs is a large and diverse group of enzymes that have a common mechanism of catalysis. Here, we review the biological, biochemical and structural data that together present PMTs as a novel, chemically tractable target class for drug discovery.

R E V I E W S

724 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 724 18/8/09 09:48:45

First published in Nature Reviews Drug Discovery 8, 724–732 (2009); doi: 10.1038/nrd2974

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S4 NATURE REPRINT COLLECTION Epigenetics

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S5

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S6 NATURE REPRINT COLLECTION Epigenetics

common natural ligand, the ATP-binding pockets of protein kinases have afforded medicinal chemists a rich diversity of chemical scaffolds, which have resulted in a range of drug molecules of varying degrees of target selectivity32. Similarly, the commonality of SAM use by the PMTs belies the structural, biological and pathologi-cal diversity of these enzymes. From the perspective of drug discovery and medicinal chemistry, the diversity of SAM-binding modes and catalytic mechanisms of these enzymes is of key importance.

A common structural feature of PKMTs and PRMTs that distinguishes these enzymes from other pro-teins that use SAM is the overall architecture of their extended catalytic active sites. This generally consists of a SAM-binding pocket that is accessed from one face of the protein, and a narrow, hydrophobic, acceptor (that is, lysine or arginine) channel that extends to the opposite face of the protein surface, such that the two substrates enter the active site from opposite sides of the enzyme surface.

Table 2 | Selected PKMTs and PRMTs that have shown an association with human cancers

Protein methyltransferase

Methylation substrates

cancers cancer association refs

SUV39H1 H3K9 Colon cancer Increased expression in colorectal tumours; associated with transcriptional repression

53

EHMT2 H3K9 Lung, prostate and hepatocellular carcinoma

Increased expression in lung cancer cell lines; regulates centrosome duplication, presumably through chromatin structure

54,55

MLL H3K4 Leukaemia Chromosomal aberrations involving MLL are a cause of acute leukaemias; the SET domain is lost in translocation

56–58

NSD1 H3K36 Acute myeloid leukaemia

Translocation fuses NSD1 to nucleoporin 98 in human acute myeloid leukaemia

59

WHSC1 H3K36 and H4K20

Myeloma Translocated and increased expression in myeloma; associated with transcriptional regulation

60–62

WHSC1L1 H3K4 Lung and breast cancers, and childhood acute myeloid leukaemia

Amplified in lung cancer and breast cancer; translocation with nucleoporin 98; mediates transcriptional activation

63–64

DOT1L H3K79 MLL-rearranged leukaemias

Recruited by MLL fusion partners MLLT1, MLLT2, MLLT3 and MLLT10 to homeobox genes; associated with transcriptional activation and elongation

11, 66,67

SMYD3 H3K4 Breast, liver, colon and gastric cancers

Overexpressed in multiple tumour types; associated with transcriptional activation

68,69

EZH2 H3K27 Breast, prostate, colon, gastric, bladder and liver cancers, melanoma and lymphoma

Amplified and increased expression in several tumour types; a member of the polycomb repressive complex 2; associated with transcriptional repression

10,15,70,71

SETD7 H3K4 Breast cancers SET7-mediated methylation stabilizes the oestrogen receptor and is necessary for the recruitment of the oestrogen receptor to its target genes and target gene transactivation

72

PRDM14 No known substrate

Breast cancers Amplified and overexpressed in cancers; associated with transcriptional repression

73

CARM1 H3R17, EP300–CBP and NCOA3

Breast and prostate cancers

Increased expression correlates with androgen independence in human prostate carcinoma; overexpressed in breast tumours and associated with transcriptional activation

74,75

PRMT5 H3R8, p53, SNRPD1, SNRPD3 and SUPT5H

Lymphoma PRMT5 expression and H3R8 methylation levels are increased in lymphoid cancer cells; PRMT5 mediates p53 methylation, which promotes cell arrest rather than cell death; H4R3 methylation promotes recruitment of DNMT3A, subsequent promoter CpG methylation and gene silencing

12, 76

CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); CBP, CREB-binding protein; DNMT3A, DNA (cytosine-5-)-methyltransferase 3α; EHMT2, euchromatic histone-lysine N-methyltransferase 2 (also known as G9A and KMT1C); EP300, E1A-binding protein p300; EZH2, enhancer of zeste homologue 2 (also known as KMT6); DOT1L, DOT1-like, histone H3 methyltransferase (also known as KMT4); MLL, myeloid, lymphoid or mixed-lineage leukaemia (also known as KMT2A); MLLT1, myeloid, lymphoid or mixed-lineage leukemia, translocated to 1; NCOA3, nuclear receptor coactivator 3; NSD1, nuclear receptor-binding SET domain protein 1; PKMT, protein lysine methyltransferase; PRDM14, PR domain-containing protein 14; PRMT, protein arginine methyltransferase; SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SMYD3, SET and MYND domain-containing protein 3; SNRPD1, small nuclear ribonucleoprotein D1 polypeptide 16kDa (also known as SMD1); SNRPD3, small nuclear ribonucleoprotein D3 polypeptide 18kDa (also known as SMD3); SUPT5H, suppressor of Ty 5 homologue; SUV39H1, suppressor of variegation 3–9 homologue 1 (also known as KMT1A); WHSC1, Wolf–Hirschhorn syndrome candidate 1 (also known as MMSET and NSD2); WHSC1L1, Wolf–Hirschhorn syndrome candidate 1-like protein 1 (also known as NSD3).

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 727

nrd_2974_sep09.indd 727 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S7

Nature Reviews | Drug Discovery

PMTs

SAM SAH

N

O

H

NH

O

NHH

CH3

NH3+

–O2C

S+

OHH H

NO

N

N

N

NH2

H

OH

H

N

O

H

NH

O

NCH3

HH

NH3+

–O2C

S+

OHH H

NO

N

N

N

NH2

H

OH

H

Nu– LG Nu LG–+LGNuδ– δ+ ‡

a

b

Crystallographic studies have revealed two distinct binding modes for SAM or SAH in the cofactor-binding pockets of PMTs24. For the SET domain PKMTs that have been co-crystallized with SAM or SAH, it is known that the cofactor adopts a ‘U-shaped’ configuration within the active site (FIG. 3) that aligns the methylsulphonium cation of SAM at the base of the narrow lysine channel, in perfect juxtaposition to the ε-amino group of the acceptor lysine residue, which facilitates group transfer. This U-shaped configuration is induced by a conserved aspartate or glutamate residue that binds to the ribose hydroxyl groups, and a positively charged lysine or arginine residue that forms a salt bridge with the car-boxylate of SAM. In striking contrast to the U-shaped configuration that is adopted by the cofactor when bound to PKMTs, SAM bound within the active site of PRMTs adopts an extended configuration that resem-bles the extended SAM configuration seen in the DNA methyltransferases; again, the binding motif results in alignment of the SAM methylsulphonium cation with the base of the acceptor-binding channel. Another dis-tinction between cofactor binding within the PKMTs and the PRMTs is that, in PRMTs, dimer formation seems to be a crucial component of SAM binding and catalysis, whereas this is not the case for PKMTs3,24. The mechanistic consequences of obligate dimer formation in the PRMTs is not yet clear, but it may be involved in multiple methylations of the arginine residue.

From the above discussion, it could be concluded that the configuration of the bound SAM is structurally related to the identity of the methyl acceptor nitrogen species upon which the enzymes act; that is, U-shaped for PKMTs and extended for PRMTs. However, data on the non-SET domain PKMT, DoT1l, do not support this conclusion. In the co-crystal structures of human DoT1l bound to SAM33, and the yeast homologue DoT1P bound to SAH34, the cofactor is bound in the extended configu-ration, similar to that seen in the PRMTs. Additionally, the solvent-exposed surface area of the bound cofactor in DoT1l is more similar to that seen in the PRMTs than the PKMTs, as is the overall amino-acid sequence around the cofactor-binding pocket24,33. Therefore, from a struc-tural perspective, DoT1l seems to link the PKMT and PRMT groups of PMTs.

The discovery and optimization of selective drugs for the PMTs will depend not only on the static structure of the active site of the enzyme, as revealed through crys-tallographic studies, but also on the structural dynamics of the active site that accompany catalytic turnover27,35. Studies on the kinetic mechanisms of the PMTs may provide some information in this area.

Some of the SET domain PKMTs, such as SETD7, perform a single round of catalysis on a lysine residue, resulting in a mono-methylated product, whereas other SET domain PKMTs catalyse multiple rounds of methyl-ation on a specific lysine residue. Crystallographic studies suggest that the difference between single-turnover and multiple-turnover SET domain enzymes results from the degree of steric crowding and hydrogen-bonding patterns in the lysine-binding channel of these enzymes3,24,36,37. In particular, the identity of an aromatic residue within the lysine-binding pocket seems to be the key determinant of the multiplicity of lysine methyl-ation. In the PKMT DIM5, this residue is a phenylalanine (F281), and the enzyme can tri-methylate the acceptor lysine residue of its protein substrate. The correspond-ing residue in SETD7 is a tyrosine (Y305), and this enzyme can only mono-methylate its protein substrate. Remarkably, the mutant F281Y transforms DIM5 into a mono-methylating PKMT, and the corresponding mutant Y305F in SETD7 results in an enzyme that is capable of multiple rounds of lysine methylation38. These mutagenesis results have been extended to the PKMT euchromatic histone lysine N-methyltransferase 2 (EHMT2; also known as G9A)39, and the ‘tyrosine–phenylalanine switch’ seems to be a general determinant of product specificity among the SET domain PKMTs24. Molecular dynamics and hybrid quantum mechanical–molecular mechanical studies also suggest a key role for bound water molecules (a water channel) in the extent of lysine methylation by PKMTs30.

An outstanding question that has yet to be reconciled with the mechanistic hypothesis described above is how the quaternary nitrogen atom is deprotonated to gener-ate a neutral amine methyl acceptor. At physiological pH, the lysine amine is protonated (the negative log-arithm of the acid dissociation constant (pKa) of the side chain amine is ~10.8 (REF. 35)), and so there are no lone pair electrons to act as the attacking nucleophile in the

Figure 2 | PMT-catalysed methylation of proteins by an sN2 reaction with sAM as the methyl donor. The protein methyltransferases (PMTs) catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) to a nitrogen atom of lysine or arginine side chains to form S-adenosyl-l-homocysteine (SAH; also known as AdoHcy). a | The methyl group (shown in red) of the SAM sulphonium cation is ‘attacked’ by the lone pair electrons of a lysine (shown here) or arginine (not shown) side-chain nitrogen atom. The reaction results in transfer of the methyl group to the attacking nitrogen atom and the production of SAH from the reaction cofactor. b | A more generalized chemical scheme of a bimolecular nucleophilic substitution (S

N2)

group transfer reaction, illustrating the attacking nucleophile (Nu–; lysine or arginine in the case of PMTs), the leaving group (LG; the methyl group in the case of PMTs), and the transient but essential formation of a penta-coordinate carbon transition state (‡).

R E V I E W S

728 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 728 18/8/09 09:48:47

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S8 NATURE REPRINT COLLECTION Epigenetics

b DOT1L

a PRMT

c SET domain

Nature Reviews | Drug Discovery

General base catalysisA mechanism that can occur in enzyme catalysis, in which a basic group accepts protons from a substrate molecule, usually to stabilize a charged transition-state species.

SN2-mediated methyl transfer reaction. A potential mech-anism of deprotonation is through general base catalysis. However, inspection of the amino acids in the active sites of PKMTs reveals no obvious basic side chains that could act in this capacity. Another hypothesis is that the solvent acts as a proton ‘sponge’; however, this seems inconsistent with the fact that the lysine side chain is buried deeply in the protein, with no clear access to bulk solvent. An alternative hypothesis has recently been proposed, based on molecular dynamics simulations30. According to this model, binding of SAM and protein substrates creates a ‘water shuttle’ that can remove a proton from the buried lysine side chain and ferry this proton along a contiguous chain of water molecules to be deposited into the bulk solvent. Additionally, the electrostatic repulsion created by the quarternary nitrogen atom and the positively charged SAM cofactor lowers the pKa of the lysine side-chain amine to ~8.2, thereby facilitating this deprotonation

process. Furthermore, the water shuttle hypothesis pro-vides an alternative mechanism to explain the differences in extent of lysine methylation by the PKMTs. The molec-ular dynamics studies suggest that the ability to form a water shuttle will determine the extent of methylation that is catalysed by a given enzyme. For example, simulations of SETD7-mediated catalysis suggest that mono-methyl-ation of lysine prevents re-formation of a new water shuttle, and so this enzyme terminates catalysis after one round of methylation. The same simulations suggest that other PKMTs, such as the ribulose bisphosphate carboxylase–oxygenase large subunit methyltransferase, can readily re-form the water shuttle, leading to multiple rounds of methylation.

Enzymes that perform multiple rounds of catalysis on a macromolecular substrate can do so by one of two mechanisms: a distributive enzyme mechanism, in which each round of catalysis results in macromolecular product dissociation and rebinding, or a processive mechanism, in which multiple rounds of catalysis proceed before dis-sociation of the macromolecular product. PMTs use both of these mechanisms: some SET domain PKMTs that perform multiple rounds of lysine methylation have been found to use a processive mechanism3,24, whereas DoT1l has been shown to perform multiple rounds of H3K79 methylation through a non-processive (distributive) mechanism40.

The PRMTs are also capable of performing multiple rounds of arginine methylation to produce either mono- or di-methylated arginine products. The PRMTs that have been studied so far follow an ordered, sequential mechanism in which SAM binds before the arginine-containing substrate, and di-methyl arginine production occurs through a processive mechanism3. on the basis of product specificity, PRMTs can be subdivided into two types: type I PRMTs, which produce an asymmetrical N,N ′-dimethyl arginine; and type II PRMTs, which produce a symmetrical N,N-dimethyl arginine3.

The variations in active-site structure and chemical mechanism that are summarized above reflect a target class with the potential for substantial chemical diversity among small-molecule modulators of individual enzymes in the class. Therefore, the opportunity for the develop-ment of different chemotypes that compete with the com-mon, natural ligands of these enzymes (for example, SAM, lysine and arginine), and can be modified to produce enzyme-selective inhibitors, seems promising.

Known inhibitors of PMTsDespite the convergence of data concerning PMTs, the search for potent, selective inhibitors of these enzymes has only recently begun in earnest. Some indirect approaches to inhibiting or depleting PMTs have been reported. For example, the antiviral compound 3-deazaneplanocin (DZNep) inhibits the enzyme SAH hydrolase and thereby increases intracellular levels of the universal product of PMTs, SAH41. Product inhibition by SAH would therefore be expected for all PMTs and other SAM-dependent enzymes, with the degree of inhibition for specific enzymes being related to their relative inhibi-tion constant (Ki) and Michaelis constant (Km) values for

Figure 3 | variations in the configuration of sAM or sAH bound within the active sites of different PMTs. a | The representative conformation shown for the protein arginine methyltransferases (PRMTs) was taken from the crystal structure of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) bound to coactivator- associated arginine methyltransferase 1 (CARM1)49. b | The conformation shown for DOT1-like, histone H3 methyltransferase (DOT1L) was taken from the crystal structure of S-adenosyl-l-methionine (SAM; also known as AdoMet) bound to this protein33. c | The representative conformation shown for the protein lysine methyl-transferases (PKMTs) was taken from the crystal structure of SAH bound to SET domain-containing lysine methyl transferase 8 (REF. 50). Carbon atoms are represented by grey circles; nitrogen atoms are represented by blue circles; oxygen atoms are represented by red circles; and sulphur atoms are represented by yellow circles.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 729

nrd_2974_sep09.indd 729 18/8/09 09:48:48

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S9

SAH and SAM, respectively27. Similarly, the activity of all SAM-dependent enzymes in a cell could be reduced by blocking SAM biosynthesis — for example, by inhibit-ing dihydrofolate reductase or SAM synthase, which are

two enzymes involved in SAM biosynthesis42. Also, the pan-HDAC inhibitor panobinostat has recently been shown to cause depletion of cellular levels of the PMT EZH2 (REF. 43). Although the mechanism by which this

Table 3 | Chemical structures and biochemical data for small-molecule inhibitors of PMTs

compound structure Mechanism and potency selectivity* refs

SAH

H2N S

CO H2O

OHOH

N

N

N

N

NH2 Product of the reactions catalysed by PMTs IC50 values range from 0.1 to 20 µM

Non-selective 77,78

Sinefungin

CO H2O

H2N

OHOH

N

N

N

N

NH2

NH2

Natural product analogue of SAM and SAH IC50 values range from 0.1 to 20 µM

Non-selective 36

ChaetocinHN N

HN

O OH

O

S S

SSNH

NH

N

O

O OH

SAM-competitive inhibitor of SUV39 IC

50 = 0.6 µM

> 4-fold 79

BIX-01294

N

NH

N

NMeO

MeO

NN

SAM-non-competitive inhibitor of EHMT2 IC

50 = 2.7 µM

> 4-fold 80

Methylgene compound 7a of REF. 45

CH3O

HN

O

NN

F3C

S

NH

O NH2

CARM1 inhibitor IC

50 = 60 nM

> 100-fold for PRMT1 and SETD7

45

Bristol–Myers Squibb compound 7f of REF. 47

N N

O

NS

NN

F3C

HN

O

NH2

CARM1 inhibitor IC

50 = 40 nM

>100-fold for PRMT1 and PRMT3

46,47

CARM1, coactivator-associated arginine methyltransferase 1 (also known as PRMT4); EHMT2, euchromatic histone lysine N-methyltransferase 2 (also known as G9A and KMT1C); IC

50, half-maximal inhibitory concentration; PMT, protein methyltransferase;

PRMT, protein arginine methyltransferase; SAH, S-adenosyl-l-homocysteine (also known as AdoHcy); SAM, S-adenosyl-l-methionine (also known as AdoMet); SETD7, SET domain-containing lysine methyltransferase 7 (also known as KMT7); SUV39, suppressor of variegation 3–9; *Selectivity is given as the ratio of the IC

50 value for the most potent inhibition at a non-target PMT over the IC

50 value

for the primary target. See REF. 27.

R E V I E W S

730 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 730 18/8/09 09:48:50

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S10 NATURE REPRINT COLLECTION Epigenetics

Structure–activity relationshipThe relationship between the chemical structure of a compound and its pharmacological activity.

occurs is not yet fully understood, an approach of this type would nevertheless deplete the protein levels of EZH2 and so abolish the PMT catalytic activity of the enzyme along with any other non-enzymatic functions of EZH2.

Direct inhibitors of PMTs have recently been reviewed4, along with other probes of histone-modifying enzymes. Some natural ligands for these enzymes have been known for some time, including the reaction product, SAH, and a natural inhibitor isolated from Streptomyces spp. cultures, sinefungin (TABLE 3). More selective inhibitors have been identified for SUV39 (chaetocin; reported half-maximal inhibitory concentration (IC50) = 0.6 μM) and for EHMT2 (BIX-01294; reported IC50 = 1.6 μM), but no further opti-mization of these compounds has been reported to date4. A co-crystal structure of BIX-01294 bound to EHMT2 has recently been published44. Surprisingly, the compound was found to bind to the enzyme non-competitively with respect to SAM, in a groove that is normally occupied by a portion of the protein substrate.

More recently, two groups have reported potent, selec-tive, pyrazole-based inhibitors of the PRMT CARM1 (REFS 45–47) (TABLE 3). These compounds are the first examples of inhibitors of a specific PMT that are effective at nanomolar concentrations and display >100-fold selec-tivity for the primary target over related enzymes. The compound series reported by Methylgene45 was found to be inactive in cellular assays; no cellular data have been reported for the compound series from Bristol–Myers Squibb46,47. Therefore, although an exciting first step has been made towards developing selective inhibitors of PMTs, substantial work remains to be done before these findings can be translated into pharmacologically tractable species.

The paucity of potent, selective, pharmacologically tractable inhibitors of the PMTs creates a crucial thera-peutic gap which medicinal chemists should strive to fill. As described here, the pathobiological relevance of these enzymes, together with the structural and mecha-nistic information that suggests their druggability as a target class, converge to make the PMTs an attractive and important class of novel enzymes for contemporary drug discovery.

ConclusionsThere is a growing body of evidence that enzymes in this target class have important pathogenic roles in human diseases. The structures and enzymatic mechanisms of the PMTs support the view that pharmacological modu-lation of these enzymes by small-molecule inhibitors will be an effective means of therapeutic intervention in cancer and numerous other unmet medical needs. The discovery of small-molecule inhibitors of PMTs as starting points for drug development should clearly be a key focus of new research efforts. Beyond this goal, there are many opportunities to use chemical probes of PMT function to define the underlying biology and pathobiology that are associated with protein modifi-cation by these enzymes. The nature of PMT catalysis, and the available structural information about these enzymes, should facilitate the discovery of PMT ligands through mechanism- and structure-guided discovery methods48, as well as methods that do not rely on mech-anistic knowledge, such as high-throughput screening of diverse chemical libraries.

A key remaining question when considering the PMTs as a drug discovery target class is whether or not selective inhibition of particular enzymes can be achieved through targeting the SAM-binding pocket. This is analogous to the question that hindered the early acceptance of protein kinases as drug targets: whether it was possible to achieve selectivity among the ATP-binding pockets of the kinases. In retrospect, it is clear that the diversity of binding-site architecture and the binding-site dynamics associated with enzyme catalysis provide ample opportunities for selective inhibition of kinases through medicinal chemistry efforts. Will the same be true for the SAM-binding pockets of PMTs? Ultimately, structure–activity relationship profiles, selec-tivity and collateral inhibition of off-target enzymes by PMT inhibitors will need to be determined empirically. Despite these limitations, it is our hope that the data pre-sented here will help to stimulate systematic exploration of the human PMT target class towards the goal of devel-oping selective inhibitors of PMTs as therapeutic agents for human diseases.

1. Strahl, B. D. & Allis, C. D. The language of covalent histone modifications. Nature 403, 41–45 (2000).

2. Kouzarides, T. Chromatin modifications and their function. Cell 128, 693–705 (2007).A thorough overview of post-translational modifications on core histones, the enzymes that mediate these modifications and the biological functions of the modification.

3. Smith, B. C. & Denu, J. M. Chemical mechanisms of histone lysine and arginine modifications. Biochim. Biophys. Acta 1789, 45–57 (2008).An excellent review of the chemical biology of lysine- and arginine-modifying enzymes.

4. Cole, P. A. Chemical probes for histone-modifying enzymes. Nature Chem. Biol. 4, 590–597 (2008).

5. Keppler, B. R. & Archer, T. K. Chromatin-modifying enzymes as therapeutic targets — Part 1. Expert Opin. Ther. Targets. 12, 1301–1312 (2008).

6. Pray, L. At the flick of a switch: epigenetic drugs. Chem. Biol. 15, 640–641 (2008).

7. Jones, P. A. & Baylin, S. B. The epigenomics of cancer. Cell 128, 683–692 (2007).

8. Wilson, C. B., Rowell, E. & Sekimata, M. Epigenetic control of T-helper-cell differentiation. Nature Rev. Immunol. 9, 91–105 (2009).

9. Tsankova, N., Renthal, W., Kumar, A. & Nestler, E. J. Epigenetic regulation in psychiatric disorders. Nature Rev. Neurosci. 8, 355–367 (2007).

10. Kleer, C. G. et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells. Proc. Natl Acad. Sci. USA 100, 11606–11611 (2003).

11. Krivtsov, A. V. et al. H3K79 methylation profiles define murine and human MLL-AF4 leukemias. Cancer Cell 14, 355–368 (2008).

12. Jansson, M. et al. Arginine methylation regulates the p53 response. Nature Cell Biol. 10, 1431–1439 (2008).

13. Hong, H. et al. Aberrant expression of CARM1, a transcriptional coactivator of androgen receptor, in the development of prostate carcinoma and androgen-independent status. Cancer 101, 83–89 (2004).

14. Schneider, R., Bannister, A. J. & Kouzarides, T. Unsafe SETs: histone lysine methyltransferases and cancer. Trends Biochem. Sci. 27, 396–402 (2002).

15. Simon, J. A. & Lange, C. A. Roles of the EZH2 histone methyltransferase in cancer epigenetics. Mutat. Res. 647, 21–29 (2008).

16. Dillon, S. C., Zhang, X., Trievel, R. C. & Cheng, X. The SET-domain protein superfamily: protein lysine methyltransferases. Genome Biol. 6, 227 (2005).

17. Ryu, H. et al. ESET/SETDB1 gene expression and histone H3 (K9) trimethylation in Huntington’s disease. Proc. Natl Acad. Sci. USA 103, 19176–19181 (2006).

18. Cheng, D., Cote, J., Shaaban, S. & Bedford, M. T. The arginine methyltransferase CARM1 regulates the coupling of transcription and mRNA processing. Mol. Cell 25, 71–83 (2007).

19. Li, Y. et al. Role of the histone H3 lysine 4 methyltransferase, SET7/9, in the regulation of NF-κB-dependent inflammatory genes. Relevance to diabetes and inflammation. J. Biol. Chem. 283, 26771–26781 (2008).

20. Covic, M. et al. Arginine methyltransferase CARM1 is a promoter-specific regulator of NF-κB-dependent gene expression. EMBO J. 24, 85–96 (2005).

21. Hassa, P. O., Covic, M., Bedford, M. T. & Hottiger, M. O. Protein arginine methyltransferase 1 coactivates NF-κB-dependent gene expression synergistically with CARM1 and PARP1. J. Mol. Biol. 377, 668–678 (2008).

22. Huang, J. et al. Trimethylation of histone H3 lysine 4 by Set1 in the lytic infection of human herpes simplex virus 1. J. Virol. 80, 5740–5746 (2006).

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 731

nrd_2974_sep09.indd 731 18/8/09 09:48:50

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

NATURE REPRINT COLLECTION Epigenetics S11

23. Jeong, S. J. et al. Coactivator-associated arginine methyltransferase 1 enhances transcriptional activity of the human T-cell lymphotropic virus type 1 long terminal repeat through direct interaction with Tax. J. Virol. 80, 10036–10044 (2006).

24. Cheng, X., Collins, R. E. & Zhang, X. Structural and sequence motifs of protein (histone) methylation enzymes. Annu. Rev. Biophys. Biomol. Struct. 34, 267–294 (2005).

25. Goldstein, D. M., Gray, N. S. & Zarrinkar, P. P. High-throughput kinase profiling as a platform for drug discovery. Nature Rev. Drug Discov. 7, 391–397 (2008).

26. Mook, R. A. The importance and complexity of target class selectivity in drug discovery. The American Association for Cancer Research Education Book 223–226 (The American Association for Cancer Research, Philadelphia, 2005).

27. Copeland, R. A. Evaluation of Enzyme Inhibitors in Drug Discovery: A Guide for Medicinal Chemists and Pharmacologists (Wiley, Hoboken, 2005).

28. Cheng, D. et al. Small molecule regulators of protein arginine methyltransferases. J. Biol. Chem. 279, 23892–23899 (2004).

29. Allis, C. D. et al. New nomenclature for chromatin-modifying enzymes. Cell 131, 633–636 (2007).

30. Zhang, X. & Bruice, T. C. Enzymatic mechanism and product specificity of SET-domain protein lysine methyltransferases. Proc. Natl Acad. Sci. USA 105, 5728–5732 (2008).This work provides a detailed theoretical basis to explain the substrate specificity of the protein lysine methyltransferases.

31. Fedorov, O. et al. A systematic interaction map of validated kinase inhibitors with Ser/Thr kinases. Proc. Natl Acad. Sci. USA 104, 20523–20528 (2007).

32. Karaman, M. W. et al. A quantitative analysis of kinase inhibitor selectivity. Nature Biotech. 26, 127–132 (2008).

33. Min, J., Feng, Q., Li, Z., Zhang, Y. & Xu, R. M. Structure of the catalytic domain of human DOT1L, a non-SET domain nucleosomal histone methyltransferase. Cell 112, 711–723 (2003).

34. Sawada, K. et al. Structure of the conserved core of the yeast Dot1p, a nucleosomal histone H3 lysine 79 methyltransferase. J. Biol. Chem. 279, 43296–43306 (2004).

35. Copeland, R. A. Enzymes: A Practical Introduction to Structure, Mechanism and Data Analysis 2nd edn (Wiley, Hoboken, 2000).

36. Couture, J. F., Hauk, G., Thompson, M. J., Blackburn, G. M. & Trievel, R. C. Catalytic roles for carbon–oxygen hydrogen bonding in SET domain lysine methyltransferases. J. Biochem. 281, 19280–19287 (2006).

37. Collins, R. E. et al. In vitro and in vivo analyses of a Phe/Tyr switch controlling product specificity of histone lysine methyltransferases. J. Biol. Chem. 280, 5563–5570 (2005).This study provides a structural basis for the wide range of lysine methylation patterns that is achieved by different SET domain PKMTs.

38. Trievel, R. C., Flynn, E. M., Houtz, R. L. & Hurley, J. H. Mechanism of multiple lysine methylation by the SET domain enzyme Rubisco LSMT. Nature Struct. Biol. 10, 545–552 (2003).

39. Zhang, X. et al. Structural basis for the product specificity of histone lysine methyltransferases. Mol. Cell 12, 177–185 (2003).

40. Frederiks, F. et al. Nonprocessive methylation by Dot1 leads to functional redundancy of histone H3K79 methylation states. Nature Struct. Mol. Biol. 15, 550–557 (2008).

41. Chiang, P. K. Biological effects of inhibitors of S-adenosylhomocysteine hydrolase. Pharmacol. Ther. 77, 115–134 (1998).

42. Bender, C. M., Zingg, J.-M. & Jones, P. A. DNA methylation as a target for drug design. Pharm. Res. 15, 175–187 (1998).

43. Fiskus, W. et al. Panobinostat treatment depletes EZH2 and DNMT1 levels and enhances decitabine mediated de-repression of JunB and loss of survival of human acute leukemia cells. Cancer Biol. Ther. 8, 939–950 (2009).

44. Chang, Y. et al. Structural basis for G9a-like protein lysine methyltransferase inhibition by BIX-01294. Nature Struct. Mol. Biol. 16, 312–317 (2009).

45. Allan, M. et al. N-Benzyl-1-heteroaryl-3-(trifluoromethyl)-1H-pyrazole-5-carboxamides as inhibitors of co-activator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 19, 1218–1223 (2009).The first examples of potent, drug-like inhibitors of a human PMT.

46. Purandare, A. V. et al. Pyrazole inhibitors of coactivator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 18, 4438–4441 (2008).

47. Huynh, T. et al. Optimization of pyrazole inhibitors of coactivator associated arginine methyltransferase 1 (CARM1). Bioorg Med. Chem. Lett. 19, 2924–2927 (2009).

48. Copeland, R. A., Gontarek, R. & Luo, L. in Textbook of Drug Design and Discovery 4th edn Ch. 12 (eds. Krogsgaard-Larsen, P., Madsen, U. & Stromgaard, K.) 378–407 (Taylor and Francis, New York, 2009).

49. Troffer-Charlier, N., Cura, V., Hassenboehler, P., Moras, D. & Cavarelli, J. Functional insights from structures of coactivator-associated arginine methyltransferase 1 domains. EMBO J. 26, 4391–4401 (2007).

50. Couture, J.-F., Collazo, E., Brunzelle, J. S. & Trievel, R. C. Structural and functional analysis of SET8, a histone H4 Lys-20 methyltransferase. Genes Dev. 19, 1455–1465 (2005).

51. Ma, W. W. & Adjei, A. A. Novel agents on the horizon for cancer therapy. CA Cancer J. Clin. 59, 111–137 (2009).A review of the current knowledge on how aberrant epigenetic mechanisms can contribute to the development of cancer and the progress in developing therapies that target these mechanisms.

52. Cortez, C. C. & Jones, P. A. Chromatin, cancer and drug therapies. Mutat. Res. 647, 44–51 (2008).

53. Kang, M. Y. et al. Association of the SUV39H1 histone methyltransferase with the DNA methyltransferase 1 at mRNA expression level in primary colorectal cancer. Int. J. Cancer 121, 2192–2197 (2007).

54. Watanabe, H. et al. Deregulation of histone lysine methyltransferases contributes to oncogenic transformation of human bronchoepithelial cells. Cancer Cell Int. 8, 15 (2008).

55. Kondo, Y. et al. Downregulation of histone H3 lysine 9 methyltransferase G9a induces centrosome disruption and chromosome instability in cancer cells. PLoS One 3, e2037 (2008).

56. Tkachuk, D., Kohler, S. & Cleary, M. L. Involvement of a homolog of Drosophila trithorax by 11q23 chromosomal translocations in acute leukemias. Cell 71, 691–700 (1992).

57. Gu, Y. et al. The t(4;11) chromosome translocation of human acute leukemias fuses the ALL-1 gene, related to Drosophila trithorax, to the AF-4 gene. Cell 71, 701–708 (1992).

58. Liedtke, M. & Cleary, M. L. Therapeutic targeting of MLL. Blood 113, 6061–6068 (2009).

59. Wang, G. G., Cai, L., Pasillas, M. P. & Kamps, M. P. NUP98-NSD1 links H3K36 methylation to Hox-A gene activation and leukaemogenesis. Nature Cell Biol. 9, 804–812 (2007).

60. Marango, J. et al. The MMSET protein is a histone methyltransferase with characteristics of a transcriptional corepressor. Blood 111, 3145–3154 (2008).

61. Kim, J. Y. et al. Multiple-myeloma-related WHSC1/MMSET isoform RE-IIBP is a histone methyltransferase with transcriptional repression activity. Mol. Cell Biol. 28, 2023–2034 (2008).

62. Lauring, J. et al. The multiple myeloma associated MMSET gene contributes to cellular adhesion, clonogenic growth, and tumorigenicity. Blood 111, 856–864 (2008).

63. Angrand, P. O. et al. NSD3, a new SET domain-containing gene, maps to 8p12 and is amplified in human breast cancer cell lines. Genomics 74, 79–88 (2001).

64. Rosati, R. et al. NUP98 is fused to the NSD3 gene in acute myeloid leukemia associated with t(8;11)(p11.2;p15). Blood 99, 3857–3860 (2002).

65. Tonon, G. et al. High-resolution genomic profiles of human lung cancer. Proc. Natl Acad. Sci. USA 102, 9625–9630 (2005).

66. Okada, Y. et al. hDOT1L links histone methylation to leukemogenesis. Cell 121, 167–178 (2005).

67. Bitoun, E., Oliver, P. L. & Davies, K. E. The mixed-lineage leukemia fusion partner AF4 stimulates RNA polymerase II transcriptional elongation and mediates coordinated chromatin remodeling. Hum. Mol. Genet. 16, 92–106 (2007).

68. Hamamoto, R. et al. SMYD3 encodes a histone methyltransferase involved in the proliferation of cancer cells. Nature Cell Biol. 6, 731–740 (2004).

69. Hamamoto, R. et al. Enhanced SMYD3 expression is essential for the growth of breast cancer cells. Cancer Sci. 97, 113–118 (2006).

70. Bracken, A. P. et al. EZH2 is downstream of the pRB-E2F pathway, essential for proliferation and amplified in cancer. EMBO J. 22, 5323–5335 (2003).

71. Varambally, S. et al. The polycomb group protein EZH2 is involved in progression of prostate cancer. Nature 419, 624–629 (2002).

72. Subramanian, K. et al. Regulation of estrogen receptor alpha by the SET7 lysine methyltransferase. Mol. Cell 30, 336–347 (2008).

73. Nishikawa, N. et al. Gene amplification and overexpression of PRDM14 in breast cancers. Cancer Res. 67, 9649–9657 (2007).

74. Majumder, S., Liu, Y., Ford, O. H., 3rd, Mohler, J. L. & Whang, Y. E. Involvement of arginine methyltransferase CARM1 in androgen receptor function and prostate cancer cell viability. Prostate 66, 1292–1301 (2006).

75. Frietze, S., Lupien, M., Silver, P. A. & Brown, M. CARM1 regulates estrogen-stimulated breast cancer growth through up-regulation of E2F1. Cancer Res. 68, 301–306 (2008).

76. Zhao, Q. et al. PRMT5-mediated methylation of histone H4R3 recruits DNMT3A, coupling histone and DNA methylation in gene silencing. Nature Struct. Mol. Biol. 16, 304–311 (2009).

77. Patnaik, D. et al. Substrate specificity and kinetic mechanism of mammalian G9a histone H3 methyltransferase. J. Biol. Chem. 279, 53248–53258 (2004).

78. Chin, H. G., Patnaik, D., Esteve, P.-O., Jacobsen, S. E. & Pradhan, S. Catalytic properties and kinetic mechanism of human recombinant lys-9 histone H3 methyltransferase SUV39H1: participation of the chromodomain in enzymatic catalysis. Biochemistry 45, 3272–3284 (2006).

79. Greiner, D., Bonaldi, T., Eskeland, R., Roemer, E. & Imhof, A. Identification of a specific inhibitor of the histone methyltransferase SU(VAR)3–9. Nature Chem. Biol. 1, 143–145 (2005).

80. Kubicek, S. et al. Reversal of H3K9me2 by a small-molecule inhibitor for the G9a histone methyltransferase. Mol. Cell 25, 473–481 (2007).

AcknowledgementsWe are grateful to K. Shiosaki, C. T. Walsh, H. R. Horvitz, Y. Zhang, and R. Gould for their insights, constant support and encouragement. We also thank K. Boater, E. Olhava, L. Jin and T. Luly for expert help in preparation of this manuscript.

Competing interests statementThe authors declare competing financial interests: see web version for details.

DATABASESUniProtKB: http://www.uniprot.orgCARM1 | DOT1L | EHMT2 | EZH2 | PRMT1 | SETD7 | SETD8 | SETD1A | SETDB1 | SUZ12

FURTHER INFORMATIONAuthor’s homepage: http://www.epizyme.com

All liNks Are AcTive iN THe oNliNe PDf

R E V I E W S

732 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 732 18/8/09 09:48:50

Nature Reviews | Drug Discovery

Me

Ac MeK

K

UbK

SP R

PKMTs52 family members

PRMTs≥10 family members

Demethylases~30 family members

Deacetylases~18 family members

Acetyltransferases

Kinases

Ligases

Target classA group of proteins that are related by a common type of drug-binding pocket, but sufficiently diverse that selective inhibition of specific proteins can be achieved, using medicinal chemical elaboration of the basic chemotype structures.

SAMS-adenosyl-l-methionine, the universal methyl group donor of all enzymatic methyltransferase reactions.

focus on the PMTs, and in particular on those aspects that make PMTs attractive targets for drug discovery efforts. We summarize the data that contribute to the validation of PMTs as targets for specific human diseases, as well as the structural and mechanistic data that suggest PMTs are a tractable (that is, druggable) target class.

PKMTs and PRMTs in human diseaseIn surveying the histone-modifying enzymes of the human genome, the enzymes that catalyse methylation of lysine residues (protein lysine methyltransferases (PKMTs)) and arginine residues (protein arginine methyltransferases (PRMTs)) are of substantial interest from the perspective of drug discovery and medicinal chemistry. The action of these enzymes is crucial in controlling gene regulation, and there is an increasing amount of biochemical and biological data to suggest that the enzymatic activities of several of these proteins have pathogenic roles in cancer, inflammatory diseases, neurodegenerative diseases and other conditions of importance7–15.

For example, with the exception of DoT1-like, histone H3 methyltransferase (DoT1l; also known as KMT4), all human PKMTs contain a ~130 amino-acid domain, referred to as the SET domain, which constitutes the catalytic domain of these enzymes14–16. Enhancer of zeste homologue 2 (EZH2; also known as KMT6) is a SET domain protein that forms the catalytic subunit of the 4–5- protein core of polycomb repressive complex 2 (PRC2). PRC2 is a PKMT that catalyses the methyla-tion of lysine 27 of histone H3 (in the nomenclature of histone modification, this site is referred to as H3K27). Although EZH2 contains the catalytic active site, all of the proteins of the PRC2 complex are required for full PKMT activity. overexpression of EZH2 or another PRC2 subunit, suppressor of zeste 12 homologue (SUZ12), has been associated with numerous human cancer types, including prostate, breast, bladder, colon, skin, liver, endometrial, lung and gastric cancers, as well as lymphomas and myelomas15. In breast carcinomas, increased levels of EZH2 have been shown to correlate with increased invasiveness and proliferation rate; it has been suggested that EZH2 could be a prognostic indica-tor of patient outcome for breast cancer10. In cell culture, overexpression of EZH2 in breast epithelial cells causes anchorage-independent cell growth and increased invasiveness. Additionally, when EZH2-overexpressing cells were injected into the mammary fat pads of nude mice, the animals developed tumours, demonstrating the tumorigenicity of EZH2 overexpression. Importantly, the phenotypic effects of EZH2 overexpression are correlated with increased H3K27 methylation and are dependent on the presence of an intact SET domain, both of which imply a role for EZH2 enzymatic activity in pathogenesis10,15. Several other human PKMTs and PRMTs are strongly associated with human cancers, as summarized in TABLE 2. Similarly, there is compelling evidence that other PKMTs and PRMTs have a pathogenic role in serious human diseases other than cancer7–9. For example, SET domain, bifurcated 1 (SETDB1)17 and coactivator-associated

arginine methyltransferase 1 (CARM1; also known as PRMT4)18 have been implicated in the neurodegenera-tive diseases Huntington’s disease and spinal muscular atrophy, respectively. SET domain-containing lysine methyltransferase 7 (SETD7; also known as KMT7)19, CARM1 (REF. 20) and PRMT1 (REF. 21) have been associated with nuclear factor-κB-related inflamma-tory diseases, and SET domain-containing protein 1A (SETD1A)22 and CARM1 (REF. 23) have been associated with viral infections involving Herpes simplex virus and human T lymphotrophic virus, respectively. PKMTs and PRMTs are therefore emerging as compelling targets for drug discovery efforts4,5.

PMTs as a drug target classFrom a chemical biology and medicinal chemistry perspective, the PKMTs and PRMTs are of interest because they have a common mechanism of catalysis (discussed below), involving a small, organic cofactor. As other druggable classes of enzymes, such as the protein kinases, share this mechanistic feature, it is likely that the PMTs will be similarly amenable to inhibition by small, organic molecules.

The PKMTs and PRMTs catalyse methyl transfer from their universal methyl donor, S-adenosyl-l-methionine (SAM; also known as AdoMet) (FIG. 2), to a nitrogen atom of lysine or arginine side chains3. Protein substrate spe-cificity can be stringent in these enzymes; some PKMTs seem to selectively methylate a particular lysine residue on a specific histone, and the extent of methylation on a single lysine residue (that is, mono-, di- or tri-methyl-ation) that is catalysed by a particular enzyme can also

Figure 1 | A nucleosome and the post-translational histone protein modifications that can influence epigenetic regulation of gene transcription. Modifications of the histone protein tail are shown: changes in acetylation (Ac) by acetyltransferases and deacetylases, phosphorylation (P) by kinases, ubiquitylation (Ub) by ligases and changes in methylation (Me) by methyltransferases and demethylases. The enzyme families that are responsible for the various post-translational modifications are shown. PKMT, protein lysine methyltransferase; PRMT, protein arginine methyltransferase.

R E V I E W S

NATURE REVIEWS | Drug Discovery VolUME 8 | SEPTEMBER 2009 | 725

nrd_2974_sep09.indd 725 18/8/09 09:48:46

SAHS-adenosyl-l-homocysteine, the universal product of all enzymatic methyltransferase reactions, formed by methyl group transfer from S-adenosyl-l-methionine.

be stringent (discussed below). Clearly, the structure of each enzyme is unique16,24, as is the resulting biology and pathobiology associated with each enzyme. Yet, the shared chemical mechanisms of the PKMTs and PRMTs allows for certain efficiencies and economies in the dis-covery of selective drugs for these enzymes, by treating them as a target class25–27. Several of these enzymes have been found to catalyse methyl transfer to lysine or arginine residues on a number of cellular proteins; this is especially the case for the PRMTs, for which sev-eral cytosolic substrates have been identified24,28. With respect to gene regulation, however, the most important targets for both PKMTs and PRMTs are likely to be the histones, as post-translational modification of these pro-teins is clearly a determinant of chromatin remodelling and therefore regulation of gene transcription.

Representation of PMTs in the human genomeThe PMT target class is represented in many species, and the human genome encodes several PKMTs and PRMTs. Attempts to quantify the representation of PKMTs in a particular organism, and to understand the related-ness of these proteins to one another, have focused on the sequence alignment of the SET domain because, as discussed above, this domain is common to all PKMTs (except DoT1l).

Several attempts have been made to systematically group the PKMTs on the basis of sequence homology and substrate14,29. For example, in 2007, a nomenclature con-vention was proposed for the PKMTs, along with other types of chromatin-modifying enzymes29. In this study, 24 human PKMTs were identified. These SET domain PKMTs have been divided into related families on the basis of sequence alignment; initially four, and later seven, major families were defined in this manner14,16: the

suppressor of variegation 3–9 (SUV39) family; the SET1 (also known as Mll) family; the SET2 (also known as NSD) family; the retinoblastoma protein-interacting zinc finger protein (RIZ) (also known as PRDM) family; the SET and MYND domain-containing (SMYD) family; the enhancer of zeste (EZ) family; and the SUV4–20 family. An eighth family, known as ‘others’, included the enzymes SETD7 and SETD8 (also known as PRSET7). Finally, DoT1l, the human, non-SET domain PKMT can be considered a ninth family of PKMTs.

our group has recently extended this work to system-atically identify all of the SET domains that are encoded by the human genome. This study has more than doubled the number of putative human PKMTs to 52 (51 SET domain proteins plus DoT1l) (l.F. Jerva, K.o. Elliston, V.M.R., M.E.S. and R.A.C., unpublished observations). From the perspective of drug discovery, the salient point is that these enzymes are numerous in humans.

The PRMTs are similarly well represented in humans. There are at least eight human PRMTs for which some level of methyltransferase activity has been shown. These proteins have a canonical sequence domain that is associ-ated with the binding sites for the cofactor and substrate (arginine), although the sequence conservation among these proteins is low. Estimates of the total number of PRMTs that are encoded by the human genome vary, depending on the method of sequence alignment and the level of alignment stringency that is applied. Nevertheless, it is clear that 10–50 of these enzymes are represented in humans.

The human PMTs are thus a large class of enzymes, and several of them already have well established dis-ease association (discussed above). Furthermore, owing to the common features of their chemical mechanism of catalysis (discussed below), the PMTs are likely to be inherently tractable as targets for small-molecule drug intervention. The PMT target class, as defined here, therefore provides an important pool of potential targets for drug discovery efforts.

The PMT active siteThe pursuit of the PMTs as a drug target class is facilitated by a rich literature base of crystallographic and enzyme kinetic studies of these enzymes that have helped to define their mechanisms of catalysis. All of these enzymes probably use a common bimolecular nucleophilic sub-stitution (SN2) methyl transfer mechanism3,24,30. The lone pair electrons of a nitrogen atom (from lysine or arginine) ‘attacks’ the electrophilic methylsulphonium cation of SAM at a 180° angle to the leaving group, to form a penta-coordinate carbon transition state. The transition state structure then collapses, with methyl group relo-cation to the nitrogen atom of the lysine or arginine side chain and formation of S-adenosyl-l-homocysteine (SAH; also known as AdoHcy) (FIG. 2) as a product.

The use of a naturally occurring adenosyl analogue as the universal group transfer donor by PMTs is remi-niscent of protein kinases — another large family of druggable enzyme targets, the ATP-binding pockets of which have proved to be highly tractable targets for drug discovery31,32. Furthermore, despite binding a

Table 1 | Epigenetic-enzyme inhibitors for cancer therapy

generic name Alernative names

clinical status*

DNMT inhibitors

5-azacitidine Vidaza Approved in the United States for myelodysplastic syndrome

Decitabine Dacogen Approved in United States for myelodysplastic syndrome

HDAC inhibitors

Vorinostat Zolinza Approved in United States for cutaneous manifestation in cutaneous T cell lymphoma

Romidepsin FK228 New drug application filing

Panobinostat LBH-589 Phase II

Belinostat PXD-101 Phase II

Entinostat MS-275 SNDX-275

Phase II

MGCD-0103 MG-0103 Phase II

JNJ-26481585 None Phase I

Givinostat ITF2357 Phase II

*See REFS 51,52 for comprehensive reviews of novel cancer therapies, including those described in the table. DNMT, DNA methyltransferase; HDAC, histone deacetylase.

R E V I E W S

726 | SEPTEMBER 2009 | VolUME 8 www.nature.com/reviews/drugdisc

nrd_2974_sep09.indd 726 18/8/09 09:48:46

S12 NATURE REPRINT COLLECTION Epigenetics

ARTICLEdoi:10.1038/nature10351

Frequent mutation of histone-modifyinggenes in non-Hodgkin lymphomaRyan D. Morin1*, Maria Mendez-Lago1*, Andrew J. Mungall1, Rodrigo Goya1, Karen L. Mungall1, Richard D. Corbett1,Nathalie A. Johnson2, TesaM. Severson1, Readman Chiu1, Matthew Field1, Shaun Jackman1, Martin Krzywinski1, DavidW. Scott2,Diane L. Trinh1, Jessica Tamura-Wells1, Sa Li1, Marlo R. Firme1, Sanja Rogic2, Malachi Griffith1, Susanna Chan1,Oleksandr Yakovenko1, Irmtraud M. Meyer3, Eric Y. Zhao1, Duane Smailus1, Michelle Moksa1, Suganthi Chittaranjan1,Lisa Rimsza4, Angela Brooks-Wilson1,5, John J. Spinelli6,7, Susana Ben-Neriah2, Barbara Meissner2, Bruce Woolcock2,Merrill Boyle2, Helen McDonald1, Angela Tam1, Yongjun Zhao1, Allen Delaney1, Thomas Zeng1, Kane Tse1, Yaron Butterfield1,Inanç Birol1, Rob Holt1, Jacqueline Schein1, Douglas E. Horsman2, Richard Moore1, Steven J. M. Jones1, Joseph M. Connors2,Martin Hirst1, Randy D. Gascoyne2,8 & Marco A. Marra1,9

Follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBCL) are the two most common non-Hodgkinlymphomas (NHLs). Here we sequenced tumour and matched normal DNA from 13 DLBCL cases and one FL case toidentify genes with mutations in B-cell NHL. We analysed RNA-seq data from these and another 113 NHLs to identifygenes with candidate mutations, and then re-sequenced tumour andmatched normal DNA from these cases to confirm109 genes with multiple somatic mutations. Genes with roles in histone modification were frequent targets of somaticmutation. For example, 32% of DLBCL and 89% of FL cases had somatic mutations in MLL2, which encodes a histonemethyltransferase, and 11.4% and 13.4% of DLBCL and FL cases, respectively, had mutations in MEF2B, acalcium-regulated gene that cooperates with CREBBP and EP300 in acetylating histones. Our analysis suggests apreviously unappreciated disruption of chromatin biology in lymphomagenesis.

Non-Hodgkin lymphomas (NHLs) are cancers of B, T or naturalkiller lymphocytes. The two most common types of NHL, follicularlymphoma (FL) and diffuse large B-cell lymphoma (DLBCL),together comprise 60% of new B-cell NHL diagnoses each year inNorth America1. FL is an indolent and typically incurable diseasecharacterized by clinical and genetic heterogeneity. DLBCL is aggres-sive and likewise heterogeneous, comprising at least two distinct sub-types that respond differently to standard treatments. Both FL and thegerminal centre B-cell (GCB) cell of origin (COO) subtype of DLBCLderive from germinal centre B cells, whereas the activated B-cell(ABC) variety, which has a more aggressive clinical course, is thoughtto originate from B cells that have exited, or are poised to exit, thegerminal centre2. Current knowledge of the specific genetic eventsleading to DLBCL and FL is limited to the presence of a few recurrentgenetic abnormalities2. For example, 85–90% of FL and 30–40% ofGCB DLBCL cases3,4 harbour t(14;18)(q32;q21), which results inderegulated expression of the BCL2 oncoprotein. Other geneticabnormalities unique to GCB DLBCL include amplification of thec-REL gene and of the miR-17-92 microRNA cluster5. In contrast toGCB cases, 24% of ABC DLBCLs harbour structural alterations orinactivating mutations affecting PRDM1, which is involved in differ-entiation of GCB cells into antibody-secreting plasma cells6. ABC-specific mutations also affect genes regulating NF-kB signalling7,8,9,with TNFAIP3 (also known as A20) and MYD88 (ref. 10) the mostabundantly mutated in 24% and 39% of cases, respectively. Toenhance our understanding of the genetic architecture of B-cellNHL, we undertook a study to (1) identify somatic mutations and

(2) determine the prevalence, expression and focal recurrence ofmutations in FL and DLBCL. Using strategies and techniques appliedto cancer genome and transcriptome characterization by ourselvesand others11,12,13, we sequenced tumour DNA and/or RNA from 117tumour samples and 10 cell lines (Supplementary Tables 1 and 2) andidentified 651 genes (Supplementary Figure 1) with evidence ofsomatic mutation in B-cell NHL. After validation, we showed that109 genes were somatically mutated in two or more NHL cases. Wefurther characterized the frequency and nature of mutations withinMLL2 and MEF2B, which were among the most frequently mutatedgenes with no previously known role in lymphoma.

Identification of recurrently mutated genesWe sequenced the genomes or exomes of 14 NHL cases, all withmatched constitutional DNA sequenced to comparable depths(Supplementary Tables 1 and 2). After screening for single nucleotidevariants followed by subtraction of known polymorphisms and visualinspection of the sequence read alignments, we identified 717 non-synonymous variants (coding single nucleotide variants; cSNVs)affecting 651 genes (Supplementary Figure 1 and SupplementaryMethods). We identified between 20 and 135 cSNVs in each of thesegenomes. Only 25 of the 651 genes with cSNVs were represented inthe cancer gene census (December 2010 release)14.We performed RNA sequencing (RNA-seq) on these 14 NHL cases

and an expanded set of 113 samples comprising 83 DLBCL, 12 FL and8 B-cell NHL cases with other histologies and 10 DLBCL-derived celllines (Supplementary Table 2). We analysed these data to identify

1Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada. 2Centre for Lymphoid Cancer, BC Cancer Agency, Vancouver, British Columbia V5Z1L3, Canada. 3Centre for High-throughput Biology, Department of Computer Science, Vancouver, British Columbia V6T 1Z4, Canada. 4Department of Pathology, University of Arizona, Tucson, Arizona85724, USA. 5Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada. 6Cancer Control Research, BC Cancer Agency, Vancouver,British Columbia V5Z 1L3, Canada. 7School of Population and Public Health, University of British Columbia, Vancouver, British Columbia V6T 1Z3, Canada. 8Department of Pathology, University of BritishColumbia, Vancouver, British Columbia V6T 2B5, Canada. 9Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia V6H 3N1, Canada.*These authors contributed equally to this work.

2 9 8 | N A T U R E | V O L 4 7 6 | 1 8 A U G U S T 2 0 1 1

First published in Nature 476, 298–303 (2011); doi: 10.1038/nature10351

NATURE REPRINT COLLECTION Epigenetics S13

novel fusion transcripts (Supplementary Table 3) and cSNVs (Fig. 1).We identified 240 genes with at least one cSNV in a genome/exome oran RNA-seq ‘mutation hot spot’ (see later), and with cSNVs in at leastthree cases in total (Supplementary Table 4).We selected cSNVs fromeach of these 240 genes for re-sequencing to confirm their somaticstatus. We did not re-sequence genes with previously documentedmutations in lymphoma (for example, CD79B, BCL2). We confirmedthe somatic status of 543 cSNVs in 317 genes, with 109 genes having atleast two confirmed somatic mutations (Supplementary Tables 4 and5). Of the successfully re-sequenced cSNVs predicted from the gen-omes, 171 (94.5%) were confirmed somatic, 7 were false calls and 3were present in the germ line. These 109 recurrently mutated geneswere significantly enriched for genes implicated in lymphocyte activa-tion (P5 8.33 1024; for example, STAT6, BCL10), lymphocyte dif-ferentiation (P5 3.53 1023; for example, CARD11), and regulationof apoptosis (P5 1.93 1023; for example, BTG1, BTG2). Also sig-nificantly enriched were genes linked to transcriptional regulation(P5 5.43 1024; for example, TP53) and genes involved in methyla-tion (P5 2.23 1024) and acetylation (P5 1.23 1022), includinghistone methyltransferase (HMT) and acetyltransferase (HAT)enzymes known previously to be mutated in lymphoma (for example,EZH2 (ref. 13) and CREBBP (ref. 15); Supplementary Methods).Mutation hot spots can result frommutations at sites under strong

selective pressure and we have previously identified such sites usingRNA-seq data13.We searched our RNA-seq data for genes withmuta-tion hot spots, and identified 10 genes that were not mutated in the 14genomes (PIM1, FOXO1, CCND3, TP53, IRF4, BTG2, CD79B,BCL7A, IKZF3 and B2M), of which five (FOXO1, CCND3, BTG2,

IKZF3 and B2M) were not previously known targets of point muta-tion in NHL (Supplementary Table 6 and Supplementary Methods).FOXO1, BCL7A and B2M had hot spots affecting their start codons.The effect of a FOXO1 start codon mutation, which was observed inthree cases, was further studied using a cell line in which the initiatingATG was mutated to TTG. Western blots probed with a FOXO1antibody revealed a band with a reduced molecular weight, indicativeof a FOXO1 amino-terminal truncation (Supplementary Figure 2),consistent with use of the next in-frame ATG for translation ini-tiation. A second hot spot in FOXO1 at T24 was mutated in two cases.T24 is reportedly phosphorylated by AKT subsequent to B-cell recep-tor (BCR) stimulation16 inducing FOXO1 nuclear export.We analysed the RNA-seq data to determine whether any of the

somatic mutations in the 109 recurrently mutated genes showed evid-ence for allelic imbalance with expression favouring one allele. Out of380 expressed heterozygous mutant alleles, we observed preferentialexpression of the mutation for 16.8% (64/380) and preferentialexpression of the wild type for 27.8% (106/380; SupplementaryTable 7). Seven genes showed evidence for significant preferentialexpression of the mutant allele in at least two cases: BCL2, CARD11,CD79B, EZH2, IRF4, MEF2B and TP53; Supplementary Methods. In27 out of 43 cases with BCL2 cSNVs, expression favoured the mutantallele, consistent with the previously-described hypothesis that thetranslocated (and hence, transcriptionally deregulated) allele ofBCL2 is targeted by somatic hypermutation17. Examples of mutationsat known oncogenic hot spot sites such as F123I in CARD11 (ref. 18)showed allelic imbalance favouring the mutant allele in some cases.Similarly, we noted expression favouring two novel hot spot muta-tions inMEF2B (Y69 and D83) and two sites in EZH2 not previouslyreported as mutated in lymphoma (A682G and A692V).We sought to distinguish new cancer-related mutations from

passenger mutations using the approach proposed previously19. Wereasoned that this would reveal genes with strong selection signatures,and mutations in such genes would be good candidate cancer drivers.We identified 26 genes with significant evidence for positive selection(false discovery rate5 0.03, Supplementary Methods), with eitherselective pressure for acquiring non-synonymous point mutations ortruncating/nonsensemutations (SupplementaryMethods; Table 1 andSupplementary Table 8). Included were known lymphoma oncogenes(BCL2, CD79B (ref. 9), CARD11 (ref. 18),MYD88 (ref. 10) and EZH2(ref. 13)), all of which showed signatures indicative of selection fornon-synonymous variants.

Evidence for selection of inactivating changesWe expected tumour suppressor genes to show strong selection forthe acquisition of nonsense mutations. In our analysis, the eight mostsignificant genes included seven with strong selective pressure fornonsense mutations, including the known tumour suppressor genesTP53 and TNFRSF14 (ref. 20 ; Table 1). CREBBP, recently reported ascommonly inactivated in DLBCL15, also showed some evidence foracquisition of nonsensemutations and cSNVs (Supplementary Figure3 and Supplementary Table 9). We also observed enrichment fornonsensemutations in BCL10, a positive regulator of NF-kB, in whichoncogenic truncated products have been described in lymphomas21.The remaining strongly significant genes (BTG1, GNA13, SGK1 andMLL2) had no reported role in lymphoma. GNA13 was affected bymutations in 22 cases includingmultiple nonsensemutations.GNA13encodes the alpha subunit of a heterotrimeric G-protein coupledreceptor responsible for modulating RhoA activity22. Some of themutated residues negatively affect its function23,24, including aT203A mutation, which also showed allelic imbalance favouring themutant allele (Supplementary Table 7). GNA13 protein was reducedor absent on western blots in cell lines harbouring either a nonsensemutation, a stop codon deletion, a frame shifting deletion, or changesaffecting splice sites (Supplementary Methods and SupplementaryFigure 4).

HIST1H1C

EZH2

BCL2

MLL2

SGK1

CREBBP

BTG1

GNA13

MEF2B

MYD88

KLHL6

TNFRSF14

IRF8

ETS1

FOXO1

IGHB2M

CARD11

BTG2

CD70

CD79B

CD58

TP53

TMEM30A

BCL10

FAS

CCND3

STAT3

BCL2

cSNVsCNV lossCNV gainCNV high-level gainLOH

Figure 1 | Genome-wide visualization of somatic mutation targets in NHL.Overview of structural rearrangements and copy number variations (CNVs) inthe 11 DLBCL genomes and cSNVs in the 109 recurrently mutated genesidentified in our analysis. Inner arcs represent somatic fusion transcriptsidentified in at least one of the 11 genomes. The CNVs and LOH detected ineach of the 11DLBCL tumour/normal pairs are displayed on the concentric setsof rings. The inner 11 rings show regions of enhanced homozygosity plottedwith blue (interpreted as LOH). The outer 11 rings show somatic CNVs. Purplecircles indicate the position of genes with at least two confirmed somaticmutationswith circle diameter proportional to the number of cases with cSNVsdetected in that gene. Circles representing the genes with significant evidencefor positive selection are labelled. Coincidence between recurrently mutatedgenes and regions of gain/loss are colour-coded in the labels (green, loss; red,gain). For example B2M, which encodes beta-2-microglobulin, is recurrentlymutated and is deleted in two cases.

ARTICLE RESEARCH

1 8 A U G U S T 2 0 1 1 | V O L 4 7 6 | N A T U R E | 2 9 9

chromosome (BAC) clone sequencing in eight FL cases to show that inall eight cases themutations were in trans, affecting bothMLL2 alleles.This observation is consistent with the notion that there is a complete,or near-complete, loss ofMLL2 in the tumour cells of such patients.With the exception of two primary FL cases and two DLBCL cell

lines (Pfeiffer and SU-DHL-9), themajority ofMLL2mutations seemedto be heterozygous. Analysis of Affymetrix 500k SNP array data fromtwo FL cases with apparent homozygous mutations revealed that bothtumours showed copy number neutral loss of heterozygosity (LOH)for the region of chromosome 12 containing MLL2 (SupplementaryMethods). Thus, in addition to bi-allelic mutation, LOH is a second,albeit less common mechanism by whichMLL2 function is lost.MLL2 was the most frequently mutated gene in FL, and among the

most frequently mutated genes in DLBCL (Fig. 2). We confirmedMLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCLpatients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of theeight normal centroblast samples we sequenced. Our analysis pre-dicted that the majority of the somatic mutations observed in MLL2were inactivating (91% disrupted the reading frame or were truncat-ing point mutations), indicating to us that MLL2 is a tumour sup-pressor of significance in NHL.

Recurrent point mutations in MEF2BOur selective pressure analysis also revealed genes with stronger pres-sure for acquisition of amino acid substitutions than for nonsense

mutations. One such gene wasMEF2B, which had not previously beenlinked to lymphoma. We found that 20 (15.7%) cases had MEF2BcSNVs and 4 (3.1%) cases hadMEF2C cSNVs. All cSNVs detected byRNA-seq affected either the MADS box or MEF2 domains. To deter-mine the frequency and scope of MEF2B mutations, we Sanger-sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCLprimary tumours; 17 cell lines; 35 cases of assorted NHL (IBL,composite FL and PBMCL); and eight non-malignant centroblastsamples. We also used a capture strategy (Supplementary Methods)to sequence the entire MEF2B coding region in the 261 FL samples,revealing six additional variants outside exons 2 and 3. We thus iden-tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2BcSNVs or indels, failing to observe novel variants in other NHL andnon-malignant samples. Of the variants 55 (80%) affected residueswithin the MADS box and MEF2 domains encoded by exons 2 and 3(Supplementary Table 11; Fig. 3b). Each patient generally had a singleMEF2B variant and we observed relatively few (eight in total, 10.7%)truncation-inducing SNVs or indels. Non-synonymous SNVswere byfar themost common type of change observed, with 59.4% of detectedvariants affecting K4, Y69, N81 or D83. In 12 casesMEF2Bmutationswere shown to be somatic, including representative mutations at eachof K4, Y69, N81 and D83 (Supplementary Table 12). We did notdetect mutations in ABC cases, indicating that somatic mutations inMEF2B have a role unique to the development of GCBDLBCL and FL(Fig. 2).

AB

C e

nric

hmen

tG

CB

enr

ichm

ent

10203040

Cas

es

ABC GCBU FL

<0.05

0.1–0.05

0.3–0.1

MYD88

CD79B

BCL6s

TNFA

IP3

CARD11

FAS

TMEM30A

CD58

CD70

STAT3

ETS

1HIST1H1C

CCND3

KLH

L6BTG

1BTG

2IRF8

B2M

EP300

CREBBP

MLL2

FOXO1

TNFR

SF14

MEF2B

TP53

BCL2

SGK1

GNA13

EZH2

BCL2s

BCL2sEZH2GNA13SGK1BCL2TP53MEF2BTNFRSF14FOXO1MLL2CREBBPEP300B2MIRF8BTG2BTG1KLHL6CCND3HIST1H1CETS1STAT3CD70CD58TMEM30AFASCARD11TNFAIP3BCL6sCD79BMYD88

Figure 2 | Overview of mutations and potential cooperative interactions inNHL. This heat map displays possible trends towards co-occurrence (red) andmutual exclusion (blue) of somatic mutations and structural rearrangements.Colours were assigned by taking the minimum value of a left- and right-tailedFisher’s exact test. To capture trends a P-value threshold of 0.3 was used, withthe darkest shade of the colour indicating those meeting statistical significance(P# 0.05). The relative frequency of mutations in ABC (blue), GCB (red),unclassifiable (black)DLBCLs and FL (yellow) cases is shown on the left. Geneswere arranged with those having significant (P, 0.05, Fisher’s exact test)enrichment for mutations in ABC cases (blue triangle) towards the top (andleft) and those with significant enrichment for mutations in GCB cases (redtriangle) towards the bottom (and right). The total number of cases in whicheach gene contained either cSNVs or confirmed somatic mutations is shown atthe top. The cluster of blue squares (upper-right) results from the mutualexclusion of the ABC-enrichedmutations (for example,MYD88, CD79B) fromthe GCB-enriched mutations (for example, EZH2, GNA13). Presence ofstructural rearrangements involving the two oncogenes BCL6 and BCL2(indicated as BCL6s and BCL2s) was determined with FISH techniques usingbreak-apart probes (Supplementary Methods).

PHD PHD HMG box COG5141FYRN

FYRC

SET

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp

a

D83G

K4E

MADS box

0 50 100 150 200 250 300 350

MEF2

b N81KN81Y

D83A

D83VY69HY69C

MLL2

MEF2B

bp

Figure 3 | Summary and effect of somatic mutations affecting MLL2 andMEF2B. a, Re-sequencing theMLL2 locus in 89 samples revealed mainlynonsense (red circles) and frameshift-inducing indel mutations (orangetriangles; inverted triangles for insertions and upright triangles for deletions). Asmaller number of non-synonymous somatic mutations (green circles) andpoint mutations or deletions affecting splice sites (yellow stars) were alsoobserved. All of the non-synonymous pointmutations affected a residue withineither the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminaldomain) or PHD zinc finger domains. The effect of these splice-site mutationsonMLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVsand somatic mutations found inMEF2B in all FL and DLBCL cases sequencedare shownwith the same symbols. Only the amino acids with variants in at leasttwo patients are labelled. cSNVs were most prevalent in the first two protein-coding exons ofMEF2B (exons 2 and 3). The crystal structure of MEF2 boundto EP300 supports the idea that two of the mutated sites (L67 and Y69) areimportant in the interaction between these proteins (Supplementary Figure 8and Supplementary Discussion)50.

Table 2 | Summary of types of MLL2 somatic mutationsSample Type FL DLBCL DLBCL cell-line Centroblast

Truncation 18 4 7 0Indel with frameshift 22 8 6 0Splice site 4 2 0 0SNV 3 2 2 0Any mutation/number of cases 31/35 12/37 10/17 0/8Percentage 89 32 59 0

ARTICLE RESEARCH

1 8 A U G U S T 2 0 1 1 | V O L 4 7 6 | N A T U R E | 3 0 1

S14 NATURE REPRINT COLLECTION Epigenetics

SGK1 encodes a phosphatidylinositol-3-OH kinase (PI(3)K)-regulated kinase with functions including regulation of FOXOtranscription factors25, regulation of NF-kB by phosphorylating IkBkinase26, and negative regulation of NOTCH signalling27. SGK1 alsoresides within a region of chromosome 6 commonly deleted inDLBCL(Fig. 1)5. Themechanismbywhich SGK1 andGNA13 inactivationmaycontribute to lymphoma is unclear, but the strong degree of apparentselection towards their inactivation and their overall high mutationfrequency (eachmutated in 18 of 106DLBCL cases) suggests that theirloss contributes to B-cell NHL. Certain genes are known to bemutatedmore commonly in GCB DLBCLs (for example, TP53 (ref. 28) andEZH2 (ref. 13)). Here, both SGK1 and GNA13 mutations were foundonly in GCB cases (P5 1.933 1023 and 2.283 1024, Fisher’s exacttest; n5 15 and 18, respectively) (Fig. 2). Two additional genes(MEF2B and TNFRSF14) with no previously described role inDLBCL showed a similar restriction to GCB cases (Fig. 2).

Inactivating MLL2 mutationsMLL2 showed the most significant evidence for selection and thelargest number of nonsense SNVs. Our RNA-seq analysis indicatedthat 26.0% (33/127) of cases carried at least one MLL2 cSNV. To

address the possibility that variable RNA-seq coverage ofMLL2 failedto capture some mutations, we PCR-amplified the entireMLL2 locus(,36 kilobases) in 89 cases (35 primary FLs, 17 DLBCL cell lines, and37 DLBCLs). Of these cases 58 were among the RNA-seq cohort.Illumina amplicon re-sequencing (Supplementary Methods) revealed78 mutations, confirming the RNA-seq mutations in the overlappingcases and identifying 33 additional mutations. We confirmed thesomatic status of 46 variants using Sanger sequencing (Supplemen-tary Table 10), and showed that 20 of the 33 additionalmutations wereinsertions or deletions (indels). Three SNVs at splice sites were alsodetected, aswere10newcSNVs thathadnot beendetected byRNA-seq.The somatic mutations were distributed acrossMLL2 (Fig. 3a). Of

these, 37% (n5 29/78) were nonsense mutations, 46% (n5 36/78)were indels that altered the reading frame, 8% (n5 6/78) were pointmutations at splice sites and 9% (n5 7/78) were non-synonymousamino acid substitutions (Table 2). Four of the somatic splice sitemutations had effects on MLL2 transcript length and structure. Forexample, two heterozygous splice site mutations resulted in the use ofa novel splice donor site and an intron retention event.Approximately half of the NHL cases we sequenced had twoMLL2

mutations (Supplementary Table 10). We used bacterial artificial

Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genesGene Cases Total Somatic cSNVs

(RNA-seqcohort)*

P (raw) q NS SP T SP Skew(M, WT, both){

NS S T NS S T

MLL2{ 16 8 17 17 8 18 10 6.8531028 8.50 31027 0.834 14.4 WTTNFRSF14 G{ 7 1 7 8 1 7 11 6.8531028 8.50 31027 7.52 118 BothSGK1 G{ 18 6 6 37 10 6 9 6.8531028 8.50 31027 19.5 61.7 2BCL10{ 2 0 4 3 0 4 4 6.8531028 8.50 31027 3.62 112 WTGNA13 G{ 21 1 2 33 1 2 5 6.8531028 8.50 31027 24.1 25.7 BothTP53 G{ 20 2 1 23 3 1 22 6.8531028 8.50 31027 15.6 14.1 BothEZH2 G{ 33 0 0 33 0 0 33 6.8531028 8.50 31027 11.4 0.00 BothBTG2{ 12 6 1 14 6 1 2 6.85 31028 8.50 31027 23.9 35.1 2BCL2 G{ 42 45 0 96 105 0 43 9.3531028 8.50 31027 3.78 0.00 MBCL6{1 11 2 0 12 2 0 2 9.3531028 8.50 31027 0.175 0.00 MCIITA{1 5 3 0 6 3 0 2 9.3531028 8.50 31027 0.086 0.00FAS{ 2 0 4 3 0 4 2 1.52 31027 1.17 31026 2.54 66.5 WTBTG1{ 11 6 2 11 7 2 10 1.52 31027 1.17 31026 17.5 52.5 BothMEF2B G{ 20 2 0 20 2 0 10 2.05 31027 1.47 31026 14.2 0.00 MIRF8{ 11 5 3 14 5 3 3 4.55 31027 3.03 31026 8.82 28.2 WTTMEM30A{ 1 0 4 1 0 4 4 6.06 31027 3.79 31026 0.785 65.0 WTCD58{ 2 0 3 2 0 3 2 2.42 31026 1.43 31025 2.29 69.2 2KLHL6{ 10 2 2 12 2 2 4 1.00 31025 5.26 31025 5.42 16.4 2MYD88 A{ 13 2 0 14 2 0 9 1.00 31025 5.26 31025 12.4 0.00 WTCD70{ 5 0 1 5 0 2 3 1.70 31025 8.48 31025 7.08 44.0 2CD79B A{ 7 2 1 9 2 1 5 2.00 31025 9.52 31025 10.9 18.3 MCCND3{ 7 1 2 7 1 2 6 2.80 31025 1.27 31024 6.55 36.3 WTCREBBP{ 20 7 4 24 7 4 9 1.00 31024 4.35 31024 2.72 6.04 BothHIST1H1C{ 9 0 0 10 0 0 6 1.80 31024 7.50 31024 11.9 0.00 BothB2M{ 7 0 0 7 0 0 4 3.90 31024 1.56 31023 16.6 0.00 WTETS1{ 10 1 0 10 1 0 4 4.10 31024 1.58 31023 5.76 0.00 WTCARD11{ 14 3 0 14 3 0 3 1.90 31023 7.04 31023 3.37 0.00 BothFAT2{1 2 1 0 2 1 0 2 6.30 31023 2.25 31022 0.128 0.00 2IRF4{1 9 4 0 26 5 0 5 7.00 31023 2.41 31022 0.569 0.00 BothFOXO1{ 8 4 0 10 4 0 4 7.60 3103 2.53 31022 4.02 0.00 2STAT3 9 0 0 9 0 0 4 2.19 31022 6.08 31022 2 2 BothRAPGEF1 8 3 0 10 3 0 3 2.98 31022 7.45 31022 2 2 WTABCA7 12 3 0 15 3 0 2 7.76 31022 1.67 31021 2 2 WTRNF213 10 8 0 10 8 0 2 7.87 31022 1.67 31021 2 2 2MUC16 17 12 0 39 25 0 2 8.32 31022 1.73 31021 2 2 2HDAC7 8 4 0 8 4 0 2 8.94 31022 1.82 31021 2 2 WTPRKDC 7 3 0 7 4 0 2 1.06 31021 2.05 31021 2 2 2SAMD9 9 2 0 9 2 0 2 1.79 31021 3.01 31021 2 2 2TAF1 10 0 0 10 0 0 2 3.03 31021 4.74 31021 2 2 2PIM1 20 19 0 33 34 0 11 3.40 31021 5.23 31021 2 2 WTCOL4A2 8 2 0 8 2 0 2 7.64 31021 8.99 31021 2 2 2EP300 8 7 1 8 7 1 3 9.54 31021 1.00 2 2 WT

Individual cases with non-synonymous (NS), synonymous (S) and truncating (T) mutations and the total number of mutations of each class are shown separately because some genes contained multiplemutations in the same case. The P values indicated in bold are the upper limit on the P value for that gene determined with the approach described in ref. 19 (see Supplementary Methods), q is the Benjamini-corrected q value, and NS SP and T SP refer to selective pressure estimates from this model for the acquisition of non-synonymous or truncatingmutations, respectively. Genes with a superscript of either A or Gwere found to have mutations significantly enriched in ABC or GCB cases, respectively (P,0.05, Fisher’s exact test).*Additional somatic mutations identified in larger cohorts and insertion/deletion mutations are not included in this total.{ ‘Both’ indicates that we observed separate cases in which skewed expression was seen but where this skew was not consistent for the mutant or wild-type allele.{Genessignificant at a false discovery rate of 0.03. SNVs inBCL2 andpreviously confirmedhot spotmutations inEZH2andCD79Bareprobably somatic in these samples basedonpublishedobservations of others.1Selective pressure estimates are both,1 indicating purifying selection rather than positive selection acting on this gene.

RESEARCH ARTICLE

3 0 0 | N A T U R E | V O L 4 7 6 | 1 8 A U G U S T 2 0 1 1

SGK1 encodes a phosphatidylinositol-3-OH kinase (PI(3)K)-regulated kinase with functions including regulation of FOXOtranscription factors25, regulation of NF-kB by phosphorylating IkBkinase26, and negative regulation of NOTCH signalling27. SGK1 alsoresides within a region of chromosome 6 commonly deleted inDLBCL(Fig. 1)5. Themechanismbywhich SGK1 andGNA13 inactivationmaycontribute to lymphoma is unclear, but the strong degree of apparentselection towards their inactivation and their overall high mutationfrequency (eachmutated in 18 of 106DLBCL cases) suggests that theirloss contributes to B-cell NHL. Certain genes are known to bemutatedmore commonly in GCB DLBCLs (for example, TP53 (ref. 28) andEZH2 (ref. 13)). Here, both SGK1 and GNA13 mutations were foundonly in GCB cases (P5 1.933 1023 and 2.283 1024, Fisher’s exacttest; n5 15 and 18, respectively) (Fig. 2). Two additional genes(MEF2B and TNFRSF14) with no previously described role inDLBCL showed a similar restriction to GCB cases (Fig. 2).

Inactivating MLL2 mutationsMLL2 showed the most significant evidence for selection and thelargest number of nonsense SNVs. Our RNA-seq analysis indicatedthat 26.0% (33/127) of cases carried at least one MLL2 cSNV. To

address the possibility that variable RNA-seq coverage ofMLL2 failedto capture some mutations, we PCR-amplified the entireMLL2 locus(,36 kilobases) in 89 cases (35 primary FLs, 17 DLBCL cell lines, and37 DLBCLs). Of these cases 58 were among the RNA-seq cohort.Illumina amplicon re-sequencing (Supplementary Methods) revealed78 mutations, confirming the RNA-seq mutations in the overlappingcases and identifying 33 additional mutations. We confirmed thesomatic status of 46 variants using Sanger sequencing (Supplemen-tary Table 10), and showed that 20 of the 33 additionalmutations wereinsertions or deletions (indels). Three SNVs at splice sites were alsodetected, aswere10newcSNVs thathadnot beendetected byRNA-seq.The somatic mutations were distributed acrossMLL2 (Fig. 3a). Of

these, 37% (n5 29/78) were nonsense mutations, 46% (n5 36/78)were indels that altered the reading frame, 8% (n5 6/78) were pointmutations at splice sites and 9% (n5 7/78) were non-synonymousamino acid substitutions (Table 2). Four of the somatic splice sitemutations had effects on MLL2 transcript length and structure. Forexample, two heterozygous splice site mutations resulted in the use ofa novel splice donor site and an intron retention event.Approximately half of the NHL cases we sequenced had twoMLL2

mutations (Supplementary Table 10). We used bacterial artificial

Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genesGene Cases Total Somatic cSNVs

(RNA-seqcohort)*

P (raw) q NS SP T SP Skew(M, WT, both){

NS S T NS S T

MLL2{ 16 8 17 17 8 18 10 6.8531028 8.50 31027 0.834 14.4 WTTNFRSF14 G{ 7 1 7 8 1 7 11 6.8531028 8.50 31027 7.52 118 BothSGK1 G{ 18 6 6 37 10 6 9 6.8531028 8.50 31027 19.5 61.7 2BCL10{ 2 0 4 3 0 4 4 6.8531028 8.50 31027 3.62 112 WTGNA13 G{ 21 1 2 33 1 2 5 6.8531028 8.50 31027 24.1 25.7 BothTP53 G{ 20 2 1 23 3 1 22 6.8531028 8.50 31027 15.6 14.1 BothEZH2 G{ 33 0 0 33 0 0 33 6.8531028 8.50 31027 11.4 0.00 BothBTG2{ 12 6 1 14 6 1 2 6.85 31028 8.50 31027 23.9 35.1 2BCL2 G{ 42 45 0 96 105 0 43 9.3531028 8.50 31027 3.78 0.00 MBCL6{1 11 2 0 12 2 0 2 9.3531028 8.50 31027 0.175 0.00 MCIITA{1 5 3 0 6 3 0 2 9.3531028 8.50 31027 0.086 0.00FAS{ 2 0 4 3 0 4 2 1.52 31027 1.17 31026 2.54 66.5 WTBTG1{ 11 6 2 11 7 2 10 1.52 31027 1.17 31026 17.5 52.5 BothMEF2B G{ 20 2 0 20 2 0 10 2.05 31027 1.47 31026 14.2 0.00 MIRF8{ 11 5 3 14 5 3 3 4.55 31027 3.03 31026 8.82 28.2 WTTMEM30A{ 1 0 4 1 0 4 4 6.06 31027 3.79 31026 0.785 65.0 WTCD58{ 2 0 3 2 0 3 2 2.42 31026 1.43 31025 2.29 69.2 2KLHL6{ 10 2 2 12 2 2 4 1.00 31025 5.26 31025 5.42 16.4 2MYD88 A{ 13 2 0 14 2 0 9 1.00 31025 5.26 31025 12.4 0.00 WTCD70{ 5 0 1 5 0 2 3 1.70 31025 8.48 31025 7.08 44.0 2CD79B A{ 7 2 1 9 2 1 5 2.00 31025 9.52 31025 10.9 18.3 MCCND3{ 7 1 2 7 1 2 6 2.80 31025 1.27 31024 6.55 36.3 WTCREBBP{ 20 7 4 24 7 4 9 1.00 31024 4.35 31024 2.72 6.04 BothHIST1H1C{ 9 0 0 10 0 0 6 1.80 31024 7.50 31024 11.9 0.00 BothB2M{ 7 0 0 7 0 0 4 3.90 31024 1.56 31023 16.6 0.00 WTETS1{ 10 1 0 10 1 0 4 4.10 31024 1.58 31023 5.76 0.00 WTCARD11{ 14 3 0 14 3 0 3 1.90 31023 7.04 31023 3.37 0.00 BothFAT2{1 2 1 0 2 1 0 2 6.30 31023 2.25 31022 0.128 0.00 2IRF4{1 9 4 0 26 5 0 5 7.00 31023 2.41 31022 0.569 0.00 BothFOXO1{ 8 4 0 10 4 0 4 7.60 3103 2.53 31022 4.02 0.00 2STAT3 9 0 0 9 0 0 4 2.19 31022 6.08 31022 2 2 BothRAPGEF1 8 3 0 10 3 0 3 2.98 31022 7.45 31022 2 2 WTABCA7 12 3 0 15 3 0 2 7.76 31022 1.67 31021 2 2 WTRNF213 10 8 0 10 8 0 2 7.87 31022 1.67 31021 2 2 2MUC16 17 12 0 39 25 0 2 8.32 31022 1.73 31021 2 2 2HDAC7 8 4 0 8 4 0 2 8.94 31022 1.82 31021 2 2 WTPRKDC 7 3 0 7 4 0 2 1.06 31021 2.05 31021 2 2 2SAMD9 9 2 0 9 2 0 2 1.79 31021 3.01 31021 2 2 2TAF1 10 0 0 10 0 0 2 3.03 31021 4.74 31021 2 2 2PIM1 20 19 0 33 34 0 11 3.40 31021 5.23 31021 2 2 WTCOL4A2 8 2 0 8 2 0 2 7.64 31021 8.99 31021 2 2 2EP300 8 7 1 8 7 1 3 9.54 31021 1.00 2 2 WT

Individual cases with non-synonymous (NS), synonymous (S) and truncating (T) mutations and the total number of mutations of each class are shown separately because some genes contained multiplemutations in the same case. The P values indicated in bold are the upper limit on the P value for that gene determined with the approach described in ref. 19 (see Supplementary Methods), q is the Benjamini-corrected q value, and NS SP and T SP refer to selective pressure estimates from this model for the acquisition of non-synonymous or truncatingmutations, respectively. Genes with a superscript of either A or Gwere found to have mutations significantly enriched in ABC or GCB cases, respectively (P,0.05, Fisher’s exact test).*Additional somatic mutations identified in larger cohorts and insertion/deletion mutations are not included in this total.{ ‘Both’ indicates that we observed separate cases in which skewed expression was seen but where this skew was not consistent for the mutant or wild-type allele.{Genessignificant at a false discovery rate of 0.03. SNVs inBCL2 andpreviously confirmedhot spotmutations inEZH2andCD79Bareprobably somatic in these samples basedonpublishedobservations of others.1Selective pressure estimates are both,1 indicating purifying selection rather than positive selection acting on this gene.

RESEARCH ARTICLE

3 0 0 | N A T U R E | V O L 4 7 6 | 1 8 A U G U S T 2 0 1 1

NATURE REPRINT COLLECTION Epigenetics S15

chromosome (BAC) clone sequencing in eight FL cases to show that inall eight cases themutations were in trans, affecting bothMLL2 alleles.This observation is consistent with the notion that there is a complete,or near-complete, loss ofMLL2 in the tumour cells of such patients.With the exception of two primary FL cases and two DLBCL cell

lines (Pfeiffer and SU-DHL-9), themajority ofMLL2mutations seemedto be heterozygous. Analysis of Affymetrix 500k SNP array data fromtwo FL cases with apparent homozygous mutations revealed that bothtumours showed copy number neutral loss of heterozygosity (LOH)for the region of chromosome 12 containing MLL2 (SupplementaryMethods). Thus, in addition to bi-allelic mutation, LOH is a second,albeit less common mechanism by whichMLL2 function is lost.MLL2 was the most frequently mutated gene in FL, and among the

most frequently mutated genes in DLBCL (Fig. 2). We confirmedMLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCLpatients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of theeight normal centroblast samples we sequenced. Our analysis pre-dicted that the majority of the somatic mutations observed in MLL2were inactivating (91% disrupted the reading frame or were truncat-ing point mutations), indicating to us that MLL2 is a tumour sup-pressor of significance in NHL.

Recurrent point mutations in MEF2BOur selective pressure analysis also revealed genes with stronger pres-sure for acquisition of amino acid substitutions than for nonsense

mutations. One such gene wasMEF2B, which had not previously beenlinked to lymphoma. We found that 20 (15.7%) cases had MEF2BcSNVs and 4 (3.1%) cases hadMEF2C cSNVs. All cSNVs detected byRNA-seq affected either the MADS box or MEF2 domains. To deter-mine the frequency and scope of MEF2B mutations, we Sanger-sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCLprimary tumours; 17 cell lines; 35 cases of assorted NHL (IBL,composite FL and PBMCL); and eight non-malignant centroblastsamples. We also used a capture strategy (Supplementary Methods)to sequence the entire MEF2B coding region in the 261 FL samples,revealing six additional variants outside exons 2 and 3. We thus iden-tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2BcSNVs or indels, failing to observe novel variants in other NHL andnon-malignant samples. Of the variants 55 (80%) affected residueswithin the MADS box and MEF2 domains encoded by exons 2 and 3(Supplementary Table 11; Fig. 3b). Each patient generally had a singleMEF2B variant and we observed relatively few (eight in total, 10.7%)truncation-inducing SNVs or indels. Non-synonymous SNVswere byfar themost common type of change observed, with 59.4% of detectedvariants affecting K4, Y69, N81 or D83. In 12 casesMEF2Bmutationswere shown to be somatic, including representative mutations at eachof K4, Y69, N81 and D83 (Supplementary Table 12). We did notdetect mutations in ABC cases, indicating that somatic mutations inMEF2B have a role unique to the development of GCBDLBCL and FL(Fig. 2).

AB

C e

nric

hmen

tG

CB

enr

ichm

ent

10203040

Cas

es

ABC GCBU FL

<0.05

0.1–0.05

0.3–0.1

MYD88

CD79B

BCL6s

TNFA

IP3

CARD11

FAS

TMEM30A

CD58

CD70

STAT3

ETS

1HIST1H1C

CCND3

KLH

L6BTG

1BTG

2IRF8

B2M

EP300

CREBBP

MLL2

FOXO1

TNFR

SF14

MEF2B

TP53

BCL2

SGK1

GNA13

EZH2

BCL2s

BCL2sEZH2GNA13SGK1BCL2TP53MEF2BTNFRSF14FOXO1MLL2CREBBPEP300B2MIRF8BTG2BTG1KLHL6CCND3HIST1H1CETS1STAT3CD70CD58TMEM30AFASCARD11TNFAIP3BCL6sCD79BMYD88

Figure 2 | Overview of mutations and potential cooperative interactions inNHL. This heat map displays possible trends towards co-occurrence (red) andmutual exclusion (blue) of somatic mutations and structural rearrangements.Colours were assigned by taking the minimum value of a left- and right-tailedFisher’s exact test. To capture trends a P-value threshold of 0.3 was used, withthe darkest shade of the colour indicating those meeting statistical significance(P# 0.05). The relative frequency of mutations in ABC (blue), GCB (red),unclassifiable (black)DLBCLs and FL (yellow) cases is shown on the left. Geneswere arranged with those having significant (P, 0.05, Fisher’s exact test)enrichment for mutations in ABC cases (blue triangle) towards the top (andleft) and those with significant enrichment for mutations in GCB cases (redtriangle) towards the bottom (and right). The total number of cases in whicheach gene contained either cSNVs or confirmed somatic mutations is shown atthe top. The cluster of blue squares (upper-right) results from the mutualexclusion of the ABC-enrichedmutations (for example,MYD88, CD79B) fromthe GCB-enriched mutations (for example, EZH2, GNA13). Presence ofstructural rearrangements involving the two oncogenes BCL6 and BCL2(indicated as BCL6s and BCL2s) was determined with FISH techniques usingbreak-apart probes (Supplementary Methods).

PHD PHD HMG box COG5141FYRN

FYRC

SET

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp

a

D83G

K4E

MADS box

0 50 100 150 200 250 300 350

MEF2

b N81KN81Y

D83A

D83VY69HY69C

MLL2

MEF2B

bp

Figure 3 | Summary and effect of somatic mutations affecting MLL2 andMEF2B. a, Re-sequencing theMLL2 locus in 89 samples revealed mainlynonsense (red circles) and frameshift-inducing indel mutations (orangetriangles; inverted triangles for insertions and upright triangles for deletions). Asmaller number of non-synonymous somatic mutations (green circles) andpoint mutations or deletions affecting splice sites (yellow stars) were alsoobserved. All of the non-synonymous pointmutations affected a residue withineither the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminaldomain) or PHD zinc finger domains. The effect of these splice-site mutationsonMLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVsand somatic mutations found inMEF2B in all FL and DLBCL cases sequencedare shownwith the same symbols. Only the amino acids with variants in at leasttwo patients are labelled. cSNVs were most prevalent in the first two protein-coding exons ofMEF2B (exons 2 and 3). The crystal structure of MEF2 boundto EP300 supports the idea that two of the mutated sites (L67 and Y69) areimportant in the interaction between these proteins (Supplementary Figure 8and Supplementary Discussion)50.

Table 2 | Summary of types of MLL2 somatic mutationsSample Type FL DLBCL DLBCL cell-line Centroblast

Truncation 18 4 7 0Indel with frameshift 22 8 6 0Splice site 4 2 0 0SNV 3 2 2 0Any mutation/number of cases 31/35 12/37 10/17 0/8Percentage 89 32 59 0

ARTICLE RESEARCH

1 8 A U G U S T 2 0 1 1 | V O L 4 7 6 | N A T U R E | 3 0 1

chromosome (BAC) clone sequencing in eight FL cases to show that inall eight cases themutations were in trans, affecting bothMLL2 alleles.This observation is consistent with the notion that there is a complete,or near-complete, loss ofMLL2 in the tumour cells of such patients.With the exception of two primary FL cases and two DLBCL cell

lines (Pfeiffer and SU-DHL-9), themajority ofMLL2mutations seemedto be heterozygous. Analysis of Affymetrix 500k SNP array data fromtwo FL cases with apparent homozygous mutations revealed that bothtumours showed copy number neutral loss of heterozygosity (LOH)for the region of chromosome 12 containing MLL2 (SupplementaryMethods). Thus, in addition to bi-allelic mutation, LOH is a second,albeit less common mechanism by whichMLL2 function is lost.MLL2 was the most frequently mutated gene in FL, and among the

most frequently mutated genes in DLBCL (Fig. 2). We confirmedMLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCLpatients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of theeight normal centroblast samples we sequenced. Our analysis pre-dicted that the majority of the somatic mutations observed in MLL2were inactivating (91% disrupted the reading frame or were truncat-ing point mutations), indicating to us that MLL2 is a tumour sup-pressor of significance in NHL.

Recurrent point mutations in MEF2BOur selective pressure analysis also revealed genes with stronger pres-sure for acquisition of amino acid substitutions than for nonsense

mutations. One such gene wasMEF2B, which had not previously beenlinked to lymphoma. We found that 20 (15.7%) cases had MEF2BcSNVs and 4 (3.1%) cases hadMEF2C cSNVs. All cSNVs detected byRNA-seq affected either the MADS box or MEF2 domains. To deter-mine the frequency and scope of MEF2B mutations, we Sanger-sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCLprimary tumours; 17 cell lines; 35 cases of assorted NHL (IBL,composite FL and PBMCL); and eight non-malignant centroblastsamples. We also used a capture strategy (Supplementary Methods)to sequence the entire MEF2B coding region in the 261 FL samples,revealing six additional variants outside exons 2 and 3. We thus iden-tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2BcSNVs or indels, failing to observe novel variants in other NHL andnon-malignant samples. Of the variants 55 (80%) affected residueswithin the MADS box and MEF2 domains encoded by exons 2 and 3(Supplementary Table 11; Fig. 3b). Each patient generally had a singleMEF2B variant and we observed relatively few (eight in total, 10.7%)truncation-inducing SNVs or indels. Non-synonymous SNVswere byfar themost common type of change observed, with 59.4% of detectedvariants affecting K4, Y69, N81 or D83. In 12 casesMEF2Bmutationswere shown to be somatic, including representative mutations at eachof K4, Y69, N81 and D83 (Supplementary Table 12). We did notdetect mutations in ABC cases, indicating that somatic mutations inMEF2B have a role unique to the development of GCBDLBCL and FL(Fig. 2).

AB

C e

nric

hmen

tG

CB

enr

ichm

ent

10203040

Cas

es

ABC GCBU FL

<0.05

0.1–0.05

0.3–0.1

MYD88

CD79B

BCL6s

TNFA

IP3

CARD11

FAS

TMEM30A

CD58

CD70

STAT3

ETS

1HIST1H1C

CCND3

KLH

L6BTG

1BTG

2IRF8

B2M

EP300

CREBBP

MLL2

FOXO1

TNFR

SF14

MEF2B

TP53

BCL2

SGK1

GNA13

EZH2

BCL2s

BCL2sEZH2GNA13SGK1BCL2TP53MEF2BTNFRSF14FOXO1MLL2CREBBPEP300B2MIRF8BTG2BTG1KLHL6CCND3HIST1H1CETS1STAT3CD70CD58TMEM30AFASCARD11TNFAIP3BCL6sCD79BMYD88

Figure 2 | Overview of mutations and potential cooperative interactions inNHL. This heat map displays possible trends towards co-occurrence (red) andmutual exclusion (blue) of somatic mutations and structural rearrangements.Colours were assigned by taking the minimum value of a left- and right-tailedFisher’s exact test. To capture trends a P-value threshold of 0.3 was used, withthe darkest shade of the colour indicating those meeting statistical significance(P# 0.05). The relative frequency of mutations in ABC (blue), GCB (red),unclassifiable (black)DLBCLs and FL (yellow) cases is shown on the left. Geneswere arranged with those having significant (P, 0.05, Fisher’s exact test)enrichment for mutations in ABC cases (blue triangle) towards the top (andleft) and those with significant enrichment for mutations in GCB cases (redtriangle) towards the bottom (and right). The total number of cases in whicheach gene contained either cSNVs or confirmed somatic mutations is shown atthe top. The cluster of blue squares (upper-right) results from the mutualexclusion of the ABC-enrichedmutations (for example,MYD88, CD79B) fromthe GCB-enriched mutations (for example, EZH2, GNA13). Presence ofstructural rearrangements involving the two oncogenes BCL6 and BCL2(indicated as BCL6s and BCL2s) was determined with FISH techniques usingbreak-apart probes (Supplementary Methods).

PHD PHD HMG box COG5141FYRN

FYRC

SET

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp

a

D83G

K4E

MADS box

0 50 100 150 200 250 300 350

MEF2

b N81KN81Y

D83A

D83VY69HY69C

MLL2

MEF2B

bp

Figure 3 | Summary and effect of somatic mutations affecting MLL2 andMEF2B. a, Re-sequencing theMLL2 locus in 89 samples revealed mainlynonsense (red circles) and frameshift-inducing indel mutations (orangetriangles; inverted triangles for insertions and upright triangles for deletions). Asmaller number of non-synonymous somatic mutations (green circles) andpoint mutations or deletions affecting splice sites (yellow stars) were alsoobserved. All of the non-synonymous pointmutations affected a residue withineither the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminaldomain) or PHD zinc finger domains. The effect of these splice-site mutationsonMLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVsand somatic mutations found inMEF2B in all FL and DLBCL cases sequencedare shownwith the same symbols. Only the amino acids with variants in at leasttwo patients are labelled. cSNVs were most prevalent in the first two protein-coding exons ofMEF2B (exons 2 and 3). The crystal structure of MEF2 boundto EP300 supports the idea that two of the mutated sites (L67 and Y69) areimportant in the interaction between these proteins (Supplementary Figure 8and Supplementary Discussion)50.

Table 2 | Summary of types of MLL2 somatic mutationsSample Type FL DLBCL DLBCL cell-line Centroblast

Truncation 18 4 7 0Indel with frameshift 22 8 6 0Splice site 4 2 0 0SNV 3 2 2 0Any mutation/number of cases 31/35 12/37 10/17 0/8Percentage 89 32 59 0

ARTICLE RESEARCH

1 8 A U G U S T 2 0 1 1 | V O L 4 7 6 | N A T U R E | 3 0 1

S16 NATURE REPRINT COLLECTION Epigenetics

DiscussionIn our study of genome, transcriptome and exome sequences from127 B-cell NHL cases, we identified 109 genes with clear evidence ofsomatic mutation in multiple individuals. Significant selection seemsto act on at least 26 of these for the acquisition of either nonsense ormissense mutations. To the best of our knowledge, the majority ofthese genes had not previously been associated with any cancer type.We observed an enrichment of somatic mutations affecting genesinvolved in transcriptional regulation and, more specifically, chro-matin modification.MLL2 emerged from our analysis as a major tumour suppressor

locus in NHL. It is one of six human H3K4-specific methyltrans-ferases29, all of which share homology with the Drosophila trithoraxgene. Trimethylated H3K4 (H3K4me3) is an epigenetic mark assoc-iated with the promoters of actively transcribed genes. By laying downthis mark, MLLs are responsible for the transcriptional regulation ofdevelopmental genes including the homeobox (Hox) gene family30

which collectively control segment specificity and cell fate in thedeveloping embryo31,32. EachMLL family member is thought to targetdifferent subsets of Hox genes33 and in addition, MLL2 is known toregulate the transcription of a diverse set of genes34. Recently, MLL2mutations were reported in a small-cell lung cancer cell line35 and inrenal carcinoma36, but the frequency of nonsense mutations affectingMLL2 in these cancers was not established in these reports.Inactivating mutations were reported recently in MLL2 or MLL3 in16% of medulloblastoma patients37, further implicating MLL2 as acancer gene.Our data link MLL2 somatic mutations to B-cell NHL. The

reported mutations are likely to be inactivating and in eight of thecases with multiple mutations, we confirmed that both alleles wereaffected, presumably resulting in essentially complete loss of MLL2function. The high prevalence ofMLL2mutations in FL (89%) equalsthe frequency of the t(14;18)(q32;q21) translocation, which is con-sidered the most prevalent genetic abnormality in FL3. In DLBCLtumour samples and cell lines, MLL2 mutation frequencies were32% and 59%, respectively, also exceeding the prevalence of the mostfrequent cytogenetic abnormalities, such as the various translocationsinvolving 3q27, which occur in 25–30% of DLBCLs and are enrichedin ABC cases38. Importantly, we foundMLL2mutated in both DLBCLsubtypes (Fig. 2). Our analyses thus indicate that MLL2 acts as acentral tumour suppressor in FL and both DLBCL subtypes.The MEF2 gene family encodes four related transcription factors

that recruit histone-modifying enzymes including histone deacetylases(HDACs) and HATs in a calcium-regulated manner. Although trun-cating variants were detected in our analysis of MEF2 gene familymembers, our analysis suggests that, in contrast toMLL2,MEF2 familymembers tend to selectively acquire non-synonymous amino acid sub-stitutions. In the case ofMEF2B, 59.4% of all the cSNVs were found atfour sites within the protein (K4, Y69, N81 and D83), and all four ofthese sites were confirmed to be targets of somatic mutation. D83 isaffected in 39% of theMEF2B alterations, resulting in replacement ofthe charged aspartate with any of alanine, glycine or valine. Althoughwe cannot yet predict the consequences of these substitutions onprotein function, it seems likely that their effect would have an impacton the ability of MEF2B to facilitate gene expression and thus have arole in promoting the malignant transformation of germinal centre Bcells to lymphoma (Supplementary Discussion).MEF2Bmutations can be linked to CREBBP and EP300mutations,

and to recurrent Y641 mutations in EZH2 (ref. 13). One target ofCREBBP/EP300 HAT activity is H3K27, which is methylated byEZH2 to repress transcription. There is evidence that the action ofEZH2 antagonizes that of CREBBP/EP300 (ref. 39). One function ofMEF2 is to recruit either HDACs or CREBBP/EP300 to target genes40,and it has been suggested that HDACs compete with CREBBP/EP300for the same binding site on MEF2 (ref. 41). Under normal Ca21

levels, MEF2 is bound by type IIa HDACs, which maintain the tails

of histone proteins in a deacetylated repressive chromatin state42.Increased cytoplasmic Ca21 levels induce the nuclear export ofHDACs, enabling the recruitment of HATs such as CREBBP/EP300, facilitating transcription at MEF2 target genes. Mutation ofCREBBP, EP300 orMEF2Bmay have an impact on the expression ofMEF2 target genes owing to reduced acetylation of nucleosomes nearthese genes (Supplementary Figure 5; Supplementary Discussion). Inlight of the recent finding that heterozygous EZH2 Y641 mutationsenhance overall H3K27 trimethylation activity of PRC2 (refs 43, 44),it is possible that mutation of bothMLL2 and EZH2 could cooperatein reducing the expression of some of the same target genes. Our dataindicate that (1) post-transcriptionalmodification of histones is of keyimportance in germinal centre B cells and (2) deregulated histonemodification due to these mutations is likely to result in reducedacetylation and enhanced methylation, and acts as a core driver eventin the development of NHL (Supplementary Figure 5).

METHODS SUMMARYAll samples analysed contained at least 50% tumour cells. Genomes, exomes andtranscriptomes were sequenced using a combination of Illumina GAIIx andHiSeq 2000 instruments to read lengths of between 36 and 100 nucleotides.Exome capture was performed using the Agilent SureSelect Target EnrichmentSystem Protocol (Version 1.0, September 2009). Alignment was accomplishedusing BWA45 and variants were identified using SNVmix46. Variants were manu-ally reviewed in IGV and were confirmed (where applicable) by PCR followed byeither Sanger sequencing or Illumina re-sequencing. Structural rearrangementsin genomes and transcriptomes were identified using ABySS47. Gene expressionvalues used for subtype assignment were calculated as reads per kilobase genemodel per million mapped reads (RPKM) values48 and subtypes were assignedusing an adaptation of themethod developed for data fromAffymetrix expressionarrays49 trained with samples previously classified by this standard approach.

Received 13 November 2010; accepted 7 July 2011.

Published online 27 July 2011.

1. Anderson, J. R., Armitage, J. O., Weisenburger, D. D., Non-Hodgkin’s LymphomaClassification Project. Epidemiology of the non-Hodgkin’s lymphomas:distributions of the major subtypes differ by geographic locations. Ann. Oncol. 9,717–720 (1998).

2. Lenz, G. & Staudt, L. M. Aggressive lymphomas. N. Engl. J. Med. 362, 1417–1429(2010).

3. Horsman, D. E. et al. Follicular lymphoma lacking the t(14;18)(q32;q21):identification of two disease subtypes. Br. J. Haematol. 120, 424–433 (2003).

4. Iqbal, J. et al. BCL2 translocation defines a unique tumor subset within thegerminal center B-cell-like diffuse large B-cell lymphoma. Am. J. Pathol. 165,159–166 (2004).

5. Lenz, G.et al.Molecular subtypes of diffuse largeB-cell lymphomaarise bydistinctgenetic pathways. Proc. Natl Acad. Sci. USA 105, 13520–13525 (2008).

6. Pasqualucci, L.et al. Inactivationof thePRDM1/BLIMP1gene indiffuse largeB celllymphoma. J. Exp. Med. 203, 311–317 (2006).

7. Kato, M. et al. Frequent inactivation of A20 in B-cell lymphomas. Nature 459,712–716 (2009).

8. Compagno, M. et al. Mutations of multiple genes cause deregulation of NF-kB indiffuse large B-cell lymphoma. Nature 459, 717–721 (2009).

9. Davis, R. E. et al. Chronic active B-cell-receptor signalling in diffuse large B-celllymphoma. Nature 463, 88–92 (2010).

10. Ngo, V. N. et al. Oncogenically activeMYD88mutations in human lymphoma.Nature 470, 115–119 (2011).

11. Mardis, E. R. et al. Recurring mutations found by sequencing an acute myeloidleukemia genome. N. Engl. J. Med. 361, 1058–1066 (2009).

12. Shah, S. P. et al.Mutational evolution in a lobular breast tumour profiled at singlenucleotide resolution. Nature 461, 809–813 (2009).

13. Morin, R. D. et al. Somatic mutations altering EZH2 (Tyr641) in follicular anddiffuse large B-cell lymphomas of germinal-center origin. Nature Genet. 42,181–185 (2010).

14. Futreal, P. A.et al.Acensusofhumancancer genes.NatureRev. Cancer4,177–183(2004).

15. Pasqualucci, L. et al. Inactivating mutations of acetyltransferase genes in B-celllymphoma. Nature 471, 189–195 (2011).

16. Yusuf, I., Zhu, X., Kharas, M. G., Chen, J. & Fruman, D. A. Optimal B-cell proliferationrequiresphosphoinositide3-kinase-dependent inactivation of FOXO transcriptionfactors. Blood 104, 784–787 (2004).

17. Saito, M. et al. BCL6 suppression of BCL2 via Miz1 and its disruption in diffuselarge B cell lymphoma. Proc. Natl Acad. Sci. USA 106, 11294–11299 (2009).

18. Lenz, G. et al. Oncogenic CARD11mutations in human diffuse large B celllymphoma. Science 319, 1676–1679 (2008).

RESEARCH ARTICLE

3 0 2 | N A T U R E | V O L 4 7 6 | 1 8 A U G U S T 2 0 1 1

SGK1 encodes a phosphatidylinositol-3-OH kinase (PI(3)K)-regulated kinase with functions including regulation of FOXOtranscription factors25, regulation of NF-kB by phosphorylating IkBkinase26, and negative regulation of NOTCH signalling27. SGK1 alsoresides within a region of chromosome 6 commonly deleted inDLBCL(Fig. 1)5. Themechanismbywhich SGK1 andGNA13 inactivationmaycontribute to lymphoma is unclear, but the strong degree of apparentselection towards their inactivation and their overall high mutationfrequency (eachmutated in 18 of 106DLBCL cases) suggests that theirloss contributes to B-cell NHL. Certain genes are known to bemutatedmore commonly in GCB DLBCLs (for example, TP53 (ref. 28) andEZH2 (ref. 13)). Here, both SGK1 and GNA13 mutations were foundonly in GCB cases (P5 1.933 1023 and 2.283 1024, Fisher’s exacttest; n5 15 and 18, respectively) (Fig. 2). Two additional genes(MEF2B and TNFRSF14) with no previously described role inDLBCL showed a similar restriction to GCB cases (Fig. 2).

Inactivating MLL2 mutationsMLL2 showed the most significant evidence for selection and thelargest number of nonsense SNVs. Our RNA-seq analysis indicatedthat 26.0% (33/127) of cases carried at least one MLL2 cSNV. To

address the possibility that variable RNA-seq coverage ofMLL2 failedto capture some mutations, we PCR-amplified the entireMLL2 locus(,36 kilobases) in 89 cases (35 primary FLs, 17 DLBCL cell lines, and37 DLBCLs). Of these cases 58 were among the RNA-seq cohort.Illumina amplicon re-sequencing (Supplementary Methods) revealed78 mutations, confirming the RNA-seq mutations in the overlappingcases and identifying 33 additional mutations. We confirmed thesomatic status of 46 variants using Sanger sequencing (Supplemen-tary Table 10), and showed that 20 of the 33 additionalmutations wereinsertions or deletions (indels). Three SNVs at splice sites were alsodetected, aswere10newcSNVs thathadnot beendetected byRNA-seq.The somatic mutations were distributed acrossMLL2 (Fig. 3a). Of

these, 37% (n5 29/78) were nonsense mutations, 46% (n5 36/78)were indels that altered the reading frame, 8% (n5 6/78) were pointmutations at splice sites and 9% (n5 7/78) were non-synonymousamino acid substitutions (Table 2). Four of the somatic splice sitemutations had effects on MLL2 transcript length and structure. Forexample, two heterozygous splice site mutations resulted in the use ofa novel splice donor site and an intron retention event.Approximately half of the NHL cases we sequenced had twoMLL2

mutations (Supplementary Table 10). We used bacterial artificial

Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genesGene Cases Total Somatic cSNVs

(RNA-seqcohort)*

P (raw) q NS SP T SP Skew(M, WT, both){

NS S T NS S T

MLL2{ 16 8 17 17 8 18 10 6.8531028 8.50 31027 0.834 14.4 WTTNFRSF14 G{ 7 1 7 8 1 7 11 6.8531028 8.50 31027 7.52 118 BothSGK1 G{ 18 6 6 37 10 6 9 6.8531028 8.50 31027 19.5 61.7 2BCL10{ 2 0 4 3 0 4 4 6.8531028 8.50 31027 3.62 112 WTGNA13 G{ 21 1 2 33 1 2 5 6.8531028 8.50 31027 24.1 25.7 BothTP53 G{ 20 2 1 23 3 1 22 6.8531028 8.50 31027 15.6 14.1 BothEZH2 G{ 33 0 0 33 0 0 33 6.8531028 8.50 31027 11.4 0.00 BothBTG2{ 12 6 1 14 6 1 2 6.85 31028 8.50 31027 23.9 35.1 2BCL2 G{ 42 45 0 96 105 0 43 9.3531028 8.50 31027 3.78 0.00 MBCL6{1 11 2 0 12 2 0 2 9.3531028 8.50 31027 0.175 0.00 MCIITA{1 5 3 0 6 3 0 2 9.3531028 8.50 31027 0.086 0.00FAS{ 2 0 4 3 0 4 2 1.52 31027 1.17 31026 2.54 66.5 WTBTG1{ 11 6 2 11 7 2 10 1.52 31027 1.17 31026 17.5 52.5 BothMEF2B G{ 20 2 0 20 2 0 10 2.05 31027 1.47 31026 14.2 0.00 MIRF8{ 11 5 3 14 5 3 3 4.55 31027 3.03 31026 8.82 28.2 WTTMEM30A{ 1 0 4 1 0 4 4 6.06 31027 3.79 31026 0.785 65.0 WTCD58{ 2 0 3 2 0 3 2 2.42 31026 1.43 31025 2.29 69.2 2KLHL6{ 10 2 2 12 2 2 4 1.00 31025 5.26 31025 5.42 16.4 2MYD88 A{ 13 2 0 14 2 0 9 1.00 31025 5.26 31025 12.4 0.00 WTCD70{ 5 0 1 5 0 2 3 1.70 31025 8.48 31025 7.08 44.0 2CD79B A{ 7 2 1 9 2 1 5 2.00 31025 9.52 31025 10.9 18.3 MCCND3{ 7 1 2 7 1 2 6 2.80 31025 1.27 31024 6.55 36.3 WTCREBBP{ 20 7 4 24 7 4 9 1.00 31024 4.35 31024 2.72 6.04 BothHIST1H1C{ 9 0 0 10 0 0 6 1.80 31024 7.50 31024 11.9 0.00 BothB2M{ 7 0 0 7 0 0 4 3.90 31024 1.56 31023 16.6 0.00 WTETS1{ 10 1 0 10 1 0 4 4.10 31024 1.58 31023 5.76 0.00 WTCARD11{ 14 3 0 14 3 0 3 1.90 31023 7.04 31023 3.37 0.00 BothFAT2{1 2 1 0 2 1 0 2 6.30 31023 2.25 31022 0.128 0.00 2IRF4{1 9 4 0 26 5 0 5 7.00 31023 2.41 31022 0.569 0.00 BothFOXO1{ 8 4 0 10 4 0 4 7.60 3103 2.53 31022 4.02 0.00 2STAT3 9 0 0 9 0 0 4 2.19 31022 6.08 31022 2 2 BothRAPGEF1 8 3 0 10 3 0 3 2.98 31022 7.45 31022 2 2 WTABCA7 12 3 0 15 3 0 2 7.76 31022 1.67 31021 2 2 WTRNF213 10 8 0 10 8 0 2 7.87 31022 1.67 31021 2 2 2MUC16 17 12 0 39 25 0 2 8.32 31022 1.73 31021 2 2 2HDAC7 8 4 0 8 4 0 2 8.94 31022 1.82 31021 2 2 WTPRKDC 7 3 0 7 4 0 2 1.06 31021 2.05 31021 2 2 2SAMD9 9 2 0 9 2 0 2 1.79 31021 3.01 31021 2 2 2TAF1 10 0 0 10 0 0 2 3.03 31021 4.74 31021 2 2 2PIM1 20 19 0 33 34 0 11 3.40 31021 5.23 31021 2 2 WTCOL4A2 8 2 0 8 2 0 2 7.64 31021 8.99 31021 2 2 2EP300 8 7 1 8 7 1 3 9.54 31021 1.00 2 2 WT

Individual cases with non-synonymous (NS), synonymous (S) and truncating (T) mutations and the total number of mutations of each class are shown separately because some genes contained multiplemutations in the same case. The P values indicated in bold are the upper limit on the P value for that gene determined with the approach described in ref. 19 (see Supplementary Methods), q is the Benjamini-corrected q value, and NS SP and T SP refer to selective pressure estimates from this model for the acquisition of non-synonymous or truncatingmutations, respectively. Genes with a superscript of either A or Gwere found to have mutations significantly enriched in ABC or GCB cases, respectively (P,0.05, Fisher’s exact test).*Additional somatic mutations identified in larger cohorts and insertion/deletion mutations are not included in this total.{ ‘Both’ indicates that we observed separate cases in which skewed expression was seen but where this skew was not consistent for the mutant or wild-type allele.{Genessignificant at a false discovery rate of 0.03. SNVs inBCL2 andpreviously confirmedhot spotmutations inEZH2andCD79Bareprobably somatic in these samples basedonpublishedobservations of others.1Selective pressure estimates are both,1 indicating purifying selection rather than positive selection acting on this gene.

RESEARCH ARTICLE

3 0 0 | N A T U R E | V O L 4 7 6 | 1 8 A U G U S T 2 0 1 1

NATURE REPRINT COLLECTION Epigenetics S17

19. Greenman, C., Wooster, R., Futreal, P. A., Stratton, M. R. & Easton, D. F. Statisticalanalysis of pathogenicity of somatic mutations in cancer. Genetics 173,2187–2198 (2006).

20. Cheung, K. J. et al. Acquired TNFRSF14 mutations in follicular lymphoma areassociated with worse prognosis. Cancer Res. 70, 9166–9174 (2010).

21. Du, M. Q. et al. BCL10 genemutation in lymphoma. Blood95, 3885–3890 (2000).22. Kreutz, B., Hajicek, N., Yau, D. M., Nakamura, S. & Kozasa, T. Distinct regions of

Ga13 participate in its regulatory interactions with RGS homology domain-containing RhoGEFs. Cell. Signal. 19, 1681–1689 (2007).

23. Bhattacharyya, R. & Wedegaertner, P. Ga13 requires palmitoylation for plasmamembrane localization, Rho-dependent signaling, and promotion of p115-RhoGEF membrane binding. J. Biol. Chem. 275, 14992–14999 (2000).

24. Manganello, J.M., Huang, J., Kozasa, T., Voyno-Yasenetskaya, T. A.& LeBreton,G.C.Protein kinase A-mediated phosphorylation of the Ga13 switch I region alters theGabc13-G protein-coupled receptor complex and inhibits Rho activation. J. Biol.Chem. 278, 124–130 (2003).

25. Brunet, A. et al. Protein kinase SGK mediates survival signals by phosphorylatingthe forkhead transcription factor FKHRL1 (FOXO3a).Mol. Cell. Biol. 21, 952–965(2001).

26. Tai, D. J. C., Su, C.-C., Ma, Y.-L. & Lee, E. H. Y. SGK1 phosphorylation of IkB kinase aand p300 Up-regulates NF-kB activity and increases N-methyl-D-aspartatereceptor NR2A and NR2B expression. J. Biol. Chem. 284, 4073–4089 (2009).

27. Mo, J. et al. Serum- and glucocorticoid-inducible kinase 1 (SGK1) controls Notch1signaling by downregulation of protein stability through Fbw7 ubiquitin ligase.J. Cell Sci. 124, 100–112 (2011).

28. Young, K. H. et al. Structural profiles of TP53 gene mutations predict clinicaloutcome in diffuse large B-cell lymphoma: an international collaborative study.Blood 112, 3088–3098 (2008).

29. Shilatifard, A. Molecular implementation and physiological roles for histone H3lysine 4 (H3K4) methylation. Curr. Opin. Cell Biol. 20, 341–348 (2008).

30. Milne, T. et al.MLL targets SET domain methyltransferase activity to Hox genepromoters. Mol. Cell 10, 1107–1117 (2002).

31. Krumlauf, R. Hox genes in vertebrate development. Cell 78, 191–201 (1994).32. Canaani, E. et al. ALL-1//MLL1, a homologue of Drosophila TRITHORAX, modifies

chromatin and is directly involved in infant acute leukaemia. Br. J. Cancer 90,756–760 (2004).

33. Wang, P. et al. Global analysis of H3K4 methylation defines MLL family membertargets andpoints to a role forMLL1-mediatedH3K4methylation in the regulationof transcriptional initiation by RNA polymerase II.Mol. Cell. Biol. 29, 6074–6085(2009).

34. Issaeva, I. et al. Knockdown of ALR (MLL2) reveals ALR target genes and leads toalterations in cell adhesion and growth.Mol. Cell. Biol. 27, 1889–1903 (2007).

35. Pleasance, E. D. et al. A small-cell lung cancer genome with complex signatures oftobacco exposure. Nature 463, 184–190 (2010).

36. Dalgliesh,G. L.et al.Systematic sequencingof renal carcinoma reveals inactivationof histone modifying genes. Nature 463, 360–363 (2010).

37. Parsons, D. W. et al. The genetic landscape of the childhood cancermedulloblastoma. Science 331, 435–439 (2011).

38. Iqbal, J. et al. Distinctive patterns of BCL6 molecular alterations and theirfunctional consequences in different subgroups of diffuse large B-cell lymphoma.Leukemia 21, 2332–2343 (2007).

39. Pasini, D. et al. Characterization of an antagonistic switch between histone H3lysine 27 methylation and acetylation in the transcriptional regulation ofPolycomb group target genes. Nucleic Acids Res. (2010).

40. Giordano, A. & Avantaggiati, M. p300 and CBP: partners for life and death. J. Cell.Physiol. 181, 218–230 (1999).

41. Han, A., He, J., Wu, Y., Liu, J. O. & Chen, L. Mechanism of recruitment of class IIhistone deacetylases by myocyte enhancer factor-2. J. Mol. Biol. 345, 91–102(2005).

42. Youn, H. & Liu, J. Cabin1 represses MEF2-dependent Nur77 expression and T cellapoptosis by controlling association of histone deacetylases and acetylases withMEF2. Immunity 13, 85–94 (2000).

43. Yap, D. B. et al. Somatic mutations at EZH2 Y641 act dominantly through amechanism of selectively altered PRC2 catalytic activity, to increase H3K27trimethylation. Blood 117, 2451–2459 (2011).

44. Sneeringer, C. J. et al. Coordinated activities of wild-type plus mutant EZH2 drivetumor-associated hypertrimethylation of lysine 27 on histone H3 (H3K27) inhuman B-cell lymphomas. Proc. Natl Acad. Sci. USA 107, 20980–20985 (2010).

45. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheelertransform. Bioinformatics 25, 1754–1760 (2009).

46. Goya, R. et al. SNVMix: predicting single nucleotide variants from next-generationsequencing of tumors. Bioinformatics 26, 730–736 (2010).

47. Robertson, G. et al. De novo assembly and analysis of RNA-seq data. NatureMethods 7, 909–912 (2010).

48. Mortazavi, A., Williams, B. A., Mccue, K., Schaeffer, L. & Wold, B. Mapping andquantifyingmammalian transcriptomesbyRNA-Seq.NatureMethods5,621–628(2008).

49. Wright, G. et al. A gene expression-based method to diagnose clinically distinctsubgroups of diffuse large B cell lymphoma. Proc. Natl Acad. Sci. USA 100,9991–9996 (2003).

50. He, J. et al. Structure of p300 bound to MEF2 on DNA reveals a mechanism ofenhanceosome assembly. Nucleic Acids Res. (2011).

Supplementary Information is linked to the online version of the paper atwww.nature.com/nature.

Acknowledgements This study was funded in part by funding from the NationalCancer Institute Office ofCancerGenomics (ContractNo. HHSN261200800001E), theTerry Fox Foundation (grant 019001, Biology of Cancer: Insights from GenomicAnalyses of Lymphoid Neoplasms) and Genome Canada/Genome British ColumbiaGrant Competition III (Project Title: High Resolution Analysis of Follicular LymphomaGenomes) to J.M.C., R.D.G. and M.A.M. We acknowledge support from NIH grantsP50CA130805-01 ‘‘SPORE in Lymphoma, Tissue Resource Core (PI Fisher)’’ and1U01CA114778 ‘‘Molecular Signatures to Improve Diagnosis and Outcome inLymphoma (PIChan)’’. A.J.M. is aCareerDevelopmentProgramFellowof theLeukemiaandLymphomaSociety.N.A.J.wasa research fellowof theTerryFoxFoundation (awardNCIC 019005) and the Michael Smith Foundation for Health Research(ST-PDF-01793). M.A.M. is a Terry Fox Young Investigator and a Michael Smith SeniorResearch Scholar. R.D.M. is a Vanier Scholar (CIHR) and holds a MSFHR seniorgraduate studentship. M.M.-L. acknowledges support from a Postdoctoral Fellowshipfrom the Spanish Ministry of Education, under the ‘‘Programa Nacional de Movilidadde Recursos Humanos del Plan Nacional de I-D1i 2008-2011’’. D.W.S. was supportedby the Terry Fox Foundation Strategic Health Research Training Program in CancerResearch at Canadian Institutes of Health Research (Grant No. TGT-53912). J.J.S.acknowledges funding fromTheCanadian Cancer Society and the Canadian Institutesof Health Research. R.G. is supported by a UBC Four Year Fellowship. I.M.M.acknowledges theCanadianFoundation for Innovation for a LeadersOpportunity Fund.The laboratory work for this study was undertaken at the Genome Sciences Centre,British ColumbiaCancer ResearchCentre and theCentre for Translational andAppliedGenomics, a program of the Provincial Health Services Authority Laboratories. Theauthors would like to thank C. Greenman for supplying his software and alsoacknowledge D. Gerhard and S. Aparicio for discussions and guidance. Special thanksto C. Suragh, R. Roscoe, A. Troussard and A. Drobnies for expert project managementassistance, and to the Library Construction, Sequencing and Bioinformatics teams atthe Genome Sciences Centre. The content of this publication does not necessarilyreflect the views of policies of the Department of Health andHuman Services, nor doesmentionof tradenames, commercial products, or organizations implyendorsementbythe US Government.

Author ContributionsM.A.M., R.D.G., D.E.H.,M.H. and J.M.C. conceived of the study andled the design of the experiments. R.D.M. performed the analysis of sequence data,identified mutations and, with M.M.-L., A.J.M. and M.A.M., produced figures and wrotethe manuscript. M.M.-L., A.J.M., D.L.T., S.Chan, S.Chittarajan, D.S., H.M., J.S., M.M., T.Z.,A.D., K.T., Y.B., M.R.F., J.T.-W. and T.M.S. designed and performed experiments toamplify, discover and validate mutations. R.G., M.G. and I.M.M. contributed to analysesand reviewed the manuscript. N.A.J., M.B., B.W. and B.M. prepared the samples,performed sample sorting and COO analysis and contributed to the text. A.B.-W. andJ.J.S. collected andprepared constitutional DNA samples. K.L.M., R.C., S.L., M.F. andS.J.generated de novo assemblies and identifiedmutations. M.K., S.R., M.G., O.Y. and E.Y.Z.wrote software and contributed to figures. R.D.C. performed copy number analysis andproduced a figure and S.B.-N. performed confirmatory FISH experiments. Y.Z. and A.T.produced the sequencing libraries. I.B., R.H., S.J.M.J., R.M., J.S. and M.H. contributed tothe development of experimental and analytical protocols. L.R. providedmaterials andreviewed the manuscript.

Author Information The SRA accession number for the submission of the data notincluded in previous publications is SRP001599, which is linked to the dbGAP studyaccession phs000235.v2.p1. Reprints and permissions information is available atwww.nature.com/reprints. This paper is distributed under the terms of the CreativeCommons Attribution-Non-Commercial-Share Alike licence, and is freely available toall readers at www.nature.com/nature. The authors declare no competing financialinterests. Readers are welcome to comment on the online version of this article atwww.nature.com/nature. Correspondence and requests for materials should beaddressed to M.A.M. ([email protected]).

ARTICLE RESEARCH

1 8 A U G U S T 2 0 1 1 | V O L 4 7 6 | N A T U R E | 3 0 3

chromosome (BAC) clone sequencing in eight FL cases to show that inall eight cases themutations were in trans, affecting bothMLL2 alleles.This observation is consistent with the notion that there is a complete,or near-complete, loss ofMLL2 in the tumour cells of such patients.With the exception of two primary FL cases and two DLBCL cell

lines (Pfeiffer and SU-DHL-9), themajority ofMLL2mutations seemedto be heterozygous. Analysis of Affymetrix 500k SNP array data fromtwo FL cases with apparent homozygous mutations revealed that bothtumours showed copy number neutral loss of heterozygosity (LOH)for the region of chromosome 12 containing MLL2 (SupplementaryMethods). Thus, in addition to bi-allelic mutation, LOH is a second,albeit less common mechanism by whichMLL2 function is lost.MLL2 was the most frequently mutated gene in FL, and among the

most frequently mutated genes in DLBCL (Fig. 2). We confirmedMLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCLpatients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of theeight normal centroblast samples we sequenced. Our analysis pre-dicted that the majority of the somatic mutations observed in MLL2were inactivating (91% disrupted the reading frame or were truncat-ing point mutations), indicating to us that MLL2 is a tumour sup-pressor of significance in NHL.

Recurrent point mutations in MEF2BOur selective pressure analysis also revealed genes with stronger pres-sure for acquisition of amino acid substitutions than for nonsense

mutations. One such gene wasMEF2B, which had not previously beenlinked to lymphoma. We found that 20 (15.7%) cases had MEF2BcSNVs and 4 (3.1%) cases hadMEF2C cSNVs. All cSNVs detected byRNA-seq affected either the MADS box or MEF2 domains. To deter-mine the frequency and scope of MEF2B mutations, we Sanger-sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCLprimary tumours; 17 cell lines; 35 cases of assorted NHL (IBL,composite FL and PBMCL); and eight non-malignant centroblastsamples. We also used a capture strategy (Supplementary Methods)to sequence the entire MEF2B coding region in the 261 FL samples,revealing six additional variants outside exons 2 and 3. We thus iden-tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2BcSNVs or indels, failing to observe novel variants in other NHL andnon-malignant samples. Of the variants 55 (80%) affected residueswithin the MADS box and MEF2 domains encoded by exons 2 and 3(Supplementary Table 11; Fig. 3b). Each patient generally had a singleMEF2B variant and we observed relatively few (eight in total, 10.7%)truncation-inducing SNVs or indels. Non-synonymous SNVswere byfar themost common type of change observed, with 59.4% of detectedvariants affecting K4, Y69, N81 or D83. In 12 casesMEF2Bmutationswere shown to be somatic, including representative mutations at eachof K4, Y69, N81 and D83 (Supplementary Table 12). We did notdetect mutations in ABC cases, indicating that somatic mutations inMEF2B have a role unique to the development of GCBDLBCL and FL(Fig. 2).

AB

C e

nric

hmen

tG

CB

enr

ichm

ent

10203040

Cas

es

ABC GCBU FL

<0.05

0.1–0.05

0.3–0.1

MYD88

CD79B

BCL6s

TNFA

IP3

CARD11

FAS

TMEM30A

CD58

CD70

STAT3

ETS

1HIST1H1C

CCND3

KLH

L6BTG

1BTG

2IRF8

B2M

EP300

CREBBP

MLL2

FOXO1

TNFR

SF14

MEF2B

TP53

BCL2

SGK1

GNA13

EZH2

BCL2s

BCL2sEZH2GNA13SGK1BCL2TP53MEF2BTNFRSF14FOXO1MLL2CREBBPEP300B2MIRF8BTG2BTG1KLHL6CCND3HIST1H1CETS1STAT3CD70CD58TMEM30AFASCARD11TNFAIP3BCL6sCD79BMYD88

Figure 2 | Overview of mutations and potential cooperative interactions inNHL. This heat map displays possible trends towards co-occurrence (red) andmutual exclusion (blue) of somatic mutations and structural rearrangements.Colours were assigned by taking the minimum value of a left- and right-tailedFisher’s exact test. To capture trends a P-value threshold of 0.3 was used, withthe darkest shade of the colour indicating those meeting statistical significance(P# 0.05). The relative frequency of mutations in ABC (blue), GCB (red),unclassifiable (black)DLBCLs and FL (yellow) cases is shown on the left. Geneswere arranged with those having significant (P, 0.05, Fisher’s exact test)enrichment for mutations in ABC cases (blue triangle) towards the top (andleft) and those with significant enrichment for mutations in GCB cases (redtriangle) towards the bottom (and right). The total number of cases in whicheach gene contained either cSNVs or confirmed somatic mutations is shown atthe top. The cluster of blue squares (upper-right) results from the mutualexclusion of the ABC-enrichedmutations (for example,MYD88, CD79B) fromthe GCB-enriched mutations (for example, EZH2, GNA13). Presence ofstructural rearrangements involving the two oncogenes BCL6 and BCL2(indicated as BCL6s and BCL2s) was determined with FISH techniques usingbreak-apart probes (Supplementary Methods).

PHD PHD HMG box COG5141FYRN

FYRC

SET

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp

a

D83G

K4E

MADS box

0 50 100 150 200 250 300 350

MEF2

b N81KN81Y

D83A

D83VY69HY69C

MLL2

MEF2B

bp

Figure 3 | Summary and effect of somatic mutations affecting MLL2 andMEF2B. a, Re-sequencing theMLL2 locus in 89 samples revealed mainlynonsense (red circles) and frameshift-inducing indel mutations (orangetriangles; inverted triangles for insertions and upright triangles for deletions). Asmaller number of non-synonymous somatic mutations (green circles) andpoint mutations or deletions affecting splice sites (yellow stars) were alsoobserved. All of the non-synonymous pointmutations affected a residue withineither the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminaldomain) or PHD zinc finger domains. The effect of these splice-site mutationsonMLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVsand somatic mutations found inMEF2B in all FL and DLBCL cases sequencedare shownwith the same symbols. Only the amino acids with variants in at leasttwo patients are labelled. cSNVs were most prevalent in the first two protein-coding exons ofMEF2B (exons 2 and 3). The crystal structure of MEF2 boundto EP300 supports the idea that two of the mutated sites (L67 and Y69) areimportant in the interaction between these proteins (Supplementary Figure 8and Supplementary Discussion)50.

Table 2 | Summary of types of MLL2 somatic mutationsSample Type FL DLBCL DLBCL cell-line Centroblast

Truncation 18 4 7 0Indel with frameshift 22 8 6 0Splice site 4 2 0 0SNV 3 2 2 0Any mutation/number of cases 31/35 12/37 10/17 0/8Percentage 89 32 59 0

ARTICLE RESEARCH

1 8 A U G U S T 2 0 1 1 | V O L 4 7 6 | N A T U R E | 3 0 1

S18 NATURE REPRINT COLLECTION Epigenetics 566 NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology

ARTICLE PUBLISHED ONLINE: 10 JULY 2011 | DOI: 10.1038/NCHEMBIO.599

Protein lysine methylation is increasingly recognized as a major signaling mechanism in eukaryotic cells. This pro-cess has been most heavily studied in the context of epige-

netic regulation of gene expression through methylation of lysine residues of histone proteins 1 – 6 , but a growing number of known non-histone substrates suggest that the impact of lysine methyla-tion is not limited to chromatin biology 7 – 10 . Protein lysine meth-yltransferases (PKMTs) catalyze the transfer of a methyl group from S -adenosyl- L -methionine (SAM) to the ε -amino group of lysine residues of proteins, including histones 1,11 . Since the first PKMT was characterized in 2000 (ref. 12), more than 50 human PKMTs have been identified 1,11 . PKMTs show substantial variations in protein substrate selectivity and the degree of methylation on lysine, from mono- to di- to trimethylation. Selective pharmaco-logical inhibition of individual PKMTs ’ catalytic activity in cellular systems is a useful strategy for deciphering the complex signaling mechanisms of histone and protein lysine methylation. However, very few small-molecule tools are currently available for probing the activity of individual PKMTs 13 .

Growing evidence suggests that PKMTs are important in the development of various human diseases 11,14 . In particular, G9a (also known as KMT1C or EHMT2), which was initially identified as a

histone H3 Lys9 (H3K9) methyltransferase 15 , is overexpressed in various human cancers including leukemia 8 , prostate carcinoma 8,16 , hepatocellular carcinoma 17 and lung cancer 18 . It has been shown that knockdown of G9a inhibits prostate, lung and leukemia cancer cell growth 16,18,19 . The closely related protein GLP (also known as KMT1D or EHMT1) shares 80 % sequence identity with G9a in their respective SET domains and forms a heterodimer with G9a 20 . In addi-tion to catalyzing mono- and dimethylation of H3K9 (refs. 15,20), both G9a and GLP dimethylate Lys373 of the tumor suppressor p53, inactivating p53 ’ s transcriptional activity 8 . Moreover, G9a has been shown to be involved in cocaine addiction 21 , mental retardation 22 , maintenance of HIV-1 latency 23 and DNA methylation in mouse embryonic stem (mES) cells 24 – 26 . Furthermore, pharmacologic inhi-bition of G9a and GLP has been reported to facilitate reprogram-ming of mouse fetal neural precursor cells into induced pluripotent stem (iPS) cells 27,28 . This broad range of cellular and disease-related activities poses a challenge for understanding G9a- and GLP-related biology and for the potential targeting of these proteins therapeuti-cally. Thus, selective, potent and cell- active chemical probes for G9a and GLP would be extremely valuable tools for investigating the cel-lular role of these PKMTs, as well as for assessing their potential as therapeutic targets.

1 Structural Genomics Consortium, University of Toronto , Toronto , Ontario , Canada . 2 Center for Integrative Chemical Biology and Drug Discovery, Division of Medicinal Chemistry and Natural Products, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina , USA . 3 Developmental and Stem Cell Biology Program, SickKids Hospital , Toronto , Ontario , Canada . 4 Department of Molecular Genetics, University of Toronto , Toronto , Ontario , Canada . 5 INRA, UMR 1198 Biologie du D é veloppement et Reproduction , Jouy en Josas , France . 6 Krembil Family Epigenetic Laboratory, Centre for Addiction and Mental Health , Toronto , Ontario , Canada . 7 Department of Chemistry, Princeton University , Princeton , New Jersey , USA . 8 Institute of Systems Biology and Bioinformatics, National Central University , Jhongli City , Taiwan . 9 National Institute of Mental Health Psychoactive Drug Screening Program, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina , USA . 10 Department of Biochemistry and Biophysics, UNC Macromolecular Interactions Facility, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina , USA . 11 Ontario Cancer Institute, Campbell Family Cancer Research Institute and Department of Medical Biophysics, University of Toronto , Toronto , Ontario , Canada . 12 These authors contributed equally to this work. * e-mail: [email protected] or [email protected]

A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells Masoud Vedadi 1 , 12 , Dalia Barsyte-Lovejoy 1 , 12 , Feng Liu 2 , 12 , Sylvie Rival-Gervier 3 , 5 , Abdellah Allali-Hassani 1 , Viviane Labrie 6 , Tim J Wigle 2 , Peter A DiMaggio 7 , Gregory A Wasney 1 , Alena Siarheyeva 1 , Aiping Dong 1 , Wolfram Tempel 1 , Sun-Chong Wang 6 , 8 , Xin Chen 2 , Irene Chau 1 , Thomas J Mangano 9 , Xi-ping Huang 9 , Catherine D Simpson 2 , Samantha G Pattenden 2 , Jacqueline L Norris 2 , Dmitri B Kireev 2 , Ashutosh Tripathy 10 , Aled Edwards 1 , Bryan L Roth 9 , William P Janzen 2 , Benjamin A Garcia 7 , Arturas Petronis 6 , James Ellis 3 , 4 , Peter J Brown 1 , Stephen V Frye 2 , Cheryl H Arrowsmith 1 , 11 * & Jian Jin 2 *

Protein lysine methyltransferases G9a and GLP modulate the transcriptional repression of a variety of genes via dimethylation of Lys9 on histone H3 (H3K9me2) as well as dimethylation of non-histone targets. Here we report the discovery of UNC0638, an inhibitor of G9a and GLP with excellent potency and selectivity over a wide range of epigenetic and non-epigenetic targets. UNC0638 treatment of a variety of cell lines resulted in lower global H3K9me2 levels, equivalent to levels observed for small hairpin RNA knockdown of G9a and GLP with the functional potency of UNC0638 being well separated from its toxicity. UNC0638 markedly reduced the clonogenicity of MCF7 cells, reduced the abundance of H3K9me2 marks at promoters of known G9a-regulated endogenous genes and disproportionately affected several genomic loci encoding microRNAs. In mouse embryonic stem cells, UNC0638 reactivated G9a-silenced genes and a retroviral reporter gene in a concentration-dependent manner without promoting differentiation.

First published in Nature Chemical Biology 7, 566–574 (2011); doi: 10.1038/ nchembio.599

NATURE REPRINT COLLECTION Epigenetics S19566 NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology

ARTICLE PUBLISHED ONLINE: 10 JULY 2011 | DOI: 10.1038/NCHEMBIO.599

Protein lysine methylation is increasingly recognized as a major signaling mechanism in eukaryotic cells. This pro-cess has been most heavily studied in the context of epige-

netic regulation of gene expression through methylation of lysine residues of histone proteins 1 – 6 , but a growing number of known non-histone substrates suggest that the impact of lysine methyla-tion is not limited to chromatin biology 7 – 10 . Protein lysine meth-yltransferases (PKMTs) catalyze the transfer of a methyl group from S -adenosyl- L -methionine (SAM) to the ε -amino group of lysine residues of proteins, including histones 1,11 . Since the first PKMT was characterized in 2000 (ref. 12), more than 50 human PKMTs have been identified 1,11 . PKMTs show substantial variations in protein substrate selectivity and the degree of methylation on lysine, from mono- to di- to trimethylation. Selective pharmaco-logical inhibition of individual PKMTs ’ catalytic activity in cellular systems is a useful strategy for deciphering the complex signaling mechanisms of histone and protein lysine methylation. However, very few small-molecule tools are currently available for probing the activity of individual PKMTs 13 .

Growing evidence suggests that PKMTs are important in the development of various human diseases 11,14 . In particular, G9a (also known as KMT1C or EHMT2), which was initially identified as a

histone H3 Lys9 (H3K9) methyltransferase 15 , is overexpressed in various human cancers including leukemia 8 , prostate carcinoma 8,16 , hepatocellular carcinoma 17 and lung cancer 18 . It has been shown that knockdown of G9a inhibits prostate, lung and leukemia cancer cell growth 16,18,19 . The closely related protein GLP (also known as KMT1D or EHMT1) shares 80 % sequence identity with G9a in their respective SET domains and forms a heterodimer with G9a 20 . In addi-tion to catalyzing mono- and dimethylation of H3K9 (refs. 15,20), both G9a and GLP dimethylate Lys373 of the tumor suppressor p53, inactivating p53 ’ s transcriptional activity 8 . Moreover, G9a has been shown to be involved in cocaine addiction 21 , mental retardation 22 , maintenance of HIV-1 latency 23 and DNA methylation in mouse embryonic stem (mES) cells 24 – 26 . Furthermore, pharmacologic inhi-bition of G9a and GLP has been reported to facilitate reprogram-ming of mouse fetal neural precursor cells into induced pluripotent stem (iPS) cells 27,28 . This broad range of cellular and disease-related activities poses a challenge for understanding G9a- and GLP-related biology and for the potential targeting of these proteins therapeuti-cally. Thus, selective, potent and cell- active chemical probes for G9a and GLP would be extremely valuable tools for investigating the cel-lular role of these PKMTs, as well as for assessing their potential as therapeutic targets.

1 Structural Genomics Consortium, University of Toronto , Toronto , Ontario , Canada . 2 Center for Integrative Chemical Biology and Drug Discovery, Division of Medicinal Chemistry and Natural Products, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina , USA . 3 Developmental and Stem Cell Biology Program, SickKids Hospital , Toronto , Ontario , Canada . 4 Department of Molecular Genetics, University of Toronto , Toronto , Ontario , Canada . 5 INRA, UMR 1198 Biologie du D é veloppement et Reproduction , Jouy en Josas , France . 6 Krembil Family Epigenetic Laboratory, Centre for Addiction and Mental Health , Toronto , Ontario , Canada . 7 Department of Chemistry, Princeton University , Princeton , New Jersey , USA . 8 Institute of Systems Biology and Bioinformatics, National Central University , Jhongli City , Taiwan . 9 National Institute of Mental Health Psychoactive Drug Screening Program, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina , USA . 10 Department of Biochemistry and Biophysics, UNC Macromolecular Interactions Facility, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina , USA . 11 Ontario Cancer Institute, Campbell Family Cancer Research Institute and Department of Medical Biophysics, University of Toronto , Toronto , Ontario , Canada . 12 These authors contributed equally to this work. * e-mail: [email protected] or [email protected]

A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells Masoud Vedadi 1 , 12 , Dalia Barsyte-Lovejoy 1 , 12 , Feng Liu 2 , 12 , Sylvie Rival-Gervier 3 , 5 , Abdellah Allali-Hassani 1 , Viviane Labrie 6 , Tim J Wigle 2 , Peter A DiMaggio 7 , Gregory A Wasney 1 , Alena Siarheyeva 1 , Aiping Dong 1 , Wolfram Tempel 1 , Sun-Chong Wang 6 , 8 , Xin Chen 2 , Irene Chau 1 , Thomas J Mangano 9 , Xi-ping Huang 9 , Catherine D Simpson 2 , Samantha G Pattenden 2 , Jacqueline L Norris 2 , Dmitri B Kireev 2 , Ashutosh Tripathy 10 , Aled Edwards 1 , Bryan L Roth 9 , William P Janzen 2 , Benjamin A Garcia 7 , Arturas Petronis 6 , James Ellis 3 , 4 , Peter J Brown 1 , Stephen V Frye 2 , Cheryl H Arrowsmith 1 , 11 * & Jian Jin 2 *

Protein lysine methyltransferases G9a and GLP modulate the transcriptional repression of a variety of genes via dimethylation of Lys9 on histone H3 (H3K9me2) as well as dimethylation of non-histone targets. Here we report the discovery of UNC0638, an inhibitor of G9a and GLP with excellent potency and selectivity over a wide range of epigenetic and non-epigenetic targets. UNC0638 treatment of a variety of cell lines resulted in lower global H3K9me2 levels, equivalent to levels observed for small hairpin RNA knockdown of G9a and GLP with the functional potency of UNC0638 being well separated from its toxicity. UNC0638 markedly reduced the clonogenicity of MCF7 cells, reduced the abundance of H3K9me2 marks at promoters of known G9a-regulated endogenous genes and disproportionately affected several genomic loci encoding microRNAs. In mouse embryonic stem cells, UNC0638 reactivated G9a-silenced genes and a retroviral reporter gene in a concentration-dependent manner without promoting differentiation.

NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology 567

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599

The recent report of BIX01294 ( 1 ), a small-molecule inhibitor of G9a and GLP 29 , was an important advance, as this compound is, to our knowledge, the first potent and selective PKMT inhibi-tor. BIX01294 has since been used successfully as a probe of G9a in cellular reprogramming 27,28 and reactivation of latent HIV-1 (ref. 23). BIX01294 at 4.1 μ M reduced the abundance of the H3K9me2 mark in bulk histones in several cell lines and reduced H3K9me2 levels at G9a target genes 29 . However, BIX01294 was toxic to cells at concen-trations higher than 4.1 μ M (ref. 29). This poor separation between the concentration producing robust functional effects in cells and the concentration causing toxicity has limited the compound’s usefulness as a G9a and GLP chemical probe. To provide a high-quality chemical probe 30 of G9a and GLP with an improved ratio of toxicity to func-tional potency (toxicity/function ratio, which is determined dividing the EC 50 value of observed toxicity by the IC 50 value of the functional potency), we have explored this 2,4-diamino-6,7-dimethoxyquinazo-line template. We have previously reported the discovery of UNC0224 and UNC0321 ( 2 ) as potent and selective G9a and GLP inhibitors and described robust structure-activity relationships (SAR) of their analogs 31,32 . Other studies of the SAR of this scaffold have resulted in the discovery of E72 as a potent and selective GLP inhibitor 33 . However, UNC0321 ( Supplementary Results , Supplementary Fig. 1 ) and E72 (ref. 33) are less potent than BIX01294 in cellular assays.

Here we report that UNC0638 ( 3 ) is a potent, selective and cell-penetrant chemical probe for G9a and GLP, with a toxicity / function ratio of > 100, compared to < 6 for BIX01294. We describe the dis-covery of UNC0638 and its in vitro potency, selectivity, mechanism of action and kinetics, X-ray cocrystal structure and robust on-target activities in cells. This greatly improved, well-characterized chemical probe represents a substantial advance in PKMT probe discovery and will enable better understanding of the epigenetic and cellular role(s) of G9a and GLP.

RESULTS Discovery of UNC0638 Previously, initial inhibitor design and synthesis based on the X-ray cocrystal structures of the GLP – BIX01294 (PDB 3FPD) 34 and G9a – UNC0224 (PDB 3K5K) 31 complexes led us to discover UNC0321, a potent and selective inhibitor of G9a and GLP 32 ( Scheme 1 ). However, UNC0321 was less potent in cellular assays than BIX01294 ( Supplementary Fig. 1 ), even though it was more potent than BIX01294 in biochemical assays. We hypothesized that the poor cel-lular potency of UNC0321 was probably due to poor cell membrane permeability. Here, to improve the cellular potency of this series of compounds, we exploited the SAR of the quinazoline scaffold discov-ered previously 31,32 and designed several generations of new analogs

aimed at increasing lipophilicity while maintaining high in vitro potency. Among the newly synthesized compounds, UNC0638 ( Scheme 1 ), which has balanced in vitro potency and physicochemical properties aiding cell penetration, showed high potency in cellular assays and was considerably less toxic to cells than BIX01294 (see below). UNC0638 was efficiently synthesized via a novel seven-step synthetic sequence ( Supplementary Scheme 1 ). In contrast to our previous synthetic route to UNC0321 (ref. 32), this new synthesis avoided the Mitsunobu reaction as the last synthetic step and thus greatly facilitated purification of the final compounds.

In addition, we designed and synthesized UNC0737 ( 4 ) ( Scheme 1 ), the N -methyl analog of UNC0638, as a structurally similar but less potent G9a and GLP inhibitor for use as a negative control. UNC0737 was designed to eliminate the hydrogen bond interaction seen in the G9a – UNC0224 cocrystal structure between Asp1083 of G9a and the secondary amino group at the 4-position of UNC0224 ’ s quina zoline

Table 1 | Selectivity of UNC0638 against epigenetic targets Target IC 50 (nM) a K i (nM) a Tm shift ( ° C) b

G9a < 15 c 3 ± 0.05 d 4 GLP 19 ± 1 c 8 SUV39H2 > 10,000 c NT SUV39H1 > 10,000 e NT SETD7 > 10,000 c NT SMYD3 NT ND MLL > 10,000 e NT EZH2 > 10,000 e NT DOT1L NT ND SETD8 > 10,000 c ND PRDM1 NT ND PRDM10 NT ND PRDM12 NT ND PRMT1 > 10,000 e NT PRMT3 > 10,000 c ND HTATIP NT ND JMJD2E 4,500 ± 1,100 f NT DNMT1 107,000 ± 6,000 g NT a IC 50 or K i values are the average of at least two separate experiments. b Results from single DSF or differential static light scattering (DSLS) assay at 100 μ M. c SAHH-coupled assay results. d MCE assay results. e Assay results from BPS Bioscience. f AlphaScreen assay results. g Radioactive methyl transfer assay results. NT, not tested. ND, not detected.

Scheme 1 | Discovery of UNC0638 . Structure-based design and SAR exploration of the quinazoline template represented by BIX01294 led to the identification of UNC0321, a G9a and GLP inhibitor with high in vitro potency but poor cellular potency. The design and synthesis of several generations of new analogs aimed at improving cell membrane permeability while maintaining high in vitro potency resulted in the discovery of UNC0638, which has balanced in vitro potency and physicochemical properties aiding cell penetration. UNC0737, the N -methyl analog of UNC0638, was discovered as a structurally similar but less potent G9a and GLP inhibitor for use as a negative control.

N

NNN

NH

NPh

OMe

OMe

OMe

O

N

NN

NH

N

N

ON

OMe

O

N

N

N

N

N

1 (BIX01294)G9a: IC50 = 180 nMGLP: IC50 = 34 nM

2 (UNC0321)G9a: IC50 < 15 nMGLP: IC50 = 15 nM

3 (UNC0638), R = HG9a: IC50 < 15 nM (n = 4)GLP: IC50 = 19 ± 1 nM (n = 2)4 (UNC0737), R = Me (negative control)G9a: IC50 = 5,000 ± 200 nM (n = 2)GLP: IC50 > 10,000 nM (n = 2)

Improvein vitropotency

Improve cellularpotency

R

S20 NATURE REPRINT COLLECTION Epigenetics 568 NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599

ring 31 . Indeed, UNC0737 was > 300-fold less potent than UNC0638 in G9a and GLP biochemical assays (see below).

UNC0638 is a potent and substrate-competitive inhibitor The inhibitory effect of UNC0638 on G9a and GLP activity was first evaluated using the fluorescence-based S -adenosyl- L -homocysteine hydrolase (SAHH)-coupled assay, which monitors the conversion of the cofactor, SAM, to the cofactor product, S -adenosyl- L -homocysteine (SAH) 35 . UNC0638 was a potent G9a (IC 50 < 15 nM ( n = 4)) and GLP inhibitor (IC 50 = 19 ± 1 nM ( n = 2)) in these SAHH-coupled assays ( Table 1 ). An endoproteinase-coupled microfluidic capillary electro-phoresis (MCE) assay 36 , which is orthogonal and complementary to the SAHH-coupled assay, was also used to evaluate G9a inhibition by UNC0638, yielding an IC 50 < 10 nM ( n = 3). In addition, UNC0638 displaced a fluorescein-labeled 15-mer H3 peptide (residues 1 – 15) with high efficiency in a G9a fluorescence- polarization assay, suggest-ing that UNC0638 binds in the substrate peptide – binding site of G9a ( Supplementary Fig. 2 ). UNC0638 also stabilized G9a and GLP in differential scanning fluorimetry (DSF) experiments, with Tm shifts of 4 ° C and 8 ° C, respectively, consistent with high-affinity binding ( Supplementary Fig. 3 ).

We next determined detailed mechanism-of-action and Michaelis-Menten kinetic parameters associated with both the peptide and SAM as a function of UNC0638 concentration ( Fig. 1a – d ). These experi-ments confirmed that UNC0638 was competitive with the peptide substrate, as the K m of the peptide ( K m peptide ) increased linearly with UNC0638 concentration ( Fig. 1b ), and noncompetitive with cofactor SAM, as the apparent K m of SAM ( K m app ) remained constant in the presence of increasing concentrations of the compound ( Fig. 1d ). The K i of UNC0638 was determined to be 3.0 ± 0.05 nM ( n = 2). Consistent with this, the Morrison K i (ref. 37) for UNC0638 was 3.7 ± 0.2 nM ( n = 3) ( Supplementary Fig. 4 ).

Kinetics of the inhibition of G9a by UNC0638 was also studied using surface plasmon resonance (SPR). UNC0638 bound G9a tightly, with rapid association ( k a = 2.12 × 10 6 1/ms) and disassocia-tion ( k d = 5.7 × 10 − 2 1/s) rates ( Supplementary Fig. 5 ), consistent with a classic reversible mechanism of inhibition of G9a. The K d of UNC0638 measured from equilibrium analysis of the Langmuir binding isotherms in the SPR studies was 27 nM, consistent with results from homogeneous assays.

As expected, UNC0737, the N -methyl analog of UNC0638, was a poor inhibitor of G9a (IC 50 = 5,000 ± 200 nM ( n = 2)) and GLP (IC 50 > 10,000 nM ( n = 2)) in the SAHH-coupled assays ( Supplementary Table 2 ). The combination of the high structural similarity between UNC0737 and UNC0638 and the > 300-fold loss of potency in UNC0737 compared to UNC0638 makes UNC0737 an appropriate negative control for use in cellular and functional assays.

UNC0638 is a selective inhibitor of G9a and GLP The selectivity of UNC0638 over a wide range of epigenetic targets was evaluated ( Table 1 ). Notably, UNC0638 was inactive against other H3K9 (SUV39H1 and SUV39H2), H3K27 (EZH2), H3K4 (SETD7, MLL and SMYD3), H3K79 (DOT1L) and H4K20 (SETD8) methyltransferases, as well as PRDM1, PRDM10 and PRDM12. In addition, UNC0638 was inactive against protein arginine methyl-transferases PRMT1 and PRMT3, and HTATIP, a histone acetyl-transferase. Of note, UNC0638 had weak but measurable activity against JMJD2E (IC 50 = 4,500 ± 1,100 nM ( n = 3)), a Jumonji protein demethylase and DNA methyltransferase DNMT1 (IC 50 = 107,000 ± 6,000 nM ( n = 2)). Nevertheless, the selectivity of UNC0638 for G9a and GLP over JMJD2E was > 200-fold, and selectivity for G9a and GLP over DNMT1 was > 5,000-fold.

We also evaluated the selectivity of UNC0638 over a broad range of non-epigenetic targets, including G protein coupled receptors (GPCRs), ion channels, transporters and kinases ( Supplementary Tables 3 and 4 ). UNC0638 was clean ( < 30 % inhibition at 1 μ M)

against 26 out of 29 targets in the Ricerca Selectivity Panel. At 1 μ M concentration, UNC0638 showed 64 % , 90 % and 69 % inhibi-tion of muscarinic M 2 , adrenergic α 1A and adrenergic α 1B receptors, respectively. Because UNC0638 hit three GPCRs in the Ricerca Selectivity Panel, we further assessed its selectivity against GPCRs by testing UNC0638 in the US National Institute of Mental Health ’ s Psychoactive Drug Screen Program Selectivity Panel, which con-sists of a total of 45 targets, including 36 GPCRs. UNC0638 had < 50 % inhibition at 1 μ M against 39 targets in the panel, and > 50 % inhibition at 1 μ M against 6 targets in the panel. K i in the radio-ligand binding assay for each of the six interacting GPCRs was subsequently determined. The K i measurements showed UNC0638 was at least 100-fold selective for G9a over these six GPCRs. In M 1 , M 2 and M 4 functional assays, UNC0638 had no agonist activ-ity, low antagonist potency against M 1 and M 4 (IC 50 > 10,000 nM ( n = 2)), and modest antagonist potency against M 2 (IC 50 = 480 ± 10 nM ( n = 2)). Furthermore, UNC0638 was tested against a panel of 24 kinases and showed < 10 % inhibition at 1 μ M against these kinases. Therefore, we conclude that when used at appropriate con-centrations (for example, < 500 nM), the effects of UNC0638 on histone or other lysine methylation substrates can be interpreted as primarily due to the inhibition of G9a and GLP.

The selectivity of UNC0737 is summarized in Supplementary Table 2 . Like UNC0638, UNC0737 was inactive against SUV39H2, SETD7, SETD8 and PRMT3, had a binding-affinity range of 60 to

Figure 1 | UNC0638 competes with the peptide substrate but not with the cofactor SAM. We determined the velocity of the reaction by measuring the conversion of substrate to product at six time points spanning 100 min, analyzed these data by linear regression to determine initial steady-state enzyme velocity and fitted them to Michaelis-Menten kinetics. The K m values of the peptide and SAM were then plotted as a function of UNC0638 concentrations. ( a , b ) UNC0638 is competitive with the H3K9 peptide substrate, as K m peptide increases linearly with compound concentration. ( c , d ) UNC0638 does not compete with the cofactor SAM, as K m app was not affected by the compound. ( e , f ) The X-ray cocrystal structure of the G9a–UNC0638–SAH complex confirms the mechanism of action of UNC0638. UNC0638 (in gray, blue and red sticks) occupies the peptide binding groove and does not interact with the SAM binding pocket. The 7-(3-pyrrolidin-1-yl-)propoxy side chain of UNC0638 interacts with the lysine binding channel.

Tyr1154

Asp1083

Leu1086

Asp1078

0.4

0.03

0.02

0.01

0

UNC0638 (nM)13.3 80

60

40

K mpe

ptid

e (μM

)K m

app (μ

M)

20

25

15

10

5

0

20

00 0.005 0.010

Ki = 3.0 ± 0.05 nM

UNC0638 (μM)0.015

8.93.92.61.81.20

13.3UNC0638 (nM)

8.95.93.92.61.81.20

0.3

V obs (μ

M/m

in)

V obs (μ

M m

in–1

)

0.2

0.1

00 20 40

0 0 0.002

Hydrogenbonding

(side chain)

Hydrogenbonding

(side chain) Hydrogenbonding

(side chain)

⊕ – π(side chain)

0.004UNC0638 (μM)

0.00610050SAM (μM)

60H3K9 peptide (μM)

80 100

ba

c d

e f

Asp1088

Asp1083

Asp1088

Leu1086

Tyr1154

NATURE REPRINT COLLECTION Epigenetics S21568 NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599

ring 31 . Indeed, UNC0737 was > 300-fold less potent than UNC0638 in G9a and GLP biochemical assays (see below).

UNC0638 is a potent and substrate-competitive inhibitor The inhibitory effect of UNC0638 on G9a and GLP activity was first evaluated using the fluorescence-based S -adenosyl- L -homocysteine hydrolase (SAHH)-coupled assay, which monitors the conversion of the cofactor, SAM, to the cofactor product, S -adenosyl- L -homocysteine (SAH) 35 . UNC0638 was a potent G9a (IC 50 < 15 nM ( n = 4)) and GLP inhibitor (IC 50 = 19 ± 1 nM ( n = 2)) in these SAHH-coupled assays ( Table 1 ). An endoproteinase-coupled microfluidic capillary electro-phoresis (MCE) assay 36 , which is orthogonal and complementary to the SAHH-coupled assay, was also used to evaluate G9a inhibition by UNC0638, yielding an IC 50 < 10 nM ( n = 3). In addition, UNC0638 displaced a fluorescein-labeled 15-mer H3 peptide (residues 1 – 15) with high efficiency in a G9a fluorescence- polarization assay, suggest-ing that UNC0638 binds in the substrate peptide – binding site of G9a ( Supplementary Fig. 2 ). UNC0638 also stabilized G9a and GLP in differential scanning fluorimetry (DSF) experiments, with Tm shifts of 4 ° C and 8 ° C, respectively, consistent with high-affinity binding ( Supplementary Fig. 3 ).

We next determined detailed mechanism-of-action and Michaelis-Menten kinetic parameters associated with both the peptide and SAM as a function of UNC0638 concentration ( Fig. 1a – d ). These experi-ments confirmed that UNC0638 was competitive with the peptide substrate, as the K m of the peptide ( K m peptide ) increased linearly with UNC0638 concentration ( Fig. 1b ), and noncompetitive with cofactor SAM, as the apparent K m of SAM ( K m app ) remained constant in the presence of increasing concentrations of the compound ( Fig. 1d ). The K i of UNC0638 was determined to be 3.0 ± 0.05 nM ( n = 2). Consistent with this, the Morrison K i (ref. 37) for UNC0638 was 3.7 ± 0.2 nM ( n = 3) ( Supplementary Fig. 4 ).

Kinetics of the inhibition of G9a by UNC0638 was also studied using surface plasmon resonance (SPR). UNC0638 bound G9a tightly, with rapid association ( k a = 2.12 × 10 6 1/ms) and disassocia-tion ( k d = 5.7 × 10 − 2 1/s) rates ( Supplementary Fig. 5 ), consistent with a classic reversible mechanism of inhibition of G9a. The K d of UNC0638 measured from equilibrium analysis of the Langmuir binding isotherms in the SPR studies was 27 nM, consistent with results from homogeneous assays.

As expected, UNC0737, the N -methyl analog of UNC0638, was a poor inhibitor of G9a (IC 50 = 5,000 ± 200 nM ( n = 2)) and GLP (IC 50 > 10,000 nM ( n = 2)) in the SAHH-coupled assays ( Supplementary Table 2 ). The combination of the high structural similarity between UNC0737 and UNC0638 and the > 300-fold loss of potency in UNC0737 compared to UNC0638 makes UNC0737 an appropriate negative control for use in cellular and functional assays.

UNC0638 is a selective inhibitor of G9a and GLP The selectivity of UNC0638 over a wide range of epigenetic targets was evaluated ( Table 1 ). Notably, UNC0638 was inactive against other H3K9 (SUV39H1 and SUV39H2), H3K27 (EZH2), H3K4 (SETD7, MLL and SMYD3), H3K79 (DOT1L) and H4K20 (SETD8) methyltransferases, as well as PRDM1, PRDM10 and PRDM12. In addition, UNC0638 was inactive against protein arginine methyl-transferases PRMT1 and PRMT3, and HTATIP, a histone acetyl-transferase. Of note, UNC0638 had weak but measurable activity against JMJD2E (IC 50 = 4,500 ± 1,100 nM ( n = 3)), a Jumonji protein demethylase and DNA methyltransferase DNMT1 (IC 50 = 107,000 ± 6,000 nM ( n = 2)). Nevertheless, the selectivity of UNC0638 for G9a and GLP over JMJD2E was > 200-fold, and selectivity for G9a and GLP over DNMT1 was > 5,000-fold.

We also evaluated the selectivity of UNC0638 over a broad range of non-epigenetic targets, including G protein coupled receptors (GPCRs), ion channels, transporters and kinases ( Supplementary Tables 3 and 4 ). UNC0638 was clean ( < 30 % inhibition at 1 μ M)

against 26 out of 29 targets in the Ricerca Selectivity Panel. At 1 μ M concentration, UNC0638 showed 64 % , 90 % and 69 % inhibi-tion of muscarinic M 2 , adrenergic α 1A and adrenergic α 1B receptors, respectively. Because UNC0638 hit three GPCRs in the Ricerca Selectivity Panel, we further assessed its selectivity against GPCRs by testing UNC0638 in the US National Institute of Mental Health ’ s Psychoactive Drug Screen Program Selectivity Panel, which con-sists of a total of 45 targets, including 36 GPCRs. UNC0638 had < 50 % inhibition at 1 μ M against 39 targets in the panel, and > 50 % inhibition at 1 μ M against 6 targets in the panel. K i in the radio-ligand binding assay for each of the six interacting GPCRs was subsequently determined. The K i measurements showed UNC0638 was at least 100-fold selective for G9a over these six GPCRs. In M 1 , M 2 and M 4 functional assays, UNC0638 had no agonist activ-ity, low antagonist potency against M 1 and M 4 (IC 50 > 10,000 nM ( n = 2)), and modest antagonist potency against M 2 (IC 50 = 480 ± 10 nM ( n = 2)). Furthermore, UNC0638 was tested against a panel of 24 kinases and showed < 10 % inhibition at 1 μ M against these kinases. Therefore, we conclude that when used at appropriate con-centrations (for example, < 500 nM), the effects of UNC0638 on histone or other lysine methylation substrates can be interpreted as primarily due to the inhibition of G9a and GLP.

The selectivity of UNC0737 is summarized in Supplementary Table 2 . Like UNC0638, UNC0737 was inactive against SUV39H2, SETD7, SETD8 and PRMT3, had a binding-affinity range of 60 to

Figure 1 | UNC0638 competes with the peptide substrate but not with the cofactor SAM. We determined the velocity of the reaction by measuring the conversion of substrate to product at six time points spanning 100 min, analyzed these data by linear regression to determine initial steady-state enzyme velocity and fitted them to Michaelis-Menten kinetics. The K m values of the peptide and SAM were then plotted as a function of UNC0638 concentrations. ( a , b ) UNC0638 is competitive with the H3K9 peptide substrate, as K m peptide increases linearly with compound concentration. ( c , d ) UNC0638 does not compete with the cofactor SAM, as K m app was not affected by the compound. ( e , f ) The X-ray cocrystal structure of the G9a–UNC0638–SAH complex confirms the mechanism of action of UNC0638. UNC0638 (in gray, blue and red sticks) occupies the peptide binding groove and does not interact with the SAM binding pocket. The 7-(3-pyrrolidin-1-yl-)propoxy side chain of UNC0638 interacts with the lysine binding channel.

Tyr1154

Asp1083

Leu1086

Asp1078

0.4

0.03

0.02

0.01

0

UNC0638 (nM)13.3 80

60

40

K mpe

ptid

e (μM

)K m

app (μ

M)

20

25

15

10

5

0

20

00 0.005 0.010

Ki = 3.0 ± 0.05 nM

UNC0638 (μM)0.015

8.93.92.61.81.20

13.3UNC0638 (nM)

8.95.93.92.61.81.20

0.3

V obs (μ

M/m

in)

V obs (μ

M m

in–1

)

0.2

0.1

00 20 40

0 0 0.002

Hydrogenbonding

(side chain)

Hydrogenbonding

(side chain) Hydrogenbonding

(side chain)

⊕ – π(side chain)

0.004UNC0638 (μM)

0.00610050SAM (μM)

60H3K9 peptide (μM)

80 100

ba

c d

e f

Asp1088

Asp1083

Asp1088

Leu1086

Tyr1154

NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology 569

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599

1,000 nM ( K i values) for α 1A , α 2C , M 1 , M 2 and M 4 receptors, had low to modest potency (IC 50 800 to > 10,000 nM) in M 1 , M 2 and M 4 func-tional assays and was inactive against a panel of 24 kinases. The simi-lar molecular profiles of UNC0737 and UNC0638 against epigenetic targets (except G9a and GLP) and non-epigenetic targets make UNC0737 an appropriate negative control in terms of selectivity.

Crystal structure of the G9a – UNC0638 – SAH complex An X-ray crystal structure of the G9a – UNC0638 – SAH complex (2.56- Å resolution; Fig. 1e,f and Supplementary Table 1 ) provides structural insight into the mechanism of action. First, UNC0638 occupies the substrate binding groove and does not interact with the SAM binding pocket. This finding is consistent with the results from the inhibitor-peptide-SAM competition experiments. Second, the hydrogen of the secondary amino group at the 4-position of the quinazoline ring indeed forms a hydrogen bond with Asp1083, explaining the marked potency loss of UNC0737 compared to UNC0638. Finally, the lysine binding channel is occupied by the 7-(3-pyrrolidin-1-yl-)propoxy side chain. Compared to the X-ray crystal structures of the GLP – BIX01294 (PDB 3FPD) 34 and G9a – UNC0224 (PDB 3K5K) 31 complexes, the same binding mode was observed for UNC0638 ( Supplementary Fig. 6 ).

UNC0638 is stable under cellular assay conditions 1 H NMR and LC-MS analysis of a solution of UNC0638 (10 mM) in deuterated DMSO and deuterated H 2 O (90:10 ratio) that had been kept at room temperature for 4 weeks indicated that UNC0638 was stable under these conditions; no degradation products were found. Incubation of UNC0638 with or without MCF7, U2OS or H1299 cells in two types of cell media for 65 h also did not produce

degradation products of UNC0638. In mouse drug metabolism and pharmacokinetic studies, UNC0638 had high clearance, short half-life, high volume distribution and low exposure after intravenous, oral or intraperitoneal administration ( Supplementary Table 5 ). Thus, although UNC0638 is probably not suitable for in vivo animal studies owing to low exposure levels, its high stability under cellular assay conditions, in combination with high potency and selectivity, makes UNC0638 an ideal chemical tool for cell-based studies.

UNC0638 has high cellular potency and low toxicity G9a and GLP are the primary enzymes affecting dimethylation of histone H3K9 in cells 15,20 . To assess the cellular potency of UNC0638, we used an H3K9me2 antibody cell immunofluorescence or in-cell western assay. This assay allows rapid processing of multiple samples for H3K9me2 immunofluorescence signal ( Fig. 2a , green signal) and normalization to cell number via the use of the nucleic acid dye DRAQ5 ( Fig. 2a , red signal). We verified the specificity of the anti-body by comparison of dose-dependent dot blots and by the reduced cellular immunofluorescence signal in G9a and GLP knockdown experiments ( Fig. 2a ). Initially, the data were normalized to total H3 levels ( Supplementary Fig. 7 ); however, this was found to be con-sistent with the DRAQ5 normalization, and subsequently the latter was used. We characterized UNC0638 and UNC0737 in MDA-MB-231 cells because of their robust H3K9me2 levels and good toler-ance of G9a and GLP knockdown. In MDA-MB-231 and MCF7 cells, treatment with several short hairpin RNAs (shRNAs) reduced G9a and GLP to 25 – 40 % of the levels in control experiments, and also yielded consistently lower levels of H3K9me2 ( Supplementary Fig. 8 ). In MDA-MB-231 cells, UNC0638 (48 h exposure) reduced H3K9me2 levels in a concentration-dependent manner with an IC 50

d e f110

H3K9me2MTT

H3K9me2MTT

1009080706050403020100

Perc

enta

ge re

spon

se

120

100

80

60

40

Perc

enta

ge M

TT

20

0100 101 102

Compound (nM)103 104 105 100 101 102

Compound (nM)103 104 105

1101009080706050403020100

Perc

enta

ge re

spon

se

100 101 102

Compound (nM)103 104 105

BIX01294BIX01294UNC0638 UNC0638

UNC0737

110

UNC0638

a b cshRNA

(nM) H3K9me2 Cell viability H3K9me2 viability

5,000Contr

G9aGLP

G9a

GLP

2,500

500

250

50

25

5

0

701d 2d 3d 4d 2+2d 2+2dw

6050403020100

80 250 500

No-antibodycontrols

100908070605040

Perc

enta

ge H

3K9m

e2

Perc

enta

ge H

3K9m

e2

3020100

100 101 102

Compound (nM)103 104

G9a and GLP shRNA

BIX01294UNC0638UNC0737

Compound (nM)

Figure 2 | UNC0638 inhibits cellular H3K9 dimethylation and shows good separation of functional potency and toxicity in MDA-MB-231 cells. ( a ) UNC0638 (48 h) or G9a and/or GLP shRNAs, reduced H3K9 dimethylation levels. H3K9me2 antibody was used for cell immunostaining (in-cell western) and results normalized to cell number measured by uptake nucleic acid dye (DRAQ5). ( b ) UNC0638 was considerably more potent than BIX01294 and UNC0737 (negative control) in reducing cellular H3K9me2 levels, which were measured after MDA-MB-231 cells were treated with inhibitors for 48 h. Dashed line indicates level of H3K9me2 resulting from G9a and GLP knockdown. ( c ) Cellular levels of H3K9me2 were progressively reduced from 1 d to 4 d exposure to UNC0638 at three concentrations (80 nM, representing IC 50 ; 250 nM, representing IC 90 ; and 500 nM, representing 2 × IC 90 ). The reductions with 250-nM and 500-nM treatments after 4 d were equal or very close to that of G9a and GLP knockdown cells. Refreshing the inhibitor after 2 d (2 + 2d) increased inhibition by UNC0638 at 80 nM but had little further effect at 250 and 500 nM. The effects of UNC0638 were long-lasting. In cells with 2 d exposure to UNC0638, levels of H3K9me2 remained low after washout of compound followed by 2 d incubation without the inhibitor (2 + 2dw). ( d ) UNC0638 and UNC0737 had lower cellular toxicity than BIX01294 in MTT (3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide)) assays. ( e , f ) UNC0638 had good separation of functional potency (decrease in H3K9me2 levels) and toxicity (from the MTT assay), whereas BIX01294 had poor separation of these effects.

S22 NATURE REPRINT COLLECTION Epigenetics 570 NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599

of 81 ± 9 nM ( n = 3), which indicates considerably higher potency than BIX01294 (IC 50 = 500 ± 43 nM ( n = 3)) ( Fig. 2b ). The maximum effect of UNC0638 in reducing H3K9me2 levels was greater than that of BIX01294 and close, but not equal, to that of the double knockdown of G9a and GLP via shRNA ( Fig. 2b ). Consistent with its poor in vitro potency, UNC0737 (negative control) showed poor cellular potency in the in-cell western (IC 50 > 5,000 nM ( n = 3); Fig. 2b ) and chromatin immunoprecipitation (ChIP) assays ( Supplementary Fig. 9 ).

We next studied the time course of effects of UNC0638 on cellu-lar levels of H3K9me2. Because H3K9me2 has a half-life of about 1 d (ref. 38), we hypothesized that exposure beyond 48 h might result in even greater reduction of the mark. We found that H3K9me2 levels in MDA-MB-231 cells gradually decreased over the course of treatment ( Fig. 2c ). After 4 d, the cellular H3K9me2 levels under treatment with 250 or 500 nM of UNC0638 were equal or very close to those of G9a and GLP knockdown cells. At both UNC0638 con-centrations, changing the cell medium after 2 d (denoted “ 2 + 2d ” in Fig. 2c ) had little effect compared with not changing the medium (denoted “ 4d ” in Fig. 2c ). Notably, reduced cellular levels of H3K9me2 were still observed at the 4-d time point after cells were exposed to UNC0638 for 2 d, followed by washout of the compound and another 2 d of culture without the inhibitor (denoted “ 2 + 2dw ” in Fig. 2c ). The level of H3K9me2 at day 2 + 2dw was inversely proportional to the original dosage of UNC0638, suggesting that residual amounts of UNC0638 remain in the cells and can have a lasting effect. Inhibitor treatment did not affect the protein levels of G9a or GLP ( Supplementary Fig. 10 ) or the mRNA levels of G9a ( Supplementary Fig. 11 ), indicating the observed effects were due to inhibition of the enzymatic function of the proteins and not to changes in protein abundance.

One of the desirable characteristics of a good chemical probe is low toxicity due to off-target effects. Both UNC0638 (EC 50 = 11,000 ± 710 nM ( n = 3)) and UNC0737 (EC 50 = 8,700 ± 790 nM ( n = 3)) were considerably less toxic than BIX01294 (EC 50 = 2,700 ± 76 nM

( n = 3)) in an MTT assay ( Fig. 2d ). Notably, UNC0737 had cellular toxicity similar to that of UNC0638, suggesting that the observed cellular toxicity is probably not due to inhibition of G9a and GLP in this cell type. Thus, the toxicity / function ratio of UNC0638 was 138, whereas the same ratio for BIX01294 was < 6 ( Fig. 2e,f ). This much improved toxicity / function ratio enables UNC0638 to be used in cell-based model systems over a range of concentrations without interference from cellular toxicity.

Furthermore, we have evaluated the cellular potency and toxicity of UNC0638 in other tumor and normal cell lines. UNC0638 had high potency, ranging from 48 to 238 nM, in reducing H3K9me2 levels in breast, prostate, colon carcinomas and normal fibroblast cells, with the two prostate carcinoma cell lines, PC3 (IC 50 = 59 nM) and 22RV1 (IC 50 = 48 nM), being the most sensitive ( Supplementary Table 6 ). The EC 50 for the cellular toxicity of UNC0638 (from MTT assays) in these tumor and normal cell lines was considerably higher than the corresponding IC 50 for the functional effects. The toxicity / function ratio of UNC0638 in these cell lines varied by up to ten-fold (19 for IMR90 compared with 233 for PC3), but was well above the value of 6 observed for BIX01294 in MDA-MB-231 cells. These results again support the conclusion that UNC0638 is suitable as a chemical probe of G9a and GLP in a broad range of cell types without interference from potential off-target toxicity.

Although UNC0638 is well tolerated by several cell types in terms of general cell viability, we investigated whether it might affect the growth properties of cancer cell lines. At concentrations of UNC0638 that considerably reduce H3K9me2 levels and for which acute off- target toxicity is minimal, we monitored the effect of G9a and GLP inhibi tion on cell growth using a clonogenicity assay. There was a marked concentration-dependent reduction of clonogenicity in MCF7 cells upon treatment with UNC0638 or upon G9a or GLP knock-down but much less effect on MDA-MB-231 cells ( Supplementary Fig. 12 ). These data show that inhibition of G9a and GLP can have differential phenotypic effects depending on the cell type, possibly

a

c

520 521 522 523 524 525 526 527 5280

10

20

30

40

50

60

70

80

90

100

Rela

tive

abun

danc

e

521.307

521.808

523.822

524.323 526.337522.309526.839524.824

522.810 527.340

Control shRNA

shRNA G9a

shRNA G9a + GLP

H3K9me2

m/z521 522 523 524 525 526 527 528

m/z

0

10

20

30

40

50

60

70

80

90

100

Rela

tive

abun

danc

e

521.306

521.807523.821

524.323526.337

522.308526.838

524.824522.810 527.339

Control

BIX01294

UNC0638

H3K9me2b

d

528 529 530 531 532 533 534 535m/z

0

10

20

30

40

50

60

70

80

90

100

Rela

tive

abun

danc

e

530.829

528.314 533.345

531.331528.815 533.846

531.832529.317 534.347

Control shRNA

shRNA G9a

shRNA G9a + GLP

H3K9me3

528 529 530 531 532 533 534 535m/z

0

10

20

30

40

50

60

70

80

90

100

Rela

tive

abun

danc

e

528.314 530.829

533.345

528.815 531.331533.846

529.317 531.832 534.347

Control BIX01294

UNC0638

H3K9me3

Figure 3 | Quantitative MS analysis of histone post-translational modifications in MDA-MB-231 cells. ( a – d ) MS of doubly charged peptides (KSTGGKAPR) corresponding to H3K9me2 ( a , b ) and H3K9me3 ( c , d ). In a , c , cells were treated with indicated shRNAs (control, D0 propionyl labeled; G9a, D0,D5 propionyl labeled, ~ 2.5 m / z heavier than the control; G9a + GLP, D5,D5 propionyl labeled, ~ 5 m / z heavier than the control). In b , d , cells were mock-treated (control, D0 propionyl labeled) or treated with 1 μ M of BIX01294 (D0,D5 propionyl labeled, ~ 2.5 m / z heavier than the control) or UNC0638 (D5,D5 propionyl labeled, ~ 5 m / z heavier than the control) for 48 h.

NATURE REPRINT COLLECTION Epigenetics S23NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology 57 1

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599

related to differences in epigenetic state or p53 status (MCF7 cells have functional p53, whereas MDA-MB-231 cells do not).

MS analysis confirms that UNC0638 reduces H3K9me2 levels To confirm the effect of UNC0638 on cellular levels of H3K9me2 and assess the potential effect on other histone post-translational modi-fication marks, we analyzed acid-extracted histones from MDA-MB-231 cells treated with UNC0638 using quantitative MS – based proteomics as previously described 39 . After treatment of MDA-MB-231 cells with UNC0638 (1 μ M for 48 h), the levels of H3K9me2 were considerably lower, similar to those observed with shRNA double knockdown of G9a and GLP ( Fig. 3a,b ). BIX01294 (1 μ M for 48 h) reduced the cellular levels of H3K9me2 to a lesser extent than UNC0638 did. These results are consistent with the findings from the in-cell western assay. We note that the levels of unmodi-fied H3K9 were higher upon treatment with UNC0638, consistent with decreased modification by G9a and GLP ( Supplementary Fig. 13 ). In contrast, the levels of H3K9me3 remained constant with all treatments ( Fig. 3c,d ), suggesting that, at least in these cells, trimethylation of H3K9 is not dependent on prior dimethylation of H3K9 by G9a and/or GLP. We also analyzed other well-known histone marks after treatment with UNC0638, BIX01294 and shR-NAs targeting G9a and GLP. With the exception of acetylation of histone H3 Lys14 (H3K14ac), no changes in abundance were observed for 21 different modification states of H3 and H4 ( Supplementary Table 7 ). Notably, the levels of H3K14ac doubled both with UNC0638 treatment and with G9a and GLP knockdown, suggesting a possible link or cross-talk between H3K9me2 and H3K14ac ( Supplementary Fig. 13 ). This result is consistent with a previous finding in HEK293 cells in which G9a and GLP were knocked down via siRNA 39 .

Genomic profiling of UNC0638-modulated H3K9me2 levels To better understand how UNC0638 might regulate specific genes, we investigated the H3K9me2 levels at genomic loci along chromo-somes 3 and X (chr3 and chrX). In chromatin immunoprecipita-tion on chip (ChIP-chip) experiments using a selective H3K9me2 antibody, MCF7 cells treated with UNC0638 at 320 nM (the IC 90 for H3K9me2 inhibition) for 14 d had significantly fewer genomic regions containing H3K9me2 on chr3 and chrX ( P < 2.2 × 10 − 16 ; Fig. 4a and Supplementary Fig. 14 ). Lower levels of H3K9me2 were observed in the MAGEA1 promoter in our ChIP-chip study ( P = 4.3 × 10 − 3 ; Fig. 4b ), and this was confirmed in an independent ChIP – quantitative PCR (ChIP-qPCR) analysis. In agreement with previously reported data 29 , our ChIP-chip and ChIP-qPCR data show significant, concentration-dependent reductions of H3K9me2 levels at the TBC1D5 and MAGEA2 promoters ( P = 2.0 × 10 − 10 and 2.6 × 10 − 3 , respectively) but not at the MAGEB4 promoter ( P = 0.07) after exposure to UNC0638 ( Supplementary Fig. 15a – c ). Thus, UNC0638 shows robust on-target modulation of H3K9me2 levels, consistent with its activity as a selective G9a and GLP inhibitor.

H3K9me2 is modestly correlated with euchromatic silenced genes 40,41 ; however, it has also been reported at active genes 42 . Notably, when we examined the genes with the greatest reduction of H3K9me2 levels within their gene bodies, we found an enrichment of miRNA genes: 25 % of the top 100 most affected genes encoded miRNAs, whereas only 5 % of the probed genes on chr3 and chrX were miRNA genes ( Supplementary Table 8 ). Furthermore, under the conditions of these experiments (treatment for 14 d at 70 or 320 nM), we found that the total levels of DNA methylation were not altered on chr3 and chrX in UNC0638-treated MCF7 cells com-pared to control-treated cells ( Fig. 4c and Supplementary Fig. 15d ). G9a inactivation has previously been shown to be ineffective in altering global DNA methylation in human cancer cell lines (in con-trast to mES cells) 24 . We note that although our result supports the conclusion that inhibition of G9a catalytic activity does not produce global changes in genomic DNA methylation, it does not exclude the possibility of small, targeted changes below the resolution of these experiments. Taken together, these results further support the value of UNC0638 as a tool for investigating the effects of specific and global changes to H3K9me2 levels in human cells.

UNC0638 reactivates silenced gene expression in mES cells Embryonic stem cells are unique in their ability to efficiently silence retroviruses through epigenetic mechanisms including H3K9 dimethylation 43 . To investigate the ability of UNC0638 to reactivate silent retrovirus vectors, we first determined the cellular potency of UNC0638 and BIX01294 in J1 mES cells. Consistent with the above results, UNC0638 showed greater cellular potency than BIX01294 (at 48 h, IC 50 = 138 and 2,041 nM, respectively; Supplementary Fig. 16a ). To establish retrovirus silencing, we infected J1 mES cells with an HSC1-EF1 α -EGFP-Puromycin retrovirus and selected for transduced cells with a short puromycin treatment. We observed the initial 100 % EGFP + cell population diminish to 30 – 36 % EGFP + cells as retrovirus silencing was gradually established over 150 d of extended culture.

To investigate the ability of the probe compounds to reactivate silent retrovirus vectors, we followed EGFP expression by flow cytometry after treatment with UNC0638 (100, 250 or 500 nM), BIX01294 (2 μ M) or UNC0737 (500 nM, as a negative control). Whereas UNC0737 did not reactivate EGFP expression above the 36 % EGFP + cells seen in the untreated sample, UNC0638 reactivated EGFP expression in a concentration-dependent manner to a maxi-mal level of 63 % EGFP + cells at day 10 ( Fig. 5a and Supplementary Fig. 16c ). BIX01294 reactivated expression reaching the level of 53 % EGFP + cells at day 10, an expression level exceeded when cells were treated with UNC0638 at 250 nM, one-eighth the concentra-tion of BIX01294. Moreover, we observed cell morphology changes under BIX01294 treatment, suggesting that this inhibitor may

4,000

3,000

Exce

ss o

f DN

Ahy

pom

ethy

late

d pr

obes

2,000

1,000

0

DAC

UNC0638

70 nMUNC0638

320 nM

Chromosome 3

Scale17,300,000

200 kb17,400,000

Control

UNC0638 320 nM

17,500,000 17,600,000 17,700,000chr. 32.8

2.8–0.9

–0.9

b c

a

UNC0638 320 nM

Control

MAGEA1

Control

UNC0638 320 nM

152,139,000 152,140,0002 kb

152,141,000 152,142,000 152,143,000Control2.2

2.2–0.03

–0.03MAGEA1

UNC0638 320 nM

RefSeq genes

Scalechr. X

Figure 4 | Effects of UNC0638 on H3K9me2 and DNA methylation. ( a ) Example of a genomic region (3p24.3) showing reductions in H3K9me2 after UNC0638 exposure ( P = 2.0 × 10 − 10 ). Light blue bars show the log ratios of anti-H3K9me2 to IgG in control (top) and treated (bottom) samples. ( b ) Administration of UNC0638 decreased H3K9me2 in the MAGEA1 promoter. Log ratios are shown as in a . The promoter was defined to be 4 kilobase pairs upstream and 0.5 kilobase pairs downstream of the transcription start site. ( c ) UNC0638 did not change DNA methylation levels. MCF7 cells were exposed to either UNC0638 (at 70 or 320 nM), 5-azacytidine (DAC) or control. y -axis scale is the excess of the number of significantly hypomethylated probes over the number of hypermethylated probes on chromosome 3. Significance cutoff was set at P = 10 − 3 for two-sample t -test between treated and control log intensities.

S24 NATURE REPRINT COLLECTION Epigenetics 572 NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599

induce cell differentiation. By day 12, BIX01294-treated cells had arrested or died, whereas UNC0638 reactivated EGFP expression in 75 % of cells without showing morphological signs of cell differen-tiation ( Fig. 5a ). At day 10 of BIX01294 treatment, only 65 % of cells were positive for the pluripotency marker SSEA1 ( Supplementary Fig. 16b ). In contrast, UNC0638 treatment maintained expression of the SSEA1 pluripotency marker: the level of marker in cells treated with 100 nM of UNC0638 (97 % SSEA1 + cells) was indistinguishable from that in UNC0737-treated cells or untreated cells. We conclude that inhibition of G9a with UNC0638 functionally reactivates silent retrovirus vectors without promoting differentiation into SSEA1 − cells and is considerably more potent than BIX01294 treatment.

We next tested whether MAGEA2 and DUB1 , genes previously shown to be reactivated in G9a-knockout mES cells 25,44 , could be reac-tivated with UNC0638 treatment in J1 mES cells. At day 10, DUB1 and MAGEA2 genes were more highly expressed in UNC0638 than in untreated or UNC0737-treated cells ( Fig. 5b ). Similar to the results for retroviral vector reactivation, mRNA levels of DUB1 and MAGEA2 genes showed a concentration-dependent increase upon treatment with UNC0638. We note that reactivation of endo genous genes occurred by day 3 ( Supplementary Fig. 16d ), whereas EGFP retro virus reactiva-tion was first evident by flow cytometry at day 7 ( Fig. 5a ).

In addition to directly methylating H3K9 (ref. 20), G9a has been reported to indirectly facilitate DNA methylation in mES cells 24 – 26 . We first analyzed the presence of H3K9me2 by ChIP on the EGFP provirus and the endogenous MAGEA2 promoter and found that UNC0638 treatment decreased H3K9me2 at both targets by day 3, with a further decrease by day 7 ( Fig. 5c,d ). To test whether this reduction of H3K9me2 affects DNA methylation, we performed bisulfite sequencing on the retrovirus long terminal repeat and MAGEA2 promoter after 10 d of treatment. Both regions were hypermethylated in untreated or UNC0737-treated mES cells. In contrast, UNC0638 treatment induced DNA hypomethylation in a

concentration-dependent manner ( Fig. 5e ). These results suggest that inhibition of gene silencing by UNC0638 primarily drives H3K9me2 loss to reactivate gene expression and facilitates DNA hypomethylation in mES cells.

DISCUSSION Protein lysine methyltransferase G9a has been implicated in vari-ous human diseases including leukemia 8 , prostate cancer 8,16 , liver cancer 17 , lung cancer 18 , drug addiction 21 , mental retardation 22 and maintenance of HIV-1 latency 23 . Given the broad areas of biology in which G9a and GLP have a role, a high-quality chemical probe of these two PKMTs would be very valuable for dissecting the mole-cular mechanism(s) of these activities, the cell types in which they are relevant and which diseases (if any) would benefit from their inhibi-tion. Here we report the discovery and characterization of UNC0638, which has all the properties of a high-quality chemical probe 30 . (i) UNC0638 is a potent, substrate-competitive inhibitor of G9a (IC 50 < 15 nM, K i = 3 nM) and the closely related GLP (IC 50 = 19 nM). (ii) It is selective for G9a and GLP over a wide range of epigenetic and non-epigenetic targets. (iii) It is highly active in cells: at 250 nM con-centration, it reduces the levels of H3K9me2 by ~ 60 – 80 % in a variety of cell lines, similar to the reductions seen for shRNA knockdown of G9a and GLP, and modulates expression of known G9a-regulated genes. (iv) UNC0737, an N -methyl derivative of UNC0638, is > 300-fold less potent against G9a and GLP, with similar selectivity and cellular toxicity compared to UNC0638, and therefore is a useful negative control. (v) UNC0638 has low cellular toxicity in seven cell lines tested at functional doses. Notably, the greatly improved cellular toxicity / function ratio relative to the previously available probe, BIX01294, makes UNC0638 much more versatile as a chemical probe. (vi) Finally, a useful chemical probe must be available to the biological research community. As such, we have made UNC0638 available through a commercial vendor (Sigma-Aldrich).

80 10,000 2.0GFP

IgGH3K9me2

IgG

MAGEA2

H3K9me2

1.5

1.0

0.5

0

1,000

100

10

1

0.1

Untreated

UNC0737

0.5 μM

UNC0638

0.1 μM

UNC0638

0.25 μM

UNC0638

0.5 μMBIX01294

2 μM

Untreateda

d

b cUNC0638-500nM MAGEA2

DUB1UNC0638-250nMUNC0638-100nMUNC0737-500nMBIX01294-2μM

70

60

50

40

Perc

enta

ge o

f EG

FP+ c

ells

Rela

tive

expr

essi

on

Perc

enta

ge o

f inp

ut

Untreated

UNC0737500 nM

UNC0638100 nM

UNC0638250 nM

UNC0638500 nM

BIX012942 μMe

LTR

MA

GEA

2pr

omot

er

Untreated

3 d 7 d

Untreated 3 d 7 d

Perc

enta

ge o

f inp

ut

30

20

7

6

5

4

3

2

1

0

0 1 2 3 4 5 6 7 8Time (d)

9 10 11 12 13 14 15

Figure 5 | UNC0638 reactivates a silent EGFP retrovirus vector and G9a-regulated endogenous genes in mES cells. ( a ) Time course of EGFP retrovirus activation during indicated treatments in J1 mES cells infected by HSC1-EF1 α -EGFP-Puromycin retroviral vector. Plotted percentage of EGFP expression is the mean of the percentage of EGFP + cells in three independent experiments. Error bars, s.d. ( b ) Analysis of mRNA levels of two G9a-regulated endogenous genes ( MAGEA2 and DUB1 ) in J1 mES cells treated for 10 d with indicated compounds. The graph shows the normalized expression of MAGEA2 and DUB1 relative to the mRNA level detected in untreated cells ( Δ ( Δ Ct)). ( c , d ) ChIP analysis of H3K9me2 enrichment at the EGFP gene ( c ) and MAGEA2 promoter ( d ) in cells treated with UNC0638 (500 nM) for 3 or 7 d. ( e ) Analysis of DNA methylation in the long terminal repeat (LTR) of the HSC1-EF1 α -EGFP-Puromycin retroviral vector and in the MAGEA2 promoter after 10 d of treatment.

NATURE REPRINT COLLECTION Epigenetics S25NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology 573

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599

Our proteomics and immunofluorescence data show that pharmaco logic inhibition of G9a and GLP with UNC0638 leads to global reductions in H3K9me2 levels, comparable to those achieved with shRNAs. However, the genomic regions specifically marked with H3K9me2 vary with cell type and possibly with disease state 19 – 21,23,45 . Similarly, the cellular levels of G9a and GLP also vary with cell type and disease state 8 . Indeed, we observed considerable variation in the global concentration-response behavior of UNC0638 in six cancer cell lines and human fibroblasts. We also observed a marked growth-inhibitory effect in MCF7 cells but not MDA-MB-231 cells at moderate UNC0638 concentrations under which H3K9me2 levels are fully suppressed. Given the association of G9a and GLP with DNMTs 46,47 and repression of tumor suppressor p53 (refs. 8,48), there may be specific epigenetic cellular states (in particular, H3K9me2-mediated repression of a specific set of genes and / or p53 status) in which cells are selectively vulnerable to G9a and GLP inhibition. UNC0638 will be a useful tool to search for such types of cancer or other disease-related epigenetic states. BIX01294 is likely to be less effective in such studies owing to off-target toxic effects encountered at or near its required concentration for full H3K9me2 suppression.

It is notable that most studies of G9a and GLP to date have used knockout or knockdown of G9a and / or GLP, whereas UNC0638 inhibits only the enzymatic function of G9a and GLP, and does not affect the protein and mRNA levels, thereby preserving a potential scaffolding function in the many protein complexes reported for G9a and/or GLP 20,46 – 49 . For example, it has been shown that catalytic acti-vity of G9a or GLP is not required for all of its function 25,26 . This may explain the milder phenotype of UNC0638 compared to knockdowns of G9a and GLP, and it suggests that UNC0638 can be used to separate enzymatic from non-enzymatic functions of these proteins.

We also show that UNC0638 can reactivate endogenous genes and silenced retroviral reporters in mES cells, further implicating H3K9me2-mediated repression in these processes. Retroviral silenc-ing is a reliable criterion for identification of fully reprogrammed cells and is a good indicator of pluripotency 50 . UNC0638 reduced H3K9me2 on endogenous genes and retrovirus vectors within 3 d, and DNA hypomethylation was observed by day 10, when the cells had already reactivated expression. Together, these results suggest that a cascade of events is involved in the reactivation of silenced genes, and concentration-dependent inhibition of G9a by UNC0638 drives this process. Therefore, UNC0638 is a potent chemical tool for mod-ulating G9a-related activities in cells to alter their expression profiles and epigenetic landscapes, to assist in manipulating their cell identity and phenotype, and to decipher the timing and inter-relationship of H3K9me2 and DNA methylation in gene silencing.

METHODS Biochemical assays. SAHH-coupled assay, peptide displacement, DSF and DSLS assay protocols are described in Supplementary Methods . Morrison K i and IC 50 in G9a MCE assay were determined as described 36 . Details of other selectivity assays are included in Supplementary Methods .

Cell immunostaining (in-cell western). We added 2 % (w/v) formaldehyde in PBS to fix cells for 15 min. After five washes with 0.1 % (v/v) Triton X100 in PBS, cells were blocked for 1 h with 1 % (w/v) BSA in PBS. Three of four replicates were exposed to primary H3K9me2 antibody, Abcam no. 1220 at 1:800 dilution in 1 % BSA and PBS for 2 h. One replicate was reserved as a background control. The wells were washed five times with 0.1 % (v/v) Tween 20 in PBS, then secondary IR800-conjugated antibody (LiCor) and a nucleic acid – intercalating dye, DRAQ5 (LiCor), were added for 1 h. After five washes with 0.1 % Tween 20 in PBS, the plates were read on an Odyssey (LiCor) scanner at 800 nm (H3K9me2 signal) and 700 nm (DRAQ5 signal). Fluorescence intensity was quantified, normalized to the back-ground and then to the DRAQ5 signal, and expressed as a percentage of control.

Quantitative MS analysis of histones. Histones from MDA-MB-231cells were extracted using standard acid procedures and analyzed after chemical derivatization by propionic anhydride and trypsin as described 39 . Digested histone samples were analyzed by LC-MS / MS on an Orbitrap mass spectrometer as described 39 . All data were manually verified.

ChIP-chip. ChIP samples were amplified for arrays using a whole-genome amplification (WGA) method (reference is provided in Supplementary Methods ). In WGA, DNA fragments are primed to generate a library of DNA fragments with a common end sequence. The library then is replicated using linear, isothermal amplification, followed by a limited round of geometric PCR amplifications. The GenomePlex Complete WGA kit (Sigma) was used for library preparation. ChIP samples concentrated to 10 μ l were mixed with 2 μ l library-preparation buffer and then with 1 μ l library-stabilization solution. Samples were incubated at 95 ° C for 2 min and afterwards immediately cooled on ice. Each sample was mixed with 1 μ l of library-preparation enzyme and incubated as follows: 16 ° C for 20 min, 24 ° C for 20 min, 37 ° C for 20 min, 75 ° C for 5 min, 4 ° C hold. Amplification of the samples was completed with the GenomePlex WGA kit (Sigma). Each sample was combined with 44 μ l nuclease-free water, 7.5 μ l Amplification Master Mix, 3 μ l dNTP / dUTP mix (10 mM dATP, 10 mM dCTP, 10 mM dGTP, 8 mM dTTP and 2 mM dUTP) and 5 μ l WGA DNA polymerase. Amplified samples were purified using the QIAquick PCR purification kit (Qiagen) and then processed for array hybridization as described below. Samples exposed to the H3K9me2 antibody or IgG control antibody were hybridized to the arrays ( n = 2 per group).

Reactivation of retrovirus expression in mES cells. J1 mES cells were cultured (reference is provided in Supplementary Methods ) in DMEM with 15 % (v/v) ES-qualified FBS supplemented with 4 mM L -glutamine, 0.1 mM MEM non-essential amino acids, 1 mM sodium pyruvate, 0.55 mM 2-mercaptoethanol and purified recombinant leukemia inhibitory factor on 0.1 % gelatin-coated plates. Cells were infected with a self-inactivated HSC1 retroviral vector (reference is provided in Supplementary Methods ) engineered to harbor an EGFP-Puromycin biscistronic reporter gene controlled by the human EF1 α promoter. EGFP gene expression was analyzed by flow cytometry. Pluripotency of mES cells was tested by SSEA1 (a surface maker of mouse undifferentiated cells) immunostaining and also measured by flow cytometry. Briefly, to perform flow cytometry, we fixed trypsinized cells with 2 % formaldehyde in phosphate-buffered saline with 2 % (v/v) FBS for 10 min at room temperature. Cells were then suspended in PBS with 2 % (v / v) of serum (flow buffer) and filtered through 70- μ m nylon membranes. EGFP expression analyses were performed by LSRII flow cytometer (Becton-Dickinson) using CellQuest Pro software. SSEA1 (a surface maker of mouse undifferentiated cells) immunostaining was performed on non-permeabilized fixed cells. They were incubated with mouse IgM antibody to SSEA1 (DSHB, MC-480) for 30 min at 4 ° C. After being washed three times with flow buffer, cells were incubated for 30 min with secondary antibody, Phycoerythrin-Cy5.5 (PE-Cy5.5) anti-mouse IgM (eBioscience), at 4 ° C. Cells were washed three times in the flow buffer, and SSEA1 immunostaining was analyzed by LSRII flow cytometer. We excluded cell debris were excluded from analysis by using forward- and side-scatter gating. Uninfected J1 ES cell line was used as a negative control to adjust EGFP fluorescence measurements. SSEA1 immunostaining of mouse embryonic fibroblast was used as a negative control cell line for SSEA1 cell measure-ment. Reactivation of endogenous genes, ChIP of endogenous gene and retrovirus, and DNA methylation analysis in mES cells are described in Supplementary Methods .

Additional methods. Synthesis of UNC0638 and UNC0737, UNC0638 mechanism-of-action and kinetic studies, determination of X-ray crystal structure of the G9a – UNC0638 – SAH complex, UNC0638 stability studies, mouse drug metabolism and pharmacokinetic studies, cell growth, MTT, shRNA, ChIP, western blotting, immunofluorescence microscopy, clonogenicity, immuno staining flow cytometry, reverse transcription – qPCR, DNA extraction, enrichment of unmethylated DNA fraction, and microarray experiments and data analysis are described in Supplementary Methods .

Accession codes. Protein Data Bank: Coordinates and structure factors for the cocrystal structure of the G9a – UNC0638 – SAH complex have been deposited with accession code 3RJW.

Received 25 February 2011; accepted 27 April 2011; published online 10 July 2011

References 1 . Kouzarides , T . Chromatin modifi cations and their function . Cell 128 , 693 – 705

( 2007 ). 2 . Martin , C . & Zhang , Y . Th e diverse functions of histone lysine methylation .

Nat. Rev. Mol. Cell Biol. 6 , 838 – 849 ( 2005 ). 3 . Jenuwein , T . & Allis , C . D . Translating the histone code . Science 293 ,

1074 – 1080 ( 2001 ). 4 . Bernstein , B . E . , Meissner , A . & Lander , E . S . Th e mammalian epigenome . Cell 128 ,

669 – 681 ( 2007 ). 5 . Gelato , K . A . & Fischle , W . Role of histone modifi cations in defi ning

chromatin structure and function . Biol. Chem. 389 , 353 – 363 ( 2008 ). 6 . Strahl , B . D . & Allis , C . D . Th e language of covalent histone modifi cations .

Nature 403 , 41 – 45 ( 2000 ). 7 . Huang , J . et al. Repression of p53 activity by Smyd2-mediated methylation .

Nature 444 , 629 – 632 ( 2006 ). 8 . Huang , J . et al. G9A and GLP methylate lysine 373 in the tumor suppressor p53 .

J. Biol. Chem. 285 , 9636 – 9641 ( 2010 ).

S26 NATURE REPRINT COLLECTION Epigenetics 574 NATURE CHEMICAL BIOLOGY | VOL 7 | AUGUST 2011 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.599

9 . Rathert , P . et al. Protein lysine methyltransferase G9a acts on non-histone targets . Nat. Chem. Biol. 4 , 344 – 346 ( 2008 ).

10 . Huang , J . et al. p53 is regulated by the lysine demethylase LSD1 . Nature 449 , 105 – 108 ( 2007 ).

11 . Copeland , R . A . , Solomon , M . E . & Richon , V . M . Protein methyltransferases as a target class for drug discovery . Nat. Rev. Drug Discov. 8 , 724 – 732 ( 2009 ).

12 . Rea , S . et al. Regulation of chromatin structure by site-specifi c histone H3 methyltransferases . Nature 406 , 593 – 599 ( 2000 ).

13 . Cole , P . A . Chemical probes for histone-modifying enzymes . Nat. Chem. Biol. 4 , 590 – 597 ( 2008 ).

14 . Esteller , M . Epigenetics in cancer . N. Engl. J. Med. 358 , 1148 – 1159 ( 2008 ). 15 . Tachibana , M . et al. G9a histone methyltransferase plays a dominant role in

euchromatic histone H3 lysine 9 methylation and is essential for early embryogenesis . Genes Dev. 16 , 1779 – 1791 ( 2002 ).

16 . Kondo , Y . et al. Downregulation of histone H3 lysine 9 methyltransferase G9a induces centrosome disruption and chromosome instability in cancer cells . PLoS ONE 3 , e2037 ( 2008 ).

17 . Kondo , Y . et al. Alterations of DNA methylation and histone modifi cations contribute to gene silencing in hepatocellular carcinomas . Hepatol. Res. 37 , 974 – 983 ( 2007 ).

18 . Watanabe , H . et al. Deregulation of histone lysine methyltransferases contributes to oncogenic transformation of human bronchoepithelial cells . Cancer Cell Int. 8 , 15 ( 2008 ).

19 . Goyama , S . et al. EVI-1 interacts with histone methyltransferases SUV39H1 and G9a for transcriptional repression and bone marrow immortalization . Leukemia 24 , 81 – 88 ( 2010 ).

20 . Tachibana , M . et al. Histone methyltransferases G9a and GLP form heteromeric complexes and are both crucial for methylation of euchromatin at H3 – K9 . Genes Dev. 19 , 815 – 826 ( 2005 ).

21 . Maze , I . et al. Essential role of the histone methyltransferase G9a in cocaine-induced plasticity . Science 327 , 213 – 216 ( 2010 ).

22 . Schaefer , A . et al. Control of cognition and adaptive behavior by the GLP/G9a epigenetic suppressor complex . Neuron 64 , 678 – 691 ( 2009 ).

23 . Imai , K . , Togami , H . & Okamoto , T . Involvement of histone H3 Lysine 9 (H3K9) methyl transferase G9a in the maintenance of HIV-1 latency and its reactivation by BIX01294 . J. Biol. Chem. 285 , 16538 – 16545 ( 2010 ).

24 . Link , P . A . et al. Distinct roles for histone methyltransferases G9a and GLP in cancer germ-line antigen gene regulation in human cancer cells and murine embryonic stem cells . Mol. Cancer Res. 7 , 851 – 862 ( 2009 ).

25 . Tachibana , M . , Matsumura , Y . , Fukuda , M . , Kimura , H . & Shinkai , Y . G9a/GLP complexes independently mediate H3K9 and DNA methylation to silence transcription . EMBO J. 27 , 2681 – 2690 ( 2008 ).

26 . Dong , K . B . et al. DNA methylation in ES cells requires the lysine methyl-transferase G9a but not its catalytic activity . EMBO J. 27 , 2691 – 2701 ( 2008 ).

27 . Shi , Y . et al. Induction of pluripotent stem cells from mouse embryonic fi broblasts by Oct4 and Klf4 with small-molecule compounds . Cell Stem Cell 3 , 568 – 574 ( 2008 ).

28 . Shi , Y . et al. A combined chemical and genetic approach for the generation of induced pluripotent stem cells . Cell Stem Cell 2 , 525 – 528 ( 2008 ).

29 . Kubicek , S . et al. Reversal of H3K9me2 by a small-molecule inhibitor for the G9a histone methyltransferase . Mol. Cell 25 , 473 – 481 ( 2007 ).

30 . Frye , S . V . Th e art of the chemical probe . Nat. Chem. Biol. 6 , 159 – 161 ( 2010 ). 31 . Liu , F . et al. Discovery of a 2,4-diamino-7-aminoalkoxyquinazoline as a

potent and selective inhibitor of histone lysine methyltransferase G9a . J. Med. Chem. 52 , 7950 – 7953 ( 2009 ).

32 . Liu , F . et al. Protein lysine methyltransferase G9a inhibitors: design, synthesis, and structure activity relationships of 2,4-diamino-7-aminoalkoxy-quinazolines . J. Med. Chem. 53 , 5844 – 5857 ( 2010 ).

33 . Chang , Y . et al. Adding a lysine mimic in the design of potent inhibitors of histone lysine methyltransferases . J. Mol. Biol. 400 , 1 – 7 ( 2010 ).

34 . Chang , Y . et al. Structural basis for G9a-like protein lysine methyltransferase inhibition by BIX-01294 . Nat. Struct. Mol. Biol. 16 , 312 – 317 ( 2009 ).

35 . Collazo , E . , Couture , J . F . , Bulfer , S . & Trievel , R . C . A coupled fl uorescent assay for histone methyltransferases . Anal. Biochem. 342 , 86 – 92 ( 2005 ).

36 . Wigle , T . J . et al. Accessing protein methyltransferase and demethylase enzymology using microfl uidic capillary electrophoresis . Chem. Biol. 17 , 695 – 704 ( 2010 ).

37 . Morrison , J . F . Kinetics of the reversible inhibition of enzyme-catalysed reactions by tight-binding inhibitors . Biochim. Biophys. Acta 185 , 269 – 286 ( 1969 ).

38 . Zee , B . M . et al. In vivo residue-specifi c histone methylation dynamics . J. Biol. Chem. 285 , 3341 – 3350 ( 2010 ).

39 . Plazas-Mayorca , M . D . et al. Quantitative proteomics reveals direct and indirect alterations in the histone code following methyltransferase knockdown . Mol. Biosyst. 6 , 1719 – 1729 ( 2010 ).

40 . Barski , A . et al. High-resolution profi ling of histone methylations in the human genome . Cell 129 , 823 – 837 ( 2007 ).

41 . Rice , J . C . et al. Histone methyltransferases direct diff erent degrees of methyl-ation to defi ne distinct chromatin domains . Mol. Cell 12 , 1591 – 1598 ( 2003 ).

42 . Vakoc , C . R . , Mandat , S . A . , Olenchock , B . A . & Blobel , G . A . Histone H3 lysine 9 methylation and HP1gamma are associated with transcription elongation through mammalian chromatin . Mol. Cell 19 , 381 – 391 ( 2005 ).

43 . Wolf , D . & Goff , S . P . TRIM28 mediates primer binding site-targeted silencing of murine leukemia virus in embryonic cells . Cell 131 , 46 – 57 ( 2007 ).

44 . Yokochi , T . et al. G9a selectively represses a class of late-replicating genes at the nuclear periphery . Proc. Natl. Acad. Sci. USA 106 , 19363 – 19368 ( 2009 ).

45 . Hosey , A . M . , Chaturvedi , C . P . & Brand , M . Crosstalk between histone modifi cations maintains the developmental pattern of gene expression on a tissue-specifi c locus . Epigenetics 5 , 273 – 281 ( 2010 ).

46 . Epsztejn-Litman , S . et al. De novo DNA methylation promoted by G9a prevents reprogramming of embryonically silenced genes . Nat. Struct. Mol. Biol. 15 , 1176 – 1183 ( 2008 ).

47 . Est è ve , P . O . et al. Direct interaction between DNMT1 and G9a coordinates DNA and histone methylation during replication . Genes Dev. 20 , 3089 – 3103 ( 2006 ).

48 . Chen , L . et al. MDM2 recruitment of lysine methyltransferases regulates p53 transcriptional output . EMBO J. 29 , 2538 – 2552 ( 2010 ).

49 . Fritsch , L . et al. A subset of the histone H3 lysine 9 methyltransferases Suv39h1, G9a, GLP, and SETDB1 participate in a multimeric complex . Mol. Cell 37 , 46 – 56 ( 2010 ).

50 . Stadtfeld , M . , Maherali , N . , Breault , D . T . & Hochedlinger , K . Defi ning molecular cornerstones during fi broblast to iPS cell reprogramming in mouse . Cell Stem Cell 2 , 230 – 240 ( 2008 ).

Acknowledgments We thank A. Tumber for JMJD2E assay support; J. Moffat (University of Toronto) for the gift of shRNAs; R. Bristow (University Health Network) for RV221 and PC3 cells; T. Hajian and F. Syeda for protein purification; G. Senisterra for contributing to DSF and DSLS data analysis; M. Herold for graphical design and illustration; I. Korboukh, M. Herold and J. Yost for critical reading of the manuscript; and R. Trump and C. Yates for helpful discussion. The research described here was supported by the National Institute of General Medical Sciences, US National Institutes of Health (NIH; grant RC1GM090732), the Carolina Partnership and University Cancer Research Fund from the University of North Carolina at Chapel Hill, the US National Science Foundation (NSF), the Ontario Research Fund, the Ontario Ministry of Health and Long-term Care and the Structural Genomics Consortium. The Structural Genomics Consortium is a registered charity (number 1097737) that receives funds from the Canadian Institutes for Health Research (CIHR), the Canada Foundation for Innovation, Genome Canada through the Ontario Genomics Institute, GlaxoSmithKline, Karolinska Institutet, the Knut and Alice Wallenberg Foundation, the Ontario Innovation Trust, the Ontario Ministry for Research and Innovation, Merck & Co. Inc., the Novartis Research Foundation, the Swedish Agency for Innovation Systems, the Swedish Foundation for Strategic Research and the Wellcome Trust. C.H.A. holds a Canada Research Chair in Structural Genomics. V.L. is supported by a CIHR fellowship. A.P. is supported by grants from the CIHR (199170 and 186007) and from the NIH (MH074127, MH088413, DP3DK085698 and HG004535). A.P. is Tapscott Chair in Schizophrenia Studies and a Senior Fellow of the Ontario Mental Health Foundation. J.E. is supported by CIHR grant IG1-102956. B.A.G. is supported by grants from the NSF (Early Faculty CAREER award and CBET-0941143), the American Society for Mass Spectrometry and the NIH Office of the Director (DP2OD007447). P.A.D. is supported by NIH postdoctoral fellowship F32 NRSA.

Author contributions M.V., A.A.-H., A.S. and I.C. performed SAHH-coupled, fluorescence-polarization, DSF, DSLS and DNMT1 assays; D.B.-L. developed and performed in-cell western, MTT, ChIP, gene expression, clonogenicity, western blotting and immunofluorescence studies; F.L. developed the synthetic route to UNC0638 and UNC0737 and synthesized the compounds; S.R.-G. and J.E. developed and performed mES cell studies; V.L., S.-C.W. and A.P. performed H3K9me2 genomic localization and DNA methylation analysis; T.J.W. and W.P.J. performed mechanism-of-action studies; P.A.D. and B.A.G. performed MS-based proteomics studies; M.V., G.A.W., A.D., W.T., D.B.K. and C.H.A. solved and analyzed the X-ray crystal structure of the G9a-UNC0638-SAH complex; T.J.W. and A.T. performed SPR studies; X.C. and S.G.P. performed UNC0638 stability studies; S.G.P. performed RT-qPCR studies; T.J.M., X.-p.H. and B.L.R. performed GPCR selectivity studies; C.D.S. and W.P.J. performed kinase selectivity studies; J.L.N. purified proteins; J.J. designed UNC0638 and UNC0737; C.H.A., J.J., S.V.F., M.V., D.B.-L., P.J.B., J.E., S.R.-G., A.P., V.L., B.A.G., T.J.W. and A.E. designed studies and discussed results; J.J., C.H.A., D.B.-L., S.R.-G., M.V., V.L., S.V.F., J.E., A.P., B.A.G., P.J.B. and T.J.W. wrote the paper.

Competing financial interests The authors declare no competing financial interests.

Additional information Supplementary information, chemical compound information and chemical probe information is available online at http://www.nature.com/naturechemicalbiology/ . Reprints and permissions information is available online at http://www.nature.com/reprints/index.html . Correspondence and requests for materials should be addressed to C.H.A. and J.J.

NATURE REPRINT COLLECTION Epigenetics S27

LETTERdoi:10.1038/nature10953

Chromatin-modifying enzymes as modulators ofreprogrammingTamer T. Onder1,2,3,4, Nergis Kara5, Anne Cherry1,2,3,4, Amit U. Sinha6,7, Nan Zhu3,6,7, Kathrin M. Bernt3,6,7, Patrick Cahan1,2,3,4,B. Ogan Mancarci8, Juli Unternaehrer1,2,3,4, Piyush B. Gupta9,10, Eric S. Lander9,11,12, Scott A. Armstrong3,6,7

& George Q. Daley1,2,3,4,6,13,14

Generation of induced pluripotent stem cells (iPSCs) by somaticcell reprogramming involves global epigenetic remodelling1.Whereas several proteins are known to regulate chromatin marksassociated with the distinct epigenetic states of cells before andafter reprogramming2,3, the role of specific chromatin-modifyingenzymes in reprogramming remains to be determined. To addresshow chromatin-modifying proteins influence reprogramming, weused short hairpin RNAs (shRNAs) to target genes in DNA andhistonemethylation pathways, and identified positive and negativemodulators of iPSC generation. Whereas inhibition of the corecomponents of the polycomb repressive complex 1 and 2, includingthe histone 3 lysine 27 methyltransferase EZH2, reduced repro-gramming efficiency, suppression of SUV39H1, YY1 and DOT1Lenhanced reprogramming. Specifically, inhibition of the H3K79histone methyltransferase DOT1L by shRNA or a small moleculeaccelerated reprogramming, significantly increased the yield ofiPSC colonies, and substituted for KLF4 and c-Myc (also knownas MYC). Inhibition of DOT1L early in the reprogrammingprocess is associated with a marked increase in two alternativefactors, NANOG and LIN28, which play essential functional rolesin the enhancement of reprogramming. Genome-wide analysis ofH3K79me2 distribution revealed that fibroblast-specific genesassociated with the epithelial to mesenchymal transition loseH3K79me2 in the initial phases of reprogramming. DOT1L inhibi-tion facilitates the loss of this mark from genes that are fated to berepressed in the pluripotent state. These findings implicate specificchromatin-modifying enzymes as barriers to or facilitators ofreprogramming, and demonstrate how modulation of chromatin-modifying enzymes can be exploited to more efficiently generateiPSCs with fewer exogenous transcription factors.To examine the influence of chromatin modifiers on somatic cell

reprogramming, we used a loss-of-function approach to interrogatethe role of 22 select genes in DNA and histone methylation pathways.We tested a pool of three hairpins for each of 22 target genes andobserved knockdown efficiencies of .60% for 21 out of 22 targets(Supplementary Fig. 1). We infected fibroblasts differentiated fromthe H1 human embryonic stem cell (ESC) line (dH1fs) with shRNApools, transduced them with reprogramming vectors expressingOCT4 (also known as POU5F1), SOX2, KLF4 and c-Myc (OSKM),and identified the resulting iPSCs by Tra-1-60 staining (Fig. 1a)4. EightshRNA pools reduced reprogramming efficiency (Fig. 1b). Among thetarget genes were OCT4 (included as a control), and EHMT1 andSETDB1, two H3K9 methyltransferases whose histone mark is asso-ciated with transcriptional repression. The remaining five shRNA

pools targeted components of polycomb repressive complexes(PRC),majormediators of gene silencing and heterochromatin forma-tion5. Inhibition of PRC1 (BMI1, RING1) and PRC2 components(EZH2, EED, SUZ12) significantly decreased reprogramming effi-ciency while having negligible effects on cell proliferation (Fig. 1cand Supplementary Fig. 2). This finding is of particular significancegiven that EZH2 is necessary for fusion-based reprogramming6 andhighlights the importance of transcriptional silencing of the somaticcell gene expression program during generation of iPSCs.In contrast to genes whose functions seem to be required for repro-

gramming, inhibition of three genes enhanced reprogramming: YY1,SUV39H1 and DOT1L (Fig. 1b, d). YY1 is a context-dependenttranscriptional activator or repressor7, whereas SUV39H1 is a histoneH3K9 methyltransferase implicated in heterochromatin formation8.Interestingly, enzymes that modify H3K9 were associated with bothinhibition and enhancement of reprogramming, which suggested thatunravelling the mechanisms for their effects might be challenging.Thus, we focused onDOT1L, a histone H3 lysine 79methyltransferasethat has not previously been studied in the context of reprogramming9.We used two hairpin vectors that resulted in the most significantdownregulation of DOT1L and concomitant decrease in globalH3K79 methylation levels (Supplementary Fig. 3a, b). Fibroblastsexpressing DOT1L shRNAs formed significantly more iPSC colonieswhen tested separately or in a context where they were fluorescentlylabelled and co-mixed with control cells (Fig. 2a and SupplementaryFig. 4). This enhanced reprogramming phenotype could be reversedby overexpressing an shRNA-resistant wild-type DOT1L, but not acatalytically inactive DOT1L, indicating that inhibition of catalyticactivity of DOT1L is key to enhance reprogramming10 (Fig. 2a). Ourfindings with dH1fs were applicable to other human fibroblasts, asIMR-90 and MRC-5 cells also showed threefold and sixfold increasesin reprogramming efficiency, respectively, upon DOT1L suppression(Supplementary Fig. 5). To validate our findings independently ofshRNA-mediated knockdown, we used a recently discovered smallmolecule inhibitor of DOT1L catalytic activity. EPZ004777 (ref. 11,referred to as iDot1L) abrogatedH3K79methylation at concentrationsranging from 1mM to 10mM and increased reprogramming efficiencythree- to fourfold (Fig. 2b and Supplementary Fig. 6a, b). Combinationof inhibitor treatment with DOT1L knockdown did not furtherincrease reprogramming efficiency, reinforcing our previousobservation that inhibition of the catalytic activity of DOT1L iskey to reprogramming (Supplementary Fig. 6c). iPSCs generatedthrough DOT1L inhibition showed characteristic ESC morphology,immunoreactivity for SSEA4, SSEA3, Tra-1-81, OCT4 and NANOG,

1Stem Cell Transplantation Program, Division of Pediatric Hematology and Oncology, Manton Center for Orphan Disease Research, Children’s Hospital Boston and Dana Farber Cancer Institute, Boston,Massachusetts, 02115, USA. 2Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, 02115, USA. 3Harvard Stem Cell Institute, Cambridge,Massachusetts, 02138, USA. 4Stem Cell Program, Children’s Hospital Boston, Boston, Massachusetts, 02115, USA. 5German Cancer Research Center, Heidelberg, 69120, Germany. 6Division ofHematology/Oncology, Children’sHospital, HarvardMedical School, Boston,Massachusetts, 02115,USA. 7Departmentof PediatricOncology,HarvardMedical School, Boston,Massachusetts, 02115,USA.8Department of Molecular Biology and Genetics, Bilkent University, Ankara, 06800, Turkey. 9Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02142, USA.10Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, 02142, USA. 11The Broad Institute of Harvard andMassachusetts Institute of Technology, Cambridge, Massachusetts, 02142,USA. 12Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, 02115, USA. 13Division of Hematology, Brigham and Women’s Hospital, Boston, Massachusetts, 02115, USA.14Howard Hughes Medical Institute, Chevy Chase, Maryland, 20815, USA.

5 9 8 | N A T U R E | V O L 4 8 3 | 2 9 M A R C H 2 0 1 2

First published in Nature 483, 598–602 (2012); doi: 10.1038/nature10953

EZH2In pre-clinical development for

patients with genetically defined lymphomas and solid tumors

MAPPING THE HMTomeThe Power of Personalized Therapeutics

Personalized Therapeutics • The Power of Epigenetics

Epizyme is creating personalized therapeutics for patients with genetically defined cancers based on breakthrough discoveries in the field of epigenetics.

MLL

EZH1EZH2

MLL4

SETD1B

SETD1A

MLL2

MLL3

SUV39H1

SUV39H2

EHMT1

EHMT2

SETDB1SETMAR

SETDB2Q6ZW69

MLL5

SETD5

NSD1

WHSC1L1

WHSC1

ASH1L

SETD2

SETD7

SETD8

SUV420H2SUV420H1

SETD6

SETD3

PRDM3

PRDM5

PRDM16

PRDM2

PRDM1

PRDM11

PRDM7

PRDM9

PRDM10

PRDM8

PRDM13

PRDM6

PRDM14

PRDM12

PRDM4

SETD4

SMYD5

SMYD1

SMYD2

SMYD3

SMYD4

PRDM15

The Power of Personalized Therapeutics

MLLMLL4

SETD1BSETD1B

SETD1ASETD1A

MLL2

MLL3

EHMT2

SETDB1SETDB1SETMARSETMAR

SETDB2Q6ZW69Q6ZW69

ASH1L

SUV420H2SUV420H2SUV420H1

SETD6

SETD3

PRDM3

PRDM5

PRDM16

PRDM2

PRDM1

PRDM11

PRDM6

PRDM14

PRDM12PRDM12

PRDM4

SETD4

SMYD5

SMYD1

SMYD2

SMYD3

SMYD4EZH1EZH2

MLL4

SETD7

SETD8

EZH1EZH1EZH2

DOT1LIn Phase I development for

patients with MLL-r, a genetically defined type of acute leukemia

To learn more about moving science forward in genetically defined cancers, visitwww.epizyme.com

Epizyme mapped the HMTome, a therapeutically important class of enzymes known as histone methyltransferases (HMTs) that are proven drivers of diseases such as cancer. The HMTome includes two major families - lysine methyltransferases (KMTs) and arginine methyltransferases (RMTs). Epizyme is creating small molecule HMT inhibitors as personalized therapeutics for the treatment of patients with genetically defined cancers.

DOT1LIn Phase I development for

patients with MLL-r, a genetically defined type of acute leukemia

METTL11A

METTL11B

COQ3

METTL12

METTL13

ECE2

PRMT5

METTL10

METTL20

PRMT7

PRMT10

PRMT6

PRMT2

PRMT3

PRMT1

PRMT8

CARM1

WBSCR22

ALKBH8

WBSCR27

COQ5DOT1L

METTL7B

AS3MT

METTL7A

NSUN4

PNMT

ASMT

NOP2

NSUN7

PRMT9

PRMT11NSUN5B

NNMT

INMT

NSUN5C

NSUN3

NSUN6NSUN2

NSUN5

METTL2A

METTL2B

METTL6

METTL8

C20orf7

Epizyme mapped the HMTome, a therapeutically important class of enzymes known as histone methyltransferases (HMTs) that are proven drivers of diseases such as cancer. The HMTome includes two major families - lysine methyltransferases (KMTs) and arginine methyltransferases (RMTs). Epizyme is creating smallmolecule HMT inhibitors as personalized therapeutics for the treatment of patients with genetically defined cancers.

COQ3

METTL13METTL13

ECE2

PRMT5

METTL10METTL10

METTL20

ASMT

C20orf7

WBSCR22

ALKBH8

WBSCR27

COQ5DOT1L

METTL7B

AS3MT

METTL7A

WBSCR27WBSCR27

DOT1L

METTL7BMETTL7B

AS3MT

S28 NATURE REPRINT COLLECTION Epigenetics

and differentiated into all three embryonic germ layers in vitroand in teratomas (Supplementary Fig. 7a–c). Therefore, iPSCsgenerated following DOT1L inhibition display all of the hallmarks ofpluripotency.We next assessed DOT1L inhibition in murine reprogramming.

iDot1L treatment led to threefold enhancement of reprogrammingof mouse embryonic fibroblasts carrying an OCT4-GFP (greenfluorescent protein) reporter gene (OCT4–GFP MEFs; Fig. 2c).Reprogramming of tail-tip fibroblasts (TTFs) derived from a con-ditional knockout DOT1L mouse strain yielded significantly moreiPSC colonies upon deletion of DOT1L12 (Supplementary Fig. 8a).Cre-mediated excision of both floxed DOT1L alleles in iPSC clonesderived from homozygous TTFs was confirmed by genomic PCR(Supplementary Fig. 8b). DOT1L inhibition also increased reprogram-ming efficiency of MEFs and peripheral blood cells derived from aninducible secondary iPSC mouse strain13 (Supplementary Fig. 8c, d).Taken together, these results demonstrate that DOT1L inhibitionenhances reprogramming of both mouse and human cells.We next examined the cellular mechanisms by which DOT1L

inhibition promotes reprogramming. DOT1L inhibition affected nei-ther retroviral transgene expression nor cellular proliferation(Supplementary Fig. 9a–c). Although previous studies indicated thatDOT1L-null cells have increased apoptosis and accumulation of cellsin G2 phase9, we failed to observe a significant increase in apoptosis orchange in the cell cycle profile of DOT1L-inhibited fibroblasts(Supplementary Fig. 9d, e). In human iPSC clones derived fromshDot1L fibroblasts, DOT1L inhibition was no longer evident, reflect-ing the known silencing of retroviruses that occurs during reprogram-ming (Supplementary Fig. 10a). Quantitative PCR (qPCR) analysis

revealed that the silencing occurred by day 15 after OSKM transduc-tion (Supplementary Fig. 10b, c). To define the crucial timewindow forDOT1L inhibition, we treated fibroblasts with iDot1L at 1-week inter-vals during reprogramming. iDot1L treatment in either the first orsecond week was sufficient to enhance reprogramming, whereas treat-ment in the third week or a 5-day pretreatment had no effect(Supplementary Fig. 10d, e). Immunofluorescence analysis revealedsignificantly greater numbers of Tra-1-60-positive cell clusters onday 10 and day 14 in shDot1L cultures (Supplementary Fig. 11a, b),indicating that the emergence of iPSCs is accelerated upon DOT1Linhibition. When we extended the reprogramming experiments by 10more days, shDot1L cells still yieldedmore iPSC colonies than controls(Supplementary Fig. 11c). Taken together, these findings indicate thatDOT1L inhibition acts in early to middle stages to accelerate andincrease the efficiency of the reprogramming process.To assess whether DOT1L inhibition could replace any of the repro-

gramming factors, we infected control and DOT1L-inhibited fibro-blasts with three factors, omitting one factor at a time. In the absence ofOCT4 or SOX2 no iPSC colonies emerged (Fig. 2d).When we omittedeither KLF4 or c-Myc, DOT1L-inhibited fibroblasts gave rise to robustnumbers of Tra-1-60-positive colonies, whereas control cells gener-ated very few colonies, as reported previously4 (Fig. 2d–f and Sup-plementary Fig. 12a). Importantly, DOT1L-inhibited fibroblaststransduced with only OCT4 and SOX2 gave rise to Tra-1-60-positivecolonies, whereas control fibroblasts did not (Fig. 2d–f). These two-factor iPSCs showed typical ESCmorphology, silenced the reprogram-ming vectors and had all of the hallmarks of pluripotency as gauged byendogenous pluripotency factor expression and the ability to form allthree embryonic germ layers in vitro and in teratomas (Supplementary

dH1f fibroblasts

Day –6shRNA Re-seed OSKM Plate on MEFs

Day –5 Day –1 Day 0 Day 6Tra-1-60Staining

Day 21

a

b

c

shCnt

rl

shSETD

B1

shBm

i1

shRing

1

shEed

shEzh

2

shSuz

12

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

d

**

** **

****

*

**

*

*

shCntrl shSuv39H1shYY1 shDot1L

0.2

0.4

0.6

0.8

1.2

0

1.0

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

0

3.5

3.0

2.5

1.5

2.0

1.0

0.5

SUV39H1 YY1 DOT1L DNMT3A MECP2 NR2F1 DNMT1 SMYD2

CNTRL MBD2 MBD4 EZH1 SUV39H2 MBD1 MBD3G9A

SETDB1 OCT4 BMI1 RING1

Num

ber

of T

ra-1

-60+

col

onie

s

SUV39H1

YY1

DOT1L

DNMT3A

MECP2

NR2F1

DNMT1

SMYD2

CNTR

LMBD2

MBD4

EZH1

SUV39H2

MBD1

G9A

MBD3

SETD

B1

OCT4

BMI1

RING1

SUZ12

EHMT1

EZH2

EED

SUZ12 EHMT1 EZH2 EED

0

20

200

4060

80

100120

140160

180

Figure 1 | Screening for inhibitors and enhancers of reprogramming.a, Timeline of shRNA infection and iPSC generation. b, Number of Tra-1-601

colonies 21 days after OSKM transduction of 25,000 dH1f cells previouslyinfected with pools of shRNAs against the indicated genes. Representative Tra-1-60-stained reprogramming wells are shown. The dotted lines indicates 3standard deviations from the mean number of colonies in control wells.c, Validation of primary screen hits that decrease reprogramming efficiency.

Fold change in Tra-1-601 iPSC colonies relative to control cells. *P, 0.05,**P, 0.01 compared to control shRNA-expressing fibroblasts (n5 4; errorbars,6s.e.m.). Representative Tra-1-60-stained wells are shown. d, Validationof primary screen hits that increase reprogramming efficiency. Fold change inTra-1-601 iPSC colonies relative to control cells. *P, 0.05, **P, 0.01compared to control shRNA-expressing fibroblasts (n5 4; errorbars,6 s.e.m.). Representative Tra-1-60-stained wells are shown.

LETTER RESEARCH

2 9 M A R C H 2 0 1 2 | V O L 4 8 3 | N A T U R E | 5 9 9

Figs 7a–c and 12b). PCR on genomic DNA isolated from expandedcolonies confirmed the absence of integrated KLF4 and c-Myc trans-genes (Supplementary Fig. 12c). Thus, we were able to generate two-factor iPSCs either by suppression of DOT1L expression or chemicalinhibition of its methyltransferase activity.To gain insights into the molecular mechanisms of how DOT1L

inhibition promotes reprogramming and replaces KLF4we performedglobal gene-expression analyses on control and shDot1L fibroblastsbefore and 6 days after OSKM andOSM transduction, along with cellsthat were treated with iDot1L. Relatively few genes were differentiallyexpressed in shDot1L cells on day 6 of reprogramming (22 up, 23down; Supplementary Table 3). Inhibitor-treated cells showed broadergene expressionchanges (405upand175down; SupplementaryTable 3),presumably due to more complete inhibition of K79me2 levels (Fig. 3a).In the absence of KLF4, 94 genes were differentially upregulated inshDot1L cells; intersection of this set of genes with the set differentiallyupregulated in four-factor reprogramming of DOT1L-inhibited cellsyielded only five common genes (Fig. 3a, b). We were particularlyintrigued to find NANOG and LIN28 upregulated in all three instancesof DOT1L inhibition, because these two genes are part of the corepluripotency network of human ESCs14,15 and can reprogram human

fibroblasts into iPSCswhen used in combinationwithOCT4 and SOX2(ref. 16).We explored the possibility that NANOG and LIN28 upregula-

tionmight account for the enhanced reprogramming observed follow-ing DOT1L inhibition, and validated their upregulation in shDot1Lfibroblasts uponOSMorOS transduction (Supplementary Fig. 13a, b).Interestingly, at this early time point REX1 (also known as ZFP42) andDNMT3B, two other well-characterized pluripotency genes, were notupregulated, indicating that DOT1L inhibition does not broadlyupregulate the pluripotency network. Suppression of either Nanogor Lin28 abrogated the two-factor (OS) reprogramming of shDot1Lfibroblasts, indicating the essential roles of NANOG and LIN28 in thisprocess (Fig. 3c and Supplementary Fig. 13c). DOT1L inhibition alsoled to increased NANOG expression in the context of OCT4, SOX2and LIN28 (OSL) and LIN28 expression in the context ofOCT4, SOX2and NANOG (OSN) (Supplementary Fig. 14a). Furthermore, DOT1Linhibition significantly increased the efficiency of three-factor repro-gramming in the context of OSN and OSL (Supplementary Fig. 14b).Finally, inclusion ofNANOGand LIN28 in theOSKMreprogrammingcocktail did not confer any additional enhancement to shDot1L cells(Fig. 4d and Supplementary Fig. 14c). Taken together, these dataimplicate NANOG and LIN28 in the enhancement of reprogrammingand replacement of KLF4 and c-Myc with DOT1L inhibition.To gain insight into the genome-wide chromatin changes that are

facilitated by DOT1L inhibition, we performed chromatin immuno-precipitation followed byDNA sequencing (ChIP-seq) for H3K79me2and H3K27me3 in human ESCs as well as fibroblasts undergoing

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

shCnt

rl

shDot

1L-1

shDot

1L-2

shDot

1L-1

+Dot1L

_wt

shDot

1L-1

+Dot1L

_mut

n = 5

n = 5

n = 5

n = 3

n = 3

*

*

*

a

shCntrl

shDot1L

OSKM SKM OKMOSM OSK OS

iDot1L

UntreatedshCntrlshDot1LUntreatediDot1L

OSKM OSM OSK OS

Tra-

1-60

+ c

olon

ies

50

0

100

150

200

250

d

4.5

Untre

ated

1 μM

3.3 μM

10 μM

iDot1L

*

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

5.0

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

*

*

b c

Num

ber

of A

P+ c

olon

ies

der

ived

fr

om O

CT4

–GFP

ME

Fs

0

50

100

150

200

250

300

350

400

450*

Untre

ated

iDot

1L

ef

Figure 2 | DOT1L inhibition enhances reprogramming efficiency andsubstitutes for KLF4 andMyc. a, Fold change in the reprogramming efficiencyof dH1f cells infected with two independentDOT1L shRNAs or co-infected withshRNA-1 and a vector expressing an shRNA-resistant wild-type or catalyticallydead mutant DOT1L. Data correspond to the average and s.e.m.;n5 independent experiments. *P, 0.01 compared to control shRNA-expressing fibroblasts. b, Fold change in the reprogramming efficiency of dH1fcells treated with iDot1L at the indicated concentrations for 21days. Datacorrespond to the mean6 s.d.; n5 3. *P, 0.001 compared to untreatedfibroblasts. c, Number of alkaline-phosphatase-positive (AP1) colonies derivedfrom OSKM-transduced untreated or iDot1L-treated (10mM) OCT4–GFPMEFs. *P, 0.001 compared untreated MEFs (n5 4; error bars,6 s.d.).Representative AP-stained wells are shown. d, Tra-1-60 stained of plates ofshCntrl and shDot1L fibroblasts in the absence of each factor or both KLF4 andc-Myc. e, Tra-1-60-stained plates of untreated and iDot1L treated (3.3mM)fibroblasts in the absenceof each factor or bothKLF4andc-Myc. f,Quantificationof the Tra-1-601 colonies in Fig. 2d, e representing mean and s.d. of twoindependent experiments done in triplicate.

a

shDot1L_OSKM (22)((22)

L_M

iDot1L_OSKM (405)

(2

shDot1L_OSM (94)

CDO1CHST15COL11A1LEFTY1LEFTY2LIN28ALUM

NANOGPROM1SCG2UPP1

ARL6IP1CADM1INHBAINHBBLEFTY1LIN28ALUM

NANOGNPPBPMEPA1RUNX3UPP1

LEFTY1LIN28ALUM

NANOGUPP1

oKM

b

c d

CDO1CHST15COLL11A1LEFTY1LEFTY2LIN28ALUMNANOGPROM1SCG2UPP1

Unt

reat

ed-b

iore

p1

Unt

reat

ed-b

iore

p2

iDo

t1L-

bio

rep

1iD

ot1

L-b

iore

p2

shC

ntrl-

bio

rep

1sh

Cnt

rl-b

iore

p2

shC

ntrl-

bio

rep

3sh

Do

t1L-

bio

rep

1sh

Do

t1L-

bio

rep

2sh

Do

t1L-

bio

rep

3

0

20

40

60

80

100

120

Cntrl shLin28A shNanog

Num

ber

of T

ra-1

-60+

col

onie

s

shDot1L + OS

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

shCntrl shCntrl+ N2L

shDot1L+ N2L

shDot1L

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

Figure 3 | NANOG and LIN28 are required for enhancement ofreprogramming by DOT1L inhibition. a, Overlap of differentiallyupregulated genes in shDot1L cells 6 days post-OSKM and OSM transductionwith the genes upregulated in OSKM-transduced iDot1L-treated cells. b, Heatmaps showing differential expression levels of commonly upregulated genes inOSKM-transduced DOT1L-inhibited cells. c, Number of Tra-1-601 iPSCcolonies upon knockdown of Nanog or Lin28 in 2-factor reprogramming ofshDot1L cells. Data represent mean and s.e.m of 2 independent experimentsdone in triplicate. d, Fold-change in Tra-1-601 iPSC colonies in 4-factor(OSKM) and 6-factor (OSKMNL) reprogramming of shCntrl and shDot1Lfibroblasts. Data represent mean and s.e.m. of two independent experimentsdone in duplicate. Representative Tra-1-60-stained wells are shown above.

RESEARCH LETTER

6 0 0 | N A T U R E | V O L 4 8 3 | 2 9 M A R C H 2 0 1 2

NATURE REPRINT COLLECTION Epigenetics S29

Figs 7a–c and 12b). PCR on genomic DNA isolated from expandedcolonies confirmed the absence of integrated KLF4 and c-Myc trans-genes (Supplementary Fig. 12c). Thus, we were able to generate two-factor iPSCs either by suppression of DOT1L expression or chemicalinhibition of its methyltransferase activity.To gain insights into the molecular mechanisms of how DOT1L

inhibition promotes reprogramming and replaces KLF4we performedglobal gene-expression analyses on control and shDot1L fibroblastsbefore and 6 days after OSKM andOSM transduction, along with cellsthat were treated with iDot1L. Relatively few genes were differentiallyexpressed in shDot1L cells on day 6 of reprogramming (22 up, 23down; Supplementary Table 3). Inhibitor-treated cells showed broadergene expressionchanges (405upand175down; SupplementaryTable 3),presumably due to more complete inhibition of K79me2 levels (Fig. 3a).In the absence of KLF4, 94 genes were differentially upregulated inshDot1L cells; intersection of this set of genes with the set differentiallyupregulated in four-factor reprogramming of DOT1L-inhibited cellsyielded only five common genes (Fig. 3a, b). We were particularlyintrigued to find NANOG and LIN28 upregulated in all three instancesof DOT1L inhibition, because these two genes are part of the corepluripotency network of human ESCs14,15 and can reprogram human

fibroblasts into iPSCswhen used in combinationwithOCT4 and SOX2(ref. 16).We explored the possibility that NANOG and LIN28 upregula-

tionmight account for the enhanced reprogramming observed follow-ing DOT1L inhibition, and validated their upregulation in shDot1Lfibroblasts uponOSMorOS transduction (Supplementary Fig. 13a, b).Interestingly, at this early time point REX1 (also known as ZFP42) andDNMT3B, two other well-characterized pluripotency genes, were notupregulated, indicating that DOT1L inhibition does not broadlyupregulate the pluripotency network. Suppression of either Nanogor Lin28 abrogated the two-factor (OS) reprogramming of shDot1Lfibroblasts, indicating the essential roles of NANOG and LIN28 in thisprocess (Fig. 3c and Supplementary Fig. 13c). DOT1L inhibition alsoled to increased NANOG expression in the context of OCT4, SOX2and LIN28 (OSL) and LIN28 expression in the context ofOCT4, SOX2and NANOG (OSN) (Supplementary Fig. 14a). Furthermore, DOT1Linhibition significantly increased the efficiency of three-factor repro-gramming in the context of OSN and OSL (Supplementary Fig. 14b).Finally, inclusion ofNANOGand LIN28 in theOSKMreprogrammingcocktail did not confer any additional enhancement to shDot1L cells(Fig. 4d and Supplementary Fig. 14c). Taken together, these dataimplicate NANOG and LIN28 in the enhancement of reprogrammingand replacement of KLF4 and c-Myc with DOT1L inhibition.To gain insight into the genome-wide chromatin changes that are

facilitated by DOT1L inhibition, we performed chromatin immuno-precipitation followed byDNA sequencing (ChIP-seq) for H3K79me2and H3K27me3 in human ESCs as well as fibroblasts undergoing

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

shCnt

rl

shDot

1L-1

shDot

1L-2

shDot

1L-1

+Dot1L

_wt

shDot

1L-1

+Dot1L

_mut

n = 5

n = 5

n = 5

n = 3

n = 3

*

*

*

a

shCntrl

shDot1L

OSKM SKM OKMOSM OSK OS

iDot1L

UntreatedshCntrlshDot1LUntreatediDot1L

OSKM OSM OSK OS

Tra-

1-60

+ c

olon

ies

50

0

100

150

200

250

d

4.5

Untre

ated

1 μM

3.3 μM

10 μM

iDot1L

*

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

5.0

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

*

*

b c

Num

ber

of A

P+ c

olon

ies

der

ived

fr

om O

CT4

–GFP

ME

Fs

0

50

100

150

200

250

300

350

400

450*

Untre

ated

iDot

1L

ef

Figure 2 | DOT1L inhibition enhances reprogramming efficiency andsubstitutes for KLF4 andMyc. a, Fold change in the reprogramming efficiencyof dH1f cells infected with two independentDOT1L shRNAs or co-infected withshRNA-1 and a vector expressing an shRNA-resistant wild-type or catalyticallydead mutant DOT1L. Data correspond to the average and s.e.m.;n5 independent experiments. *P, 0.01 compared to control shRNA-expressing fibroblasts. b, Fold change in the reprogramming efficiency of dH1fcells treated with iDot1L at the indicated concentrations for 21days. Datacorrespond to the mean6 s.d.; n5 3. *P, 0.001 compared to untreatedfibroblasts. c, Number of alkaline-phosphatase-positive (AP1) colonies derivedfrom OSKM-transduced untreated or iDot1L-treated (10mM) OCT4–GFPMEFs. *P, 0.001 compared untreated MEFs (n5 4; error bars,6 s.d.).Representative AP-stained wells are shown. d, Tra-1-60 stained of plates ofshCntrl and shDot1L fibroblasts in the absence of each factor or both KLF4 andc-Myc. e, Tra-1-60-stained plates of untreated and iDot1L treated (3.3mM)fibroblasts in the absenceof each factor or bothKLF4andc-Myc. f,Quantificationof the Tra-1-601 colonies in Fig. 2d, e representing mean and s.d. of twoindependent experiments done in triplicate.

a

shDot1L_OSKM (22)((22)

L_M

iDot1L_OSKM (405)

(2

shDot1L_OSM (94)

CDO1CHST15COL11A1LEFTY1LEFTY2LIN28ALUM

NANOGPROM1SCG2UPP1

ARL6IP1CADM1INHBAINHBBLEFTY1LIN28ALUM

NANOGNPPBPMEPA1RUNX3UPP1

LEFTY1LIN28ALUM

NANOGUPP1

oKM

b

c d

CDO1CHST15COLL11A1LEFTY1LEFTY2LIN28ALUMNANOGPROM1SCG2UPP1

Un

trea

ted

-bio

rep

1U

ntr

eate

d-b

iore

p2

iDo

t1L-

bio

rep

1iD

ot1

L-b

iore

p2

shC

ntr

l-b

iore

p1

shC

ntr

l-b

iore

p2

shC

ntr

l-b

iore

p3

shD

ot1

L-b

iore

p1

shD

ot1

L-b

iore

p2

shD

ot1

L-b

iore

p3

0

20

40

60

80

100

120

Cntrl shLin28A shNanogN

umb

er o

f Tra

-1-6

0+ c

olon

ies

shDot1L + OS

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

shCntrl shCntrl+ N2L

shDot1L+ N2L

shDot1L

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

Figure 3 | NANOG and LIN28 are required for enhancement ofreprogramming by DOT1L inhibition. a, Overlap of differentiallyupregulated genes in shDot1L cells 6 days post-OSKM and OSM transductionwith the genes upregulated in OSKM-transduced iDot1L-treated cells. b, Heatmaps showing differential expression levels of commonly upregulated genes inOSKM-transduced DOT1L-inhibited cells. c, Number of Tra-1-601 iPSCcolonies upon knockdown of Nanog or Lin28 in 2-factor reprogramming ofshDot1L cells. Data represent mean and s.e.m of 2 independent experimentsdone in triplicate. d, Fold-change in Tra-1-601 iPSC colonies in 4-factor(OSKM) and 6-factor (OSKMNL) reprogramming of shCntrl and shDot1Lfibroblasts. Data represent mean and s.e.m. of two independent experimentsdone in duplicate. Representative Tra-1-60-stained wells are shown above.

RESEARCH LETTER

6 0 0 | N A T U R E | V O L 4 8 3 | 2 9 M A R C H 2 0 1 2

Figs 7a–c and 12b). PCR on genomic DNA isolated from expandedcolonies confirmed the absence of integrated KLF4 and c-Myc trans-genes (Supplementary Fig. 12c). Thus, we were able to generate two-factor iPSCs either by suppression of DOT1L expression or chemicalinhibition of its methyltransferase activity.To gain insights into the molecular mechanisms of how DOT1L

inhibition promotes reprogramming and replaces KLF4we performedglobal gene-expression analyses on control and shDot1L fibroblastsbefore and 6 days after OSKM andOSM transduction, along with cellsthat were treated with iDot1L. Relatively few genes were differentiallyexpressed in shDot1L cells on day 6 of reprogramming (22 up, 23down; Supplementary Table 3). Inhibitor-treated cells showed broadergene expressionchanges (405upand175down; SupplementaryTable 3),presumably due to more complete inhibition of K79me2 levels (Fig. 3a).In the absence of KLF4, 94 genes were differentially upregulated inshDot1L cells; intersection of this set of genes with the set differentiallyupregulated in four-factor reprogramming of DOT1L-inhibited cellsyielded only five common genes (Fig. 3a, b). We were particularlyintrigued to find NANOG and LIN28 upregulated in all three instancesof DOT1L inhibition, because these two genes are part of the corepluripotency network of human ESCs14,15 and can reprogram human

fibroblasts into iPSCswhen used in combinationwithOCT4 and SOX2(ref. 16).We explored the possibility that NANOG and LIN28 upregula-

tionmight account for the enhanced reprogramming observed follow-ing DOT1L inhibition, and validated their upregulation in shDot1Lfibroblasts uponOSMorOS transduction (Supplementary Fig. 13a, b).Interestingly, at this early time point REX1 (also known as ZFP42) andDNMT3B, two other well-characterized pluripotency genes, were notupregulated, indicating that DOT1L inhibition does not broadlyupregulate the pluripotency network. Suppression of either Nanogor Lin28 abrogated the two-factor (OS) reprogramming of shDot1Lfibroblasts, indicating the essential roles of NANOG and LIN28 in thisprocess (Fig. 3c and Supplementary Fig. 13c). DOT1L inhibition alsoled to increased NANOG expression in the context of OCT4, SOX2and LIN28 (OSL) and LIN28 expression in the context ofOCT4, SOX2and NANOG (OSN) (Supplementary Fig. 14a). Furthermore, DOT1Linhibition significantly increased the efficiency of three-factor repro-gramming in the context of OSN and OSL (Supplementary Fig. 14b).Finally, inclusion ofNANOGand LIN28 in theOSKMreprogrammingcocktail did not confer any additional enhancement to shDot1L cells(Fig. 4d and Supplementary Fig. 14c). Taken together, these dataimplicate NANOG and LIN28 in the enhancement of reprogrammingand replacement of KLF4 and c-Myc with DOT1L inhibition.To gain insight into the genome-wide chromatin changes that are

facilitated by DOT1L inhibition, we performed chromatin immuno-precipitation followed byDNA sequencing (ChIP-seq) for H3K79me2and H3K27me3 in human ESCs as well as fibroblasts undergoing

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

shCnt

rl

shDot

1L-1

shDot

1L-2

shDot

1L-1

+Dot1L

_wt

shDot

1L-1

+Dot1L

_mut

n = 5

n = 5

n = 5

n = 3

n = 3

*

*

*

a

shCntrl

shDot1L

OSKM SKM OKMOSM OSK OS

iDot1L

UntreatedshCntrlshDot1LUntreatediDot1L

OSKM OSM OSK OS

Tra-

1-60

+ c

olon

ies

50

0

100

150

200

250

d

4.5

Untre

ated

1 μM

3.3 μM

10 μM

iDot1L

*

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

5.0

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

*

*

b c

Num

ber

of A

P+ c

olon

ies

der

ived

fr

om O

CT4

–GFP

ME

Fs

0

50

100

150

200

250

300

350

400

450*

Untre

ated

iDot

1L

ef

Figure 2 | DOT1L inhibition enhances reprogramming efficiency andsubstitutes for KLF4 andMyc. a, Fold change in the reprogramming efficiencyof dH1f cells infected with two independentDOT1L shRNAs or co-infected withshRNA-1 and a vector expressing an shRNA-resistant wild-type or catalyticallydead mutant DOT1L. Data correspond to the average and s.e.m.;n5 independent experiments. *P, 0.01 compared to control shRNA-expressing fibroblasts. b, Fold change in the reprogramming efficiency of dH1fcells treated with iDot1L at the indicated concentrations for 21days. Datacorrespond to the mean6 s.d.; n5 3. *P, 0.001 compared to untreatedfibroblasts. c, Number of alkaline-phosphatase-positive (AP1) colonies derivedfrom OSKM-transduced untreated or iDot1L-treated (10mM) OCT4–GFPMEFs. *P, 0.001 compared untreated MEFs (n5 4; error bars,6 s.d.).Representative AP-stained wells are shown. d, Tra-1-60 stained of plates ofshCntrl and shDot1L fibroblasts in the absence of each factor or both KLF4 andc-Myc. e, Tra-1-60-stained plates of untreated and iDot1L treated (3.3mM)fibroblasts in the absenceof each factor or bothKLF4andc-Myc. f,Quantificationof the Tra-1-601 colonies in Fig. 2d, e representing mean and s.d. of twoindependent experiments done in triplicate.

a

shDot1L_OSKM (22)((22)

L_M

iDot1L_OSKM (405)

(2

shDot1L_OSM (94)

CDO1CHST15COL11A1LEFTY1LEFTY2LIN28ALUM

NANOGPROM1SCG2UPP1

ARL6IP1CADM1INHBAINHBBLEFTY1LIN28ALUM

NANOGNPPBPMEPA1RUNX3UPP1

LEFTY1LIN28ALUM

NANOGUPP1

oKM

b

c d

CDO1CHST15COLL11A1LEFTY1LEFTY2LIN28ALUMNANOGPROM1SCG2UPP1

Unt

reat

ed-b

iore

p1

Unt

reat

ed-b

iore

p2

iDo

t1L-

bio

rep

1iD

ot1

L-b

iore

p2

shC

ntrl-

bio

rep

1sh

Cnt

rl-b

iore

p2

shC

ntrl-

bio

rep

3sh

Do

t1L-

bio

rep

1sh

Do

t1L-

bio

rep

2sh

Do

t1L-

bio

rep

3

0

20

40

60

80

100

120

Cntrl shLin28A shNanog

Num

ber

of T

ra-1

-60+

col

onie

s

shDot1L + OS

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

shCntrl shCntrl+ N2L

shDot1L+ N2L

shDot1L

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

Figure 3 | NANOG and LIN28 are required for enhancement ofreprogramming by DOT1L inhibition. a, Overlap of differentiallyupregulated genes in shDot1L cells 6 days post-OSKM and OSM transductionwith the genes upregulated in OSKM-transduced iDot1L-treated cells. b, Heatmaps showing differential expression levels of commonly upregulated genes inOSKM-transduced DOT1L-inhibited cells. c, Number of Tra-1-601 iPSCcolonies upon knockdown of Nanog or Lin28 in 2-factor reprogramming ofshDot1L cells. Data represent mean and s.e.m of 2 independent experimentsdone in triplicate. d, Fold-change in Tra-1-601 iPSC colonies in 4-factor(OSKM) and 6-factor (OSKMNL) reprogramming of shCntrl and shDot1Lfibroblasts. Data represent mean and s.e.m. of two independent experimentsdone in duplicate. Representative Tra-1-60-stained wells are shown above.

RESEARCH LETTER

6 0 0 | N A T U R E | V O L 4 8 3 | 2 9 M A R C H 2 0 1 2

and differentiated into all three embryonic germ layers in vitroand in teratomas (Supplementary Fig. 7a–c). Therefore, iPSCsgenerated following DOT1L inhibition display all of the hallmarks ofpluripotency.We next assessed DOT1L inhibition in murine reprogramming.

iDot1L treatment led to threefold enhancement of reprogrammingof mouse embryonic fibroblasts carrying an OCT4-GFP (greenfluorescent protein) reporter gene (OCT4–GFP MEFs; Fig. 2c).Reprogramming of tail-tip fibroblasts (TTFs) derived from a con-ditional knockout DOT1L mouse strain yielded significantly moreiPSC colonies upon deletion of DOT1L12 (Supplementary Fig. 8a).Cre-mediated excision of both floxed DOT1L alleles in iPSC clonesderived from homozygous TTFs was confirmed by genomic PCR(Supplementary Fig. 8b). DOT1L inhibition also increased reprogram-ming efficiency of MEFs and peripheral blood cells derived from aninducible secondary iPSC mouse strain13 (Supplementary Fig. 8c, d).Taken together, these results demonstrate that DOT1L inhibitionenhances reprogramming of both mouse and human cells.We next examined the cellular mechanisms by which DOT1L

inhibition promotes reprogramming. DOT1L inhibition affected nei-ther retroviral transgene expression nor cellular proliferation(Supplementary Fig. 9a–c). Although previous studies indicated thatDOT1L-null cells have increased apoptosis and accumulation of cellsin G2 phase9, we failed to observe a significant increase in apoptosis orchange in the cell cycle profile of DOT1L-inhibited fibroblasts(Supplementary Fig. 9d, e). In human iPSC clones derived fromshDot1L fibroblasts, DOT1L inhibition was no longer evident, reflect-ing the known silencing of retroviruses that occurs during reprogram-ming (Supplementary Fig. 10a). Quantitative PCR (qPCR) analysis

revealed that the silencing occurred by day 15 after OSKM transduc-tion (Supplementary Fig. 10b, c). To define the crucial timewindow forDOT1L inhibition, we treated fibroblasts with iDot1L at 1-week inter-vals during reprogramming. iDot1L treatment in either the first orsecond week was sufficient to enhance reprogramming, whereas treat-ment in the third week or a 5-day pretreatment had no effect(Supplementary Fig. 10d, e). Immunofluorescence analysis revealedsignificantly greater numbers of Tra-1-60-positive cell clusters onday 10 and day 14 in shDot1L cultures (Supplementary Fig. 11a, b),indicating that the emergence of iPSCs is accelerated upon DOT1Linhibition. When we extended the reprogramming experiments by 10more days, shDot1L cells still yieldedmore iPSC colonies than controls(Supplementary Fig. 11c). Taken together, these findings indicate thatDOT1L inhibition acts in early to middle stages to accelerate andincrease the efficiency of the reprogramming process.To assess whether DOT1L inhibition could replace any of the repro-

gramming factors, we infected control and DOT1L-inhibited fibro-blasts with three factors, omitting one factor at a time. In the absence ofOCT4 or SOX2 no iPSC colonies emerged (Fig. 2d).When we omittedeither KLF4 or c-Myc, DOT1L-inhibited fibroblasts gave rise to robustnumbers of Tra-1-60-positive colonies, whereas control cells gener-ated very few colonies, as reported previously4 (Fig. 2d–f and Sup-plementary Fig. 12a). Importantly, DOT1L-inhibited fibroblaststransduced with only OCT4 and SOX2 gave rise to Tra-1-60-positivecolonies, whereas control fibroblasts did not (Fig. 2d–f). These two-factor iPSCs showed typical ESCmorphology, silenced the reprogram-ming vectors and had all of the hallmarks of pluripotency as gauged byendogenous pluripotency factor expression and the ability to form allthree embryonic germ layers in vitro and in teratomas (Supplementary

dH1f fibroblasts

Day –6shRNA Re-seed OSKM Plate on MEFs

Day –5 Day –1 Day 0 Day 6Tra-1-60Staining

Day 21

a

b

c

shCnt

rl

shSETD

B1

shBm

i1

shRing

1

shEed

shEzh

2

shSuz

12

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

d

**

** **

****

*

**

*

*

shCntrl shSuv39H1shYY1 shDot1L

0.2

0.4

0.6

0.8

1.2

0

1.0

Fold

cha

nge

in T

ra-1

-60+

col

onie

s0

3.5

3.0

2.5

1.5

2.0

1.0

0.5

SUV39H1 YY1 DOT1L DNMT3A MECP2 NR2F1 DNMT1 SMYD2

CNTRL MBD2 MBD4 EZH1 SUV39H2 MBD1 MBD3G9A

SETDB1 OCT4 BMI1 RING1

Num

ber

of T

ra-1

-60+

col

onie

s

SUV39H1

YY1

DOT1L

DNMT3A

MECP2

NR2F1

DNMT1

SMYD2

CNTR

LMBD2

MBD4

EZH1

SUV39H2

MBD1

G9A

MBD3

SETD

B1

OCT4

BMI1

RING1

SUZ12

EHMT1

EZH2

EED

SUZ12 EHMT1 EZH2 EED

0

20

200

4060

80

100120

140160

180

Figure 1 | Screening for inhibitors and enhancers of reprogramming.a, Timeline of shRNA infection and iPSC generation. b, Number of Tra-1-601

colonies 21 days after OSKM transduction of 25,000 dH1f cells previouslyinfected with pools of shRNAs against the indicated genes. Representative Tra-1-60-stained reprogramming wells are shown. The dotted lines indicates 3standard deviations from the mean number of colonies in control wells.c, Validation of primary screen hits that decrease reprogramming efficiency.

Fold change in Tra-1-601 iPSC colonies relative to control cells. *P, 0.05,**P, 0.01 compared to control shRNA-expressing fibroblasts (n5 4; errorbars,6s.e.m.). Representative Tra-1-60-stained wells are shown. d, Validationof primary screen hits that increase reprogramming efficiency. Fold change inTra-1-601 iPSC colonies relative to control cells. *P, 0.05, **P, 0.01compared to control shRNA-expressing fibroblasts (n5 4; errorbars,6 s.e.m.). Representative Tra-1-60-stained wells are shown.

LETTER RESEARCH

2 9 M A R C H 2 0 1 2 | V O L 4 8 3 | N A T U R E | 5 9 9

S30 NATURE REPRINT COLLECTION Epigenetics

reprogramming, with or without iDot1L treatment (SupplementaryFig. 15). In both ESCs and fibroblasts, H3K79me2 is positively asso-ciated with transcriptionally active genes and negatively associatedwith genes marked by H3K27me3 (Supplementary Fig. 16a–c). ESC-specific genes marked by H3K79me2 included pluripotency factors, asubset of their downstream targets, and genes involved in epithelial celladhesion such as CDH1 (E-cadherin) (280 genes; SupplementaryFig. 17a, b and Supplementary Tables 4, 5). In contrast, in fibroblasts,genes marked by H3K79me2 were significantly enriched in genesinduced during the epithelial to mesenchymal transition (EMT) (377genes; Supplementary Fig. 17a).Among the 348 genes that showed reduced H3K79me2 6 days after

OSKM expression, we likewise found a significant enrichment of genesets associated with the induction of a mesenchymal state, includingSNAI2, TGFB2 and TGFBR1 (Supplementary Fig. 18a)17,18. Only a fewof these genes showed decreased expression at day 6 (12 out of 348),but the vast majority of them lacked this mark in the pluripotent state(272 out of the 348 devoid of H3K79me2 in ESCs), suggesting theywere destined for transcriptional silencing during reprogramming.This finding prompted us to ask whether DOT1L inhibition resultsin the removal of H3K79me2 from such fibroblast-specific, EMT-associated genes. UponDOT1L inhibitor treatment, H3K79me2 levelswere reduced on almost all loci, with the exception of a subsetcomprised mostly of housekeeping genes that also had high levels ofH3K79me2 in ESCs (Supplementary Fig. 19a). Strikingly, the genes

that lost proportionally the most H3K79me2 in inhibitor-treatedfibroblasts during reprogramming (eightfold or more) were againhighly enriched in genes induced in EMT (Supplementary Fig. 19b).Mesenchymal master regulators such as SNAI1, SNAI2, ZEB1, ZEB2and TGFB2 were among these genes (Fig. 4a)19. In the presence of theDOT1L inhibitor, these regulators were more strongly repressedduring reprogramming, whereas epithelial genes such as CDH1 andOCLN were more robustly upregulated (Fig. 4b). The extinction offibroblast gene expression was accompanied by increased depositionof the repressive H3K27me3 mark on the majority of fibroblast-specific regulators examined (Supplementary Fig. 20). In contrast,H3K27me3 was depleted to a greater extent on SOX2 and E-cadherinpromoters, reflecting their activation during reprogramming. Finally,the H3K27me3 status of master regulators of other lineages, such asOLIG2, MYOD1, NKX2-1 and GATA4, remained unchanged uponDOT1L inhibitor treatment, indicating that the deposition ofH3K27me3 was specific to fibroblast-specific regulators.To test the functional importance of downregulationofmesenchymal

regulators in the iDot1L-mediated enhancement of reprogramming,we overexpressed TWIST1, SNAI1 and ZEB1 or added soluble TGF-b2 to cells undergoing reprogramming in the presence of the DOT1Linhibitor. All of these perturbations significantly counteracted theenhancement observed with DOT1L inhibition (Fig. 4c). Interestingly,expression of these factors also abrogated the iDot1L-mediated upregu-lation of NANOG and LIN28, suggesting that the effect of DOT1L

a

b

SNAI1 SNAI2 ZEB1 ZEB2 TGFB2

ESCs

Fib

Fib_OSKM

Fib_iDot1L

Fib_iDot1L _OSKM

ES_H3K27me3

H3K

79m

e2

c

0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

SNAI1

SNAI2

ZEB1

ZEB2

CDH1

OCLN

SNAI1

SNAI2

ZEB1

ZEB2

CDH1

OCLN

Control iDot1LUnt

reat

ed

Vect

or

SNAI1

TWIST1

ZEB1

TGF-β2

Num

ber

of T

ra-1

-60+

col

onie

s

0

20

40

60

80

100

120

140

iDot1L

TGF-β2

ZEB1

TWIST1

SNAI1

Vect

or

Cntrl_

D0

Cntrl_

D60

0.002

0.004

0.006

0.008

0.010

0.012

0

0.02

0.18

0.16

0.14

0.12

0.04

0.06

0.08

0.10

Rel

ativ

e m

RN

A le

vels

Rel

ativ

e m

RN

A le

vels

Rel

ativ

e m

RN

A le

vels

NANOG

LIN28A

iDot1L

d

eD0

D6

D12

D15

TGF-β2

ZEB1

TWIST1

SNAI1

Vect

or

Cntrl_

D0

Cntrl_

D6

iDot1L

Figure 4 | Genome-wide analysis of H3K79me2 marks duringreprogramming. a, H3K79me2 ChIP-sequencing tracks (blue) for selectEMT-associated genes in fibroblasts (Fib) and ESCs along with thecorresponding H3K27me3 tracks in ESCs (red). b, Expression of EMT-associated transcription factors (EMT-TF) and epithelial genes in control andiDot1L-treated fibroblasts at the indicated time points during reprogramming.qPCR was normalized to uninfected fibroblasts for EMT-TFs and H1 ESCs forCDH1 and OCLN. c, Number of Tra-1-601 colonies derived from untreatedand iDot1L-treated (3.3mM) dH1f cells that are either infected with SNAI1,

TWIST1 or ZEB1 expression vectors or treated with soluble TGF-b2(2 ngml21) (n5 3; error bars,6 s.d.). Representative Tra-1-60-stained wellsare shown. d, qRT–PCR quantification of NANOG mRNA level on day 6 ofOSKM-expressing untreated or iDot1L-treated (3.3mM) fibroblasts expressingthe indicated EMT-factors. Expression levels were normalized to thoseobserved in H1 ESCs. e, qRT–PCR quantification of LIN28A mRNA level onday 6 of OSKM-expressing untreated or iDot1L-treated (3.3mM) fibroblastsexpressing the indicated EMT-factors. Expression levels were normalized tothose observed in H1 ESCs.

LETTER RESEARCH

2 9 M A R C H 2 0 1 2 | V O L 4 8 3 | N A T U R E | 6 0 1

Figs 7a–c and 12b). PCR on genomic DNA isolated from expandedcolonies confirmed the absence of integrated KLF4 and c-Myc trans-genes (Supplementary Fig. 12c). Thus, we were able to generate two-factor iPSCs either by suppression of DOT1L expression or chemicalinhibition of its methyltransferase activity.To gain insights into the molecular mechanisms of how DOT1L

inhibition promotes reprogramming and replaces KLF4we performedglobal gene-expression analyses on control and shDot1L fibroblastsbefore and 6 days after OSKM andOSM transduction, along with cellsthat were treated with iDot1L. Relatively few genes were differentiallyexpressed in shDot1L cells on day 6 of reprogramming (22 up, 23down; Supplementary Table 3). Inhibitor-treated cells showed broadergene expressionchanges (405upand175down; SupplementaryTable 3),presumably due to more complete inhibition of K79me2 levels (Fig. 3a).In the absence of KLF4, 94 genes were differentially upregulated inshDot1L cells; intersection of this set of genes with the set differentiallyupregulated in four-factor reprogramming of DOT1L-inhibited cellsyielded only five common genes (Fig. 3a, b). We were particularlyintrigued to find NANOG and LIN28 upregulated in all three instancesof DOT1L inhibition, because these two genes are part of the corepluripotency network of human ESCs14,15 and can reprogram human

fibroblasts into iPSCswhen used in combinationwithOCT4 and SOX2(ref. 16).We explored the possibility that NANOG and LIN28 upregula-

tionmight account for the enhanced reprogramming observed follow-ing DOT1L inhibition, and validated their upregulation in shDot1Lfibroblasts uponOSMorOS transduction (Supplementary Fig. 13a, b).Interestingly, at this early time point REX1 (also known as ZFP42) andDNMT3B, two other well-characterized pluripotency genes, were notupregulated, indicating that DOT1L inhibition does not broadlyupregulate the pluripotency network. Suppression of either Nanogor Lin28 abrogated the two-factor (OS) reprogramming of shDot1Lfibroblasts, indicating the essential roles of NANOG and LIN28 in thisprocess (Fig. 3c and Supplementary Fig. 13c). DOT1L inhibition alsoled to increased NANOG expression in the context of OCT4, SOX2and LIN28 (OSL) and LIN28 expression in the context ofOCT4, SOX2and NANOG (OSN) (Supplementary Fig. 14a). Furthermore, DOT1Linhibition significantly increased the efficiency of three-factor repro-gramming in the context of OSN and OSL (Supplementary Fig. 14b).Finally, inclusion ofNANOGand LIN28 in theOSKMreprogrammingcocktail did not confer any additional enhancement to shDot1L cells(Fig. 4d and Supplementary Fig. 14c). Taken together, these dataimplicate NANOG and LIN28 in the enhancement of reprogrammingand replacement of KLF4 and c-Myc with DOT1L inhibition.To gain insight into the genome-wide chromatin changes that are

facilitated by DOT1L inhibition, we performed chromatin immuno-precipitation followed byDNA sequencing (ChIP-seq) for H3K79me2and H3K27me3 in human ESCs as well as fibroblasts undergoing

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

shCnt

rl

shDot

1L-1

shDot

1L-2

shDot

1L-1

+Dot1L

_wt

shDot

1L-1

+Dot1L

_mut

n = 5

n = 5

n = 5

n = 3

n = 3

*

*

*

a

shCntrl

shDot1L

OSKM SKM OKMOSM OSK OS

iDot1L

UntreatedshCntrlshDot1LUntreatediDot1L

OSKM OSM OSK OS

Tra-

1-60

+ c

olon

ies

50

0

100

150

200

250

d

4.5

Untre

ated

1 μM

3.3 μM

10 μM

iDot1L

*

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

5.0

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

*

*

b c

Num

ber

of A

P+ c

olon

ies

der

ived

fr

om O

CT4

–GFP

ME

Fs

0

50

100

150

200

250

300

350

400

450*

Untre

ated

iDot

1L

ef

Figure 2 | DOT1L inhibition enhances reprogramming efficiency andsubstitutes for KLF4 andMyc. a, Fold change in the reprogramming efficiencyof dH1f cells infected with two independentDOT1L shRNAs or co-infected withshRNA-1 and a vector expressing an shRNA-resistant wild-type or catalyticallydead mutant DOT1L. Data correspond to the average and s.e.m.;n5 independent experiments. *P, 0.01 compared to control shRNA-expressing fibroblasts. b, Fold change in the reprogramming efficiency of dH1fcells treated with iDot1L at the indicated concentrations for 21days. Datacorrespond to the mean6 s.d.; n5 3. *P, 0.001 compared to untreatedfibroblasts. c, Number of alkaline-phosphatase-positive (AP1) colonies derivedfrom OSKM-transduced untreated or iDot1L-treated (10mM) OCT4–GFPMEFs. *P, 0.001 compared untreated MEFs (n5 4; error bars,6 s.d.).Representative AP-stained wells are shown. d, Tra-1-60 stained of plates ofshCntrl and shDot1L fibroblasts in the absence of each factor or both KLF4 andc-Myc. e, Tra-1-60-stained plates of untreated and iDot1L treated (3.3mM)fibroblasts in the absenceof each factor or bothKLF4andc-Myc. f,Quantificationof the Tra-1-601 colonies in Fig. 2d, e representing mean and s.d. of twoindependent experiments done in triplicate.

a

shDot1L_OSKM (22)((22)

L_M

iDot1L_OSKM (405)

(2

shDot1L_OSM (94)

CDO1CHST15COL11A1LEFTY1LEFTY2LIN28ALUM

NANOGPROM1SCG2UPP1

ARL6IP1CADM1INHBAINHBBLEFTY1LIN28ALUM

NANOGNPPBPMEPA1RUNX3UPP1

LEFTY1LIN28ALUM

NANOGUPP1

oKM

b

c d

CDO1CHST15COLL11A1LEFTY1LEFTY2LIN28ALUMNANOGPROM1SCG2UPP1

Unt

reat

ed-b

iore

p1

Unt

reat

ed-b

iore

p2

iDo

t1L-

bio

rep

1iD

ot1

L-b

iore

p2

shC

ntrl-

bio

rep

1sh

Cnt

rl-b

iore

p2

shC

ntrl-

bio

rep

3sh

Do

t1L-

bio

rep

1sh

Do

t1L-

bio

rep

2sh

Do

t1L-

bio

rep

3

0

20

40

60

80

100

120

Cntrl shLin28A shNanog

Num

ber

of T

ra-1

-60+

col

onie

s

shDot1L + OS

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

shCntrl shCntrl+ N2L

shDot1L+ N2L

shDot1L

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

Figure 3 | NANOG and LIN28 are required for enhancement ofreprogramming by DOT1L inhibition. a, Overlap of differentiallyupregulated genes in shDot1L cells 6 days post-OSKM and OSM transductionwith the genes upregulated in OSKM-transduced iDot1L-treated cells. b, Heatmaps showing differential expression levels of commonly upregulated genes inOSKM-transduced DOT1L-inhibited cells. c, Number of Tra-1-601 iPSCcolonies upon knockdown of Nanog or Lin28 in 2-factor reprogramming ofshDot1L cells. Data represent mean and s.e.m of 2 independent experimentsdone in triplicate. d, Fold-change in Tra-1-601 iPSC colonies in 4-factor(OSKM) and 6-factor (OSKMNL) reprogramming of shCntrl and shDot1Lfibroblasts. Data represent mean and s.e.m. of two independent experimentsdone in duplicate. Representative Tra-1-60-stained wells are shown above.

RESEARCH LETTER

6 0 0 | N A T U R E | V O L 4 8 3 | 2 9 M A R C H 2 0 1 2

NATURE REPRINT COLLECTION Epigenetics S31

inhibition on these two pluripotency genes is likely to be indirect(Fig. 4d, e). Conversely, we tested whether destabilization of themesenchymal state by inhibition of TGF-b signalling would be redund-ant with DOT1L inhibition. A small molecule inhibitor of TGF-bsignalling (SB431542) increased reprogramming efficiency, but in com-bination with the DOT1L inhibitor, showed no significant furtherincrease in iPSC colonies (Supplementary Fig. 21). Taken together thesedata indicate that in fibroblasts, downregulation of the mesenchymalgene expression program is critical to enhancement of reprogrammingby DOT1L inhibition.Our loss-of-function survey indicates that chromatin-modifying

enzymes play critical roles for both reactivating silenced loci as wellas reinstating closed domains of heterochromatin during the globalepigenetic remodelling of differentiated cells to pluripotency, thusimplicating specific enzymes as facilitators or barriers to cell fate tran-sitions. DOT1L inhibition seems to enhance reprogramming at least inpart by facilitating loss of H3K79me2 from fibroblast genes whosesilencing is required for reprogramming (Supplementary Fig. 22).Interestingly, KLF4, which can be replaced by DOT1L inhibition,has been shown to facilitate a mesenchymal to epithelial transition(MET) by inducing E-cadherin expression20. Persistent H3K79me2at the fibroblast master regulators during the initial phases of repro-gramming seems to prevent shutdown of these genes, thus hinderingthe acquisition of an epithelial phenotype concomitant with delayedactivation of NANOG and LIN28. In this regard H3K79me2 acts as abarrier to efficient repression of the somatic program by the repro-gramming factors. This notion is consistent with the role of Dot1 inyeast, where it antagonizes gene repression21. As reprogramming ofblood cells is also enhanced by DOT1L inhibition, we speculate thatDOT1L inhibition may enhance reprogramming in a broad range ofcell types by facilitating the silencing of lineage-specific programs ofgene expression. Finally, our results also demonstrate that specificchromatin modifiers can be modulated to generate iPSCs more effi-ciently and with fewer exogenously introduced transcription factors.

METHODS SUMMARYshRNAs were designed using the RNAi Codex22. 97-mer oligonucleotides (Sup-plementary Table 1) were PCR-amplified and cloned into theMSCV-PM23 vector.Reprogramming assays were carried out with either retroviral4 or lentiviral16

reprogramming vectors. dH1f cells were previously described4. For gene expres-sion analyses, total RNA was extracted from two or three independent cultureplates for each condition and transcriptional profiling was performed usingAffymetrix U133Amicroarrays. ChIP-seq was performed as described with slightmodifications12.

Full Methods and any associated references are available in the online version ofthe paper at www.nature.com/nature.

Received 16 May 2011; accepted 16 February 2012.

Published online 4March; corrected 28March 2012 (see full-text HTML version for

details).

1. Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells frommouseembryonic and adult fibroblast cultures by defined factors. Cell 126, 663–676(2006).

2. Hawkins, R. D. et al. Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell 6, 479–491 (2010).

3. Mikkelsen, T. S. et al. Genome-wide maps of chromatin state in pluripotent andlineage-committed cells. Nature 448, 553–560 (2007).

4. Park, I.-H. et al. Reprogramming of human somatic cells to pluripotency withdefined factors. Nature 451, 141–146 (2008).

5. Margueron, R. & Reinberg, D. The Polycomb complex PRC2 and its mark in life.Nature 469, 343–349 (2011).

6. Pereira, C. F. et al. ESCs require PRC2 to direct the successful reprogramming ofdifferentiated cells toward pluripotency. Cell Stem Cell 6, 547–556 (2010).

7. Shi,Y.etal.Transcriptional repressionbyYY1,ahumanGLI-Kruippel-relatedprotein,and relief of repression by adenovirus E1A protein. Cell 67, 377–388 (1991).

8. Schotta, G., Ebert, A. & Reuter, G. S. U. (VAR)3–9 is a conserved key function inheterochromatic gene silencing. Genetica 117, 149–158 (2003).

9. Jones, B. et al. The histone H3K79 methyltransferase Dot1L is essential formammalian development and heterochromatin structure. PLoS Genet. 4,e1000190 (2008).

10. Okada, Y. et al. hDOT1L links histone methylation to leukemogenesis. Cell 121,167–178 (2005).

11. Daigle, S. R.et al.Selective killingofmixed lineage leukemiacellsby apotent small-molecule DOT1L inhibitor. Cancer Cell 20, 53–65 (2011).

12. Bernt, K. M. et al. MLL-rearranged leukemia is dependent on aberrant H3K79methylation by DOT1L. Cancer Cell 20, 66–78 (2011).

13. Carey, B. W. et al. Single-gene transgenic mouse strains for reprogramming adultsomatic cells. Nature Methods 7, 56–59 (2010).

14. Boyer, L. A. et al. Core transcriptional regulatory circuitry in human embryonicstem cells. Cell 122, 947–956 (2005).

15. Mikkelsen, T. S. et al. Dissecting direct reprogramming through integrativegenomic analysis. Nature 454, 49–55 (2008).

16. Yu, J. et al. Induced pluripotent stem cell lines derived from human somatic cells.Science 318, 1917–1920 (2007).

17. Charafe-Jauffret, E. et al. Gene expression profiling of breast cell lines identifiespotential new basal markers. Oncogene 25, 2273–2284 (2006).

18. Onder, T. T.et al. Loss of E-cadherinpromotesmetastasis viamultipledownstreamtranscriptional pathways. Cancer Res. 68, 3645–3654 (2008).

19. Taube, J. H. et al. Core epithelial-to-mesenchymal transition interactome gene-expressionsignature is associatedwith claudin-lowandmetaplastic breast cancersubtypes. Proc. Natl Acad. Sci. USA 107, 15449–15454 (2010).

20. Samavarchi-Tehrani, P. et al. Functional genomics reveals a BMP-drivenmesenchymal-to-epithelial transition in the initiation of somatic cellreprogramming. Cell Stem Cell 7, 64–77 (2010).

21. Stulemeijer, I. J. e. t. a. l. Dot1 binding induces chromatin rearrangements byhistone methylation-dependent and -independent mechanisms. Epigeneticschromatin 4, 2 (2011).

22. Olson, A. et al.RNAiCodex: aportal/database for short-hairpinRNA(shRNA) gene-silencing constructs. Nucleic Acids Res. 34, D153–D157 (2006).

23. Schlabach, M. R. et al. Cancer proliferation gene discovery through functionalgenomics. Science 319, 620–624 (2008).

Supplementary Information is linked to the online version of the paper atwww.nature.com/nature.

AcknowledgementsWe thank G. Hu and S. J. Elledge for providing the MSCV-PMvector, K. Ng and M. W. Lensch for teratoma injections and assessment and S. Loewerfor discussions. We also thank E. Olhava and Epizyme Inc. for synthesizing andproviding the DOT1L inhibitor, EPZ004777. G.Q.D. is an investigator of the HowardHughes Medical Institute. Research was funded by grants from the US NationalInstitutes of Health (NIH) to S.A.A. (CA140575) and G.Q.D., and the CHB Stem CellProgram.

Author Contributions T.T.O. performed project planning, experimental work, datainterpretation and preparation of the manuscript. N.K., A.C, N.Z., J.U. and B.O.M.performed experimental work. P.C. and A.U.S. participated in data analysis. K.M.B. andS.A.A. provided criticalmaterials andparticipated in thepreparation of themanuscript.P.B.G. andE.S.L., participated indataacquisition,data interpretationandpreparationofthe manuscript. G.Q.D. supervised research and participated in project planning, datainterpretation and preparation of the manuscript.

Author Information The microarray and ChIP-seq data have been deposited in theNational Center for Biotechnology Information Gene Expression Omnibus (GEO) andare accessible through GEO Series accession numbers GSE29253 and GSE35791.Reprints and permissions information is available at www.nature.com/reprints. Theauthors declare competing financial interests: details accompany the full-text HTMLversion of the paper at www.nature.com/nature. Readers are welcome to comment onthe online version of this article at www.nature.com/nature. Correspondence andrequests for materials should be addressed to G.Q.D.([email protected]).

RESEARCH LETTER

6 0 2 | N A T U R E | V O L 4 8 3 | 2 9 M A R C H 2 0 1 2

reprogramming, with or without iDot1L treatment (SupplementaryFig. 15). In both ESCs and fibroblasts, H3K79me2 is positively asso-ciated with transcriptionally active genes and negatively associatedwith genes marked by H3K27me3 (Supplementary Fig. 16a–c). ESC-specific genes marked by H3K79me2 included pluripotency factors, asubset of their downstream targets, and genes involved in epithelial celladhesion such as CDH1 (E-cadherin) (280 genes; SupplementaryFig. 17a, b and Supplementary Tables 4, 5). In contrast, in fibroblasts,genes marked by H3K79me2 were significantly enriched in genesinduced during the epithelial to mesenchymal transition (EMT) (377genes; Supplementary Fig. 17a).Among the 348 genes that showed reduced H3K79me2 6 days after

OSKM expression, we likewise found a significant enrichment of genesets associated with the induction of a mesenchymal state, includingSNAI2, TGFB2 and TGFBR1 (Supplementary Fig. 18a)17,18. Only a fewof these genes showed decreased expression at day 6 (12 out of 348),but the vast majority of them lacked this mark in the pluripotent state(272 out of the 348 devoid of H3K79me2 in ESCs), suggesting theywere destined for transcriptional silencing during reprogramming.This finding prompted us to ask whether DOT1L inhibition resultsin the removal of H3K79me2 from such fibroblast-specific, EMT-associated genes. UponDOT1L inhibitor treatment, H3K79me2 levelswere reduced on almost all loci, with the exception of a subsetcomprised mostly of housekeeping genes that also had high levels ofH3K79me2 in ESCs (Supplementary Fig. 19a). Strikingly, the genes

that lost proportionally the most H3K79me2 in inhibitor-treatedfibroblasts during reprogramming (eightfold or more) were againhighly enriched in genes induced in EMT (Supplementary Fig. 19b).Mesenchymal master regulators such as SNAI1, SNAI2, ZEB1, ZEB2and TGFB2 were among these genes (Fig. 4a)19. In the presence of theDOT1L inhibitor, these regulators were more strongly repressedduring reprogramming, whereas epithelial genes such as CDH1 andOCLN were more robustly upregulated (Fig. 4b). The extinction offibroblast gene expression was accompanied by increased depositionof the repressive H3K27me3 mark on the majority of fibroblast-specific regulators examined (Supplementary Fig. 20). In contrast,H3K27me3 was depleted to a greater extent on SOX2 and E-cadherinpromoters, reflecting their activation during reprogramming. Finally,the H3K27me3 status of master regulators of other lineages, such asOLIG2, MYOD1, NKX2-1 and GATA4, remained unchanged uponDOT1L inhibitor treatment, indicating that the deposition ofH3K27me3 was specific to fibroblast-specific regulators.To test the functional importance of downregulationofmesenchymal

regulators in the iDot1L-mediated enhancement of reprogramming,we overexpressed TWIST1, SNAI1 and ZEB1 or added soluble TGF-b2 to cells undergoing reprogramming in the presence of the DOT1Linhibitor. All of these perturbations significantly counteracted theenhancement observed with DOT1L inhibition (Fig. 4c). Interestingly,expression of these factors also abrogated the iDot1L-mediated upregu-lation of NANOG and LIN28, suggesting that the effect of DOT1L

a

b

SNAI1 SNAI2 ZEB1 ZEB2 TGFB2

ESCs

Fib

Fib_OSKM

Fib_iDot1L

Fib_iDot1L _OSKM

ES_H3K27me3

H3K

79m

e2

c

0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

SNAI1

SNAI2

ZEB1

ZEB2

CDH1

OCLN

SNAI1

SNAI2

ZEB1

ZEB2

CDH1

OCLN

Control iDot1LUnt

reat

ed

Vect

or

SNAI1

TWIST1

ZEB1

TGF-β2

Num

ber

of T

ra-1

-60+

col

onie

s

0

20

40

60

80

100

120

140

iDot1L

TGF-β2

ZEB1

TWIST1

SNAI1

Vect

or

Cntrl_

D0

Cntrl_

D60

0.002

0.004

0.006

0.008

0.010

0.012

0

0.02

0.18

0.16

0.14

0.12

0.04

0.06

0.08

0.10

Rel

ativ

e m

RN

A le

vels

Rel

ativ

e m

RN

A le

vels

Rel

ativ

e m

RN

A le

vels

NANOG

LIN28A

iDot1L

d

eD0

D6

D12

D15

TGF-β2

ZEB1

TWIST1

SNAI1

Vect

or

Cntrl_

D0

Cntrl_

D6

iDot1L

Figure 4 | Genome-wide analysis of H3K79me2 marks duringreprogramming. a, H3K79me2 ChIP-sequencing tracks (blue) for selectEMT-associated genes in fibroblasts (Fib) and ESCs along with thecorresponding H3K27me3 tracks in ESCs (red). b, Expression of EMT-associated transcription factors (EMT-TF) and epithelial genes in control andiDot1L-treated fibroblasts at the indicated time points during reprogramming.qPCR was normalized to uninfected fibroblasts for EMT-TFs and H1 ESCs forCDH1 and OCLN. c, Number of Tra-1-601 colonies derived from untreatedand iDot1L-treated (3.3mM) dH1f cells that are either infected with SNAI1,

TWIST1 or ZEB1 expression vectors or treated with soluble TGF-b2(2 ngml21) (n5 3; error bars,6 s.d.). Representative Tra-1-60-stained wellsare shown. d, qRT–PCR quantification of NANOG mRNA level on day 6 ofOSKM-expressing untreated or iDot1L-treated (3.3mM) fibroblasts expressingthe indicated EMT-factors. Expression levels were normalized to thoseobserved in H1 ESCs. e, qRT–PCR quantification of LIN28A mRNA level onday 6 of OSKM-expressing untreated or iDot1L-treated (3.3mM) fibroblastsexpressing the indicated EMT-factors. Expression levels were normalized tothose observed in H1 ESCs.

LETTER RESEARCH

2 9 M A R C H 2 0 1 2 | V O L 4 8 3 | N A T U R E | 6 0 1

Figs 7a–c and 12b). PCR on genomic DNA isolated from expandedcolonies confirmed the absence of integrated KLF4 and c-Myc trans-genes (Supplementary Fig. 12c). Thus, we were able to generate two-factor iPSCs either by suppression of DOT1L expression or chemicalinhibition of its methyltransferase activity.To gain insights into the molecular mechanisms of how DOT1L

inhibition promotes reprogramming and replaces KLF4we performedglobal gene-expression analyses on control and shDot1L fibroblastsbefore and 6 days after OSKM andOSM transduction, along with cellsthat were treated with iDot1L. Relatively few genes were differentiallyexpressed in shDot1L cells on day 6 of reprogramming (22 up, 23down; Supplementary Table 3). Inhibitor-treated cells showed broadergene expressionchanges (405upand175down; SupplementaryTable 3),presumably due to more complete inhibition of K79me2 levels (Fig. 3a).In the absence of KLF4, 94 genes were differentially upregulated inshDot1L cells; intersection of this set of genes with the set differentiallyupregulated in four-factor reprogramming of DOT1L-inhibited cellsyielded only five common genes (Fig. 3a, b). We were particularlyintrigued to find NANOG and LIN28 upregulated in all three instancesof DOT1L inhibition, because these two genes are part of the corepluripotency network of human ESCs14,15 and can reprogram human

fibroblasts into iPSCswhen used in combinationwithOCT4 and SOX2(ref. 16).We explored the possibility that NANOG and LIN28 upregula-

tionmight account for the enhanced reprogramming observed follow-ing DOT1L inhibition, and validated their upregulation in shDot1Lfibroblasts uponOSMorOS transduction (Supplementary Fig. 13a, b).Interestingly, at this early time point REX1 (also known as ZFP42) andDNMT3B, two other well-characterized pluripotency genes, were notupregulated, indicating that DOT1L inhibition does not broadlyupregulate the pluripotency network. Suppression of either Nanogor Lin28 abrogated the two-factor (OS) reprogramming of shDot1Lfibroblasts, indicating the essential roles of NANOG and LIN28 in thisprocess (Fig. 3c and Supplementary Fig. 13c). DOT1L inhibition alsoled to increased NANOG expression in the context of OCT4, SOX2and LIN28 (OSL) and LIN28 expression in the context ofOCT4, SOX2and NANOG (OSN) (Supplementary Fig. 14a). Furthermore, DOT1Linhibition significantly increased the efficiency of three-factor repro-gramming in the context of OSN and OSL (Supplementary Fig. 14b).Finally, inclusion ofNANOGand LIN28 in theOSKMreprogrammingcocktail did not confer any additional enhancement to shDot1L cells(Fig. 4d and Supplementary Fig. 14c). Taken together, these dataimplicate NANOG and LIN28 in the enhancement of reprogrammingand replacement of KLF4 and c-Myc with DOT1L inhibition.To gain insight into the genome-wide chromatin changes that are

facilitated by DOT1L inhibition, we performed chromatin immuno-precipitation followed byDNA sequencing (ChIP-seq) for H3K79me2and H3K27me3 in human ESCs as well as fibroblasts undergoing

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

shCnt

rl

shDot

1L-1

shDot

1L-2

shDot

1L-1

+Dot1L

_wt

shDot

1L-1

+Dot1L

_mut

n = 5

n = 5

n = 5

n = 3

n = 3

*

*

*

a

shCntrl

shDot1L

OSKM SKM OKMOSM OSK OS

iDot1L

UntreatedshCntrlshDot1LUntreatediDot1L

OSKM OSM OSK OS

Tra-

1-60

+ c

olon

ies

50

0

100

150

200

250

d

4.5

Untre

ated

1 μM

3.3 μM

10 μM

iDot1L

*

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

5.0

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

*

*

b c

Num

ber

of A

P+ c

olon

ies

der

ived

fr

om O

CT4

–GFP

ME

Fs

0

50

100

150

200

250

300

350

400

450*

Untre

ated

iDot

1L

ef

Figure 2 | DOT1L inhibition enhances reprogramming efficiency andsubstitutes for KLF4 andMyc. a, Fold change in the reprogramming efficiencyof dH1f cells infected with two independentDOT1L shRNAs or co-infected withshRNA-1 and a vector expressing an shRNA-resistant wild-type or catalyticallydead mutant DOT1L. Data correspond to the average and s.e.m.;n5 independent experiments. *P, 0.01 compared to control shRNA-expressing fibroblasts. b, Fold change in the reprogramming efficiency of dH1fcells treated with iDot1L at the indicated concentrations for 21days. Datacorrespond to the mean6 s.d.; n5 3. *P, 0.001 compared to untreatedfibroblasts. c, Number of alkaline-phosphatase-positive (AP1) colonies derivedfrom OSKM-transduced untreated or iDot1L-treated (10mM) OCT4–GFPMEFs. *P, 0.001 compared untreated MEFs (n5 4; error bars,6 s.d.).Representative AP-stained wells are shown. d, Tra-1-60 stained of plates ofshCntrl and shDot1L fibroblasts in the absence of each factor or both KLF4 andc-Myc. e, Tra-1-60-stained plates of untreated and iDot1L treated (3.3mM)fibroblasts in the absenceof each factor or bothKLF4andc-Myc. f,Quantificationof the Tra-1-601 colonies in Fig. 2d, e representing mean and s.d. of twoindependent experiments done in triplicate.

a

shDot1L_OSKM (22)((22)

L_M

iDot1L_OSKM (405)

(2

shDot1L_OSM (94)

CDO1CHST15COL11A1LEFTY1LEFTY2LIN28ALUM

NANOGPROM1SCG2UPP1

ARL6IP1CADM1INHBAINHBBLEFTY1LIN28ALUM

NANOGNPPBPMEPA1RUNX3UPP1

LEFTY1LIN28ALUM

NANOGUPP1

oKM

b

c d

CDO1CHST15COLL11A1LEFTY1LEFTY2LIN28ALUMNANOGPROM1SCG2UPP1

Unt

reat

ed-b

iore

p1

Unt

reat

ed-b

iore

p2

iDo

t1L-

bio

rep

1iD

ot1

L-b

iore

p2

shC

ntrl-

bio

rep

1sh

Cnt

rl-b

iore

p2

shC

ntrl-

bio

rep

3sh

Do

t1L-

bio

rep

1sh

Do

t1L-

bio

rep

2sh

Do

t1L-

bio

rep

3

0

20

40

60

80

100

120

Cntrl shLin28A shNanog

Num

ber

of T

ra-1

-60+

col

onie

s

shDot1L + OS

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

shCntrl shCntrl+ N2L

shDot1L+ N2L

shDot1L

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

Figure 3 | NANOG and LIN28 are required for enhancement ofreprogramming by DOT1L inhibition. a, Overlap of differentiallyupregulated genes in shDot1L cells 6 days post-OSKM and OSM transductionwith the genes upregulated in OSKM-transduced iDot1L-treated cells. b, Heatmaps showing differential expression levels of commonly upregulated genes inOSKM-transduced DOT1L-inhibited cells. c, Number of Tra-1-601 iPSCcolonies upon knockdown of Nanog or Lin28 in 2-factor reprogramming ofshDot1L cells. Data represent mean and s.e.m of 2 independent experimentsdone in triplicate. d, Fold-change in Tra-1-601 iPSC colonies in 4-factor(OSKM) and 6-factor (OSKMNL) reprogramming of shCntrl and shDot1Lfibroblasts. Data represent mean and s.e.m. of two independent experimentsdone in duplicate. Representative Tra-1-60-stained wells are shown above.

RESEARCH LETTER

6 0 0 | N A T U R E | V O L 4 8 3 | 2 9 M A R C H 2 0 1 2

and differentiated into all three embryonic germ layers in vitroand in teratomas (Supplementary Fig. 7a–c). Therefore, iPSCsgenerated following DOT1L inhibition display all of the hallmarks ofpluripotency.We next assessed DOT1L inhibition in murine reprogramming.

iDot1L treatment led to threefold enhancement of reprogrammingof mouse embryonic fibroblasts carrying an OCT4-GFP (greenfluorescent protein) reporter gene (OCT4–GFP MEFs; Fig. 2c).Reprogramming of tail-tip fibroblasts (TTFs) derived from a con-ditional knockout DOT1L mouse strain yielded significantly moreiPSC colonies upon deletion of DOT1L12 (Supplementary Fig. 8a).Cre-mediated excision of both floxed DOT1L alleles in iPSC clonesderived from homozygous TTFs was confirmed by genomic PCR(Supplementary Fig. 8b). DOT1L inhibition also increased reprogram-ming efficiency of MEFs and peripheral blood cells derived from aninducible secondary iPSC mouse strain13 (Supplementary Fig. 8c, d).Taken together, these results demonstrate that DOT1L inhibitionenhances reprogramming of both mouse and human cells.We next examined the cellular mechanisms by which DOT1L

inhibition promotes reprogramming. DOT1L inhibition affected nei-ther retroviral transgene expression nor cellular proliferation(Supplementary Fig. 9a–c). Although previous studies indicated thatDOT1L-null cells have increased apoptosis and accumulation of cellsin G2 phase9, we failed to observe a significant increase in apoptosis orchange in the cell cycle profile of DOT1L-inhibited fibroblasts(Supplementary Fig. 9d, e). In human iPSC clones derived fromshDot1L fibroblasts, DOT1L inhibition was no longer evident, reflect-ing the known silencing of retroviruses that occurs during reprogram-ming (Supplementary Fig. 10a). Quantitative PCR (qPCR) analysis

revealed that the silencing occurred by day 15 after OSKM transduc-tion (Supplementary Fig. 10b, c). To define the crucial timewindow forDOT1L inhibition, we treated fibroblasts with iDot1L at 1-week inter-vals during reprogramming. iDot1L treatment in either the first orsecond week was sufficient to enhance reprogramming, whereas treat-ment in the third week or a 5-day pretreatment had no effect(Supplementary Fig. 10d, e). Immunofluorescence analysis revealedsignificantly greater numbers of Tra-1-60-positive cell clusters onday 10 and day 14 in shDot1L cultures (Supplementary Fig. 11a, b),indicating that the emergence of iPSCs is accelerated upon DOT1Linhibition. When we extended the reprogramming experiments by 10more days, shDot1L cells still yieldedmore iPSC colonies than controls(Supplementary Fig. 11c). Taken together, these findings indicate thatDOT1L inhibition acts in early to middle stages to accelerate andincrease the efficiency of the reprogramming process.To assess whether DOT1L inhibition could replace any of the repro-

gramming factors, we infected control and DOT1L-inhibited fibro-blasts with three factors, omitting one factor at a time. In the absence ofOCT4 or SOX2 no iPSC colonies emerged (Fig. 2d).When we omittedeither KLF4 or c-Myc, DOT1L-inhibited fibroblasts gave rise to robustnumbers of Tra-1-60-positive colonies, whereas control cells gener-ated very few colonies, as reported previously4 (Fig. 2d–f and Sup-plementary Fig. 12a). Importantly, DOT1L-inhibited fibroblaststransduced with only OCT4 and SOX2 gave rise to Tra-1-60-positivecolonies, whereas control fibroblasts did not (Fig. 2d–f). These two-factor iPSCs showed typical ESCmorphology, silenced the reprogram-ming vectors and had all of the hallmarks of pluripotency as gauged byendogenous pluripotency factor expression and the ability to form allthree embryonic germ layers in vitro and in teratomas (Supplementary

dH1f fibroblasts

Day –6shRNA Re-seed OSKM Plate on MEFs

Day –5 Day –1 Day 0 Day 6Tra-1-60Staining

Day 21

a

b

c

shCnt

rl

shSETD

B1

shBm

i1

shRing

1

shEed

shEzh

2

shSuz

12

Fold

cha

nge

in T

ra-1

-60+

col

onie

s

d

**

** **

****

*

**

*

*

shCntrl shSuv39H1shYY1 shDot1L

0.2

0.4

0.6

0.8

1.2

0

1.0

Fold

cha

nge

in T

ra-1

-60+

col

onie

s0

3.5

3.0

2.5

1.5

2.0

1.0

0.5

SUV39H1 YY1 DOT1L DNMT3A MECP2 NR2F1 DNMT1 SMYD2

CNTRL MBD2 MBD4 EZH1 SUV39H2 MBD1 MBD3G9A

SETDB1 OCT4 BMI1 RING1

Num

ber

of T

ra-1

-60+

col

onie

s

SUV39H1

YY1

DOT1L

DNMT3A

MECP2

NR2F1

DNMT1

SMYD2

CNTR

LMBD2

MBD4

EZH1

SUV39H2

MBD1

G9A

MBD3

SETD

B1

OCT4

BMI1

RING1

SUZ12

EHMT1

EZH2

EED

SUZ12 EHMT1 EZH2 EED

0

20

200

4060

80

100120

140160

180

Figure 1 | Screening for inhibitors and enhancers of reprogramming.a, Timeline of shRNA infection and iPSC generation. b, Number of Tra-1-601

colonies 21 days after OSKM transduction of 25,000 dH1f cells previouslyinfected with pools of shRNAs against the indicated genes. Representative Tra-1-60-stained reprogramming wells are shown. The dotted lines indicates 3standard deviations from the mean number of colonies in control wells.c, Validation of primary screen hits that decrease reprogramming efficiency.

Fold change in Tra-1-601 iPSC colonies relative to control cells. *P, 0.05,**P, 0.01 compared to control shRNA-expressing fibroblasts (n5 4; errorbars,6s.e.m.). Representative Tra-1-60-stained wells are shown. d, Validationof primary screen hits that increase reprogramming efficiency. Fold change inTra-1-601 iPSC colonies relative to control cells. *P, 0.05, **P, 0.01compared to control shRNA-expressing fibroblasts (n5 4; errorbars,6 s.e.m.). Representative Tra-1-60-stained wells are shown.

LETTER RESEARCH

2 9 M A R C H 2 0 1 2 | V O L 4 8 3 | N A T U R E | 5 9 9

S32 NATURE REPRINT COLLECTION Epigenetics

ARTICLEdoi:10.1038/nature11213

Novel mutations target distinctsubgroups of medulloblastomaGiles Robinson1,2,3*, Matthew Parker1,4*, Tanya A. Kranenburg1,2*, Charles Lu1,5, Xiang Chen1,4, Li Ding1,5,6,Timothy N. Phoenix1,2, Erin Hedlund1,4, Lei Wei1,4,7, Xiaoyan Zhu1,2, Nader Chalhoub1,2, Suzanne J. Baker1,2, Robert Huether1,4,8,Richard Kriwacki1,8, Natasha Curley1,2, Radhika Thiruvenkatam1,2, Jianmin Wang1,9, Gang Wu1,4, Michael Rusch1,4, Xin Hong1,5,Jared Becksfort1,9, Pankaj Gupta1,9, Jing Ma1,7, John Easton1,4, Bhavin Vadodaria1,4, Arzu Onar-Thomas1,10, Tong Lin1,10,Shaoyi Li1,10, Stanley Pounds1,10, Steven Paugh1,11, David Zhao1,9, Daisuke Kawauchi1,12, Martine F. Roussel1,12,David Finkelstein1,4, David W. Ellison1,7, Ching C. Lau1,13, Eric Bouffet1,14, Tim Hassall1,15, Sridharan Gururangan1,16,Richard Cohn1,17, Robert S. Fulton1,5,6, Lucinda L. Fulton1,5,6, David J. Dooling1,5,6, Kerri Ochoa1,5,6, Amar Gajjar1,3,Elaine R. Mardis1,5,6,18, Richard K. Wilson1,5,6,19, James R. Downing1,7, Jinghui Zhang1,4 & Richard J. Gilbertson1,2,3

Medulloblastoma is amalignant childhood brain tumour comprising four discrete subgroups.Here, to identifymutationsthat drivemedulloblastoma,we sequenced the entire genomes of 37 tumours andmatched normal blood. One-hundredand thirty-six genes harbouring somatic mutations in this discovery set were sequenced in an additional 56medulloblastomas. Recurrent mutations were detected in 41 genes not yet implicated in medulloblastoma; severaltarget distinct components of the epigenetic machinery in different disease subgroups, such as regulators of H3K27and H3K4 trimethylation in subgroups 3 and 4 (for example, KDM6A and ZMYM3), and CTNNB1-associated chromatinre-modellers inWNT-subgroup tumours (for example, SMARCA4 and CREBBP). Modelling ofmutations inmouse lowerrhombic lip progenitors that generateWNT-subgroup tumours identified genes that maintain this cell lineage (DDX3X),aswell asmutated genes that initiate (CDH1) or cooperate (PIK3CA) in tumorigenesis. These data provide important newinsights into the pathogenesis of medulloblastoma subgroups and highlight targets for therapeutic development.

Medulloblastoma is the most common malignant childhood braintumour1. The disease includes four subgroups (sonic hedgehog (SHH)subgroup, WNT subgroup, subgroup 3 and subgroup 4), definedprimarily by gene expression profiling, that show differences inkaryotype, histology and prognosis2. Studies of genetically engineeredmice show that these tumours arise from different cell types: SHH-subgroup medulloblastomas develop from committed cerebellargranule neuron progenitors (GNPs) in Ptch11/2 mice3,4; WNT-subgroup tumours are generated by lower rhombic lip progenitors(LRLPs) in Blbp-Cre;Ctnnb11/lox(Ex3);Tp53flx/flx mice5; whereassubgroup-3medulloblastomas probably arise from an undefined classof cerebellar progenitors6. The identification ofmedulloblastoma sub-groups has not changed clinical practice. All patients currentlyreceive the same combination of surgery, radiation and chemotherapy.This aggressive treatment fails to cure two thirds of patients withsubgroup-3 disease, and probably over-treats children with WNT-subgroup medulloblastoma who invariably survive with long-termcognitive and endocrine side effects2,7. Drugs targeting the geneticalterations that drive each medulloblastoma subgroup could provemore effective and less toxic, but the identity of these alterationsremains largely unknown.

The genomic landscape of medulloblastomaTo identify genetic alterations that drive medulloblastoma, we per-formed whole-genome sequencing (WGS) of DNA from 37 tumoursand matched normal blood (discovery cohort). Tumours were sub-grouped by gene expression (WNT subgroup, n5 5; SHH subgroup,n5 5; subgroup 3, n5 6; subgroup 4, n5 19; ‘unclassified’ (profilesnot available), n5 2; Fig. 1, Supplementary Figs 1–3 and Sup-plementary Table 1). Validation of all putative somatic alterationsincluding single nucleotide variations (SNVs), insertion/deletions(indels) and structural variations (SVs) identified by CREST8, wasconducted for 12 tumours using custom capture arrays andIllumina-based DNA sequencing (Supplementary Table 2). Putativecoding alterations and SVs were validated in the remaining 25discovery cohort cases by polymerase chain reaction (PCR) andSanger-based sequencing. Mutation frequency was determined in aseparate ‘validation cohort’ of 56medulloblastomas (WNT subgroup,n5 6; SHH subgroup, n5 8; subgroup 3, n5 11; subgroup 4, n5 19;unclassified, n5 12; Fig. 1 and Supplementary Table 1).WGS of the discovery cohort detected 22,887 validated or high-

quality somatic sequence mutations (SNVs and indels), 536 validatedor curated SVs, and 5,802 copy number variations (CNVs; 92%

*These authors contributed equally to this work.

1St Jude Children’s Research Hospital, Washington University Pediatric Cancer Genome Project, Memphis, Tennessee 38105, USA. 2Department of Developmental Neurobiology, St Jude Children’sResearch Hospital, Memphis, Tennessee 38105, USA. 3Department of Oncology, St Jude Children’s Research Hospital, Memphis, Tennessee 38105, USA. 4Department of Computational Biology andBioinformatics, St Jude Children’s Research Hospital, Memphis, Tennessee 38105, USA. 5The Genome Institute, Washington University School of Medicine in St Louis, St Louis, Missouri 63108, USA.6Department of Genetics, Washington University School of Medicine in St Louis, St Louis, Missouri 63108, USA. 7Department of Pathology, St Jude Children’s Research Hospital, Memphis, Tennessee38105, USA. 8Department of Structural Biology, St Jude Children’s Research Hospital, Memphis, Tennessee 38105, USA. 9Department of Information Sciences, St Jude Children’s Research Hospital,Memphis, Tennessee 38105, USA. 10Department of Biostatistics, St Jude Children’s Research Hospital, Memphis, Tennessee 38105, USA. 11Department of Pharmaceutical Sciences, St Jude Children’sResearch Hospital, Memphis, Tennessee 38105, USA. 12Department of Tumour Biology and Genetics, St Jude Children’s Research Hospital, Memphis, Tennessee 38105, USA. 13Texas Children’s Cancerand Hematology Centers, 6701 Fannin Street, Ste. 1420, Houston, Texas 77030, USA. 14The Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M5G 1X8, Canada. 15The Royal Children’sHospital, 50 Flemington Road, Parkville, Victoria 3052, Australia. 16Duke University Medical Center, 102382, Durham, North Carolina 27710, USA. 17The School of Women’s and Children’s Health,University of New South Wales, Kensington, New South Wales NSW 2052, Australia. 18Siteman Cancer Center, Washington University School of Medicine in St Louis, St Louis, Missouri 63108, USA.19Department of Medicine, Washington University School of Medicine in St Louis, St Louis, Missouri 63108, USA.

2 A U G U S T 2 0 1 2 | V O L 4 8 8 | N A T U R E | 4 3

First published in Nature 488, 43–48 (2012); doi: 10.1038/nature11213

NATURE REPRINT COLLECTION Epigenetics S33

concordant with 6.0 SNP mapping arrays; Supplementary Tables 3–6and Supplementary Figs 4–7). In all but five tumours with the highestmutation rates, .50% of SNVs were CRT/GRA transitions(Supplementary Fig. 8). The mean missense:silent mutation ratiowas 3.6:1 and 40% of all missense mutations were predicted to bedeleterious, suggesting a selective pressure for SNVs that affect proteincoding (Supplementary Table 5). Global patterns of total SNVsand amplifications varied significantly among medulloblastomasubgroups, even when corrected for age and sex, supporting the notionthat these tumours are distinct pathological entities (Fig. 1 andSupplementary Fig. 6). Custom capture-based analysis of the allelefrequency of all somatic mutations in 12 medulloblastomas allowedus to predict the ancestry of certain genetic alterations, suggestingthat aneuploidy precedes widespread sequence mutation in medullo-blastomas with highly mutated genomes (Supplementary Figs 9–11).

Novel CNVs and SVs are rare in medulloblastomaThe repertoire of focally amplified or deleted genes seems to be verylimited in medulloblastoma. We detected expected2 gains of MYC,MYCN and OTX2 in subgroups 3 and 4, but no novel recurrentamplifications (Fig. 1, Supplementary Fig. 12 and SupplementaryTable 7). In keeping with recent reports9, high-level amplification ofMYCN in subgroup-3 sample no. 16 (samplenumbering as in Fig. 1)wasgenerated by chromothripsis; although chromothripsis was observedinfrequently (n5 2/37 of the discovery cohort; Supplementary Fig. 13).Focal homo- or heterozygous deletions of genes previously impli-

cated in medulloblastoma were also detected (for example, PTCH1,PTEN; Fig. 1)10,11 but novel recurrent focal deletions were rare. Threesubgroup-4 tumours (nos 11–13) and one unclassified tumourdeleted DDX31, AK8 and TSC1 at chromosome 9q34.14 in concertwith OTX2 amplification, suggesting that these alterations are coop-erative (P, 0.0005, Fisher’s exact test). The breakpoint in thisdeletion occurs in DDX31, and two samples contained a missensemutation (subgroup 4, no. 15) and complex rearrangement(unidentified case SJMB026) in this gene, suggesting that DDX31 isthe target of these alterations (Supplementary Fig. 14).Over 50% of SVs detected by WGS broke the coding region of at

least one gene, but less than 2% (n5 6/314, excluding two tumours

with excessive SVs) encode potential in-frame fusion proteins(Supplementary Fig. 15); none affect the same gene or signal pathway.Therefore, fusion proteins are likely to be an uncommon transform-ing mechanism in medulloblastoma.Although germline mutations in TP53, PTCH1, APC and CREBBP

predispose to medulloblastoma11–14, only 23 mutations previouslyassociated with cancer were detected in discovery cohort germ lines.Only one of these—in a known case of Turcot’s syndrome—wasaccompanied by a somatic mutation (germline APC Y935*/somaticdeletion; WNT subgroup no. 11; Supplementary Table 8). Thus,inherited forms of medulloblastoma seem to be rare in our cohort.

Novel mutations in medulloblastoma subgroupsBecause SVs and CNVs are unlikely to drive most medulloblastomas,we investigated whether recurrent (more than two samples) somaticSNVs and/or indels might target discrete genes and pathways. Thisanalysis identified 49 genes, across all 93 tumours, which were tar-geted by non-silent, recurrent, somatic mutations; 84% (n5 41/49)have not yet been implicated in medulloblastoma (SupplementaryTables 9 and 10). Several of these congregated in disease subgroupsand converged on specific cell pathways (Fig. 1, Supplementary Fig. 8and Supplementary Table 11).

Histone methylation is deregulated in subgroups 3 and 4TheH3K27 trimethylmark (H3K27me3) represses lineage-specific genesin stem cells15 (Supplementary Fig. 8). H3K27me3 is written by thepolycomb repressive complex 2 (PRC2) that includes the methylaseEZH2(refs16, 17) and is erasedduringdifferentiationby thedemethylaseKDM6A18. As H3K27me3 is erased, chromatin remodellers recruited toH3K4me3promotedifferentiation, for example,CHD7(refs 19, 20).Thisprocess is tightly controlled during development and deregulated incancers; EZH2 is mutated in lymphomas21 and upregulated in breast22

and prostate23 cancer, while biallelic inactivation of KDM6A (chro-mosome Xp11.2) or KDM6A and its paralogue UTY (chromosomeYq11), occurs in adult female and male cancers, respectively24.Hypergeometric distribution analyses revealed selective muta-

tion of histone modifiers in subgroup-3 and -4 medulloblastomas(Supplementary Table 11). Six subgroup-4, one subgroup-3, and

WNT SHH Subgroup 3 Subgroup 4

AgeSex

HistologyStage

OutcomeMono 6

9p9q

10p10q

17p17q

nCTNNB1

**********

NS*NSNS*

NS

NSNSNSNSNSNSNSNSNSNS

NS******NS***

**NS********

CTNNB1CDH1

DDX3XSMARCA4

CREBBP

SUFUPTCH1

TP53MLL2

GABRG1

MYCNMYC

OTX2DDX31

KDM6AZMYM3KDM1AKDM3AKDM4CKDM5AKDM5B

PTEN *

16q

Cohort

Clinical

Chrom

osome

***

WN

TS

HH

NS

NSNS

IHC

<5 yr>5 yrFM

Age and sexM0M1M2M2

StageDisease freeProgressionDead

OutcomeMelanoticClassicDesmoplasticAnaplastic

HistologyBalanceLossGainND

ChromosomeWGSValid.

Cohort–+ND

nCTTNB1Wild typeMissenseNonsense/FSSplice site

Focal del.

Focal amp.Homo del

Microdel.

ER1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

CHD7NS

Sub

group 3/4

TRRAPMED13

NSNS

7q

Y

PIK3CA

*

NS

3.3 × 10–14

0.024.6 × 10–12

6.7 × 10–8

NSNSNS0.06NS3.7 × 10–6

0.040.08-----4.6 × 10–11

0.18

0.006

0.00060.12

NSNS

NSNSNS

ER FDR

KDM7A

Cohort

Figure 1 | The genomic landscapeof medulloblastoma. Top, clinical,histological, gross chromosomal,nuclear CTNNB1 (nCTNNB1) andcohort (discovery or validation)details of 79 medulloblastomas bysubgroup. ER, enrichment. Bottom,genetic alterations detected in 27genes of particular interest. Colourkey at top right. ANOVA(continuous) or Fisher’s exact test(categorical) P value is shown onright. False discovery rate (FDR)estimates of each mutation areshown on right. Slash indicates lossor mutation of wild-type allele,including X chromosome in males.***P, 0.0005; **P, 0.005;*P, 0.05; NS, not significant. F,female; M, male. amp., amplification;del., deletion; microdel.,microdeletion; valid., validationcohort. ND, not done.

RESEARCH ARTICLE

4 4 | N A T U R E | V O L 4 8 8 | 2 A U G U S T 2 0 1 2

chromosome (BAC) clone sequencing in eight FL cases to show that inall eight cases themutations were in trans, affecting bothMLL2 alleles.This observation is consistent with the notion that there is a complete,or near-complete, loss ofMLL2 in the tumour cells of such patients.With the exception of two primary FL cases and two DLBCL cell

lines (Pfeiffer and SU-DHL-9), themajority ofMLL2mutations seemedto be heterozygous. Analysis of Affymetrix 500k SNP array data fromtwo FL cases with apparent homozygous mutations revealed that bothtumours showed copy number neutral loss of heterozygosity (LOH)for the region of chromosome 12 containing MLL2 (SupplementaryMethods). Thus, in addition to bi-allelic mutation, LOH is a second,albeit less common mechanism by whichMLL2 function is lost.MLL2 was the most frequently mutated gene in FL, and among the

most frequently mutated genes in DLBCL (Fig. 2). We confirmedMLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCLpatients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of theeight normal centroblast samples we sequenced. Our analysis pre-dicted that the majority of the somatic mutations observed in MLL2were inactivating (91% disrupted the reading frame or were truncat-ing point mutations), indicating to us that MLL2 is a tumour sup-pressor of significance in NHL.

Recurrent point mutations in MEF2BOur selective pressure analysis also revealed genes with stronger pres-sure for acquisition of amino acid substitutions than for nonsense

mutations. One such gene wasMEF2B, which had not previously beenlinked to lymphoma. We found that 20 (15.7%) cases had MEF2BcSNVs and 4 (3.1%) cases hadMEF2C cSNVs. All cSNVs detected byRNA-seq affected either the MADS box or MEF2 domains. To deter-mine the frequency and scope of MEF2B mutations, we Sanger-sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCLprimary tumours; 17 cell lines; 35 cases of assorted NHL (IBL,composite FL and PBMCL); and eight non-malignant centroblastsamples. We also used a capture strategy (Supplementary Methods)to sequence the entire MEF2B coding region in the 261 FL samples,revealing six additional variants outside exons 2 and 3. We thus iden-tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2BcSNVs or indels, failing to observe novel variants in other NHL andnon-malignant samples. Of the variants 55 (80%) affected residueswithin the MADS box and MEF2 domains encoded by exons 2 and 3(Supplementary Table 11; Fig. 3b). Each patient generally had a singleMEF2B variant and we observed relatively few (eight in total, 10.7%)truncation-inducing SNVs or indels. Non-synonymous SNVswere byfar themost common type of change observed, with 59.4% of detectedvariants affecting K4, Y69, N81 or D83. In 12 casesMEF2Bmutationswere shown to be somatic, including representative mutations at eachof K4, Y69, N81 and D83 (Supplementary Table 12). We did notdetect mutations in ABC cases, indicating that somatic mutations inMEF2B have a role unique to the development of GCBDLBCL and FL(Fig. 2).

AB

C e

nric

hmen

tG

CB

enr

ichm

ent

10203040

Cas

es

ABC GCBU FL

<0.05

0.1–0.05

0.3–0.1

MY

D88

CD

79B

BC

L6s

TNFA

IP3

CA

RD

11FA

STM

EM

30A

CD

58C

D70

STA

T3E

TS1

HIS

T1H

1CC

CN

D3

KLH

L6B

TG1

BTG

2IR

F8B

2ME

P30

0C

RE

BB

PM

LL2

FOX

O1

TNFR

SF1

4M

EF2

BTP

53B

CL2

SG

K1

GN

A13

EZ

H2

BC

L2s

BCL2sEZH2GNA13SGK1BCL2TP53MEF2BTNFRSF14FOXO1MLL2CREBBPEP300B2MIRF8BTG2BTG1KLHL6CCND3HIST1H1CETS1STAT3CD70CD58TMEM30AFASCARD11TNFAIP3BCL6sCD79BMYD88

Figure 2 | Overview of mutations and potential cooperative interactions inNHL. This heat map displays possible trends towards co-occurrence (red) andmutual exclusion (blue) of somatic mutations and structural rearrangements.Colours were assigned by taking the minimum value of a left- and right-tailedFisher’s exact test. To capture trends a P-value threshold of 0.3 was used, withthe darkest shade of the colour indicating those meeting statistical significance(P# 0.05). The relative frequency of mutations in ABC (blue), GCB (red),unclassifiable (black)DLBCLs and FL (yellow) cases is shown on the left. Geneswere arranged with those having significant (P, 0.05, Fisher’s exact test)enrichment for mutations in ABC cases (blue triangle) towards the top (andleft) and those with significant enrichment for mutations in GCB cases (redtriangle) towards the bottom (and right). The total number of cases in whicheach gene contained either cSNVs or confirmed somatic mutations is shown atthe top. The cluster of blue squares (upper-right) results from the mutualexclusion of the ABC-enrichedmutations (for example,MYD88, CD79B) fromthe GCB-enriched mutations (for example, EZH2, GNA13). Presence ofstructural rearrangements involving the two oncogenes BCL6 and BCL2(indicated as BCL6s and BCL2s) was determined with FISH techniques usingbreak-apart probes (Supplementary Methods).

PHD PHD HMG box COG5141FYRN

FYRC

SET

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp

a

D83G

K4E

MADS box

0 50 100 150 200 250 300 350

MEF2

b N81KN81Y

D83A

D83VY69HY69C

MLL2

MEF2B

bp

Figure 3 | Summary and effect of somatic mutations affecting MLL2 andMEF2B. a, Re-sequencing theMLL2 locus in 89 samples revealed mainlynonsense (red circles) and frameshift-inducing indel mutations (orangetriangles; inverted triangles for insertions and upright triangles for deletions). Asmaller number of non-synonymous somatic mutations (green circles) andpoint mutations or deletions affecting splice sites (yellow stars) were alsoobserved. All of the non-synonymous pointmutations affected a residue withineither the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminaldomain) or PHD zinc finger domains. The effect of these splice-site mutationsonMLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVsand somatic mutations found inMEF2B in all FL and DLBCL cases sequencedare shownwith the same symbols. Only the amino acids with variants in at leasttwo patients are labelled. cSNVs were most prevalent in the first two protein-coding exons ofMEF2B (exons 2 and 3). The crystal structure of MEF2 boundto EP300 supports the idea that two of the mutated sites (L67 and Y69) areimportant in the interaction between these proteins (Supplementary Figure 8and Supplementary Discussion)50.

Table 2 | Summary of types of MLL2 somatic mutationsSample Type FL DLBCL DLBCL cell-line Centroblast

Truncation 18 4 7 0Indel with frameshift 22 8 6 0Splice site 4 2 0 0SNV 3 2 2 0Any mutation/number of cases 31/35 12/37 10/17 0/8Percentage 89 32 59 0

ARTICLE RESEARCH

1 8 A U G U S T 2 0 1 1 | V O L 4 7 6 | N A T U R E | 3 0 1

S34 NATURE REPRINT COLLECTION Epigenetics

one unclassifiedmedulloblastoma contained novel inactivatingmuta-tions in KDM6A (Figs 1 and 2 and Supplementary Figs 8 and 16). Thesingle female with aKDM6A splice-site mutation showed a deletion ofthe second allele that escapes X inactivation25 (subgroup 4, no. 15), and57% (n5 4/7) of KDM6A-mutant male medulloblastomas deletedchromosome Y, compared with only 6% (n5 3/51) of male,KDM6A wild-type tumours (P, 0.005, Fisher’s exact test; Fig. 1).Thus, a two-hit model of KDM6A-UTY tumour suppression seemsto operate in subgroup-4 medulloblastomas. Notably, mutations insix other KDM family members (KDM1A, KDM3A, KDM4C,KDM5A, KDM5B and KDM7A) were detected exclusively insubgroup-3 and -4 tumours, implicating broad disruption of lysinedemethylation in these medulloblastomas (Fig. 1, SupplementaryTable 11 and Supplementary Fig. 16).Subgroup-3 and -4medulloblastomas also gained andoverexpressed

EZH2 (chromosome 7q35-34), which writes H3K27me3, andcontained novel inactivating mutations in effectors and regulators ofthe H3K4me3 mark26 (Fig. 2a and Supplementary Fig. 8). Gain ofchromosome 7q was significantly enriched among subgroup-3 and-4 medulloblastomas (P, 0.005, Fisher’s exact test) and correlateddirectly with EZH2 expression. Indeed, EZH2 was the eighth mostsignificantly overexpressed gene on chromosome 7 among subgroup-3and -4medulloblastomas that gained chromosome 7q relative to thosewith diploid chromosome 7 (P, 0.005, Bonferroni correction).Nonsense and frameshift mutations were detected in CHD7 in foursubgroup-3 and -4 tumours. ZMYM3 (chromosome Xq13.1), whichparticipates in a protein complex with KDM1A to regulate geneexpression at the H3K4me3 mark27, was targeted by novel frameshift,nonsense and missense mutations in three male subgroup-4 medullo-blastomas. All three tumours with mutations in ZMYM3 also mutatedKDM6A (subgroup 4, nos 19, 20) or KDM1A (subgroup 4, no. 21),suggesting that these alterations are cooperative. Remarkably,KDM6A, CHD7 and ZMYM3 mutations were confined to subgroups3 and 4, and clustered in samples with sub-median EZH2 expressionlevels (Fig. 2a; P, 0.05, Fisher’s exact test). These data suggest that

subgroup-3 and -4 medulloblastomas retain a stem-like epigeneticstate by aberrantly writing (EZH2 upregulation) or preserving(KDM6A-UTY inactivation) H3K27me3, or disrupting H3K4me3associated transcription (CHD7 and ZMYM3 inactivation). Indeed,human and mouse subgroup-3 and -4 medulloblastomas containedsignificantly more H3K27me3 than did WNT- or SHH-subgrouptumours (Fig. 2b). Thus, gain of EZH2 and loss of KDM6A probablymaintains H3K27me3 in subgroup-3 and -4 medulloblastomas.Finally, we looked to see if the differential expression of H3K27me3

among medulloblastoma subgroups reflects ancestral chromatinmarking in the progenitors that generate these tumours (Fig. 2b).Relatively low levels of H3K27me3 were detected in LRLPs and com-mitted GNPs, which generate WNT- and SHH-subgroup medullo-blastomas, respectively3–5, potentially explaining why mutations thatpreserve this epigenetic mark are absent from these tumours. Werecently showed that subgroup-3 medulloblastomas arise from a rarefraction of cerebellar progenitors6. We are currently investigatingwhether these progenitors are found among the H3K27me3-positivecells seen in the external germinal layer (Fig. 2b).

Novel mutations in WNT-subgroup medulloblastomasWNT-subgroup medulloblastomas contained mutations in epigeneticregulators that are different to those seen in subgroup-3 and -4 disease.CTNNB1, the principal effector of the WNT pathway, forms atranscription factor with the T-cell factor/lymphoid enhancer factor(TCF/LEF)28. The carboxy terminus of CTNNB1 then recruits a seriesof protein complexes that remodel chromatin and promote transcrip-tion at WNT-responsive genes (Supplementary Fig. 8). These include:histone acetyltransferases (for example, CREBBP and TRRAP–TIP60complexes)28,29; ATPases of the SWI/SNF family (for example,SMARCA4)30; and the mediator complex that coordinates RNApolymerase II placement (for example, MED13)31. As expected,.70% (n5 8/11) of WNT-subgroup medulloblastomas containedmutations that stabilize CTNNB1 (Fig. 1 and Supplementary Fig. 8;P, 0.0001, Fisher’s exact test)32,33. A single subgroup-3 case (no. 5)

Mouse E14.5 hindbrainb P7 CB

(i)

Fourthventricle

Cerebellum

(ii)LRL

(i)

URL

(ii)

Choroid

BrainstemBrainstem

WNT SHH

p

qChr

7

EZH2expr.

Subgroups 3 and 4

KDM6A

H3K27me3

a

3 4 5 7 8 9 10 11 1* 2* 3* 4* 5*6*7*8* 9*10*

11*

12*

13*

14*

15*

16*

17*

1 23 4567 89 10 11 12 1314 1516 17 18 19 202122 2324

25 262728 29 303132 33 34353637

P <

0.005

38

CHD7

ZMYM3

2 3 4 5 6 7 8 9 11 12 13

P = 0.001

WNT7 SHH7 GP4-7

P =

0.05

10.6 10.6 9.7 8.6 13.2 9.3 17.0 15.2 15.3 20.0 17.7 17.5 20.0 15.8 12.7 25.7 22.1 14.1 14.0 17.612.313.5(×103 a.u.)

–4 40

–2 20Chr 7 copy number score Data N/A

EZH2 exp. log ratioMutantWild type

Mouse medulloblastoma

H3K

27m

e3

WNT SHH Subgroup 3

IGL EGL

Figure 2 | Deregulation of H3K27me3 in subgroup-3 and -4 human andmouse medulloblastoma. a, Top row, SNP profiles of chromosome 7 (Chr 7)copynumber inmedulloblastomas (samples as Fig. 1; asterisk indicates subgroup-3 cases). Second row, expression of EZH2. Subgroup-3 and -4 tumours areordered left to right by expression level, dagger indicatesmedian expression point(Bonferroni-corrected P value of EZH2 expression versus chromosome 7 gain).Third row, mutation status of KDM6A, CHD7 and ZMYM3 (P value, Fisher’sexact test mutations versus EZH2 expression). Fourth row, H3K27me3

immunohistochemistry (numbers indicate colorimetry,P valueANOVA).GP4-7indicates case subgroup-4, no.7. a.u., arbitrary units. N/A, not available.b, H3K27me3 expression in mouse Blbp-Cre;Ctnnb11/lox(Ex3);Tp53flx/flx (WNT-subgroup),Ptch11/2;Tp532/2 (SHH-subgroup) andMyc;Ink4c2/2 (subgroup-3)medulloblastomas (right) and developing hindbrain (left). High-power views ofE14.5 LRL (i) and upper rhombic lip (URL) (ii). EGL, external germinal layer;IGL, internal granule layer. Scale bar, 50mm.White arrows in P7 cerebellum (CB)pinpoint H3K27me3 cells in the EGL.

ARTICLE RESEARCH

2 A U G U S T 2 0 1 2 | V O L 4 8 8 | N A T U R E | 4 5

SGK1 encodes a phosphatidylinositol-3-OH kinase (PI(3)K)-regulated kinase with functions including regulation of FOXOtranscription factors25, regulation of NF-kB by phosphorylating IkBkinase26, and negative regulation of NOTCH signalling27. SGK1 alsoresides within a region of chromosome 6 commonly deleted inDLBCL(Fig. 1)5. Themechanismbywhich SGK1 andGNA13 inactivationmaycontribute to lymphoma is unclear, but the strong degree of apparentselection towards their inactivation and their overall high mutationfrequency (eachmutated in 18 of 106DLBCL cases) suggests that theirloss contributes to B-cell NHL. Certain genes are known to bemutatedmore commonly in GCB DLBCLs (for example, TP53 (ref. 28) andEZH2 (ref. 13)). Here, both SGK1 and GNA13 mutations were foundonly in GCB cases (P5 1.933 1023 and 2.283 1024, Fisher’s exacttest; n5 15 and 18, respectively) (Fig. 2). Two additional genes(MEF2B and TNFRSF14) with no previously described role inDLBCL showed a similar restriction to GCB cases (Fig. 2).

Inactivating MLL2 mutationsMLL2 showed the most significant evidence for selection and thelargest number of nonsense SNVs. Our RNA-seq analysis indicatedthat 26.0% (33/127) of cases carried at least one MLL2 cSNV. To

address the possibility that variable RNA-seq coverage ofMLL2 failedto capture some mutations, we PCR-amplified the entireMLL2 locus(,36 kilobases) in 89 cases (35 primary FLs, 17 DLBCL cell lines, and37 DLBCLs). Of these cases 58 were among the RNA-seq cohort.Illumina amplicon re-sequencing (Supplementary Methods) revealed78 mutations, confirming the RNA-seq mutations in the overlappingcases and identifying 33 additional mutations. We confirmed thesomatic status of 46 variants using Sanger sequencing (Supplemen-tary Table 10), and showed that 20 of the 33 additionalmutations wereinsertions or deletions (indels). Three SNVs at splice sites were alsodetected, aswere10newcSNVs thathadnot beendetected byRNA-seq.The somatic mutations were distributed acrossMLL2 (Fig. 3a). Of

these, 37% (n5 29/78) were nonsense mutations, 46% (n5 36/78)were indels that altered the reading frame, 8% (n5 6/78) were pointmutations at splice sites and 9% (n5 7/78) were non-synonymousamino acid substitutions (Table 2). Four of the somatic splice sitemutations had effects on MLL2 transcript length and structure. Forexample, two heterozygous splice site mutations resulted in the use ofa novel splice donor site and an intron retention event.Approximately half of the NHL cases we sequenced had twoMLL2

mutations (Supplementary Table 10). We used bacterial artificial

Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genesGene Cases Total Somatic cSNVs

(RNA-seqcohort)*

P (raw) q NS SP T SP Skew(M, WT, both){

NS S T NS S T

MLL2{ 16 8 17 17 8 18 10 6.8531028 8.50 31027 0.834 14.4 WTTNFRSF14 G{ 7 1 7 8 1 7 11 6.8531028 8.50 31027 7.52 118 BothSGK1 G{ 18 6 6 37 10 6 9 6.8531028 8.50 31027 19.5 61.7 2BCL10{ 2 0 4 3 0 4 4 6.8531028 8.50 31027 3.62 112 WTGNA13 G{ 21 1 2 33 1 2 5 6.8531028 8.50 31027 24.1 25.7 BothTP53 G{ 20 2 1 23 3 1 22 6.8531028 8.50 31027 15.6 14.1 BothEZH2 G{ 33 0 0 33 0 0 33 6.8531028 8.50 31027 11.4 0.00 BothBTG2{ 12 6 1 14 6 1 2 6.85 31028 8.50 31027 23.9 35.1 2BCL2 G{ 42 45 0 96 105 0 43 9.3531028 8.50 31027 3.78 0.00 MBCL6{1 11 2 0 12 2 0 2 9.3531028 8.50 31027 0.175 0.00 MCIITA{1 5 3 0 6 3 0 2 9.3531028 8.50 31027 0.086 0.00FAS{ 2 0 4 3 0 4 2 1.52 31027 1.17 31026 2.54 66.5 WTBTG1{ 11 6 2 11 7 2 10 1.52 31027 1.17 31026 17.5 52.5 BothMEF2B G{ 20 2 0 20 2 0 10 2.05 31027 1.47 31026 14.2 0.00 MIRF8{ 11 5 3 14 5 3 3 4.55 31027 3.03 31026 8.82 28.2 WTTMEM30A{ 1 0 4 1 0 4 4 6.06 31027 3.79 31026 0.785 65.0 WTCD58{ 2 0 3 2 0 3 2 2.42 31026 1.43 31025 2.29 69.2 2KLHL6{ 10 2 2 12 2 2 4 1.00 31025 5.26 31025 5.42 16.4 2MYD88 A{ 13 2 0 14 2 0 9 1.00 31025 5.26 31025 12.4 0.00 WTCD70{ 5 0 1 5 0 2 3 1.70 31025 8.48 31025 7.08 44.0 2CD79B A{ 7 2 1 9 2 1 5 2.00 31025 9.52 31025 10.9 18.3 MCCND3{ 7 1 2 7 1 2 6 2.80 31025 1.27 31024 6.55 36.3 WTCREBBP{ 20 7 4 24 7 4 9 1.00 31024 4.35 31024 2.72 6.04 BothHIST1H1C{ 9 0 0 10 0 0 6 1.80 31024 7.50 31024 11.9 0.00 BothB2M{ 7 0 0 7 0 0 4 3.90 31024 1.56 31023 16.6 0.00 WTETS1{ 10 1 0 10 1 0 4 4.10 31024 1.58 31023 5.76 0.00 WTCARD11{ 14 3 0 14 3 0 3 1.90 31023 7.04 31023 3.37 0.00 BothFAT2{1 2 1 0 2 1 0 2 6.30 31023 2.25 31022 0.128 0.00 2IRF4{1 9 4 0 26 5 0 5 7.00 31023 2.41 31022 0.569 0.00 BothFOXO1{ 8 4 0 10 4 0 4 7.60 3103 2.53 31022 4.02 0.00 2STAT3 9 0 0 9 0 0 4 2.19 31022 6.08 31022 2 2 BothRAPGEF1 8 3 0 10 3 0 3 2.98 31022 7.45 31022 2 2 WTABCA7 12 3 0 15 3 0 2 7.76 31022 1.67 31021 2 2 WTRNF213 10 8 0 10 8 0 2 7.87 31022 1.67 31021 2 2 2MUC16 17 12 0 39 25 0 2 8.32 31022 1.73 31021 2 2 2HDAC7 8 4 0 8 4 0 2 8.94 31022 1.82 31021 2 2 WTPRKDC 7 3 0 7 4 0 2 1.06 31021 2.05 31021 2 2 2SAMD9 9 2 0 9 2 0 2 1.79 31021 3.01 31021 2 2 2TAF1 10 0 0 10 0 0 2 3.03 31021 4.74 31021 2 2 2PIM1 20 19 0 33 34 0 11 3.40 31021 5.23 31021 2 2 WTCOL4A2 8 2 0 8 2 0 2 7.64 31021 8.99 31021 2 2 2EP300 8 7 1 8 7 1 3 9.54 31021 1.00 2 2 WT

Individual cases with non-synonymous (NS), synonymous (S) and truncating (T) mutations and the total number of mutations of each class are shown separately because some genes contained multiplemutations in the same case. The P values indicated in bold are the upper limit on the P value for that gene determined with the approach described in ref. 19 (see Supplementary Methods), q is the Benjamini-corrected q value, and NS SP and T SP refer to selective pressure estimates from this model for the acquisition of non-synonymous or truncatingmutations, respectively. Genes with a superscript of either A or Gwere found to have mutations significantly enriched in ABC or GCB cases, respectively (P,0.05, Fisher’s exact test).*Additional somatic mutations identified in larger cohorts and insertion/deletion mutations are not included in this total.{ ‘Both’ indicates that we observed separate cases in which skewed expression was seen but where this skew was not consistent for the mutant or wild-type allele.{Genessignificant at a false discovery rate of 0.03. SNVs inBCL2 andpreviously confirmedhot spotmutations inEZH2andCD79Bareprobably somatic in these samples basedonpublishedobservations of others.1Selective pressure estimates are both,1 indicating purifying selection rather than positive selection acting on this gene.

RESEARCH ARTICLE

3 0 0 | N A T U R E | V O L 4 7 6 | 1 8 A U G U S T 2 0 1 1

NATURE REPRINT COLLECTION Epigenetics S35

one unclassifiedmedulloblastoma contained novel inactivatingmuta-tions in KDM6A (Figs 1 and 2 and Supplementary Figs 8 and 16). Thesingle female with aKDM6A splice-site mutation showed a deletion ofthe second allele that escapes X inactivation25 (subgroup 4, no. 15), and57% (n5 4/7) of KDM6A-mutant male medulloblastomas deletedchromosome Y, compared with only 6% (n5 3/51) of male,KDM6A wild-type tumours (P, 0.005, Fisher’s exact test; Fig. 1).Thus, a two-hit model of KDM6A-UTY tumour suppression seemsto operate in subgroup-4 medulloblastomas. Notably, mutations insix other KDM family members (KDM1A, KDM3A, KDM4C,KDM5A, KDM5B and KDM7A) were detected exclusively insubgroup-3 and -4 tumours, implicating broad disruption of lysinedemethylation in these medulloblastomas (Fig. 1, SupplementaryTable 11 and Supplementary Fig. 16).Subgroup-3 and -4medulloblastomas also gained andoverexpressed

EZH2 (chromosome 7q35-34), which writes H3K27me3, andcontained novel inactivating mutations in effectors and regulators ofthe H3K4me3 mark26 (Fig. 2a and Supplementary Fig. 8). Gain ofchromosome 7q was significantly enriched among subgroup-3 and-4 medulloblastomas (P, 0.005, Fisher’s exact test) and correlateddirectly with EZH2 expression. Indeed, EZH2 was the eighth mostsignificantly overexpressed gene on chromosome 7 among subgroup-3and -4medulloblastomas that gained chromosome 7q relative to thosewith diploid chromosome 7 (P, 0.005, Bonferroni correction).Nonsense and frameshift mutations were detected in CHD7 in foursubgroup-3 and -4 tumours. ZMYM3 (chromosome Xq13.1), whichparticipates in a protein complex with KDM1A to regulate geneexpression at the H3K4me3 mark27, was targeted by novel frameshift,nonsense and missense mutations in three male subgroup-4 medullo-blastomas. All three tumours with mutations in ZMYM3 also mutatedKDM6A (subgroup 4, nos 19, 20) or KDM1A (subgroup 4, no. 21),suggesting that these alterations are cooperative. Remarkably,KDM6A, CHD7 and ZMYM3 mutations were confined to subgroups3 and 4, and clustered in samples with sub-median EZH2 expressionlevels (Fig. 2a; P, 0.05, Fisher’s exact test). These data suggest that

subgroup-3 and -4 medulloblastomas retain a stem-like epigeneticstate by aberrantly writing (EZH2 upregulation) or preserving(KDM6A-UTY inactivation) H3K27me3, or disrupting H3K4me3associated transcription (CHD7 and ZMYM3 inactivation). Indeed,human and mouse subgroup-3 and -4 medulloblastomas containedsignificantly more H3K27me3 than did WNT- or SHH-subgrouptumours (Fig. 2b). Thus, gain of EZH2 and loss of KDM6A probablymaintains H3K27me3 in subgroup-3 and -4 medulloblastomas.Finally, we looked to see if the differential expression of H3K27me3

among medulloblastoma subgroups reflects ancestral chromatinmarking in the progenitors that generate these tumours (Fig. 2b).Relatively low levels of H3K27me3 were detected in LRLPs and com-mitted GNPs, which generate WNT- and SHH-subgroup medullo-blastomas, respectively3–5, potentially explaining why mutations thatpreserve this epigenetic mark are absent from these tumours. Werecently showed that subgroup-3 medulloblastomas arise from a rarefraction of cerebellar progenitors6. We are currently investigatingwhether these progenitors are found among the H3K27me3-positivecells seen in the external germinal layer (Fig. 2b).

Novel mutations in WNT-subgroup medulloblastomasWNT-subgroup medulloblastomas contained mutations in epigeneticregulators that are different to those seen in subgroup-3 and -4 disease.CTNNB1, the principal effector of the WNT pathway, forms atranscription factor with the T-cell factor/lymphoid enhancer factor(TCF/LEF)28. The carboxy terminus of CTNNB1 then recruits a seriesof protein complexes that remodel chromatin and promote transcrip-tion at WNT-responsive genes (Supplementary Fig. 8). These include:histone acetyltransferases (for example, CREBBP and TRRAP–TIP60complexes)28,29; ATPases of the SWI/SNF family (for example,SMARCA4)30; and the mediator complex that coordinates RNApolymerase II placement (for example, MED13)31. As expected,.70% (n5 8/11) of WNT-subgroup medulloblastomas containedmutations that stabilize CTNNB1 (Fig. 1 and Supplementary Fig. 8;P, 0.0001, Fisher’s exact test)32,33. A single subgroup-3 case (no. 5)

Mouse E14.5 hindbrainb P7 CB

(i)

Fourthventricle

Cerebellum

(ii)LRL

(i)

URL

(ii)

Choroid

BrainstemBrainstem

WNT SHH

p

qChr

7

EZH2expr.

Subgroups 3 and 4

KDM6A

H3K27me3

a

3 4 5 7 8 9 10 11 1* 2* 3* 4* 5*6*7*8* 9*10*

11*

12*

13*

14*

15*

16*

17*

1 23 4567 89 10 11 12 1314 1516 17 18 19 202122 2324

25 262728 29 303132 33 34353637

P <

0.005

38

CHD7

ZMYM3

2 3 4 5 6 7 8 9 11 12 13

P = 0.001

WNT7 SHH7 GP4-7

P =

0.05

10.6 10.6 9.7 8.6 13.2 9.3 17.0 15.2 15.3 20.0 17.7 17.5 20.0 15.8 12.7 25.7 22.1 14.1 14.0 17.612.313.5(×103 a.u.)

–4 40

–2 20Chr 7 copy number score Data N/A

EZH2 exp. log ratioMutantWild type

Mouse medulloblastoma

H3K

27m

e3

WNT SHH Subgroup 3

IGL EGL

Figure 2 | Deregulation of H3K27me3 in subgroup-3 and -4 human andmouse medulloblastoma. a, Top row, SNP profiles of chromosome 7 (Chr 7)copynumber inmedulloblastomas (samples as Fig. 1; asterisk indicates subgroup-3 cases). Second row, expression of EZH2. Subgroup-3 and -4 tumours areordered left to right by expression level, dagger indicatesmedian expression point(Bonferroni-corrected P value of EZH2 expression versus chromosome 7 gain).Third row, mutation status of KDM6A, CHD7 and ZMYM3 (P value, Fisher’sexact test mutations versus EZH2 expression). Fourth row, H3K27me3

immunohistochemistry (numbers indicate colorimetry,P valueANOVA).GP4-7indicates case subgroup-4, no.7. a.u., arbitrary units. N/A, not available.b, H3K27me3 expression in mouse Blbp-Cre;Ctnnb11/lox(Ex3);Tp53flx/flx (WNT-subgroup),Ptch11/2;Tp532/2 (SHH-subgroup) andMyc;Ink4c2/2 (subgroup-3)medulloblastomas (right) and developing hindbrain (left). High-power views ofE14.5 LRL (i) and upper rhombic lip (URL) (ii). EGL, external germinal layer;IGL, internal granule layer. Scale bar, 50mm.White arrows in P7 cerebellum (CB)pinpoint H3K27me3 cells in the EGL.

ARTICLE RESEARCH

2 A U G U S T 2 0 1 2 | V O L 4 8 8 | N A T U R E | 4 5

also showed a mutation in CTNNB1, but this mutation has notbeen reported in cancer, did not upregulate nuclear CTNNB1(Fig. 1) and is of unclear relevance. Remarkably, six WNT-subgroupmedulloblastomas showed mutations in chromatin modifiers thatare recruited to TCF/LEF WNT-responsive genes by CTNNB1(Fig. 1 and Supplementary Fig. 8). Four WNT-subgroup tumourscontained heterozygous missense mutations in the helicase domain ofSMARCA4 (P, 0.002, Fisher’s exact test), two samples, including onewith a SMARCA4 mutation (no. 5), contained nonsense mutations inCREBBP (WNT-subgroup enrichment, P, 0.02, Fisher’s exact test),andmissensemutations inTRRAP andMED13were detected in a singleWNT-subgroup medulloblastoma each. Thus, in addition to stabiliza-tion ofCTNNB1, thedevelopment ofWNT-subgroupmedulloblastomamay require disruption of chromatin remodelling at WNT-responsivegenes.A small number of WNT-subgroup medulloblastomas lack muta-

tions in CTNNB1 or APC, suggesting that alternative mechanismsdrive aberrant WNT signals in these tumours. ThreeWNT-subgroupmedulloblastomas in our series contained wild-type CTNNB1 (nos 1,10 and 11; Fig. 1). Sample no. 11 inactivated APC as the sole case ofTurcot’s syndrome in our study, but this tumour and sample no. 10also contained novel missense mutations in CDH1 (R63G, V329F;WNT-subgroup enrichment, P, 0.05, Fisher’s exact test; Fig. 1).CDH1 sequesters CTNNB1 at the cell membrane34, and mutationsthat disrupt this interaction promote WNT signalling in adultcancers35,36. The functional consequences of CDH1(R63G) andCDH1(V329F) remain to be determined, but their restriction toWNT-subgroup tumours, mutual exclusivity with CTNNB1 muta-tions, and adjacency to residues mutated in breast cancer (http://www.sanger.ac.uk/genetics/CGP/cosmic/), suggest they might pro-mote aberrant WNT signals in medulloblastoma.We showed previously inmice that mutant Ctnnb1 initiatesWNT-

subgroup medulloblastoma by arresting the migration of LRLPs fromthe embryonic dorsal brainstem to the pontine grey nucleus (PGN)5.Therefore, to test whether disruption of CDH1 might substitute formutant CTNNB1 in medulloblastoma, we used short hairpin (sh)RNAsto knockdownCdh1 in embryonic day (E)14.5mouse LRLPs (Fig. 3a–c).Deletion of Cdh1 expression upregulated Tcf/Lef-mediated gene tran-scription in LRLPs and more than doubled their self-renewal capacity(Fig. 3b). Furthermore, in utero electroporation of LRLPs with Cdh1shRNAs impeded their migration from the dorsal brainstem to thePGN with an efficiency similar to that of mutant Ctnnb1 (Fig. 3d, e;see Supplementary Methods). These data support the hypothesis thatCDH1 suppresses the formation of WNT-subgroup medulloblastomaby regulating WNT-signals in LRLPs.WNT-subgroup medulloblastomas were also enriched for novel,

recurrent somaticmissensemutations in theDEAD-box RNAhelicaseDDX3X at chromosomeXp11.3 (P, 0.0001, Fisher’s exact test; Fig. 1).DDX3X regulates several critical cell processes including chromosomesegregation37, cell cycle progression38, gene transcription and trans-lation39. Previously reported cancer-associated mutations in DDX3Xdisrupt the ATPase activity of the protein, but seven of eight muta-tions identified in our series clustered in the DEAD-box domain(Supplementary Information and Supplementary Fig. 8). Structuralmodelling predicts that these mutations interfere with nucleic acidbinding, possibly altering specificity and/or affinity for RNA substrates,rather than inactivating DDX3X (Supplementary Figs 17–22). Indeed,thewild-type allele ofDDX3X that escapesX inactivation25was retainedby two of threeDDX3X-mutant female medulloblastomas, and knock-down of Ddx3x halved the self-renewal rate of mouse LRLPs, suggest-ing that this protein is important for the proliferation and/ormaintenance of the LRLP lineage (Fig. 3b).To understand better the role of DDX3X in WNT-subgroup

medulloblastoma, we used our in utero migration assay to assessthe impact of Ddx3x shRNAs, mutant Ddx3xT275M (identified inWNT-subgroup sample no. 9), or mutant Ddx3xG325E (WNT sample

no. 8) onLRLPs.Remarkably, althoughDdx3x shRNAswere expressedabundantly in E14.5 brainstem cells within 48 h of electropora-tion,#0.5% of Ddx3x-shRNA-positive cells were present by postnatalday (P)1, confirming the critical importance of this gene to maintainthe LRLP lineage (Fig. 3d, e). In contrast, mice electroporated witheither mutant Ddx3xT275M or Ddx3xG325E consistently contained

0

T25-

50-

100-

75-

T

T

T TT

T

TT TTT

Per

cent

age

exp

ress

ion

of c

ontr

ols

real

-tim

e q

PC

R

c

Cdh

1sh

RN

ASm

arca

4sh

RN

AD

dx3x

shR

NA

Gab

rg1

shR

NA

Mll2

shR

NA

Kdm

6ash

RN

A

***

***

***

***

***

***

d

Ctn

nb1m

utan

t C

dh1sh

RN

A D

dx3x

shR

NA

E16.5

Dorsal brainstem

Dorsal brainstem

Con

trol

PGN

PGN

Dorsal brainstem

Dorsal brainstem

Dorsal brainstem

Dorsal brainstem

Ddx

3xT2

75M

PGNPGN

P1

P1

P1

P1

P1

e

Dd

x3xsh

RN

A

Gab

rg1sh

RN

AM

ll2sh

RN

A

Kd

m6a

shR

NA

Dd

x3xT2

75M

Dd

x3xG

325E

Con

trol

Ctn

nb1m

utan

t

Cd

h1sh

RN

A

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Rel

ativ

e d

ista

nce

trav

elle

d

****

0 10 >20

Labelled cells (%)

P-cell distance:

P-cell number:

Mediandistance (mm):

****

**** ** **

26184

1,90

7 –

NS

NS

NS

NS

NS

NS

NS

NS

1,66

21,

203

1,55

2

1,66

21,

468

NS

NS

Dor

sal b

rain

stem

PG

N

Phase TCFshRNA

Non

eV

ecto

rC

dh1

Olig3

Wnt1

DAPI

Merge

a b

+

N/A

Ctn

nb1* +

+

Sm

arca

4 +

Dd

x3x +

0.37 ± 0.02 *

**

0.77 ± 0.090.19 ± 0.08

0.27 ± 0.05

Figure 3 | Genes mutated in WNT-subgroup medulloblastomas regulateLRLPs. a, b, Isolated Olig31/Wnt11 LRLPs were transduced in bwith mutantCtnnb1 (above hashed line) or the indicated shRNA-RFP (red fluorescenceprotein) construct (below hashed line). LRLPs were also transduced (1) or not(2) with a Tcf/Lef-enhanced green fluorescence (Tcf) reporter. Numbers onright show clonal percentage 29 to 39 passage neurosphere formation(6 standard deviation (s.d.)). N/A, not applicable. Scale bar, 10mm.c, Knockdown of genes targeted by shRNA relative to control transduced cells.Data show mean 6 s.d. d, Immunofluorescence of P1 mouse hindbrainselectroporated in utero at E14.5 with GFP (to control for equivalence ofelectroporation between embryos control) and the indicated construct. High-power views of indicated areas are shown right. Cells targeted byDdx3x shRNAare present 48 h after electroporation but ablated by P1. Scale bars, 200mm.e, Heatmap showing the distribution ofGFP1/RFP1 cells in eletroporatedmiceat P1.Median distancemigrated by cells and P values ofmigration distance andcell number relative to controls is shown. ****P, 0.00005; ***P, 0.0005; **P, 0.005; *P, 0.05. Red and green text reports significant increase ordecrease, respectively, relative to control.

RESEARCH ARTICLE

4 6 | N A T U R E | V O L 4 8 8 | 2 A U G U S T 2 0 1 2

chromosome (BAC) clone sequencing in eight FL cases to show that inall eight cases themutations were in trans, affecting bothMLL2 alleles.This observation is consistent with the notion that there is a complete,or near-complete, loss ofMLL2 in the tumour cells of such patients.With the exception of two primary FL cases and two DLBCL cell

lines (Pfeiffer and SU-DHL-9), themajority ofMLL2mutations seemedto be heterozygous. Analysis of Affymetrix 500k SNP array data fromtwo FL cases with apparent homozygous mutations revealed that bothtumours showed copy number neutral loss of heterozygosity (LOH)for the region of chromosome 12 containing MLL2 (SupplementaryMethods). Thus, in addition to bi-allelic mutation, LOH is a second,albeit less common mechanism by whichMLL2 function is lost.MLL2 was the most frequently mutated gene in FL, and among the

most frequently mutated genes in DLBCL (Fig. 2). We confirmedMLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCLpatients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of theeight normal centroblast samples we sequenced. Our analysis pre-dicted that the majority of the somatic mutations observed in MLL2were inactivating (91% disrupted the reading frame or were truncat-ing point mutations), indicating to us that MLL2 is a tumour sup-pressor of significance in NHL.

Recurrent point mutations in MEF2BOur selective pressure analysis also revealed genes with stronger pres-sure for acquisition of amino acid substitutions than for nonsense

mutations. One such gene wasMEF2B, which had not previously beenlinked to lymphoma. We found that 20 (15.7%) cases had MEF2BcSNVs and 4 (3.1%) cases hadMEF2C cSNVs. All cSNVs detected byRNA-seq affected either the MADS box or MEF2 domains. To deter-mine the frequency and scope of MEF2B mutations, we Sanger-sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCLprimary tumours; 17 cell lines; 35 cases of assorted NHL (IBL,composite FL and PBMCL); and eight non-malignant centroblastsamples. We also used a capture strategy (Supplementary Methods)to sequence the entire MEF2B coding region in the 261 FL samples,revealing six additional variants outside exons 2 and 3. We thus iden-tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2BcSNVs or indels, failing to observe novel variants in other NHL andnon-malignant samples. Of the variants 55 (80%) affected residueswithin the MADS box and MEF2 domains encoded by exons 2 and 3(Supplementary Table 11; Fig. 3b). Each patient generally had a singleMEF2B variant and we observed relatively few (eight in total, 10.7%)truncation-inducing SNVs or indels. Non-synonymous SNVswere byfar themost common type of change observed, with 59.4% of detectedvariants affecting K4, Y69, N81 or D83. In 12 casesMEF2Bmutationswere shown to be somatic, including representative mutations at eachof K4, Y69, N81 and D83 (Supplementary Table 12). We did notdetect mutations in ABC cases, indicating that somatic mutations inMEF2B have a role unique to the development of GCBDLBCL and FL(Fig. 2).

AB

C e

nric

hmen

tG

CB

enr

ichm

ent

10203040

Cas

es

ABC GCBU FL

<0.05

0.1–0.05

0.3–0.1

MY

D88

CD

79B

BC

L6s

TNFA

IP3

CA

RD

11FA

STM

EM

30A

CD

58C

D70

STA

T3E

TS1

HIS

T1H

1CC

CN

D3

KLH

L6B

TG1

BTG

2IR

F8B

2ME

P30

0C

RE

BB

PM

LL2

FOX

O1

TNFR

SF1

4M

EF2

BTP

53B

CL2

SG

K1

GN

A13

EZ

H2

BC

L2s

BCL2sEZH2GNA13SGK1BCL2TP53MEF2BTNFRSF14FOXO1MLL2CREBBPEP300B2MIRF8BTG2BTG1KLHL6CCND3HIST1H1CETS1STAT3CD70CD58TMEM30AFASCARD11TNFAIP3BCL6sCD79BMYD88

Figure 2 | Overview of mutations and potential cooperative interactions inNHL. This heat map displays possible trends towards co-occurrence (red) andmutual exclusion (blue) of somatic mutations and structural rearrangements.Colours were assigned by taking the minimum value of a left- and right-tailedFisher’s exact test. To capture trends a P-value threshold of 0.3 was used, withthe darkest shade of the colour indicating those meeting statistical significance(P# 0.05). The relative frequency of mutations in ABC (blue), GCB (red),unclassifiable (black)DLBCLs and FL (yellow) cases is shown on the left. Geneswere arranged with those having significant (P, 0.05, Fisher’s exact test)enrichment for mutations in ABC cases (blue triangle) towards the top (andleft) and those with significant enrichment for mutations in GCB cases (redtriangle) towards the bottom (and right). The total number of cases in whicheach gene contained either cSNVs or confirmed somatic mutations is shown atthe top. The cluster of blue squares (upper-right) results from the mutualexclusion of the ABC-enrichedmutations (for example,MYD88, CD79B) fromthe GCB-enriched mutations (for example, EZH2, GNA13). Presence ofstructural rearrangements involving the two oncogenes BCL6 and BCL2(indicated as BCL6s and BCL2s) was determined with FISH techniques usingbreak-apart probes (Supplementary Methods).

PHD PHD HMG box COG5141FYRN

FYRC

SET

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp

a

D83G

K4E

MADS box

0 50 100 150 200 250 300 350

MEF2

b N81KN81Y

D83A

D83VY69HY69C

MLL2

MEF2B

bp

Figure 3 | Summary and effect of somatic mutations affecting MLL2 andMEF2B. a, Re-sequencing theMLL2 locus in 89 samples revealed mainlynonsense (red circles) and frameshift-inducing indel mutations (orangetriangles; inverted triangles for insertions and upright triangles for deletions). Asmaller number of non-synonymous somatic mutations (green circles) andpoint mutations or deletions affecting splice sites (yellow stars) were alsoobserved. All of the non-synonymous pointmutations affected a residue withineither the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminaldomain) or PHD zinc finger domains. The effect of these splice-site mutationsonMLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVsand somatic mutations found inMEF2B in all FL and DLBCL cases sequencedare shownwith the same symbols. Only the amino acids with variants in at leasttwo patients are labelled. cSNVs were most prevalent in the first two protein-coding exons ofMEF2B (exons 2 and 3). The crystal structure of MEF2 boundto EP300 supports the idea that two of the mutated sites (L67 and Y69) areimportant in the interaction between these proteins (Supplementary Figure 8and Supplementary Discussion)50.

Table 2 | Summary of types of MLL2 somatic mutationsSample Type FL DLBCL DLBCL cell-line Centroblast

Truncation 18 4 7 0Indel with frameshift 22 8 6 0Splice site 4 2 0 0SNV 3 2 2 0Any mutation/number of cases 31/35 12/37 10/17 0/8Percentage 89 32 59 0

ARTICLE RESEARCH

1 8 A U G U S T 2 0 1 1 | V O L 4 7 6 | N A T U R E | 3 0 1

S36 NATURE REPRINT COLLECTION Epigenetics

,50% more labelled cells at P1 than did controls, although these cellsmigrated normally (Fig. 3d, e and data not shown). Thus, mutations inDDX3X may contribute to WNT-subgroup medulloblastoma byincreasing LRLP proliferation rather than perturbing the migrationof their daughter cells. Notably, comparable knockdown in utero ofMll2, Gabrg1 and Kdm6a that were selectively mutated in non-WNTmedulloblastomas had no apparent impact on LRLPs; supporting thevalue of our assay for assessingWNT-subgroup specificmutations andunderscoring the importance of cell context for functional studies ofgenes mutated in cancer subgroups.

PIK3CA mutations promote WNT-subgroup medulloblastomaCancer-associated, activating mutations in PIK3CA were detected ina single case each of WNT-subgroup (PIK3CA(Q546K)), SHH-subgroup (PIK3CA(H1047R)) and subgroup-4 (PIK3CA(N345K))medulloblastoma (Fig. 1 and Supplementary Fig. 23). AlthoughPIK3CA mutations are common in adult cancers40 and reported inmedulloblastoma41, their role in tumorigenesis remains controversial.In particular it is not known if these mutations initiate or progresscancer. To test this, we generated mice that express a conditionalallele of the Pik3caE545K mutation. Mice harbouring Pik3caE545K

or Pik3caE545K and Tp53flx/flx were bred with Blbp-Cre, which drivesefficient recombination in LRLPs5. Blbp-Cre;Pik3caE545K mice, withor without Tp53flx/flx, survived tumour free for a median of 212days with no evidence of aberrant LRLP migration (Fig. 4a anddata not shown). In stark contrast, 100% (n5 11/11) of Blbp-Cre;Ctnnb11/lox(Ex3);Tp531/flx;Pik3caE545K mice developed WNT-subgroup medulloblastomas by 3 months of age; only 4% (n5 2/54)of Blbp-Cre;Ctnnb11/lox(Ex3);Tp531/flx mice develop WNT-subgroupmedulloblastoma by 11 months (Fig. 4a, b). Pik3ca wild-type andmutantmousemedulloblastomas displayed similar ‘classic’ histologiesand nuclear Ctnnb11, but Pik3caE545K mutant tumours containedgreater AKT pathway activity as measured by pS6 and p4EBP1immunostaining. Thus mutations in PIK3CA probably activatethe AKT pathway to progress, rather than initiate, WNT-subgroupmedulloblastoma.

SHH-subgroup medulloblastomasFour of thirteen SHH-subgroup medulloblastomas containedexpected biallelic inactivating alterations in SUFU or PTCH1. What

drives aberrant SHH signals in the remaining cases remains unclear.These tumours contained mutations in MLL2, TP53 and PTEN thathave been reported previously in medulloblastoma42; but these muta-tions occur in other subgroups and are not known to activate SHHsignals. Two SHH-subgroup tumours (nos 11 and 12) containedidentical novel T48Mmutations in the GABAA (c-aminobutyric acid,subtype A) receptor, c1, which is predicted to be deleterious (Fig. 1and Supplementary Table 9). Disruption of GABAA receptors canenhance neural stem cell proliferation43, suggesting that these muta-tions might deregulate the proliferation of GNPs that generate SHH-subgroup medulloblastomas.

DiscussionWe have identified several, new, recurrent, somatic mutations in spe-cific subgroups of medulloblastoma. Alterations affecting EZH2,KDM6A, CHD7 and ZMYM3 seem to disrupt chromatin markingof genes in subgroup-3 and -4 tumours. Further epigenetic studieswill be required to uncover the identity of these genes, but evidencesuggests thesemay includeOTX2,MYC andMYCN44,45. As amplifica-tion of these genes was detected almost exclusively in subgroup-3 and-4 tumours that lacked mutations in KDM6A, CHD7 or ZMYM3, it istempting to speculate that these genetic alterations target commontransforming pathways. A recent study detected recurrent mutationsin three other chromatin remodellers in medulloblastoma42:SMARCA4, MLL2 and MLL3, but this study did not include detailsof tumour subgroup. Here, we show that mutations in SMARCA4,CREBBP, TRRAP and MED13 are enriched in WNT-subgroupmedulloblastomas; thereby uncovering potential cooperative muta-tions in chromatin remodellers and their binding-partner oncogene,CTNNB1. Thus, disruptions in the epigenetic machinery of medullo-blastoma are likely to be subgroup specific and may cooperate withother oncogenic mutations. The low incidence of MLL2 mutationsdetected in our study relative to previous work42 probably reflectsdifferences in study populations (see Supplementary Results).Although medulloblastoma is more prevalent in males, especially

with subgroup-3 and -4 disease46, the reason for this sex bias isunknown.One potential explanation is the locationofmedulloblastomaoncogenes or tumour suppressor genes on chromosome X47. Three ofthemost recurrentlymutated genes detected in our study are located onchromosome X, of which two (ZMYM3 and KDM6A) were observedalmost exclusively in males. Mutation in these genes might explainsome of the male sex bias in medulloblastoma. The third mutated Xchromosome gene, DDX3X, is more likely to be a WNT-subgroupmedulloblastoma oncogene. Three of four female medulloblastomascarried heterozygousmutations inDDX3X that escapeX inactivation25,and our functional data indicate that mutations in this gene provide aproliferative advantage to LRLPs that generate these tumours.Our findings also have important implications for drug develop-

ment. Inhibitors of the epigenetic machinery, especially those thatmaintain H3K27me3—for example, EZH2 methylase—may be usefultreatments for subgroup-3 and -4 disease. These tumours include themost aggressive forms of medulloblastoma, for which treatmentoptions are limited. Mutations that activate PIK3CA and DDX3X inWNT-subgroup tumoursmight alsobe targetedwith novel therapeuticstrategies48,49. Future clinical trials of drugs that target these mutantproteins must recruit the appropriate patient populations, as wedemonstrate that mutations show subgroup specificity in medullo-blastoma. Our accurate mouse models of WNT-subgroup, SHH-subgroup and subgroup-3 medulloblastoma should help with futurestudies of the biological and therapeutic importance of the novelgenetic alterations described in this study.

METHODS SUMMARYHuman tumour and matched blood samples were obtained with informedconsent through an institutional review board approved protocol at St JudeChildren’s Research Hospital. WGS and analysis of WGS data were performed

a

b Ctnnb1 pS6(Ser 235/236) H&E p4EBP1(Thr 37/46)

Tp53

+/fl

x ;Pik

3ca+

/flx

Ctn

nb1+

/lox(

Ex3

) ;Tp

53+

/flx

Ctn

nb1+

/lox(

Ex3

) ;

P < 0.0001Ctnnb1+/lox(Ex3);Tp53+/flx;Pik3ca+/flx n = 11

0 50 100 150 200 250 300 350 400 450 500 550 6000

25

50

75

100

Time (days)

Tum

our-

free

sur

viva

l (%

)

Ctnnb1+/lox(Ex3);Tp53flx/flx n = 55

Ctnnb1+/lox(Ex3);Tp53+/flx n = 54

Pik3ca+/flx n =11Pik3ca+/flx;Tp53+/flx n =15

Figure 4 | Pik3caE545K accelerates but does not initiate WNT-subgroupmedulloblastoma. a, Tumour-free survival of mice of the indicated genotype.All mice carry the Blbp-cre allele. Log rank P, 0.0001. b, Haematoxylin andeosin (H&E) and immunohistochemical stains of indicated tumours.Scale bar, 50mm.

ARTICLE RESEARCH

2 A U G U S T 2 0 1 2 | V O L 4 8 8 | N A T U R E | 4 7

SGK1 encodes a phosphatidylinositol-3-OH kinase (PI(3)K)-regulated kinase with functions including regulation of FOXOtranscription factors25, regulation of NF-kB by phosphorylating IkBkinase26, and negative regulation of NOTCH signalling27. SGK1 alsoresides within a region of chromosome 6 commonly deleted inDLBCL(Fig. 1)5. Themechanismbywhich SGK1 andGNA13 inactivationmaycontribute to lymphoma is unclear, but the strong degree of apparentselection towards their inactivation and their overall high mutationfrequency (eachmutated in 18 of 106DLBCL cases) suggests that theirloss contributes to B-cell NHL. Certain genes are known to bemutatedmore commonly in GCB DLBCLs (for example, TP53 (ref. 28) andEZH2 (ref. 13)). Here, both SGK1 and GNA13 mutations were foundonly in GCB cases (P5 1.933 1023 and 2.283 1024, Fisher’s exacttest; n5 15 and 18, respectively) (Fig. 2). Two additional genes(MEF2B and TNFRSF14) with no previously described role inDLBCL showed a similar restriction to GCB cases (Fig. 2).

Inactivating MLL2 mutationsMLL2 showed the most significant evidence for selection and thelargest number of nonsense SNVs. Our RNA-seq analysis indicatedthat 26.0% (33/127) of cases carried at least one MLL2 cSNV. To

address the possibility that variable RNA-seq coverage ofMLL2 failedto capture some mutations, we PCR-amplified the entireMLL2 locus(,36 kilobases) in 89 cases (35 primary FLs, 17 DLBCL cell lines, and37 DLBCLs). Of these cases 58 were among the RNA-seq cohort.Illumina amplicon re-sequencing (Supplementary Methods) revealed78 mutations, confirming the RNA-seq mutations in the overlappingcases and identifying 33 additional mutations. We confirmed thesomatic status of 46 variants using Sanger sequencing (Supplemen-tary Table 10), and showed that 20 of the 33 additionalmutations wereinsertions or deletions (indels). Three SNVs at splice sites were alsodetected, aswere10newcSNVs thathadnot beendetected byRNA-seq.The somatic mutations were distributed acrossMLL2 (Fig. 3a). Of

these, 37% (n5 29/78) were nonsense mutations, 46% (n5 36/78)were indels that altered the reading frame, 8% (n5 6/78) were pointmutations at splice sites and 9% (n5 7/78) were non-synonymousamino acid substitutions (Table 2). Four of the somatic splice sitemutations had effects on MLL2 transcript length and structure. Forexample, two heterozygous splice site mutations resulted in the use ofa novel splice donor site and an intron retention event.Approximately half of the NHL cases we sequenced had twoMLL2

mutations (Supplementary Table 10). We used bacterial artificial

Table 1 | Overview of cSNVs and confirmed somatic mutations in most frequently mutated genesGene Cases Total Somatic cSNVs

(RNA-seqcohort)*

P (raw) q NS SP T SP Skew(M, WT, both){

NS S T NS S T

MLL2{ 16 8 17 17 8 18 10 6.8531028 8.50 31027 0.834 14.4 WTTNFRSF14 G{ 7 1 7 8 1 7 11 6.8531028 8.50 31027 7.52 118 BothSGK1 G{ 18 6 6 37 10 6 9 6.8531028 8.50 31027 19.5 61.7 2BCL10{ 2 0 4 3 0 4 4 6.8531028 8.50 31027 3.62 112 WTGNA13 G{ 21 1 2 33 1 2 5 6.8531028 8.50 31027 24.1 25.7 BothTP53 G{ 20 2 1 23 3 1 22 6.8531028 8.50 31027 15.6 14.1 BothEZH2 G{ 33 0 0 33 0 0 33 6.8531028 8.50 31027 11.4 0.00 BothBTG2{ 12 6 1 14 6 1 2 6.85 31028 8.50 31027 23.9 35.1 2BCL2 G{ 42 45 0 96 105 0 43 9.3531028 8.50 31027 3.78 0.00 MBCL6{1 11 2 0 12 2 0 2 9.3531028 8.50 31027 0.175 0.00 MCIITA{1 5 3 0 6 3 0 2 9.3531028 8.50 31027 0.086 0.00FAS{ 2 0 4 3 0 4 2 1.52 31027 1.17 31026 2.54 66.5 WTBTG1{ 11 6 2 11 7 2 10 1.52 31027 1.17 31026 17.5 52.5 BothMEF2B G{ 20 2 0 20 2 0 10 2.05 31027 1.47 31026 14.2 0.00 MIRF8{ 11 5 3 14 5 3 3 4.55 31027 3.03 31026 8.82 28.2 WTTMEM30A{ 1 0 4 1 0 4 4 6.06 31027 3.79 31026 0.785 65.0 WTCD58{ 2 0 3 2 0 3 2 2.42 31026 1.43 31025 2.29 69.2 2KLHL6{ 10 2 2 12 2 2 4 1.00 31025 5.26 31025 5.42 16.4 2MYD88 A{ 13 2 0 14 2 0 9 1.00 31025 5.26 31025 12.4 0.00 WTCD70{ 5 0 1 5 0 2 3 1.70 31025 8.48 31025 7.08 44.0 2CD79B A{ 7 2 1 9 2 1 5 2.00 31025 9.52 31025 10.9 18.3 MCCND3{ 7 1 2 7 1 2 6 2.80 31025 1.27 31024 6.55 36.3 WTCREBBP{ 20 7 4 24 7 4 9 1.00 31024 4.35 31024 2.72 6.04 BothHIST1H1C{ 9 0 0 10 0 0 6 1.80 31024 7.50 31024 11.9 0.00 BothB2M{ 7 0 0 7 0 0 4 3.90 31024 1.56 31023 16.6 0.00 WTETS1{ 10 1 0 10 1 0 4 4.10 31024 1.58 31023 5.76 0.00 WTCARD11{ 14 3 0 14 3 0 3 1.90 31023 7.04 31023 3.37 0.00 BothFAT2{1 2 1 0 2 1 0 2 6.30 31023 2.25 31022 0.128 0.00 2IRF4{1 9 4 0 26 5 0 5 7.00 31023 2.41 31022 0.569 0.00 BothFOXO1{ 8 4 0 10 4 0 4 7.60 3103 2.53 31022 4.02 0.00 2STAT3 9 0 0 9 0 0 4 2.19 31022 6.08 31022 2 2 BothRAPGEF1 8 3 0 10 3 0 3 2.98 31022 7.45 31022 2 2 WTABCA7 12 3 0 15 3 0 2 7.76 31022 1.67 31021 2 2 WTRNF213 10 8 0 10 8 0 2 7.87 31022 1.67 31021 2 2 2MUC16 17 12 0 39 25 0 2 8.32 31022 1.73 31021 2 2 2HDAC7 8 4 0 8 4 0 2 8.94 31022 1.82 31021 2 2 WTPRKDC 7 3 0 7 4 0 2 1.06 31021 2.05 31021 2 2 2SAMD9 9 2 0 9 2 0 2 1.79 31021 3.01 31021 2 2 2TAF1 10 0 0 10 0 0 2 3.03 31021 4.74 31021 2 2 2PIM1 20 19 0 33 34 0 11 3.40 31021 5.23 31021 2 2 WTCOL4A2 8 2 0 8 2 0 2 7.64 31021 8.99 31021 2 2 2EP300 8 7 1 8 7 1 3 9.54 31021 1.00 2 2 WT

Individual cases with non-synonymous (NS), synonymous (S) and truncating (T) mutations and the total number of mutations of each class are shown separately because some genes contained multiplemutations in the same case. The P values indicated in bold are the upper limit on the P value for that gene determined with the approach described in ref. 19 (see Supplementary Methods), q is the Benjamini-corrected q value, and NS SP and T SP refer to selective pressure estimates from this model for the acquisition of non-synonymous or truncatingmutations, respectively. Genes with a superscript of either A or Gwere found to have mutations significantly enriched in ABC or GCB cases, respectively (P,0.05, Fisher’s exact test).*Additional somatic mutations identified in larger cohorts and insertion/deletion mutations are not included in this total.{ ‘Both’ indicates that we observed separate cases in which skewed expression was seen but where this skew was not consistent for the mutant or wild-type allele.{Genessignificant at a false discovery rate of 0.03. SNVs inBCL2 andpreviously confirmedhot spotmutations inEZH2andCD79Bareprobably somatic in these samples basedonpublishedobservations of others.1Selective pressure estimates are both,1 indicating purifying selection rather than positive selection acting on this gene.

RESEARCH ARTICLE

3 0 0 | N A T U R E | V O L 4 7 6 | 1 8 A U G U S T 2 0 1 1

NATURE REPRINT COLLECTION Epigenetics S37

,50% more labelled cells at P1 than did controls, although these cellsmigrated normally (Fig. 3d, e and data not shown). Thus, mutations inDDX3X may contribute to WNT-subgroup medulloblastoma byincreasing LRLP proliferation rather than perturbing the migrationof their daughter cells. Notably, comparable knockdown in utero ofMll2, Gabrg1 and Kdm6a that were selectively mutated in non-WNTmedulloblastomas had no apparent impact on LRLPs; supporting thevalue of our assay for assessingWNT-subgroup specificmutations andunderscoring the importance of cell context for functional studies ofgenes mutated in cancer subgroups.

PIK3CA mutations promote WNT-subgroup medulloblastomaCancer-associated, activating mutations in PIK3CA were detected ina single case each of WNT-subgroup (PIK3CA(Q546K)), SHH-subgroup (PIK3CA(H1047R)) and subgroup-4 (PIK3CA(N345K))medulloblastoma (Fig. 1 and Supplementary Fig. 23). AlthoughPIK3CA mutations are common in adult cancers40 and reported inmedulloblastoma41, their role in tumorigenesis remains controversial.In particular it is not known if these mutations initiate or progresscancer. To test this, we generated mice that express a conditionalallele of the Pik3caE545K mutation. Mice harbouring Pik3caE545K

or Pik3caE545K and Tp53flx/flx were bred with Blbp-Cre, which drivesefficient recombination in LRLPs5. Blbp-Cre;Pik3caE545K mice, withor without Tp53flx/flx, survived tumour free for a median of 212days with no evidence of aberrant LRLP migration (Fig. 4a anddata not shown). In stark contrast, 100% (n5 11/11) of Blbp-Cre;Ctnnb11/lox(Ex3);Tp531/flx;Pik3caE545K mice developed WNT-subgroup medulloblastomas by 3 months of age; only 4% (n5 2/54)of Blbp-Cre;Ctnnb11/lox(Ex3);Tp531/flx mice develop WNT-subgroupmedulloblastoma by 11 months (Fig. 4a, b). Pik3ca wild-type andmutantmousemedulloblastomas displayed similar ‘classic’ histologiesand nuclear Ctnnb11, but Pik3caE545K mutant tumours containedgreater AKT pathway activity as measured by pS6 and p4EBP1immunostaining. Thus mutations in PIK3CA probably activatethe AKT pathway to progress, rather than initiate, WNT-subgroupmedulloblastoma.

SHH-subgroup medulloblastomasFour of thirteen SHH-subgroup medulloblastomas containedexpected biallelic inactivating alterations in SUFU or PTCH1. What

drives aberrant SHH signals in the remaining cases remains unclear.These tumours contained mutations in MLL2, TP53 and PTEN thathave been reported previously in medulloblastoma42; but these muta-tions occur in other subgroups and are not known to activate SHHsignals. Two SHH-subgroup tumours (nos 11 and 12) containedidentical novel T48Mmutations in the GABAA (c-aminobutyric acid,subtype A) receptor, c1, which is predicted to be deleterious (Fig. 1and Supplementary Table 9). Disruption of GABAA receptors canenhance neural stem cell proliferation43, suggesting that these muta-tions might deregulate the proliferation of GNPs that generate SHH-subgroup medulloblastomas.

DiscussionWe have identified several, new, recurrent, somatic mutations in spe-cific subgroups of medulloblastoma. Alterations affecting EZH2,KDM6A, CHD7 and ZMYM3 seem to disrupt chromatin markingof genes in subgroup-3 and -4 tumours. Further epigenetic studieswill be required to uncover the identity of these genes, but evidencesuggests thesemay includeOTX2,MYC andMYCN44,45. As amplifica-tion of these genes was detected almost exclusively in subgroup-3 and-4 tumours that lacked mutations in KDM6A, CHD7 or ZMYM3, it istempting to speculate that these genetic alterations target commontransforming pathways. A recent study detected recurrent mutationsin three other chromatin remodellers in medulloblastoma42:SMARCA4, MLL2 and MLL3, but this study did not include detailsof tumour subgroup. Here, we show that mutations in SMARCA4,CREBBP, TRRAP and MED13 are enriched in WNT-subgroupmedulloblastomas; thereby uncovering potential cooperative muta-tions in chromatin remodellers and their binding-partner oncogene,CTNNB1. Thus, disruptions in the epigenetic machinery of medullo-blastoma are likely to be subgroup specific and may cooperate withother oncogenic mutations. The low incidence of MLL2 mutationsdetected in our study relative to previous work42 probably reflectsdifferences in study populations (see Supplementary Results).Although medulloblastoma is more prevalent in males, especially

with subgroup-3 and -4 disease46, the reason for this sex bias isunknown.One potential explanation is the locationofmedulloblastomaoncogenes or tumour suppressor genes on chromosome X47. Three ofthemost recurrentlymutated genes detected in our study are located onchromosome X, of which two (ZMYM3 and KDM6A) were observedalmost exclusively in males. Mutation in these genes might explainsome of the male sex bias in medulloblastoma. The third mutated Xchromosome gene, DDX3X, is more likely to be a WNT-subgroupmedulloblastoma oncogene. Three of four female medulloblastomascarried heterozygousmutations inDDX3X that escapeX inactivation25,and our functional data indicate that mutations in this gene provide aproliferative advantage to LRLPs that generate these tumours.Our findings also have important implications for drug develop-

ment. Inhibitors of the epigenetic machinery, especially those thatmaintain H3K27me3—for example, EZH2 methylase—may be usefultreatments for subgroup-3 and -4 disease. These tumours include themost aggressive forms of medulloblastoma, for which treatmentoptions are limited. Mutations that activate PIK3CA and DDX3X inWNT-subgroup tumoursmight alsobe targetedwith novel therapeuticstrategies48,49. Future clinical trials of drugs that target these mutantproteins must recruit the appropriate patient populations, as wedemonstrate that mutations show subgroup specificity in medullo-blastoma. Our accurate mouse models of WNT-subgroup, SHH-subgroup and subgroup-3 medulloblastoma should help with futurestudies of the biological and therapeutic importance of the novelgenetic alterations described in this study.

METHODS SUMMARYHuman tumour and matched blood samples were obtained with informedconsent through an institutional review board approved protocol at St JudeChildren’s Research Hospital. WGS and analysis of WGS data were performed

a

b Ctnnb1 pS6(Ser 235/236) H&E p4EBP1(Thr 37/46)

Tp53

+/fl

x ;Pik

3ca+

/flx

Ctn

nb1+

/lox(

Ex3

) ;Tp

53+

/flx

Ctn

nb1+

/lox(

Ex3

) ;

P < 0.0001Ctnnb1+/lox(Ex3);Tp53+/flx;Pik3ca+/flx n = 11

0 50 100 150 200 250 300 350 400 450 500 550 6000

25

50

75

100

Time (days)

Tum

our-

free

sur

viva

l (%

)

Ctnnb1+/lox(Ex3);Tp53flx/flx n = 55

Ctnnb1+/lox(Ex3);Tp53+/flx n = 54

Pik3ca+/flx n =11Pik3ca+/flx;Tp53+/flx n =15

Figure 4 | Pik3caE545K accelerates but does not initiate WNT-subgroupmedulloblastoma. a, Tumour-free survival of mice of the indicated genotype.All mice carry the Blbp-cre allele. Log rank P, 0.0001. b, Haematoxylin andeosin (H&E) and immunohistochemical stains of indicated tumours.Scale bar, 50mm.

ARTICLE RESEARCH

2 A U G U S T 2 0 1 2 | V O L 4 8 8 | N A T U R E | 4 7

as previously described50. Details of sequence coverage, custom capture and othervalidation procedures are provided in Supplementary Information (Supplemen-tary Tables 12–15). Immunohistochemistry and immunofluorescence ofhuman and mouse tissues were performed using routine techniques and primaryantibodies of the appropriate tissues as described (Supplementary Methods).Medulloblastoma mRNA and DNA profiles were generated using AffymetrixU133v2 and SNP 6.0 arrays, respectively (Supplementary Methods). Real-timePCR with reverse transcriptase (RT–PCR) analysis of genes targeted in mouseLRLPs by shRNAswere performed as described previously32. LRLPswere isolatedand transduced with indicated lentiviruses in stem cell cultures or targetedin utero with shRNAs or mutant cDNA sequences by electroporation asdescribed5 (Supplementary Information). Mice harbouring a Cre-induciblePik3caE545K allele were generated using homologous recombination: a lox-puro-STOP-lox cassette was introduced immediately upstream of the exon con-taining the initiation codon, exon 9 was replaced with an exon containing theE545Kmutation. Pik3caE545Kmice were bred with Blbp-Cre;Ctnnb1lox(Ex3)/lox(Ex3)

and Tp53flx/flx mice to generate progeny of the appropriate genotype and sub-jected to clinical surveillance.

Received 13 January; accepted 2 May 2012.

Published online 20 June 2012.

1. Central BrainTumorRegistry of theUnitedStates. Statistical report: primarybraintumors in theUnitedStates, 1995–1999. https:// http://www.cbtrus.org/reports/2002/2002report.pdf (CBTRUS, 2006).

2. Taylor, M. D. et al.Molecular subgroups of medulloblastoma: the currentconsensus. Acta Neuropathol. 123, 465–472 (2012).

3. Schuller, U. et al. Acquisition of granule neuron precursor identity is a criticaldeterminant of progenitor cell competence to form Shh-inducedmedulloblastoma. Cancer Cell 14, 123–134 (2008).

4. Yang, Z. J.et al.Medulloblastomacanbe initiatedbydeletion ofPatched in lineage-restricted progenitors or stem cells. Cancer Cell 14, 135–145 (2008).

5. Gibson, P.et al.Subtypesofmedulloblastomahavedistinctdevelopmental origins.Nature 468, 1095–1099 (2010).

6. Kawauchi, D. et al. A mouse model of the most aggressive subgroup of humanmedulloblastoma. Cancer Cell 21, 168–180 (2012).

7. Mulhern, R. K. et al. Neurocognitive consequences of risk-adapted therapy forchildhood medulloblastoma. J. Clin. Oncol. 23, 5511–5519 (2005).

8. Wang, J. et al. CREST maps somatic structural variation in cancer genomes withbase-pair resolution. Nature Methods 8, 652–654 (2011).

9. Rausch, T. et al. Genome sequencing of pediatric medulloblastoma linkscatastrophic DNA rearrangements with TP53mutations. Cell 148, 59–71 (2012).

10. Castellino, R. C. et al. Heterozygosity for Pten promotes tumorigenesis in a mousemodel of medulloblastoma. PLoS ONE 5, e10849 (2010).

11. Hahn, H. et al.Mutations of the human homolog of Drosophila patched in thenevoid basal cell carcinoma syndrome. Cell 85, 841–851 (1996).

12. Malkin, D. et al. Germ line p53mutations in a familial syndrome of breast cancer,sarcomas, and other neoplasms. Science 250, 1233–1238 (1990).

13. Hamilton,S.R.et al.Themolecular basis ofTurcot’s syndrome.N.Engl. J.Med.332,839–847 (1995).

14. Taylor, M. D. et al.Medulloblastoma in a child with Rubenstein-Taybi syndrome:case report and review of the literature. Pediatr. Neurosurg. 35, 235–238 (2001).

15. Mikkelsen, T. S. et al. Genome-wide maps of chromatin state in pluripotent andlineage-committed cells. Nature 448, 553–560 (2007).

16. Cao,R.et al.RoleofhistoneH3 lysine27methylation inPolycomb-groupsilencing.Science 298, 1039–1043 (2002).

17. Czermin, B. et al. Drosophila Enhancer of Zeste/ESC complexes have a histone H3methyltransferase activity that marks chromosomal Polycomb sites. Cell 111,185–196 (2002).

18. Agger, K. et al. UTX and JMJD3 are histone H3K27 demethylases involved inHOXgene regulation and development. Nature 449, 731–734 (2007).

19. Schnetz, M. P. et al. Genomic distribution of CHD7 on chromatin tracks H3K4methylation patterns. Genome Res. 19, 590–601 (2009).

20. Sauvageau, M. & Sauvageau, G. Polycomb group proteins: multi-facetedregulators of somatic stem cells and cancer. Cell Stem Cell 7, 299–313 (2010).

21. Morin, R. D. et al. Somatic mutations altering EZH2 (Tyr641) in follicular anddiffuse large B-cell lymphomas of germinal-center origin. Nature Genet. 42,181–185 (2010).

22. Kleer, C. G. et al. EZH2 is a marker of aggressive breast cancer and promotesneoplastic transformation of breast epithelial cells. Proc. Natl Acad. Sci. USA 100,11606–11611 (2003).

23. Varambally, S. et al. The polycomb group protein EZH2 is involved in progressionof prostate cancer. Nature 419, 624–629 (2002).

24. van Haaften, G. et al. Somatic mutations of the histone H3K27 demethylase geneUTX in human cancer. Nature Genet. 41, 521–523 (2009).

25. Yang, F., Babak, T., Shendure, J. & Disteche, C. M. Global survey of escape from Xinactivation by RNA-sequencing in mouse. Genome Res. 20, 614–622 (2010).

26. Christensen, J. et al.RBP2belongs to a family of demethylases, specific for tri- anddimethylated lysine 4 on histone 3. Cell 128, 1063–1076 (2007).

27. Lee, M. G., Wynder, C., Cooch, N. & Shiekhattar, R. An essential role for CoREST innucleosomal histone 3 lysine 4 demethylation. Nature 437, 432–435 (2005).

28. Mosimann, C., Hausmann, G. & Basler, K. b-Catenin hits chromatin: regulation ofWnt target gene activation. Nature Rev. Mol. Cell Biol. 10, 276–286 (2009).

29. Hecht, A., Vleminckx, K., Stemmler, M. P., van Roy, F. & Kemler, R. The p300/CBPacetyltransferases function as transcriptional coactivators of b-catenin invertebrates. EMBO J. 19, 1839–1850 (2000).

30. Barker, N. et al.The chromatin remodelling factor Brg-1 interacts with b-catenin topromote target gene activation. EMBO J. 20, 4935–4943 (2001).

31. Carrera, I., Janody, F., Leeds, N., Duveau, F. & Treisman, J. E. Pygopus activatesWingless target gene transcription through the mediator complex subunitsMed12 and Med13. Proc. Natl Acad. Sci. USA 105, 6644–6649 (2008).

32. Thompson, M. C. et al. Genomics identifies medulloblastoma subgroups that areenriched for specific genetic alterations. J. Clin. Oncol. 24, 1924–1931 (2006).

33. Kool, M. et al. Integrated genomics identifies five medulloblastoma subtypes withdistinct genetic profiles, pathway signatures and clinicopathological features.PLoS ONE 3, e3088 (2008).

34. Orsulic, S., Huber, O., Aberle, H., Arnold, S. & Kemler, R. E-cadherin bindingprevents b-catenin nuclear localization and b-catenin/LEF-1-mediatedtransactivation. J. Cell Sci. 112, 1237–1245 (1999).

35. Risinger, J. I., Berchuck, A., Kohler,M. F.&Boyd, J.Mutationsof theE-cadheringenein human gynecologic cancers. Nature Genet. 7, 98–102 (1994).

36. Becker, K.-F. et al. E-Cadherin genemutations provide clues to diffuse type gastriccarcinomas. Cancer Res. 54, 3845–3852 (1994).

37. Pek, J. W. & Kai, T. DEAD-box RNA helicase Belle/DDX3 and the RNA interferencepathway promotemitotic chromosome segregation.Proc. Natl Acad. Sci. USA 108,12007–12012 (2011).

38. Lai,M.C., Chang,W.C., Shieh,S. Y.&Tarn,W. Y.DDX3 regulatescell growth throughtranslational control of cyclin E1. Mol. Cell. Biol. 30, 5444–5453 (2010).

39. Schroder, M. Human DEAD-box protein 3 has multiple functions in generegulation and cell cycle control and is a prime target for viral manipulation.Biochem. Pharmacol. 79, 297–306 (2010).

40. Samuels, Y. et al. High frequency of mutations of the PIK3CA gene in humancancers. Science 304, 554 (2004).

41. Broderick, D. K. et al.Mutations of PIK3CA in anaplastic oligodendrogliomas, high-grade astrocytomas, andmedulloblastomas. Cancer Res. 64, 5048–5050 (2004).

42. Parsons, D. W. et al. The genetic landscape of the childhood cancermedulloblastoma. Science 331, 435–439 (2011).

43. Andang, M. et al.Histone H2AX-dependent GABAA receptor regulation of stem cellproliferation. Nature 451, 460–464 (2008).

44. Pasini, D. et al. Coordinated regulation of transcriptional repression by the RBP2H3K4 demethylase and Polycomb-Repressive Complex 2. Genes Dev. 22,1345–1355 (2008).

45. Khan, A., Shover,W.&Goodliffe, J.M. Su(z)2 antagonizesauto-repressionofMyc inDrosophila, increasing Myc levels and subsequent trans-activation. PLoS ONE 4,e5076 (2009).

46. Northcott, P. A. et al.Medulloblastoma comprises four distinctmolecular variants.J. Clin. Oncol. 29, 1408–1414 (2011).

47. Spatz, A., Borg, C. & Feunteun, J. X-chromosome genetics and human cancer.Nature Rev. Cancer 4, 617–629 (2004).

48. Lindqvist, L.et al.Selective pharmacological targeting of aDEADboxRNAhelicase.PLoS One 3, e1583 (2008).

49. Engelman, J. A. Targeting PI3K signalling in cancer: opportunities, challenges andlimitations. Nature Rev. Cancer 9, 550–562 (2009).

50. Zhang, J. et al. The genetic basis of early T-cell precursor acute lymphoblasticleukaemia. Nature 481, 157–163 (2012).

Supplementary Information is linked to the online version of the paper atwww.nature.com/nature.

Acknowledgements This research was supported as part of the St Jude Children’sResearchHospital,WashingtonUniversity Pediatric Cancer GenomeProject. This workwas supported by grants from the National Institutes of Health (R01CA129541,P01CA96832 and P30CA021765; R.J.G.), the Collaborative Ependymoma ResearchNetwork (CERN), Musicians against Childhood Cancer (MACC), The Noyes BrainTumour Foundation, and by the American Lebanese Syrian Associated Charities(ALSAC).Weare grateful toS. Temple for the gift of reagents and the staff of theHartwellCenter for Bioinformatics and Biotechnology and ARC at St Jude Children’s ResearchHospital for technical assistance.

Author Contributions G.R., M.P., T.A.K., C.L., X.C., L.D., T.N.P., E.H., L.W., X.Z., N.Ch., R.H.,N.Cu., R.T., J.W., G.W.,M.R., X.H., J.B., P.G., J.M., J.E., B.V., A.O.-T., T.L., S.Po., S.Pa., D.Z., D.K.andD.F. contributed to the designandconduct of experiments and to thewriting. S.J.B.,R.K., M.F.R., R.S.F., L.L.F., D.J.D., K.O. and E.R.M. contributed to experimental design andto the writing. A.G., D.W.E., C.C.L., E.B., T.H., S.G. and R.C. provided clinical expertise.R.K.W., J.R.D., J.Z. and R.J.G. conceived the research and contributed to the design,direction and reporting of the study.

Author Information Sequence and SNP array data were deposited in dbGaP underaccession number phs000409 and in the Sequence Read Archive (SRA) underaccession number SRP008292. Reprints and permissions information is available atwww.nature.com/reprints. This paper is distributed under the terms of the CreativeCommons Attributions-Non-Commercial-Share Alike licence, and is freely available toall readers at www.nature.com/nature. The authors declare no competing financialinterests. Readers are welcome to comment on the online version of this article atwww.nature.com/nature. Correspondence and requests for materials should beaddressed to R.J.G. ([email protected]) or J.Z.([email protected]).

RESEARCH ARTICLE

4 8 | N A T U R E | V O L 4 8 8 | 2 A U G U S T 2 0 1 2

chromosome (BAC) clone sequencing in eight FL cases to show that inall eight cases themutations were in trans, affecting bothMLL2 alleles.This observation is consistent with the notion that there is a complete,or near-complete, loss ofMLL2 in the tumour cells of such patients.With the exception of two primary FL cases and two DLBCL cell

lines (Pfeiffer and SU-DHL-9), themajority ofMLL2mutations seemedto be heterozygous. Analysis of Affymetrix 500k SNP array data fromtwo FL cases with apparent homozygous mutations revealed that bothtumours showed copy number neutral loss of heterozygosity (LOH)for the region of chromosome 12 containing MLL2 (SupplementaryMethods). Thus, in addition to bi-allelic mutation, LOH is a second,albeit less common mechanism by whichMLL2 function is lost.MLL2 was the most frequently mutated gene in FL, and among the

most frequently mutated genes in DLBCL (Fig. 2). We confirmedMLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCLpatients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of theeight normal centroblast samples we sequenced. Our analysis pre-dicted that the majority of the somatic mutations observed in MLL2were inactivating (91% disrupted the reading frame or were truncat-ing point mutations), indicating to us that MLL2 is a tumour sup-pressor of significance in NHL.

Recurrent point mutations in MEF2BOur selective pressure analysis also revealed genes with stronger pres-sure for acquisition of amino acid substitutions than for nonsense

mutations. One such gene wasMEF2B, which had not previously beenlinked to lymphoma. We found that 20 (15.7%) cases had MEF2BcSNVs and 4 (3.1%) cases hadMEF2C cSNVs. All cSNVs detected byRNA-seq affected either the MADS box or MEF2 domains. To deter-mine the frequency and scope of MEF2B mutations, we Sanger-sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCLprimary tumours; 17 cell lines; 35 cases of assorted NHL (IBL,composite FL and PBMCL); and eight non-malignant centroblastsamples. We also used a capture strategy (Supplementary Methods)to sequence the entire MEF2B coding region in the 261 FL samples,revealing six additional variants outside exons 2 and 3. We thus iden-tified 69 cases (34 DLBCL, 12.67%; and 35 FL, 15.33%) with MEF2BcSNVs or indels, failing to observe novel variants in other NHL andnon-malignant samples. Of the variants 55 (80%) affected residueswithin the MADS box and MEF2 domains encoded by exons 2 and 3(Supplementary Table 11; Fig. 3b). Each patient generally had a singleMEF2B variant and we observed relatively few (eight in total, 10.7%)truncation-inducing SNVs or indels. Non-synonymous SNVswere byfar themost common type of change observed, with 59.4% of detectedvariants affecting K4, Y69, N81 or D83. In 12 casesMEF2Bmutationswere shown to be somatic, including representative mutations at eachof K4, Y69, N81 and D83 (Supplementary Table 12). We did notdetect mutations in ABC cases, indicating that somatic mutations inMEF2B have a role unique to the development of GCBDLBCL and FL(Fig. 2).

AB

C e

nric

hmen

tG

CB

enr

ichm

ent

10203040

Cas

es

ABC GCBU FL

<0.05

0.1–0.05

0.3–0.1

MY

D88

CD

79B

BC

L6s

TNFA

IP3

CA

RD

11FA

STM

EM

30A

CD

58C

D70

STA

T3E

TS1

HIS

T1H

1CC

CN

D3

KLH

L6B

TG1

BTG

2IR

F8B

2ME

P30

0C

RE

BB

PM

LL2

FOX

O1

TNFR

SF1

4M

EF2

BTP

53B

CL2

SG

K1

GN

A13

EZ

H2

BC

L2s

BCL2sEZH2GNA13SGK1BCL2TP53MEF2BTNFRSF14FOXO1MLL2CREBBPEP300B2MIRF8BTG2BTG1KLHL6CCND3HIST1H1CETS1STAT3CD70CD58TMEM30AFASCARD11TNFAIP3BCL6sCD79BMYD88

Figure 2 | Overview of mutations and potential cooperative interactions inNHL. This heat map displays possible trends towards co-occurrence (red) andmutual exclusion (blue) of somatic mutations and structural rearrangements.Colours were assigned by taking the minimum value of a left- and right-tailedFisher’s exact test. To capture trends a P-value threshold of 0.3 was used, withthe darkest shade of the colour indicating those meeting statistical significance(P# 0.05). The relative frequency of mutations in ABC (blue), GCB (red),unclassifiable (black)DLBCLs and FL (yellow) cases is shown on the left. Geneswere arranged with those having significant (P, 0.05, Fisher’s exact test)enrichment for mutations in ABC cases (blue triangle) towards the top (andleft) and those with significant enrichment for mutations in GCB cases (redtriangle) towards the bottom (and right). The total number of cases in whicheach gene contained either cSNVs or confirmed somatic mutations is shown atthe top. The cluster of blue squares (upper-right) results from the mutualexclusion of the ABC-enrichedmutations (for example,MYD88, CD79B) fromthe GCB-enriched mutations (for example, EZH2, GNA13). Presence ofstructural rearrangements involving the two oncogenes BCL6 and BCL2(indicated as BCL6s and BCL2s) was determined with FISH techniques usingbreak-apart probes (Supplementary Methods).

PHD PHD HMG box COG5141FYRN

FYRC

SET

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 bp

a

D83G

K4E

MADS box

0 50 100 150 200 250 300 350

MEF2

b N81KN81Y

D83A

D83VY69HY69C

MLL2

MEF2B

bp

Figure 3 | Summary and effect of somatic mutations affecting MLL2 andMEF2B. a, Re-sequencing theMLL2 locus in 89 samples revealed mainlynonsense (red circles) and frameshift-inducing indel mutations (orangetriangles; inverted triangles for insertions and upright triangles for deletions). Asmaller number of non-synonymous somatic mutations (green circles) andpoint mutations or deletions affecting splice sites (yellow stars) were alsoobserved. All of the non-synonymous pointmutations affected a residue withineither the catalytic SET domain, the FYRC domain (FY-rich carboxy-terminaldomain) or PHD zinc finger domains. The effect of these splice-site mutationsonMLL2 splicing was also explored (Supplementary Figure 7). b, The cSNVsand somatic mutations found inMEF2B in all FL and DLBCL cases sequencedare shownwith the same symbols. Only the amino acids with variants in at leasttwo patients are labelled. cSNVs were most prevalent in the first two protein-coding exons ofMEF2B (exons 2 and 3). The crystal structure of MEF2 boundto EP300 supports the idea that two of the mutated sites (L67 and Y69) areimportant in the interaction between these proteins (Supplementary Figure 8and Supplementary Discussion)50.

Table 2 | Summary of types of MLL2 somatic mutationsSample Type FL DLBCL DLBCL cell-line Centroblast

Truncation 18 4 7 0Indel with frameshift 22 8 6 0Splice site 4 2 0 0SNV 3 2 2 0Any mutation/number of cases 31/35 12/37 10/17 0/8Percentage 89 32 59 0

ARTICLE RESEARCH

1 8 A U G U S T 2 0 1 1 | V O L 4 7 6 | N A T U R E | 3 0 1

S38 NATURE REPRINT COLLECTION Epigenetics 890 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE PUBLISHED ONLINE: 30 SEPTEMBER 2012 | DOI: 10.1038/NCHEMBIO.1084

Trimethylation of H3K27 is a transcriptionally repressive epige-netic mark that has been causally associated with a number of hematologic and solid human cancers. Methylation of H3K27

is catalyzed by polycomb repressive complex 2 (PRC2), contain-ing the enzymatic subunit EZH2 or EZH1 (refs. 1,2). Reversal of H3K27 methylation is catalyzed by the histone demethylases UTX and JMJD3 (refs. 3 – 7). Several molecular mechanisms leading to a hypertrimethylated state of H3K27 are seen among human cancers. For example, EZH2 itself and other PRC2 subunits are amplified and / or overexpressed in subsets of several human cancers includ-ing breast, prostate and lymphoma 8 – 13 . Loss-of-function mutations in the demethylase UTX are found in subsets of myeloma, renal and esophageal cancers 14 , and overexpression of the PRC2-associated protein PHF19 are observed in a number of solid tumors 15 .

Most recently, point mutations at Tyr641 (Y641F, Y641N, Y641S and Y641H) have been identified in 8 – 24 % of non-Hodgkin lym-phomas in several studies 16 – 18 . The mutation status for EZH2 is found to always be heterozygous in primary tumor samples from these patients. Although these mutations were originally characterized as loss-of-function mutations, our group later demonstrated that the mutations in fact change the substrate specificity of EZH2 (ref. 19). The wild-type enzyme is most efficient as a monomethyltransferase and wanes in catalytic efficiency for the second and especially the third methylation reaction. In contrast, all of the mutant enzymes show the exact opposite order of substrate use; they are essentially inactive as monomethyltransferases but are effective at catalyzing the reaction from mono- to dimethyl and are very efficient at cata-lyzing the reaction from di- to trimethyl. We, and subsequently others 20 , demonstrated that lymphoma cells heterozygous for these Tyr641 mutants show hypertrimethylation of H3K27 compared to EZH2 wild-type lymphoma cells; the hypertrimethylation results from the enzymatic coupling between wild-type (to drive monom-ethylation) and mutant (to drive di- and trimethylation) EZH2 in the heterozygous cells.

Additionally, a heterozygous EZH2 mutation within the SU(VAR)3 – 9, enhancer of zeste, trithorax (SET) domain at Ala677

(A677G) is seen both in the Pfeiffer cell line and in primary patient samples 18, 21 . Further investigation of this mutation indicates that it also results in increased H3K27me3 while decreasing H3K27me2 in vitro , similar to the Tyr641 mutations. However, at the biochemical level, the substrate specificity of this enzyme differs from that seen in the Tyr641 mutants. Specifically, in vitro assays demonstrate that the A677G mutant efficiently catalyzes all three H3K27 methylation steps, whereas the Tyr641 mutants preferentially catalyze the reac-tion from di- to trimethyl 21 . This is another example of a heterozy-gous change-of-function point mutation within the EZH2 SET domain observed in lymphoma.

On the basis of this enzymatic coupling and the resultant hyper-trimethylation of H3K27, we hypothesized that the hypertrimethy-lated H3K27 phenotype drives the lymphomagenic proliferation in these EZH2 mutant – bearing cells; the cells thus depend on EZH2 enzymatic activity for proliferation and survival. This hypothesis has not been adequately tested, however, until now. In this paper, we report the discovery of a potent and selective small-molecule inhibi-tor of EZH2, EPZ005687 ( 4 ). The ability of this compound to directly and selectively inhibit PRC2 enzymatic activity distinguishes it from DZNep, a compound that has been used previously to probe cellular EZH2 function. DZNep is an inhibitor of S -adenosylhomocysteine (SAH) hydrolase and is thought to inhibit and cause the degrada-tion of the PRC2 complex by an indirect mechanism involving an increase in the cellular concentration of SAH, an inhibitory byprod-uct of cellular methyltransferase reactions 22, 23 .

Interpretation of cellular phenotypes caused by DZNep is complicated by DZNep ’ s ability to reduce methylation at multiple histone residues targeted by protein methyltransferases (PMTs) other than EZH2. In contrast, treatment of cells with EPZ005687 resulted in concentration-dependent ablation of H3K27 methyla-tion without major decreases in any other histone methyl marks. When the compound was applied to lymphoma cells bearing an EZH2 Tyr641 or Ala677 mutation, concentration-dependent cell killing was observed. Unlike the potent cell killing seen for mutant- bearing lymphoma cell lines, EPZ005687 had minimal effects on

1 Epizyme, Inc. , Cambridge , Massachusetts , USA . 2 These authors contributed equally to this work. * e-mail: [email protected]

A selective inhibitor of EZH2 blocks H3K27 methylation and kills mutant lymphoma cells Sarah K Knutson 1 , 2 , Tim J Wigle 1 , 2 , Natalie M Warholic 1 , Christopher J Sneeringer 1 , Christina J Allain 1 , Christine R Klaus 1 , Joelle D Sacks 1 , Alejandra Raimondi 1 , Christina R Majer 1 , Jeffrey Song 1 , Margaret Porter Scott 1 , Lei Jin 1 , Jesse J Smith 1 , Edward J Olhava 1 , Richard Chesworth 1 , Mikel P Moyer 1 , Victoria M Richon 1 , Robert A Copeland 1 , Heike Keilhack 1 , Roy M Pollock 1 & Kevin W Kuntz 1 *

EZH2 catalyzes trimethylation of histone H3 lysine 27 (H3K27). Point mutations of EZH2 at Tyr641 and Ala677 occur in subpopulations of non-Hodgkin ’ s lymphoma, where they drive H3K27 hypertrimethylation. Here we report the discovery of EPZ005687, a potent inhibitor of EZH2 ( K i of 24 nM). EPZ005687 has greater than 500-fold selectivity against 15 other pro-tein methyltransferases and has 50-fold selectivity against the closely related enzyme EZH1. The compound reduces H3K27 methylation in various lymphoma cells; this translates into apoptotic cell killing in heterozygous Tyr641 or Ala677 mutant cells, with minimal effects on the proliferation of wild-type cells. These data suggest that genetic alteration of EZH2 (for example, mutations at Tyr641 or Ala677) results in a critical dependency on enzymatic activity for proliferation (that is, the equivalent of oncogene addiction), thus portending the clinical use of EZH2 inhibitors for cancers in which EZH2 is genetically altered.

First published in Nature Chemical Biology 8, 890–896 (2012); doi:10.1038/nchembio.1084

NATURE REPRINT COLLECTION Epigenetics S39890 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE PUBLISHED ONLINE: 30 SEPTEMBER 2012 | DOI: 10.1038/NCHEMBIO.1084

Trimethylation of H3K27 is a transcriptionally repressive epige-netic mark that has been causally associated with a number of hematologic and solid human cancers. Methylation of H3K27

is catalyzed by polycomb repressive complex 2 (PRC2), contain-ing the enzymatic subunit EZH2 or EZH1 (refs. 1,2). Reversal of H3K27 methylation is catalyzed by the histone demethylases UTX and JMJD3 (refs. 3 – 7). Several molecular mechanisms leading to a hypertrimethylated state of H3K27 are seen among human cancers. For example, EZH2 itself and other PRC2 subunits are amplified and / or overexpressed in subsets of several human cancers includ-ing breast, prostate and lymphoma 8 – 13 . Loss-of-function mutations in the demethylase UTX are found in subsets of myeloma, renal and esophageal cancers 14 , and overexpression of the PRC2-associated protein PHF19 are observed in a number of solid tumors 15 .

Most recently, point mutations at Tyr641 (Y641F, Y641N, Y641S and Y641H) have been identified in 8 – 24 % of non-Hodgkin lym-phomas in several studies 16 – 18 . The mutation status for EZH2 is found to always be heterozygous in primary tumor samples from these patients. Although these mutations were originally characterized as loss-of-function mutations, our group later demonstrated that the mutations in fact change the substrate specificity of EZH2 (ref. 19). The wild-type enzyme is most efficient as a monomethyltransferase and wanes in catalytic efficiency for the second and especially the third methylation reaction. In contrast, all of the mutant enzymes show the exact opposite order of substrate use; they are essentially inactive as monomethyltransferases but are effective at catalyzing the reaction from mono- to dimethyl and are very efficient at cata-lyzing the reaction from di- to trimethyl. We, and subsequently others 20 , demonstrated that lymphoma cells heterozygous for these Tyr641 mutants show hypertrimethylation of H3K27 compared to EZH2 wild-type lymphoma cells; the hypertrimethylation results from the enzymatic coupling between wild-type (to drive monom-ethylation) and mutant (to drive di- and trimethylation) EZH2 in the heterozygous cells.

Additionally, a heterozygous EZH2 mutation within the SU(VAR)3 – 9, enhancer of zeste, trithorax (SET) domain at Ala677

(A677G) is seen both in the Pfeiffer cell line and in primary patient samples 18, 21 . Further investigation of this mutation indicates that it also results in increased H3K27me3 while decreasing H3K27me2 in vitro , similar to the Tyr641 mutations. However, at the biochemical level, the substrate specificity of this enzyme differs from that seen in the Tyr641 mutants. Specifically, in vitro assays demonstrate that the A677G mutant efficiently catalyzes all three H3K27 methylation steps, whereas the Tyr641 mutants preferentially catalyze the reac-tion from di- to trimethyl 21 . This is another example of a heterozy-gous change-of-function point mutation within the EZH2 SET domain observed in lymphoma.

On the basis of this enzymatic coupling and the resultant hyper-trimethylation of H3K27, we hypothesized that the hypertrimethy-lated H3K27 phenotype drives the lymphomagenic proliferation in these EZH2 mutant – bearing cells; the cells thus depend on EZH2 enzymatic activity for proliferation and survival. This hypothesis has not been adequately tested, however, until now. In this paper, we report the discovery of a potent and selective small-molecule inhibi-tor of EZH2, EPZ005687 ( 4 ). The ability of this compound to directly and selectively inhibit PRC2 enzymatic activity distinguishes it from DZNep, a compound that has been used previously to probe cellular EZH2 function. DZNep is an inhibitor of S -adenosylhomocysteine (SAH) hydrolase and is thought to inhibit and cause the degrada-tion of the PRC2 complex by an indirect mechanism involving an increase in the cellular concentration of SAH, an inhibitory byprod-uct of cellular methyltransferase reactions 22, 23 .

Interpretation of cellular phenotypes caused by DZNep is complicated by DZNep ’ s ability to reduce methylation at multiple histone residues targeted by protein methyltransferases (PMTs) other than EZH2. In contrast, treatment of cells with EPZ005687 resulted in concentration-dependent ablation of H3K27 methyla-tion without major decreases in any other histone methyl marks. When the compound was applied to lymphoma cells bearing an EZH2 Tyr641 or Ala677 mutation, concentration-dependent cell killing was observed. Unlike the potent cell killing seen for mutant- bearing lymphoma cell lines, EPZ005687 had minimal effects on

1 Epizyme, Inc. , Cambridge , Massachusetts , USA . 2 These authors contributed equally to this work. * e-mail: [email protected]

A selective inhibitor of EZH2 blocks H3K27 methylation and kills mutant lymphoma cells Sarah K Knutson 1 , 2 , Tim J Wigle 1 , 2 , Natalie M Warholic 1 , Christopher J Sneeringer 1 , Christina J Allain 1 , Christine R Klaus 1 , Joelle D Sacks 1 , Alejandra Raimondi 1 , Christina R Majer 1 , Jeffrey Song 1 , Margaret Porter Scott 1 , Lei Jin 1 , Jesse J Smith 1 , Edward J Olhava 1 , Richard Chesworth 1 , Mikel P Moyer 1 , Victoria M Richon 1 , Robert A Copeland 1 , Heike Keilhack 1 , Roy M Pollock 1 & Kevin W Kuntz 1 *

EZH2 catalyzes trimethylation of histone H3 lysine 27 (H3K27). Point mutations of EZH2 at Tyr641 and Ala677 occur in subpopulations of non-Hodgkin ’ s lymphoma, where they drive H3K27 hypertrimethylation. Here we report the discovery of EPZ005687, a potent inhibitor of EZH2 ( K i of 24 nM). EPZ005687 has greater than 500-fold selectivity against 15 other pro-tein methyltransferases and has 50-fold selectivity against the closely related enzyme EZH1. The compound reduces H3K27 methylation in various lymphoma cells; this translates into apoptotic cell killing in heterozygous Tyr641 or Ala677 mutant cells, with minimal effects on the proliferation of wild-type cells. These data suggest that genetic alteration of EZH2 (for example, mutations at Tyr641 or Ala677) results in a critical dependency on enzymatic activity for proliferation (that is, the equivalent of oncogene addiction), thus portending the clinical use of EZH2 inhibitors for cancers in which EZH2 is genetically altered.

NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology 891

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

the proliferation of lymphoma cell lines containing wild-type EZH2. Thus, EPZ005687 represents a chemical probe molecule for testing the dependency of cancer cell lines on EZH2 enzymatic activity. The data reported here provide substantial support for the hypothesis (described above) that EZH2 mutant – bearing lymphomas critically depend on EZH2 enzymatic activity for proliferation and survival.

RESULTS Hit identification and optimization of target potency High-throughput screening of a 175,000-compound subset of a chemical diversity library against recombinant wild-type PRC2, under balanced assay conditions 24 , yielded inhibitors of varying chemotypes with half-maximum inhibitory concentration (IC 50 ) values in the 3- to 30- μ M range. The hits were divided into clusters on the basis of structural similarity, and an additional 5,000 com-pounds, representing 25 clusters, were mined from the remainder of the compound library and screened. The majority of the hits proved to be promiscuous inhibitors or had poor physicochemical proper-ties (had poor solubility or were redox active, irreversible inhibi-tors or aggregate forming). However, this hit expansion identified a pyridone-containing chemotype, 1 ( Fig. 1 ), which had an IC 50 of 620 nM for wild-type PRC2. Early attempts to use 1 in cellular assays quickly identified poor solubility as a liability of this chemical series. A survey of vectors around the template showed that amines were tolerated in the 4-position of the phenyl ring ( 2 ), which led to large improvements in solubility with a slight increase in potency. We made a variety of 5,6-fused heteroaryl ring systems, and the indazole showed improved potency compared to the pyrazolopyri-dine ( 3 versus 2 ). Increasing the size of the lipophilic group off the 1-position of the indazole led to improved potency and provided EPZ005687 ( 4 ). A comprehensive exploration of the optimization of these inhibitors through iterative structure-activity relationship studies to yield EPZ005687 and related compounds will be pre-sented in full in a separate publication. Subsequent to the discovery of EPZ005687, two patent applications were published containing EZH2 inhibitors with structures similar to those described here 25, 26 .

Biochemical characterization of EPZ005687 As illustrated in Figure 2a , EPZ005687 showed concentration-de-pendent inhibition of PRC2 enzymatic activity with an IC 50 value of 54 ± 5 nM. Dual titration of the compound and the substrate S -adenosylmethionine (SAM) yielded Michaelis-Menten plots that were best fit by the steady-state equation for competitive inhibi-tion, yielding a K i value for EPZ005687 of 24 ± 7 nM. Consistent with competitive inhibition, the IC 50 for EPZ005687 inhibition of PRC2 showed a positive linear dependence on SAM concentration ( Fig. 2b ). Dual titration of compound and oligonucleosome sub-strate resulted in Michaelis-Menten plots that were best described by the steady-state equation for noncompetitive inhibition, and the IC 50 of EPZ005687 was independent of the oligonucleosome sub-strate concentration ( Fig. 2c ).

The above data suggested that EPZ005687 binds in the SAM pocket of the EZH2 SET domain. Definitive proof of binding within the SAM pocket of the enzyme requires structural confirmation by crystallographic or NMR methods; however, the multisubunit nature of enzymatically active PRC2 presents a challenge with respect to structural biology. Indeed, though the structure of the embryonic ectoderm development (EED) subunit of PRC2 has been determined by high-resolution crystallography 27 – 29 , there have been no literature reports of intact PRC2 or EZH2 crystal structures. Our own efforts to generate apo or cocrystal structures of the entire PRC2 complex with EPZ005687 were unsuccessful. Additionally, biophysical methods to confirm binding to the EZH2 subunit in isolation was not possible owing to the poor solubility, unstable tertiary structure and complete absence of enzymatic activity of the isolated subunit. Therefore, a Yonetani-Theorell analysis 30 was

performed to determine whether EPZ005687 bound in a mutually exclusive fashion with SAH. In a previous report, we demonstrated that SAH inhibits EZH2 in a SAM-competitive manner with a K i of 7.5 μ M (ref. 31 ). The structural similarity between SAM and SAH implies overlapping binding sites for these two ligands, and this inference is confirmed by crystallographic analysis of SAM and SAH complexes of a number of PMTs (reviewed in ref. 32). Figure 2d shows a Yonetani-Theorell plot of the reciprocal of reac-tion velocity as a function of SAH concentration at several different EPZ005687 concentrations. The data were best fit with a series of parallel lines, indicative of mutually exclusive binding of the two inhibitors. Overall, these data suggest that EPZ005687 inhibits EZH2 by binding in the SAM pocket.

EPZ005687 is a potent and selective inhibitor of PRC2 activity. We tested the activity of the compound against a panel of 15 other human PMTs and 6 EZH2 enzymes with point mutations in the SET domain at Tyr641 or Ala677. As illustrated in the ligand affinity map ( Fig. 3 ), EPZ005687 had > 500-fold selectivity against all of the tested PMTs, with the exception of the closely related PRC2 complex con-taining EZH1 in place of EZH2. The selectivity of EPZ005687 was further evaluated by measuring its ability to displace radioligands from 77 human ion channels and G protein – coupled receptors. At a concentration of 10 μ M, EPZ005687 did not displace radio-ligands from most of the targets tested. Radioligands for only four targets were displaced by more than 50 % ( Supplementary Results , Supplementary Table 1 ), and the lowest IC 50 extrapolated for any of these targets was 1.5 μ M, indicating a selectivity of > 60-fold.

EPZ005687 also showed ~ 50-fold selectivity for EZH2 over EZH1-containing PRC2 ( Δ Δ G binding > 2 kcal mol − 1 ; Supplementary Fig. 1 and Supplementary Table 1 ). The affinity of EPZ005687 was similar (within a two-fold range) for PRC2 complexes containing wild-type and Tyr641 mutant EZH2. In contrast, the compound had significantly greater affinity for the A677G mutant enzyme (5.4-fold; P < 0.05). These findings were consistent across several hundred compounds within this chemotype series. Taken together with our previous demonstration that the K m of SAM is unaffected by the mutations 19, 33 , this observation implies that the structural recognition elements of the indazole series do not differ between

HN O

N

HN

O

N

ON

N

HN O

N

HN

O

N

NO

O

HN O

HN

O

N

ON

N

HN O

HN

O

N

ON

N

(1)PRC2 Ki = 310 nM

(2)PRC2 Ki = 180 nM

(3)PRC2 Ki = 80 nM EPZ005687 (4)

PRC2 Ki = 24 nM

Figure 1 | Chemical structures of PRC2 inhibitors. Wild type EZH2-containing PRC2 K i values shown are the mean of at least two independent experiments, with each experiment run in duplicate.

NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology 895

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

and selective DOT1L inhibitor, was used to elucidate the causal role of DOT1L enzymatic activity in MLL -rearranged leukemia 34 .

In the present work, we identified EPZ005687 as a potent and selective inhibitor of wild-type and mutant EZH2 – containing PRC2 enzymatic activity. We showed that the compound selectively inhib-its H3K27 methylation in cells and that this translated into selec-tive cell killing for lymphoma cells that contain heterozygous EZH2 mutations at Tyr641 or Ala677. These data established a critical and unique dependency on PRC2 enzymatic activity for the lym-phoma cell lines that bear these EZH2 mutations. This dependency is equivalent to the concept of oncogene addiction, in which cells become abnormally dependent on the biochemical activity of a specific oncogene product for growth, survival or both, such that ablation of the oncogene is cytotoxic in the genetically altered cells but inconsequential to growth of normal cells. The present results provide a compelling foundation for the clinical use of selective EZH2 inhibitors for the treatment of mutant-bearing lymphomas. The current compound represents a chemical biological probe for in vitro experiments, and we do not suggest that this compound itself could form the basis for patient treatment. Pharmacological optimization of compounds such as EPZ005687 holds great promise for this eventual outcome.

Genetic alterations in EZH2 and other PRC2 subunits are not limited to the Tyr641 and Ala677 mutations observed in lym-phoma. A broad spectrum of genetic alterations of PRC2 has been documented in a range of hematologic and solid tumors. Notably, in myeloid malignancies and T-cell leukemia, mutations in EZH2 and other PRC2 components lead to a loss of function of the com-plex 45 – 47 . The fact that both activating and inactivating mutations of EZH2 are associated with malignancy is remarkable and reflects the complex role of PRC2 target genes in cell fate decisions.

EPZ005687 is shown here to be an equally potent inhibitor of both wild-type and Tyr641 or Ala677 mutants of EZH2, suggest-ing that pharmacologically optimized inhibitors with this inhibition profile may be useful in the treatment of a number of human can-cers wherein gain-of-enzymatic function of PRC2 drives disease.

METHODS Determination of inhibitor IC 50 values in the PMT panel. Values for enzymes in the histone methyltransferase panel were determined under balanced assay conditions with both SAM and protein or peptide substrate present at concentrations equal to their respective K m values 24 . Where a peptide was used as a methyl-accepting substrate, the peptide is referred to here by the histone and residue numbers that it represents. For example, peptide H3:16 – 30 refers to a peptide representing histone H3 residues 16 through 30. All reactions were run at 25 ° C in a 50- μ l volume with 2 % (v / v) DMSO in the final reaction. Flag- and His-tagged CARM1 (residues 2 – 585) expressed in 293 cells was assayed at a final concentration of 0.25 nM against a biotinylated peptide corresponding to histone H3:16 – 30 with a monomethylated Arg26. His-tagged Dot1L (residues 1 – 416) expressed in Escherichia coli was assayed at a final concentration of 0.25 nM against chicken erythrocyte oligonucleosomes. His-tagged EHMT2 (residues 913 – 1193) expressed in E. coli was assayed at a final concentration of 0.1 nM against a biotinylated peptide corresponding to H3:1 – 15. His-tagged EHMT1 (residues 951 – 1235) expressed in E. coli was assayed at a final concentration of 0.1 nM against a bio-tinylated peptide corresponding to H3:1 – 15. Full-length glutathione S -transferase (GST)-tagged PRMT1 expressed in Spodoptera frugiperda cells was assayed at a final concentration of 0.75 nM against biotinylated peptide corresponding to H4:36 – 50. GST-tagged PRMT3 (residues 2 – 531) expressed in E. coli was assayed at a final concentration of 0.5 nM against a biotinylated peptide with the sequence biotin-aminohexyl-GGRGGFGGRGGFGGRGGFG-amide. Flag-tagged full-length PRMT5 expressed in 293 cells was assayed at a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:1 – 15. His-tagged PRMT6 (residues 2 – 375) expressed in 293 cells was assayed at a final concentra-tion of 1 nM against a peptide corresponding to H4:N36 – 50 with monomethylated Lys44. Full-length PRMT8 expressed in E. coli was assayed in a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:31 – 45. Full-length SETD7 expressed in E. coli was assayed at a final concentration of 1 nM against a biotinylated peptide corresponding to H3:1 – 15. Full-length Flag-tagged SMYD3 was expressed in E. coli and assayed at a final concentration of 50 nM against recombinant histone H4. His-tagged full-length SMYD2 was assayed at a final con-centration of 1 nM against a biotinylated peptide corresponding to H4:36 – 50. Flag- and His-tagged full-length WHSC1 was expressed in 293 cells and assayed at a

final concentration of 2.5 nM against chicken erythrocyte oligonucleosomes. Flag-tagged full-length WHSC1L1 was expressed in S. frugiperda cells and was assayed at a final concentration of 4 nM against chicken erythrocyte oligonucleosomes.

Cell culture. Lymphoma cell lines OCI-LY19 (ACC-528), WSU-DLCL2 (ACC-575) and Karpas422 (ACC-32) were obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen. Toledo (CRL-2631), HT (CRL-2260), Pfeiffer (CRL-2632) and SUDHL6 (CRL-2959) cell lines were obtained from American Type Culture Collection. DOHH2 (HTL99022) was obtained from Banca Biologica e Cell Factory. SUDHL6 and Karpas422 cell lines were cultured in RPMI plus 20 % (v / v) FBS, and all other cell lines were cultured in RPMI plus 10 % (v / v) FBS.

Analysis of long-term proliferation and cell cycle. Proliferation and cell cycle analysis were performed as previously described 34 , with slight exceptions. For the 11-d proliferation assay, plating densities were determined for each cell line on the basis of linear log-phase growth. Cells were counted and split back to the original plating density in fresh medium with EPZ005687 on days 4 and 7. Viable cell counts and IC 50 calculations were performed as previously described 34 , and LCC calculations were performed as described in Supplementary Methods .

For cell cycle, WSU-DLCL2 cells were plated in 12-well plates at a density of 1 × 10 5 cells per ml. Cells were incubated with EPZ005687 at 0.2 μ M, 0.67 μ M, 2 μ M and 6 μ M, in a total of 2 ml, over a course of 10 d. All remaining cell cycle analysis was performed as previously described 34 .

Received 19 March 2012; accepted 13 July 2012; published online 30 September 2012

References 1 . Kuzmichev , A . , Nishioka , K . , Erdjument-Bromage , H . , Tempst , P . & Reinberg ,

D . Histone methyltransferase activity associated with a human multiprotein complex containing the Enhancer of Zeste protein . Genes Dev. 16 , 2893 – 2905 ( 2002 ).

2 . Cao , R . et al. Role of histone H3 lysine 27 methylation in Polycomb-group silencing . Science 298 , 1039 – 1043 ( 2002 ).

3 . Agger , K . et al. UTX and JMJD3 are histone H3K27 demethylases involved in HOX gene regulation and development . Nature 449 , 731 – 734 ( 2007 ).

4 . Hong , S . et al. Identifi cation of JmjC domain-containing UTX and JMJD3 as histone H3 lysine 27 demethylases . Proc. Natl. Acad. Sci. USA 104 , 18439 – 18444 ( 2007 ).

5 . Lee , M . G . et al. Demethylation of H3K27 regulates polycomb recruitment and H2A ubiquitination . Science 318 , 447 – 450 ( 2007 ).

6 . Lan , F . et al. A histone H3 lysine 27 demethylase regulates animal posterior development . Nature 449 , 689 – 694 ( 2007 ).

7 . De Santa , F . et al. Th e histone H3 lysine-27 demethylase Jmjd3 links infl ammation to inhibition of polycomb-mediated gene silencing . Cell 130 , 1083 – 1094 ( 2007 ).

8 . Kleer , C . G . et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells . Proc. Natl. Acad. Sci. USA 100 , 11606 – 11611 ( 2003 ).

9 . Varambally , S . et al. Th e polycomb group protein EZH2 is involved in progression of prostate cancer . Nature 419 , 624 – 629 ( 2002 ).

10 . Kirmizis , A . et al. Silencing of human polycomb target genes is associated with methylation of histone H3 Lys 27 . Genes Dev. 18 , 1592 – 1605 ( 2004 ).

11 . Bracken , A . P . et al. EZH2 is downstream of the pRB-E2F pathway, essential for proliferation and amplifi ed in cancer . EMBO J. 22 , 5323 – 5335 ( 2003 ).

12 . Simon , J . A . & Lange , C . A . Roles of the EZH2 histone methyltransferase in cancer epigenetics . Mutat. Res. 647 , 21 – 29 ( 2008 ).

13 . Velichutina , I . et al. EZH2-mediated epigenetic silencing in germinal center B cells contributes to proliferation and lymphomagenesis . Blood 116 , 5247 – 5255 ( 2010 ).

14 . van Haaft en , G . et al. Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer . Nat. Genet. 41 , 521 – 523 ( 2009 ).

15 . Wang , S . , Robertson , G . P . & Zhu , J . A novel human homologue of Drosophila polycomblike gene is up-regulated in multiple cancers . Gene 343 , 69 – 78 ( 2004 ).

16 . Morin , R . D . et al. Somatic mutations altering EZH2 (Tyr641) in follicular and diff use large B-cell lymphomas of germinal-center origin . Nat. Genet. 42 , 181 – 185 ( 2010 ).

17 . Lohr , J . G . et al. Discovery and prioritization of somatic mutations in diff use large B-cell lymphoma (DLBCL) by whole-exome sequencing . Proc. Natl. Acad. Sci. USA 109 , 3879 – 3884 ( 2012 ).

18 . Morin , R . D . et al. Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma . Nature 476 , 298 – 303 ( 2011 ).

19 . Sneeringer , C . J . et al. Coordinated activities of wild-type plus mutant EZH2 drive tumor-associated hypertrimethylation of lysine 27 on histone H3 (H3K27) in human B-cell lymphomas . Proc. Natl. Acad. Sci. USA 107 , 20980 – 20985 ( 2010 ).

20 . Yap , D . B . et al. Somatic mutations at EZH2 Y641 act dominantly through a mechanism of selectively altered PRC2 catalytic activity, to increase H3K27 trimethylation . Blood 117 , 2451 – 2459 ( 2011 ).

S40 NATURE REPRINT COLLECTION Epigenetics 892 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

wild-type and Tyr641 mutations. However, the enhanced affinity for the A677G mutant leads us to surmise that EPZ005687 may engage additional interactions as a result of this mutation. Similarly, the significantly ( P < 0.05) diminished affinity of the compound for EZH1-containing PRC2, which contains the identical Suz12, EED and RbAp48 subunits, likewise suggests that compound affinity is affected by key recognition elements of binding within these closely related catalytic subunits and not by the other three members of the holoenzyme complex. In aggregate, the SAM-competitive inhi-bition modality, mutually exclusive binding with SAH and impact on binding affinity of A677G or EZH1 substitution for wild-type EZH2 in the PRC2 complex strongly lead us to infer that the bind-ing site for EPZ005687 is contained within the catalytic EZH2 or EZH1 subunit of the PRC2 complex and is likely to overlap with the binding site for SAM.

We have further demonstrated that EPZ005687 is a direct inhibi-tor of PRC2 enzymatic activity and does not function by disrupt-ing the protein-protein interactions among the PRC2 subunits. This was shown by performing a magnetic Flag pulldown of the wild-type PRC2 complex containing a Flag-tagged EED subunit with and without saturating concentrations of EPZ005687 and

by analyzing the supernatant and boiled magnetic beads by SDS-PAGE. The PRC2 complex was pulled down intact regardless of whether or not EPZ005687 was bound, and the supernatant was not enriched for any displaced subunit relative to the DMSO control ( Supplementary Fig. 2 ).

Intracellular inhibition of H3K27 methylation We next tested the ability of EPZ005687 to block methylation of the PRC2 substrate H3K27 within lymphoma cells. Figure 4a and Supplementary Figure 3 illustrate a typical western blot against H3K27me3 at increasing concentrations of EPZ005687 for the EZH2 wild-type lymphoma cell line OCI-LY19 and demonstrate a clear concentration-dependent inhibition of H3K27me3. Quantification of H3K27me3 by ELISA yielded an IC 50 of 80 ± 30 nM for the blot in Figure 4a . Similar results were obtained for additional EZH2 wild type, EZH2 Tyr641 and Ala677 mutant lymphoma cell lines as well as for cell lines of other cancer types, including breast and pros-tate cancer. Thus we conclude that the compound is cell permeable and inhibits methylation of the physiologically relevant substrate of PRC2.

The exquisite selectivity of EPZ005687 for PRC2, demonstrated in biochemical assays ( Fig. 3 ), is recapitulated within the cellular milieu. This is illustrated for the wild-type lymphoma line OCI-LY19 and the Y641F mutant – bearing lymphoma line WSU-DLCL2 in Figure 4b and c , respectively. In these experiments, histones were isolated from cells after treatment with or without a high con-centration of EPZ005687 (5.6 μ M) for 4 d and probed for a broad panel of histone post-translational modifications ( Figure 4b,c and Supplementary Fig. 4 ). The only histone methyl marks decreased by compound treatment are those at H3K27. In the wild-type OCI-LY19 cell line, both H3K27me3 and H3K27me2 are greatly reduced by compound treatment. The EZH2 mutant WSU-DLCL2 cell line, however, showed a decrease only in the H3K27me3 mark; it was not possible to observe a decrease in H3K27me2 owing to the already undetectable dimethylation in Tyr641 mutant cell lines. To our sur-prise, the amount of monomethylated H3K27 seemed to be unaf-fected by compound treatment in both cell types, suggesting that H3K27 monomethylation may be carried out by enzymes other

a b

c d

100300

200

100

0Perc

enta

ge o

f inh

ibiti

on

IC50

(nM

)

300 0.3200

EPZ005687 (nM)

200

100

00 2 4 6 8 10 0 10,000 20,000 30,000

IC50

(nM

)

1/ve

loci

ty (c

.p.m

.–1 m

in–1

)

50

00 1 2 3

log EPZ005687 (nM)

SAH (nM)

4 5 0 2 4[SAM]/Km

[Nucleosome]/Km

6

0.2

0.1

0

133895939261812

Figure 2 | EPZ005687 is a SAM-competitive inhibitor of EZH2 enzyme activity. ( a ) Inhibition of EZH2 when activity is assessed under balanced conditions 24 for both SAM and peptide substrates using a Flashplate assay to measure the transfer of a tritiated methyl group from SAM to the peptide. The data are fit to a standard Langmuir isotherm for inhibition, and the IC 50 of EPZ005687 was calculated to be 54 ± 5 nM with a Hill slope of 1. The data shown are the average and s.d. of seven independent duplicate runs. ( b ) Plot of IC 50 values of EPZ005687 as a function of SAM concentration relative to the K m of SAM ([SAM] / K m ) measured using a Flashplate assay similar to the IC 50 measurements described above. These values show a linear relationship, as expected for SAM-competitive inhibition with a K i of 24 ± 7 nM ( ± s.d. of three experiments). ( c ) Plot of IC 50 values of EPZ005687 as a function of chicken erythrocyte oligonucleosome concentration relative to the K m of nucleosome ([Nucleosome] / K m ) measured using a filter-binding microplate assay to measure the transfer of tritiated methyl groups from SAM to the oligonucleosome. As expected for a noncompetitive inhibitor with respect to this substrate, the IC 50 is unaffected as the concentration of oligonucleosome is increased. The mean and standard error of three experiments are shown. ( d ) Yonetani-Theorell analysis of SAH and EPZ005687 indicates that they are mutually exclusive inhibitors of PRC2. Assays were performed by combining several concentrations of SAH and EPZ005687 and yielded a series of parallel lines in a plot of 1 / velocity as a function of SAH concentration for several concentrations of EPZ005687 tested. The mean and standard error of three experiments are shown.

SMYD3SMYD2

SETD7

EHMT1EHMT2

WHSC1L1Legend:Ki (M)

10–9 10–8 10–7 10–6 10–5 >5 × 10–5

WHSC1

EZH1 EZH2

Y641C

PRMT8

PRMT1PRMT3

PRMT6 PRMT5 DOT1L

CARM1

Y641HY641SY641N

Y641FA677G

Figure 3 | Ligand affinity maps of EPZ005687 across the family trees of human lysine methyltransferases and arginine methyltransferase enzymes show EPZ005687 is a selective and potent inhibitor of EZH2 and EZH1 enzymes. The K i of EPZ005687 was measured across a panel of recombinant lysine methyltransferase (KMT; left) and arginine methyltransferase (RMT; right) enzymes at balanced conditions 24 of both the SAM and peptide or protein substrates. The EZH2 Tyr641 and Ala677 mutant enzymes are indicated above wild-type EZH2. The K i values were converted to p K i values and used to generate red circles of proportional sizes to indicate the extent of inhibition as shown in the legend. Larger circles correlate to increased potency versus the enzymes, and gray circles indicate that inhibition was not measurable at concentrations up to 50 μ M of EPZ005687.

896 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

21 . McCabe , M . T . et al. Mutation of A677 in histone methyltransferase EZH2 in human B-cell lymphoma promotes hypertrimethylation of histone H3 on lysine 27 (H3K27) . Proc. Natl. Acad. Sci. USA 109 , 2989 – 2994 ( 2012 ).

22 . Miranda , T . B . et al. DZNep is a global histone methylation inhibitor that reactivates developmental genes not silenced by DNA methylation . Mol. Cancer Th er. 8 , 1579 – 1588 ( 2009 ).

23 . Tan , J . et al. Pharmacologic disruption of Polycomb-repressive complex 2-mediated gene repression selectively induces apoptosis in cancer cells . Genes Dev. 21 , 1050 – 1063 ( 2007 ).

24 . Copeland , R . A . Evaluation of enzyme inhibitors in drug discovery. A guide for medicinal chemists and pharmacologists ( John Wiley & Sons , 2005 ) .

25 . Duquenne , C . et al. Indazoles. International patent application PCT WO2011140325 ( 2011 ).

26 . Burgess , J . et al. Azaindazoles. International patent application PCT WO2012005805 ( 2012 ).

27 . Xu , C . et al. Binding of diff erent histone marks diff erentially regulates the activity and specifi city of polycomb repressive complex 2 (PRC2) . Proc. Natl. Acad. Sci. USA 107 , 19266 – 19271 ( 2010 ).

28 . Han , Z . et al. Structural basis of EZH2 recognition by EED . Structure 15 , 1306 – 1315 ( 2007 ).

29 . Margueron , R . et al. Role of the polycomb protein EED in the propagation of repressive histone marks . Nature 461 , 762 – 767 ( 2009 ).

30 . Yonetani , T . & Th eorell , H . Studies on liver alcohol hydrogenase complexes. 3. Multiple inhibition kinetics in the presence of two competitive inhibitors . Arch. Biochem. Biophys. 106 , 243 – 251 ( 1964 ).

31 . Richon , V . M . et al. Chemogenetic analysis of human protein methyltransferases . Chem. Biol. Drug Des. 78 , 199 – 210 ( 2011 ).

32 . Chapman , P . B . et al. Improved survival with vemurafenib in melanoma with BRAF V600E mutation . N. Engl. J. Med. 364 , 2507 – 2516 ( 2011 ).

33 . Wigle , T . J . et al. Th e Y641C mutation of EZH2 alters substrate specifi city for histone H3 lysine 27 methylation states . FEBS Lett. 585 , 3011 – 3014 ( 2011 ).

34 . Daigle , S . R . et al. Selective killing of mixed lineage leukemia cells by a potent small-molecule DOT1L inhibitor . Cancer Cell 20 , 53 – 65 ( 2011 ).

35 . Ben-Porath , I . et al. An embryonic stem cell-like gene expression signature in poorly diff erentiated aggressive human tumors . Nat. Genet. 40 , 499 – 507 ( 2008 ).

36 . Dornan , D . et al. Th erapeutic potential of an anti-CD79b antibody – drug conjugate, anti – CD79b-vc-MMAE, for the treatment of non-Hodgkin lymphoma . Blood 114 , 2721 – 2729 ( 2009 ).

37 . Renan , M . J . How many mutations are required for tumorigenesis? Implications from human cancer data . Mol. Carcinog. 7 , 139 – 146 ( 1993 ).

38 . Kaelin , W . G . Jr. Choosing anticancer drug targets in the postgenomic era . J. Clin. Invest. 104 , 1503 – 1506 ( 1999 ).

39 . Li , R . & Staff ord , J . A . Kinase Inhibitor Drugs ( John Wiley & Sons, Inc. , 2009 ) .

40 . Tsai , J . et al. Discovery of a selective inhibitor of oncogenic B-Raf kinase with potent antimelanoma activity . Proc. Natl. Acad. Sci. USA 105 , 3041 – 3046 ( 2008 ).

41 . Kwak , E . L . et al. Anaplastic lymphoma kinase inhibition in non – small-cell lung cancer . N. Engl. J. Med. 363 , 1693 – 1703 ( 2010 ).

42 . Copeland , R . A . Protein methyltransferase inhibitors as personalized cancer therapeutics . Drug Discov. Today Th er. Strateg. published online, doi:10.1016/j.ddstr.2011.08.001 (16 September 2011) .

43 . Copeland , R . A . , Solomon , M . E . & Richon , V . M . Protein methyltransferases as a target class for drug discovery . Nat. Rev. Drug Discov. 8 , 724 – 732 ( 2009 ).

44 . Vedadi , M . et al. A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells . Nat. Chem. Biol. 7 , 566 – 574 ( 2011 ).

45 . Ernst , T . et al. Inactivating mutations of the histone methyltransferase gene EZH2 in myeloid disorders . Nat. Genet. 42 , 722 – 726 ( 2010 ).

46 . Nikoloski , G . et al. Somatic mutations of the histone methyltransferase gene EZH2 in myelodysplastic syndromes . Nat. Genet. 42 , 665 – 667 ( 2010 ).

47 . Ntziachristos , P . et al. Genetic inactivation of the polycomb repressive complex 2 in T cell acute lymphoblastic leukemia . Nat. Med. 18 , 298 – 301 ( 2012 ).

Acknowledgments We thank D. Johnston and A. Basavapathruni for performing DOT1L and WHSC1 enzyme selectivity assays, K. Kuplast for help with the LCC calculations, A. Santospago for preparation of assay plates and R. Gould for helpful discussions.

Author contributions L.J. made the enzymes. K.W.K. and E.J.O. designed compounds including EPZ005687. T.J.W., C.R.M. and C.J.S. performed the enzyme inhibition assays, and T.J.W. performed substrate competitions, Yonetani-Theorell analysis and the in vitro EZH2 pull-down assay. S.K.K., N.M.W., C.J.A., C.R.K., J.S. and J.D.S. performed the intracellular inhibition of H3K27 methylation ELISA. S.K.K. and N.M.W. performed the western blotting of all methyl marks and proliferation assays. S.K.K., N.M.W. and J.J.S. performed gene expres-sion and cell cycle experiments. S.K.K., T.J.W., K.W.K., A.R., J.J.S., M.P.S., R.M.P., R.C., M.P.M., V.M.R., R.A.C. and H.K. designed studies and interpreted results. S.K.K., T.J.W., K.W.K. and R.A.C. wrote the paper.

Competing financial interests The authors declare competing financial interests: details accompany the online version of the paper.

Additional information Supplementary information, chemical compound information and chemical probe information is available in the online version of the paper. Reprints and permissions information is available online at http://www.nature.com/reprints/index.html . Correspondence and requests for materials should be addressed to K.W.K.

NATURE REPRINT COLLECTION Epigenetics S41892 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

wild-type and Tyr641 mutations. However, the enhanced affinity for the A677G mutant leads us to surmise that EPZ005687 may engage additional interactions as a result of this mutation. Similarly, the significantly ( P < 0.05) diminished affinity of the compound for EZH1-containing PRC2, which contains the identical Suz12, EED and RbAp48 subunits, likewise suggests that compound affinity is affected by key recognition elements of binding within these closely related catalytic subunits and not by the other three members of the holoenzyme complex. In aggregate, the SAM-competitive inhi-bition modality, mutually exclusive binding with SAH and impact on binding affinity of A677G or EZH1 substitution for wild-type EZH2 in the PRC2 complex strongly lead us to infer that the bind-ing site for EPZ005687 is contained within the catalytic EZH2 or EZH1 subunit of the PRC2 complex and is likely to overlap with the binding site for SAM.

We have further demonstrated that EPZ005687 is a direct inhibi-tor of PRC2 enzymatic activity and does not function by disrupt-ing the protein-protein interactions among the PRC2 subunits. This was shown by performing a magnetic Flag pulldown of the wild-type PRC2 complex containing a Flag-tagged EED subunit with and without saturating concentrations of EPZ005687 and

by analyzing the supernatant and boiled magnetic beads by SDS-PAGE. The PRC2 complex was pulled down intact regardless of whether or not EPZ005687 was bound, and the supernatant was not enriched for any displaced subunit relative to the DMSO control ( Supplementary Fig. 2 ).

Intracellular inhibition of H3K27 methylation We next tested the ability of EPZ005687 to block methylation of the PRC2 substrate H3K27 within lymphoma cells. Figure 4a and Supplementary Figure 3 illustrate a typical western blot against H3K27me3 at increasing concentrations of EPZ005687 for the EZH2 wild-type lymphoma cell line OCI-LY19 and demonstrate a clear concentration-dependent inhibition of H3K27me3. Quantification of H3K27me3 by ELISA yielded an IC 50 of 80 ± 30 nM for the blot in Figure 4a . Similar results were obtained for additional EZH2 wild type, EZH2 Tyr641 and Ala677 mutant lymphoma cell lines as well as for cell lines of other cancer types, including breast and pros-tate cancer. Thus we conclude that the compound is cell permeable and inhibits methylation of the physiologically relevant substrate of PRC2.

The exquisite selectivity of EPZ005687 for PRC2, demonstrated in biochemical assays ( Fig. 3 ), is recapitulated within the cellular milieu. This is illustrated for the wild-type lymphoma line OCI-LY19 and the Y641F mutant – bearing lymphoma line WSU-DLCL2 in Figure 4b and c , respectively. In these experiments, histones were isolated from cells after treatment with or without a high con-centration of EPZ005687 (5.6 μ M) for 4 d and probed for a broad panel of histone post-translational modifications ( Figure 4b,c and Supplementary Fig. 4 ). The only histone methyl marks decreased by compound treatment are those at H3K27. In the wild-type OCI-LY19 cell line, both H3K27me3 and H3K27me2 are greatly reduced by compound treatment. The EZH2 mutant WSU-DLCL2 cell line, however, showed a decrease only in the H3K27me3 mark; it was not possible to observe a decrease in H3K27me2 owing to the already undetectable dimethylation in Tyr641 mutant cell lines. To our sur-prise, the amount of monomethylated H3K27 seemed to be unaf-fected by compound treatment in both cell types, suggesting that H3K27 monomethylation may be carried out by enzymes other

a b

c d

100300

200

100

0Perc

enta

ge o

f inh

ibiti

on

IC50

(nM

)

300 0.3200

EPZ005687 (nM)

200

100

00 2 4 6 8 10 0 10,000 20,000 30,000

IC50

(nM

)

1/ve

loci

ty (c

.p.m

.–1 m

in–1

)

50

00 1 2 3

log EPZ005687 (nM)

SAH (nM)

4 5 0 2 4[SAM]/Km

[Nucleosome]/Km

6

0.2

0.1

0

133895939261812

Figure 2 | EPZ005687 is a SAM-competitive inhibitor of EZH2 enzyme activity. ( a ) Inhibition of EZH2 when activity is assessed under balanced conditions 24 for both SAM and peptide substrates using a Flashplate assay to measure the transfer of a tritiated methyl group from SAM to the peptide. The data are fit to a standard Langmuir isotherm for inhibition, and the IC 50 of EPZ005687 was calculated to be 54 ± 5 nM with a Hill slope of 1. The data shown are the average and s.d. of seven independent duplicate runs. ( b ) Plot of IC 50 values of EPZ005687 as a function of SAM concentration relative to the K m of SAM ([SAM] / K m ) measured using a Flashplate assay similar to the IC 50 measurements described above. These values show a linear relationship, as expected for SAM-competitive inhibition with a K i of 24 ± 7 nM ( ± s.d. of three experiments). ( c ) Plot of IC 50 values of EPZ005687 as a function of chicken erythrocyte oligonucleosome concentration relative to the K m of nucleosome ([Nucleosome] / K m ) measured using a filter-binding microplate assay to measure the transfer of tritiated methyl groups from SAM to the oligonucleosome. As expected for a noncompetitive inhibitor with respect to this substrate, the IC 50 is unaffected as the concentration of oligonucleosome is increased. The mean and standard error of three experiments are shown. ( d ) Yonetani-Theorell analysis of SAH and EPZ005687 indicates that they are mutually exclusive inhibitors of PRC2. Assays were performed by combining several concentrations of SAH and EPZ005687 and yielded a series of parallel lines in a plot of 1 / velocity as a function of SAH concentration for several concentrations of EPZ005687 tested. The mean and standard error of three experiments are shown.

SMYD3SMYD2

SETD7

EHMT1EHMT2

WHSC1L1Legend:Ki (M)

10–9 10–8 10–7 10–6 10–5 >5 × 10–5

WHSC1

EZH1 EZH2

Y641C

PRMT8

PRMT1PRMT3

PRMT6 PRMT5 DOT1L

CARM1

Y641HY641SY641N

Y641FA677G

Figure 3 | Ligand affinity maps of EPZ005687 across the family trees of human lysine methyltransferases and arginine methyltransferase enzymes show EPZ005687 is a selective and potent inhibitor of EZH2 and EZH1 enzymes. The K i of EPZ005687 was measured across a panel of recombinant lysine methyltransferase (KMT; left) and arginine methyltransferase (RMT; right) enzymes at balanced conditions 24 of both the SAM and peptide or protein substrates. The EZH2 Tyr641 and Ala677 mutant enzymes are indicated above wild-type EZH2. The K i values were converted to p K i values and used to generate red circles of proportional sizes to indicate the extent of inhibition as shown in the legend. Larger circles correlate to increased potency versus the enzymes, and gray circles indicate that inhibition was not measurable at concentrations up to 50 μ M of EPZ005687.

NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology 893

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

than EZH2-containing PRC2, such as EZH1-containing PRC2 (as described above). A modest increase in H3K27 acetylation was also observed upon treatment of the OCI-LY19 cells with compound. Additionally, a slight increase in H3K36me2 was seen in the WSU-DLCL2 cells treated with EPZ005687. The degree of interplay between these two methylation marks may be dependent on cell context, as the increase in H3K36me2 was not observed in the OCI-LY19 cell line upon inhibition of EZH2.

Impact of EPZ005687 on cell growth Having established that EPZ005687 can enter cells and selectively affect H3K27 methylation, we investigated the impact of PRC2 inhi-bition on cell growth in wild-type and mutant lymphoma cell lines. We studied the effects of varying concentrations of EPZ005687 on three lymphoma lines, OCI-LY19, WSU-DLCL2 and Pfeiffer. These cell lines respectively contain wild-type EZH2, EZH2 Y641F and EZH2 A677G . Increasing concentrations of compound from

0.011 μ M to 8.3 μ M had a minimal effect on proliferation of OCI-LY19 cells over the course of 11 d ( Fig. 5a ). In contrast, EPZ005687 had a notable effect on proliferation of the EZH2 Y641F -bearing cell line ( Fig. 5b ). In all of the EZH2 Y641F -bearing cell lines tested (described below), there was a consistent and reproducible latency period of 4 d over which the compound seemed to have little impact on cell growth followed by a period from 4 – 11 d in which the impact of the compound was fully realized. A time course of H3K27me3 inhibi-tion in cells treated with EPZ005687 ( Supplementary Fig. 5 ) dem-onstrated that the diminution of H3K27me3 was apparent within 24 h but was not fully realized until day 4 and beyond. Remarkably, when the potent and selective DOT1L inhibitor EPZ004777 (ref. 34) was applied to MLL -rearranged leukemia cell lines, a similar latency period in the inhibition of H3K79me2 methylation and cellular proliferation was observed. This delay may be a common feature of inhibitors of PMT enzymatic activity. The latency of the antipro-liferative effect was shorter in the Pfeiffer cell line, which contains EZH2 A677G (ref. 21), and this cell line was found to be particularly sensitive to EZH2 inhibition by EPZ005687 ( Fig. 5c ).

Antiproliferative compounds may affect reduction of cell growth by either causing cell stasis or cell killing. Historically, the effects of such compounds have been quantitatively compared using their IC 50 values, which report on the concentration of compound required to reduce the rate of cell growth (or, more typically, the cell number at a specified time point) by half of the untreated control value. We have found the use of IC 50 values inadequate to differen-tiate between cytostatic and cytotoxic effects of compound treat-ment of cells. Therefore, we propose a new metric for quantifying the effects of antiproliferative compounds on cell growth, the lowest cytotoxic concentration (LCC). The LCC is defined as the concen-tration of inhibitor at which the proliferative rate becomes zero and represents the crossover point between cytostasis and cytotoxicity. Additional information on the calculation of LCC is presented in the Supplementary Methods .

The IC 50 and LCC values for EPZ005687 treatment of mul-tiple wild-type and mutant (EZH2 Y641F , EZH2 Y641N and EZH2 A677G ) lymphoma cell lines are summarized in Supplementary Table 3 . These data make clear the differential effects of the compound on wild-type and mutant-bearing cells. Though some modest cyto-static effects were observed in wild-type lymphoma cells, the com-pound showed robust cell killing only for the Tyr641 mutant – and EZH2 A677G -bearing lymphoma lines. The wild-type cell lines had LCC values greater than the highest concentration used in the proliferation assay ( > 25 μ M). In contrast, the LCC values for the Tyr641 mutant cell lines were all in the low- to mid-micromolar range, and the LCC for the EZH2 A677G mutant cell line was even more potent (36 nM). Clearly, the presence of heterozygous muta-tions in the EZH2 SET domain is a key driver of sensitivity to

a

bK27me3K27me2K27me1

K4me3K9me3

K27ac

K36me2K79me2

K27me3K27me2K27me1

K4me3K9me3

K27ac

K36me2K79me2

0 100 200 300Percentage of DMSO

400 0 100 200 300Percentage of DMSO

400

DMSOEPZ005687

DMSOEPZ005687

c

Concentration ofEPZ005687 (μm)

K27me3

DMSO

0.0440.088

0.350.70

1.4 2.8 5.6 Untreated

0.18

Total H3

Figure 4 | EPZ005687 specifically inhibits H3K27 methylation in lymphoma cells. ( a ) The wild-type EZH2 lymphoma cell line OCI-LY19 shows a dose-dependent decrease in H3K27me3 after treatment with EPZ005687 for 96 h. ( b , c ) A wild-type lymphoma cell line, OCI-LY19 ( b ), and a mutant lymphoma cell line, WSU-DLCL2 ( c ), show the specificity of H3K27 methylation inhibition by EPZ005687 across a broad panel of histone methylation marks. Quantification of methylation changes is represented in the bar graphs to the right of each panel of western blots. Representative western blots ( n = 1) were normalized to corresponding total H3 and expressed as percent change in EPZ005687-treated versus DMSO-treated cells.

a OCI-LY19 (WT)

0 1 2 3 4 5Time (d) Time (d) Time (d)

6 87 9 10 11

Via

ble

cells

per

ml

109

108

107

106

105

104

103

b WSU-DLCL2 (Y641F)

0 1 2 3 4 5 6 87 9 10 11

Via

ble

cells

per

ml

109

108

107

106

105

104

c Pfeiffer (A677G)

0 1 2 3 4 5 6 87 9 10 11

Via

ble

cells

per

ml 107

108

106

105

104

103

Concentration of EPZ005687 (μm)

DMSO 0.011 0.034 0.10 0.31 0.93 2.8 8.3

Figure 5 | EPZ005687 decreases proliferation in mutant but not wild-type EZH2 lymphoma cells. ( a – c ) Wild-type (WT) OCI-LY19 cells ( a ), WSU-DLCL2 (Y641F) ( b ) and Pfeiffer (A677G) ( c ) cells were treated with EPZ005687 over an 11-d time course, and proliferation was measured at the indicated time points. The viable cell count ( y axis) in each panel is presented on a logarithmic scale as the mean of triplicates ± s.e.m. The proliferation IC 50 and LCC values are listed in Supplementary Table 2 .

896 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

21 . McCabe , M . T . et al. Mutation of A677 in histone methyltransferase EZH2 in human B-cell lymphoma promotes hypertrimethylation of histone H3 on lysine 27 (H3K27) . Proc. Natl. Acad. Sci. USA 109 , 2989 – 2994 ( 2012 ).

22 . Miranda , T . B . et al. DZNep is a global histone methylation inhibitor that reactivates developmental genes not silenced by DNA methylation . Mol. Cancer Th er. 8 , 1579 – 1588 ( 2009 ).

23 . Tan , J . et al. Pharmacologic disruption of Polycomb-repressive complex 2-mediated gene repression selectively induces apoptosis in cancer cells . Genes Dev. 21 , 1050 – 1063 ( 2007 ).

24 . Copeland , R . A . Evaluation of enzyme inhibitors in drug discovery. A guide for medicinal chemists and pharmacologists ( John Wiley & Sons , 2005 ) .

25 . Duquenne , C . et al. Indazoles. International patent application PCT WO2011140325 ( 2011 ).

26 . Burgess , J . et al. Azaindazoles. International patent application PCT WO2012005805 ( 2012 ).

27 . Xu , C . et al. Binding of diff erent histone marks diff erentially regulates the activity and specifi city of polycomb repressive complex 2 (PRC2) . Proc. Natl. Acad. Sci. USA 107 , 19266 – 19271 ( 2010 ).

28 . Han , Z . et al. Structural basis of EZH2 recognition by EED . Structure 15 , 1306 – 1315 ( 2007 ).

29 . Margueron , R . et al. Role of the polycomb protein EED in the propagation of repressive histone marks . Nature 461 , 762 – 767 ( 2009 ).

30 . Yonetani , T . & Th eorell , H . Studies on liver alcohol hydrogenase complexes. 3. Multiple inhibition kinetics in the presence of two competitive inhibitors . Arch. Biochem. Biophys. 106 , 243 – 251 ( 1964 ).

31 . Richon , V . M . et al. Chemogenetic analysis of human protein methyltransferases . Chem. Biol. Drug Des. 78 , 199 – 210 ( 2011 ).

32 . Chapman , P . B . et al. Improved survival with vemurafenib in melanoma with BRAF V600E mutation . N. Engl. J. Med. 364 , 2507 – 2516 ( 2011 ).

33 . Wigle , T . J . et al. Th e Y641C mutation of EZH2 alters substrate specifi city for histone H3 lysine 27 methylation states . FEBS Lett. 585 , 3011 – 3014 ( 2011 ).

34 . Daigle , S . R . et al. Selective killing of mixed lineage leukemia cells by a potent small-molecule DOT1L inhibitor . Cancer Cell 20 , 53 – 65 ( 2011 ).

35 . Ben-Porath , I . et al. An embryonic stem cell-like gene expression signature in poorly diff erentiated aggressive human tumors . Nat. Genet. 40 , 499 – 507 ( 2008 ).

36 . Dornan , D . et al. Th erapeutic potential of an anti-CD79b antibody – drug conjugate, anti – CD79b-vc-MMAE, for the treatment of non-Hodgkin lymphoma . Blood 114 , 2721 – 2729 ( 2009 ).

37 . Renan , M . J . How many mutations are required for tumorigenesis? Implications from human cancer data . Mol. Carcinog. 7 , 139 – 146 ( 1993 ).

38 . Kaelin , W . G . Jr. Choosing anticancer drug targets in the postgenomic era . J. Clin. Invest. 104 , 1503 – 1506 ( 1999 ).

39 . Li , R . & Staff ord , J . A . Kinase Inhibitor Drugs ( John Wiley & Sons, Inc. , 2009 ) .

40 . Tsai , J . et al. Discovery of a selective inhibitor of oncogenic B-Raf kinase with potent antimelanoma activity . Proc. Natl. Acad. Sci. USA 105 , 3041 – 3046 ( 2008 ).

41 . Kwak , E . L . et al. Anaplastic lymphoma kinase inhibition in non – small-cell lung cancer . N. Engl. J. Med. 363 , 1693 – 1703 ( 2010 ).

42 . Copeland , R . A . Protein methyltransferase inhibitors as personalized cancer therapeutics . Drug Discov. Today Th er. Strateg. published online, doi:10.1016/j.ddstr.2011.08.001 (16 September 2011) .

43 . Copeland , R . A . , Solomon , M . E . & Richon , V . M . Protein methyltransferases as a target class for drug discovery . Nat. Rev. Drug Discov. 8 , 724 – 732 ( 2009 ).

44 . Vedadi , M . et al. A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells . Nat. Chem. Biol. 7 , 566 – 574 ( 2011 ).

45 . Ernst , T . et al. Inactivating mutations of the histone methyltransferase gene EZH2 in myeloid disorders . Nat. Genet. 42 , 722 – 726 ( 2010 ).

46 . Nikoloski , G . et al. Somatic mutations of the histone methyltransferase gene EZH2 in myelodysplastic syndromes . Nat. Genet. 42 , 665 – 667 ( 2010 ).

47 . Ntziachristos , P . et al. Genetic inactivation of the polycomb repressive complex 2 in T cell acute lymphoblastic leukemia . Nat. Med. 18 , 298 – 301 ( 2012 ).

Acknowledgments We thank D. Johnston and A. Basavapathruni for performing DOT1L and WHSC1 enzyme selectivity assays, K. Kuplast for help with the LCC calculations, A. Santospago for preparation of assay plates and R. Gould for helpful discussions.

Author contributions L.J. made the enzymes. K.W.K. and E.J.O. designed compounds including EPZ005687. T.J.W., C.R.M. and C.J.S. performed the enzyme inhibition assays, and T.J.W. performed substrate competitions, Yonetani-Theorell analysis and the in vitro EZH2 pull-down assay. S.K.K., N.M.W., C.J.A., C.R.K., J.S. and J.D.S. performed the intracellular inhibition of H3K27 methylation ELISA. S.K.K. and N.M.W. performed the western blotting of all methyl marks and proliferation assays. S.K.K., N.M.W. and J.J.S. performed gene expres-sion and cell cycle experiments. S.K.K., T.J.W., K.W.K., A.R., J.J.S., M.P.S., R.M.P., R.C., M.P.M., V.M.R., R.A.C. and H.K. designed studies and interpreted results. S.K.K., T.J.W., K.W.K. and R.A.C. wrote the paper.

Competing financial interests The authors declare competing financial interests: details accompany the online version of the paper.

Additional information Supplementary information, chemical compound information and chemical probe information is available in the online version of the paper. Reprints and permissions information is available online at http://www.nature.com/reprints/index.html . Correspondence and requests for materials should be addressed to K.W.K.

NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology 895

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

and selective DOT1L inhibitor, was used to elucidate the causal role of DOT1L enzymatic activity in MLL -rearranged leukemia 34 .

In the present work, we identified EPZ005687 as a potent and selective inhibitor of wild-type and mutant EZH2 – containing PRC2 enzymatic activity. We showed that the compound selectively inhib-its H3K27 methylation in cells and that this translated into selec-tive cell killing for lymphoma cells that contain heterozygous EZH2 mutations at Tyr641 or Ala677. These data established a critical and unique dependency on PRC2 enzymatic activity for the lym-phoma cell lines that bear these EZH2 mutations. This dependency is equivalent to the concept of oncogene addiction, in which cells become abnormally dependent on the biochemical activity of a specific oncogene product for growth, survival or both, such that ablation of the oncogene is cytotoxic in the genetically altered cells but inconsequential to growth of normal cells. The present results provide a compelling foundation for the clinical use of selective EZH2 inhibitors for the treatment of mutant-bearing lymphomas. The current compound represents a chemical biological probe for in vitro experiments, and we do not suggest that this compound itself could form the basis for patient treatment. Pharmacological optimization of compounds such as EPZ005687 holds great promise for this eventual outcome.

Genetic alterations in EZH2 and other PRC2 subunits are not limited to the Tyr641 and Ala677 mutations observed in lym-phoma. A broad spectrum of genetic alterations of PRC2 has been documented in a range of hematologic and solid tumors. Notably, in myeloid malignancies and T-cell leukemia, mutations in EZH2 and other PRC2 components lead to a loss of function of the com-plex 45 – 47 . The fact that both activating and inactivating mutations of EZH2 are associated with malignancy is remarkable and reflects the complex role of PRC2 target genes in cell fate decisions.

EPZ005687 is shown here to be an equally potent inhibitor of both wild-type and Tyr641 or Ala677 mutants of EZH2, suggest-ing that pharmacologically optimized inhibitors with this inhibition profile may be useful in the treatment of a number of human can-cers wherein gain-of-enzymatic function of PRC2 drives disease.

METHODS Determination of inhibitor IC 50 values in the PMT panel. Values for enzymes in the histone methyltransferase panel were determined under balanced assay conditions with both SAM and protein or peptide substrate present at concentrations equal to their respective K m values 24 . Where a peptide was used as a methyl-accepting substrate, the peptide is referred to here by the histone and residue numbers that it represents. For example, peptide H3:16 – 30 refers to a peptide representing histone H3 residues 16 through 30. All reactions were run at 25 ° C in a 50- μ l volume with 2 % (v / v) DMSO in the final reaction. Flag- and His-tagged CARM1 (residues 2 – 585) expressed in 293 cells was assayed at a final concentration of 0.25 nM against a biotinylated peptide corresponding to histone H3:16 – 30 with a monomethylated Arg26. His-tagged Dot1L (residues 1 – 416) expressed in Escherichia coli was assayed at a final concentration of 0.25 nM against chicken erythrocyte oligonucleosomes. His-tagged EHMT2 (residues 913 – 1193) expressed in E. coli was assayed at a final concentration of 0.1 nM against a biotinylated peptide corresponding to H3:1 – 15. His-tagged EHMT1 (residues 951 – 1235) expressed in E. coli was assayed at a final concentration of 0.1 nM against a bio-tinylated peptide corresponding to H3:1 – 15. Full-length glutathione S -transferase (GST)-tagged PRMT1 expressed in Spodoptera frugiperda cells was assayed at a final concentration of 0.75 nM against biotinylated peptide corresponding to H4:36 – 50. GST-tagged PRMT3 (residues 2 – 531) expressed in E. coli was assayed at a final concentration of 0.5 nM against a biotinylated peptide with the sequence biotin-aminohexyl-GGRGGFGGRGGFGGRGGFG-amide. Flag-tagged full-length PRMT5 expressed in 293 cells was assayed at a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:1 – 15. His-tagged PRMT6 (residues 2 – 375) expressed in 293 cells was assayed at a final concentra-tion of 1 nM against a peptide corresponding to H4:N36 – 50 with monomethylated Lys44. Full-length PRMT8 expressed in E. coli was assayed in a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:31 – 45. Full-length SETD7 expressed in E. coli was assayed at a final concentration of 1 nM against a biotinylated peptide corresponding to H3:1 – 15. Full-length Flag-tagged SMYD3 was expressed in E. coli and assayed at a final concentration of 50 nM against recombinant histone H4. His-tagged full-length SMYD2 was assayed at a final con-centration of 1 nM against a biotinylated peptide corresponding to H4:36 – 50. Flag- and His-tagged full-length WHSC1 was expressed in 293 cells and assayed at a

final concentration of 2.5 nM against chicken erythrocyte oligonucleosomes. Flag-tagged full-length WHSC1L1 was expressed in S. frugiperda cells and was assayed at a final concentration of 4 nM against chicken erythrocyte oligonucleosomes.

Cell culture. Lymphoma cell lines OCI-LY19 (ACC-528), WSU-DLCL2 (ACC-575) and Karpas422 (ACC-32) were obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen. Toledo (CRL-2631), HT (CRL-2260), Pfeiffer (CRL-2632) and SUDHL6 (CRL-2959) cell lines were obtained from American Type Culture Collection. DOHH2 (HTL99022) was obtained from Banca Biologica e Cell Factory. SUDHL6 and Karpas422 cell lines were cultured in RPMI plus 20 % (v / v) FBS, and all other cell lines were cultured in RPMI plus 10 % (v / v) FBS.

Analysis of long-term proliferation and cell cycle. Proliferation and cell cycle analysis were performed as previously described 34 , with slight exceptions. For the 11-d proliferation assay, plating densities were determined for each cell line on the basis of linear log-phase growth. Cells were counted and split back to the original plating density in fresh medium with EPZ005687 on days 4 and 7. Viable cell counts and IC 50 calculations were performed as previously described 34 , and LCC calculations were performed as described in Supplementary Methods .

For cell cycle, WSU-DLCL2 cells were plated in 12-well plates at a density of 1 × 10 5 cells per ml. Cells were incubated with EPZ005687 at 0.2 μ M, 0.67 μ M, 2 μ M and 6 μ M, in a total of 2 ml, over a course of 10 d. All remaining cell cycle analysis was performed as previously described 34 .

Received 19 March 2012; accepted 13 July 2012; published online 30 September 2012

References 1 . Kuzmichev , A . , Nishioka , K . , Erdjument-Bromage , H . , Tempst , P . & Reinberg ,

D . Histone methyltransferase activity associated with a human multiprotein complex containing the Enhancer of Zeste protein . Genes Dev. 16 , 2893 – 2905 ( 2002 ).

2 . Cao , R . et al. Role of histone H3 lysine 27 methylation in Polycomb-group silencing . Science 298 , 1039 – 1043 ( 2002 ).

3 . Agger , K . et al. UTX and JMJD3 are histone H3K27 demethylases involved in HOX gene regulation and development . Nature 449 , 731 – 734 ( 2007 ).

4 . Hong , S . et al. Identifi cation of JmjC domain-containing UTX and JMJD3 as histone H3 lysine 27 demethylases . Proc. Natl. Acad. Sci. USA 104 , 18439 – 18444 ( 2007 ).

5 . Lee , M . G . et al. Demethylation of H3K27 regulates polycomb recruitment and H2A ubiquitination . Science 318 , 447 – 450 ( 2007 ).

6 . Lan , F . et al. A histone H3 lysine 27 demethylase regulates animal posterior development . Nature 449 , 689 – 694 ( 2007 ).

7 . De Santa , F . et al. Th e histone H3 lysine-27 demethylase Jmjd3 links infl ammation to inhibition of polycomb-mediated gene silencing . Cell 130 , 1083 – 1094 ( 2007 ).

8 . Kleer , C . G . et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells . Proc. Natl. Acad. Sci. USA 100 , 11606 – 11611 ( 2003 ).

9 . Varambally , S . et al. Th e polycomb group protein EZH2 is involved in progression of prostate cancer . Nature 419 , 624 – 629 ( 2002 ).

10 . Kirmizis , A . et al. Silencing of human polycomb target genes is associated with methylation of histone H3 Lys 27 . Genes Dev. 18 , 1592 – 1605 ( 2004 ).

11 . Bracken , A . P . et al. EZH2 is downstream of the pRB-E2F pathway, essential for proliferation and amplifi ed in cancer . EMBO J. 22 , 5323 – 5335 ( 2003 ).

12 . Simon , J . A . & Lange , C . A . Roles of the EZH2 histone methyltransferase in cancer epigenetics . Mutat. Res. 647 , 21 – 29 ( 2008 ).

13 . Velichutina , I . et al. EZH2-mediated epigenetic silencing in germinal center B cells contributes to proliferation and lymphomagenesis . Blood 116 , 5247 – 5255 ( 2010 ).

14 . van Haaft en , G . et al. Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer . Nat. Genet. 41 , 521 – 523 ( 2009 ).

15 . Wang , S . , Robertson , G . P . & Zhu , J . A novel human homologue of Drosophila polycomblike gene is up-regulated in multiple cancers . Gene 343 , 69 – 78 ( 2004 ).

16 . Morin , R . D . et al. Somatic mutations altering EZH2 (Tyr641) in follicular and diff use large B-cell lymphomas of germinal-center origin . Nat. Genet. 42 , 181 – 185 ( 2010 ).

17 . Lohr , J . G . et al. Discovery and prioritization of somatic mutations in diff use large B-cell lymphoma (DLBCL) by whole-exome sequencing . Proc. Natl. Acad. Sci. USA 109 , 3879 – 3884 ( 2012 ).

18 . Morin , R . D . et al. Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma . Nature 476 , 298 – 303 ( 2011 ).

19 . Sneeringer , C . J . et al. Coordinated activities of wild-type plus mutant EZH2 drive tumor-associated hypertrimethylation of lysine 27 on histone H3 (H3K27) in human B-cell lymphomas . Proc. Natl. Acad. Sci. USA 107 , 20980 – 20985 ( 2010 ).

20 . Yap , D . B . et al. Somatic mutations at EZH2 Y641 act dominantly through a mechanism of selectively altered PRC2 catalytic activity, to increase H3K27 trimethylation . Blood 117 , 2451 – 2459 ( 2011 ).

S42 NATURE REPRINT COLLECTION Epigenetics 894 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

compound in these lymphoma cells. We believe these data strongly support the notion that the enzymatic activity of PRC2 becomes uniquely required for cell growth and survival of lymphoma cells bearing mutant EZH2; these data therefore point to the change-of-function mutations in EZH2 as causal genetic drivers of lymphom-agenesis in these cells.

Impact of EPZ005687 on cell cycle and gene expression To explore further the mechanism of action of EPZ005687 in mutant-bearing lymphoma, we performed cell cycle analysis and transcriptional profiling in WSU-DLCL2 (EZH2 Y641F ) mutant lym-phoma cells treated with EPZ005687. To investigate the cell kill-ing in mutant lymphoma, WSU-DLCL2 cells were treated with EPZ005687 at concentrations ranging from 0.2 μ M to 6 μ M, and cell cycle analysis was performed by flow cytometry at 4-, 7- and 10-d time points after treatment ( Fig. 5 and Supplementary Table 3 ). After 4 d, the G1 phase of the cell cycle increased, with correlative decreases in the S as well as the G2 / M phases ( Fig. 6a ). By 7 d, the highest dose of EPZ005687 (6 μ M) led to an increase in the sub-G1 population of cells, whereas the lower doses resulted in a continued increase of the G1 population ( Fig. 6b ). The prolonged exposure to EPZ005687 for 10 d led to WSU-DLCL2 cells progressing fur-ther toward the sub-G1 population, as seen with both the 2- μ M and 6- μ M doses ( Fig. 6c ).

Gene set enrichment analysis (GSEA) of transcriptional profil-ing data from WSU-DLCL2 cells treated with a high and low dose (6 μ M and 1.5 μ M, respectively) of EPZ005687 revealed a nega-tive enrichment of cell cycle gene sets as early as 24 h after addi-tion of EPZ005687 ( Supplementary Fig. 6a and Supplementary Table 4 ). These data further complement the cell cycle analysis showing a progression toward G1 accumulation upon treatment of EPZ005687 ( Fig. 6 ).

Additional GSEA showed strong enrichment of PRC2-regulated gene sets in WSU-DLCL2 cells treated with EPZ005687. Using a ‘ centroblast-repressed ’ gene signature, in which the chosen genes were identified by chromatin immunoprecipitation to be bound by EZH2 and marked with H3K27me3 in centroblast cells rela-tive to naive B cells 13 , a strong enrichment of this gene set was observed with the higher dose of EPZ005687 at all time points ( Supplementary Fig. 6b and Supplementary Table 5 ). Expression of these genes upon EZH2 inhibition may lead to a more naive B cell or a more differentiated phenotype. Upregulation of a PRC2-repressed gene signature 35 is also significantly enriched in the EPZ005687-treated WSU-DLCL2 cells ( P < 0.01 across all time points; Supplementary Fig. 6c and Supplementary Table 6 ), suggesting that small-molecule inhibition of EZH2 can lead to increased expression of known repressed targets of EZH2 (ref. 36). Taken together, the GSEA data strongly suggest that small- molecule inhibition of EZH2 in a Tyr641 mutant lymphoma

cell line can lead to derepression of known EZH2 target genes and affect genes specifically repressed by the EZH2 Tyr641 mutant.

DISCUSSION Chemical probes are increasingly proving indispensible for a molecular understanding of the biology and physiology of cellular processes in normal and disease states. In human cancers, mul-tiple genetic alterations are commonly associated with the genetic instability that leads to transformation of cells to a hyperprolifera-tive, malignant phenotype. It has been estimated that a minimum of five separate genetic alterations must be accumulated to effect such transformation 37 . Because of the genetic instability of cancer cells, many genetic alterations are observed that do not substan-tially affect cancer transformation or proliferation in a causal man-ner; such mutations have been referred to a ‘ passenger mutations ’ to distinguish them from the true ‘ driver mutations ’ that have a causal role in tumorigenesis 38 . Hence, a major hurdle to the devel-opment of new cancer treatments based on molecular targeting has been the ability to distinguish passenger from driver muta-tions. The use of selective inhibitors of genetically altered enzymes and antagonists of altered receptors has proven valuable in making such distinctions.

Over the past decade, numerous kinase inhibitors have become available to the chemical biology community and have been used to probe the impact of selective kinase inhibition on cancer cells 39 . These studies provide a basis for establishing specific genetic altera-tions as drivers of particular human cancers and pave the way for the development of targeted therapeutic agents for patients that may be identified by the presence of the specific genetic change. Two contemporary examples of this are provided by the recent US Food and Drug Administration approval of vemurafenib to specifically treat melanoma patients carrying the BRAF V600E mutant 32, 40 and of crizotinib to specifically treat lung cancer patients with a chromo-somal translocation of the ALK gene 41 . These drugs exemplify a paradigm shift in the clinical treatment of cancer, with increasing reliance on a molecular understanding of the underlying disease and the use of drugs targeted to the genetic alterations that drive a particular individual ’ s cancer. This paradigm has been referred to as personally targeted cancer therapeutics 42 .

The PMTs represent a large class of epigenetic enzymes that have a paramount role in the control of gene transcription. Several examples of genetic alterations in specific PMTs have been reported in association with different human cancers 43 . It thus seems timely to begin to probe the driver status of genetic alterations in PMTs by the use of potent, selective small-molecule inhibitors of specific PMTs. Indeed, a number of specific PMT inhibitors have begun to be reported in the literature 42 . For exam-ple, UNC0638 has been identified as a G9A and GLP inhibitor that modulates H3K9 methylation in cells 44 , and EPZ004777, a potent

a b c100 Day 4

Perc

enta

ge o

f tot

al c

ell c

ycle

Perc

enta

ge o

f tot

al c

ell c

ycle

Perc

enta

ge o

f tot

al c

ell c

ycle

80

60

40

20

0

Untreated

0.6 μM0.2 μM

DMSO2 μM

6 μM

100 Day 7

80

60

40

20

0

Untreated

0.6 μM0.2 μM

DMSO2 μM

6 μM

100 Day 10

80

60

40

20

0

Untreated

0.6 μM0.2 μM

DMSO2 μM

6 μM

G1Sub-G1

SG2/M

G1Sub-G1

SG2/M

G1Sub-G1

SG2/M

Figure 6 | Inhibition of EZH2 by EPZ005687 results in accumulation in the G1 phase of the cell cycle in an EZH2 Tyr641 mutant lymphoma cell line. ( a ) Treatment of WSU-DLCL2 cells with EPZ005687 for 4 d results in a dose-dependent increase of accumulation in G1. ( b , c ) Prolonged exposure of EPZ005687 leads to increases in the sub-G1 population after 7 d ( b ) and 10 d ( c ) at the higher doses. Graphs represent the mean of duplicates ± s.e.m.

896 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

21 . McCabe , M . T . et al. Mutation of A677 in histone methyltransferase EZH2 in human B-cell lymphoma promotes hypertrimethylation of histone H3 on lysine 27 (H3K27) . Proc. Natl. Acad. Sci. USA 109 , 2989 – 2994 ( 2012 ).

22 . Miranda , T . B . et al. DZNep is a global histone methylation inhibitor that reactivates developmental genes not silenced by DNA methylation . Mol. Cancer Th er. 8 , 1579 – 1588 ( 2009 ).

23 . Tan , J . et al. Pharmacologic disruption of Polycomb-repressive complex 2-mediated gene repression selectively induces apoptosis in cancer cells . Genes Dev. 21 , 1050 – 1063 ( 2007 ).

24 . Copeland , R . A . Evaluation of enzyme inhibitors in drug discovery. A guide for medicinal chemists and pharmacologists ( John Wiley & Sons , 2005 ) .

25 . Duquenne , C . et al. Indazoles. International patent application PCT WO2011140325 ( 2011 ).

26 . Burgess , J . et al. Azaindazoles. International patent application PCT WO2012005805 ( 2012 ).

27 . Xu , C . et al. Binding of diff erent histone marks diff erentially regulates the activity and specifi city of polycomb repressive complex 2 (PRC2) . Proc. Natl. Acad. Sci. USA 107 , 19266 – 19271 ( 2010 ).

28 . Han , Z . et al. Structural basis of EZH2 recognition by EED . Structure 15 , 1306 – 1315 ( 2007 ).

29 . Margueron , R . et al. Role of the polycomb protein EED in the propagation of repressive histone marks . Nature 461 , 762 – 767 ( 2009 ).

30 . Yonetani , T . & Th eorell , H . Studies on liver alcohol hydrogenase complexes. 3. Multiple inhibition kinetics in the presence of two competitive inhibitors . Arch. Biochem. Biophys. 106 , 243 – 251 ( 1964 ).

31 . Richon , V . M . et al. Chemogenetic analysis of human protein methyltransferases . Chem. Biol. Drug Des. 78 , 199 – 210 ( 2011 ).

32 . Chapman , P . B . et al. Improved survival with vemurafenib in melanoma with BRAF V600E mutation . N. Engl. J. Med. 364 , 2507 – 2516 ( 2011 ).

33 . Wigle , T . J . et al. Th e Y641C mutation of EZH2 alters substrate specifi city for histone H3 lysine 27 methylation states . FEBS Lett. 585 , 3011 – 3014 ( 2011 ).

34 . Daigle , S . R . et al. Selective killing of mixed lineage leukemia cells by a potent small-molecule DOT1L inhibitor . Cancer Cell 20 , 53 – 65 ( 2011 ).

35 . Ben-Porath , I . et al. An embryonic stem cell-like gene expression signature in poorly diff erentiated aggressive human tumors . Nat. Genet. 40 , 499 – 507 ( 2008 ).

36 . Dornan , D . et al. Th erapeutic potential of an anti-CD79b antibody – drug conjugate, anti – CD79b-vc-MMAE, for the treatment of non-Hodgkin lymphoma . Blood 114 , 2721 – 2729 ( 2009 ).

37 . Renan , M . J . How many mutations are required for tumorigenesis? Implications from human cancer data . Mol. Carcinog. 7 , 139 – 146 ( 1993 ).

38 . Kaelin , W . G . Jr. Choosing anticancer drug targets in the postgenomic era . J. Clin. Invest. 104 , 1503 – 1506 ( 1999 ).

39 . Li , R . & Staff ord , J . A . Kinase Inhibitor Drugs ( John Wiley & Sons, Inc. , 2009 ) .

40 . Tsai , J . et al. Discovery of a selective inhibitor of oncogenic B-Raf kinase with potent antimelanoma activity . Proc. Natl. Acad. Sci. USA 105 , 3041 – 3046 ( 2008 ).

41 . Kwak , E . L . et al. Anaplastic lymphoma kinase inhibition in non – small-cell lung cancer . N. Engl. J. Med. 363 , 1693 – 1703 ( 2010 ).

42 . Copeland , R . A . Protein methyltransferase inhibitors as personalized cancer therapeutics . Drug Discov. Today Th er. Strateg. published online, doi:10.1016/j.ddstr.2011.08.001 (16 September 2011) .

43 . Copeland , R . A . , Solomon , M . E . & Richon , V . M . Protein methyltransferases as a target class for drug discovery . Nat. Rev. Drug Discov. 8 , 724 – 732 ( 2009 ).

44 . Vedadi , M . et al. A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells . Nat. Chem. Biol. 7 , 566 – 574 ( 2011 ).

45 . Ernst , T . et al. Inactivating mutations of the histone methyltransferase gene EZH2 in myeloid disorders . Nat. Genet. 42 , 722 – 726 ( 2010 ).

46 . Nikoloski , G . et al. Somatic mutations of the histone methyltransferase gene EZH2 in myelodysplastic syndromes . Nat. Genet. 42 , 665 – 667 ( 2010 ).

47 . Ntziachristos , P . et al. Genetic inactivation of the polycomb repressive complex 2 in T cell acute lymphoblastic leukemia . Nat. Med. 18 , 298 – 301 ( 2012 ).

Acknowledgments We thank D. Johnston and A. Basavapathruni for performing DOT1L and WHSC1 enzyme selectivity assays, K. Kuplast for help with the LCC calculations, A. Santospago for preparation of assay plates and R. Gould for helpful discussions.

Author contributions L.J. made the enzymes. K.W.K. and E.J.O. designed compounds including EPZ005687. T.J.W., C.R.M. and C.J.S. performed the enzyme inhibition assays, and T.J.W. performed substrate competitions, Yonetani-Theorell analysis and the in vitro EZH2 pull-down assay. S.K.K., N.M.W., C.J.A., C.R.K., J.S. and J.D.S. performed the intracellular inhibition of H3K27 methylation ELISA. S.K.K. and N.M.W. performed the western blotting of all methyl marks and proliferation assays. S.K.K., N.M.W. and J.J.S. performed gene expres-sion and cell cycle experiments. S.K.K., T.J.W., K.W.K., A.R., J.J.S., M.P.S., R.M.P., R.C., M.P.M., V.M.R., R.A.C. and H.K. designed studies and interpreted results. S.K.K., T.J.W., K.W.K. and R.A.C. wrote the paper.

Competing financial interests The authors declare competing financial interests: details accompany the online version of the paper.

Additional information Supplementary information, chemical compound information and chemical probe information is available in the online version of the paper. Reprints and permissions information is available online at http://www.nature.com/reprints/index.html . Correspondence and requests for materials should be addressed to K.W.K.

NATURE REPRINT COLLECTION Epigenetics S43894 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

compound in these lymphoma cells. We believe these data strongly support the notion that the enzymatic activity of PRC2 becomes uniquely required for cell growth and survival of lymphoma cells bearing mutant EZH2; these data therefore point to the change-of-function mutations in EZH2 as causal genetic drivers of lymphom-agenesis in these cells.

Impact of EPZ005687 on cell cycle and gene expression To explore further the mechanism of action of EPZ005687 in mutant-bearing lymphoma, we performed cell cycle analysis and transcriptional profiling in WSU-DLCL2 (EZH2 Y641F ) mutant lym-phoma cells treated with EPZ005687. To investigate the cell kill-ing in mutant lymphoma, WSU-DLCL2 cells were treated with EPZ005687 at concentrations ranging from 0.2 μ M to 6 μ M, and cell cycle analysis was performed by flow cytometry at 4-, 7- and 10-d time points after treatment ( Fig. 5 and Supplementary Table 3 ). After 4 d, the G1 phase of the cell cycle increased, with correlative decreases in the S as well as the G2 / M phases ( Fig. 6a ). By 7 d, the highest dose of EPZ005687 (6 μ M) led to an increase in the sub-G1 population of cells, whereas the lower doses resulted in a continued increase of the G1 population ( Fig. 6b ). The prolonged exposure to EPZ005687 for 10 d led to WSU-DLCL2 cells progressing fur-ther toward the sub-G1 population, as seen with both the 2- μ M and 6- μ M doses ( Fig. 6c ).

Gene set enrichment analysis (GSEA) of transcriptional profil-ing data from WSU-DLCL2 cells treated with a high and low dose (6 μ M and 1.5 μ M, respectively) of EPZ005687 revealed a nega-tive enrichment of cell cycle gene sets as early as 24 h after addi-tion of EPZ005687 ( Supplementary Fig. 6a and Supplementary Table 4 ). These data further complement the cell cycle analysis showing a progression toward G1 accumulation upon treatment of EPZ005687 ( Fig. 6 ).

Additional GSEA showed strong enrichment of PRC2-regulated gene sets in WSU-DLCL2 cells treated with EPZ005687. Using a ‘ centroblast-repressed ’ gene signature, in which the chosen genes were identified by chromatin immunoprecipitation to be bound by EZH2 and marked with H3K27me3 in centroblast cells rela-tive to naive B cells 13 , a strong enrichment of this gene set was observed with the higher dose of EPZ005687 at all time points ( Supplementary Fig. 6b and Supplementary Table 5 ). Expression of these genes upon EZH2 inhibition may lead to a more naive B cell or a more differentiated phenotype. Upregulation of a PRC2-repressed gene signature 35 is also significantly enriched in the EPZ005687-treated WSU-DLCL2 cells ( P < 0.01 across all time points; Supplementary Fig. 6c and Supplementary Table 6 ), suggesting that small-molecule inhibition of EZH2 can lead to increased expression of known repressed targets of EZH2 (ref. 36). Taken together, the GSEA data strongly suggest that small- molecule inhibition of EZH2 in a Tyr641 mutant lymphoma

cell line can lead to derepression of known EZH2 target genes and affect genes specifically repressed by the EZH2 Tyr641 mutant.

DISCUSSION Chemical probes are increasingly proving indispensible for a molecular understanding of the biology and physiology of cellular processes in normal and disease states. In human cancers, mul-tiple genetic alterations are commonly associated with the genetic instability that leads to transformation of cells to a hyperprolifera-tive, malignant phenotype. It has been estimated that a minimum of five separate genetic alterations must be accumulated to effect such transformation 37 . Because of the genetic instability of cancer cells, many genetic alterations are observed that do not substan-tially affect cancer transformation or proliferation in a causal man-ner; such mutations have been referred to a ‘ passenger mutations ’ to distinguish them from the true ‘ driver mutations ’ that have a causal role in tumorigenesis 38 . Hence, a major hurdle to the devel-opment of new cancer treatments based on molecular targeting has been the ability to distinguish passenger from driver muta-tions. The use of selective inhibitors of genetically altered enzymes and antagonists of altered receptors has proven valuable in making such distinctions.

Over the past decade, numerous kinase inhibitors have become available to the chemical biology community and have been used to probe the impact of selective kinase inhibition on cancer cells 39 . These studies provide a basis for establishing specific genetic altera-tions as drivers of particular human cancers and pave the way for the development of targeted therapeutic agents for patients that may be identified by the presence of the specific genetic change. Two contemporary examples of this are provided by the recent US Food and Drug Administration approval of vemurafenib to specifically treat melanoma patients carrying the BRAF V600E mutant 32, 40 and of crizotinib to specifically treat lung cancer patients with a chromo-somal translocation of the ALK gene 41 . These drugs exemplify a paradigm shift in the clinical treatment of cancer, with increasing reliance on a molecular understanding of the underlying disease and the use of drugs targeted to the genetic alterations that drive a particular individual ’ s cancer. This paradigm has been referred to as personally targeted cancer therapeutics 42 .

The PMTs represent a large class of epigenetic enzymes that have a paramount role in the control of gene transcription. Several examples of genetic alterations in specific PMTs have been reported in association with different human cancers 43 . It thus seems timely to begin to probe the driver status of genetic alterations in PMTs by the use of potent, selective small-molecule inhibitors of specific PMTs. Indeed, a number of specific PMT inhibitors have begun to be reported in the literature 42 . For exam-ple, UNC0638 has been identified as a G9A and GLP inhibitor that modulates H3K9 methylation in cells 44 , and EPZ004777, a potent

a b c100 Day 4

Perc

enta

ge o

f tot

al c

ell c

ycle

Perc

enta

ge o

f tot

al c

ell c

ycle

Perc

enta

ge o

f tot

al c

ell c

ycle

80

60

40

20

0

Untreated

0.6 μM0.2 μM

DMSO2 μM

6 μM

100 Day 7

80

60

40

20

0

Untreated

0.6 μM0.2 μM

DMSO2 μM

6 μM

100 Day 10

80

60

40

20

0

Untreated

0.6 μM0.2 μM

DMSO2 μM

6 μM

G1Sub-G1

SG2/M

G1Sub-G1

SG2/M

G1Sub-G1

SG2/M

Figure 6 | Inhibition of EZH2 by EPZ005687 results in accumulation in the G1 phase of the cell cycle in an EZH2 Tyr641 mutant lymphoma cell line. ( a ) Treatment of WSU-DLCL2 cells with EPZ005687 for 4 d results in a dose-dependent increase of accumulation in G1. ( b , c ) Prolonged exposure of EPZ005687 leads to increases in the sub-G1 population after 7 d ( b ) and 10 d ( c ) at the higher doses. Graphs represent the mean of duplicates ± s.e.m.

NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology 895

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

and selective DOT1L inhibitor, was used to elucidate the causal role of DOT1L enzymatic activity in MLL -rearranged leukemia 34 .

In the present work, we identified EPZ005687 as a potent and selective inhibitor of wild-type and mutant EZH2 – containing PRC2 enzymatic activity. We showed that the compound selectively inhib-its H3K27 methylation in cells and that this translated into selec-tive cell killing for lymphoma cells that contain heterozygous EZH2 mutations at Tyr641 or Ala677. These data established a critical and unique dependency on PRC2 enzymatic activity for the lym-phoma cell lines that bear these EZH2 mutations. This dependency is equivalent to the concept of oncogene addiction, in which cells become abnormally dependent on the biochemical activity of a specific oncogene product for growth, survival or both, such that ablation of the oncogene is cytotoxic in the genetically altered cells but inconsequential to growth of normal cells. The present results provide a compelling foundation for the clinical use of selective EZH2 inhibitors for the treatment of mutant-bearing lymphomas. The current compound represents a chemical biological probe for in vitro experiments, and we do not suggest that this compound itself could form the basis for patient treatment. Pharmacological optimization of compounds such as EPZ005687 holds great promise for this eventual outcome.

Genetic alterations in EZH2 and other PRC2 subunits are not limited to the Tyr641 and Ala677 mutations observed in lym-phoma. A broad spectrum of genetic alterations of PRC2 has been documented in a range of hematologic and solid tumors. Notably, in myeloid malignancies and T-cell leukemia, mutations in EZH2 and other PRC2 components lead to a loss of function of the com-plex 45 – 47 . The fact that both activating and inactivating mutations of EZH2 are associated with malignancy is remarkable and reflects the complex role of PRC2 target genes in cell fate decisions.

EPZ005687 is shown here to be an equally potent inhibitor of both wild-type and Tyr641 or Ala677 mutants of EZH2, suggest-ing that pharmacologically optimized inhibitors with this inhibition profile may be useful in the treatment of a number of human can-cers wherein gain-of-enzymatic function of PRC2 drives disease.

METHODS Determination of inhibitor IC 50 values in the PMT panel. Values for enzymes in the histone methyltransferase panel were determined under balanced assay conditions with both SAM and protein or peptide substrate present at concentrations equal to their respective K m values 24 . Where a peptide was used as a methyl-accepting substrate, the peptide is referred to here by the histone and residue numbers that it represents. For example, peptide H3:16 – 30 refers to a peptide representing histone H3 residues 16 through 30. All reactions were run at 25 ° C in a 50- μ l volume with 2 % (v / v) DMSO in the final reaction. Flag- and His-tagged CARM1 (residues 2 – 585) expressed in 293 cells was assayed at a final concentration of 0.25 nM against a biotinylated peptide corresponding to histone H3:16 – 30 with a monomethylated Arg26. His-tagged Dot1L (residues 1 – 416) expressed in Escherichia coli was assayed at a final concentration of 0.25 nM against chicken erythrocyte oligonucleosomes. His-tagged EHMT2 (residues 913 – 1193) expressed in E. coli was assayed at a final concentration of 0.1 nM against a biotinylated peptide corresponding to H3:1 – 15. His-tagged EHMT1 (residues 951 – 1235) expressed in E. coli was assayed at a final concentration of 0.1 nM against a bio-tinylated peptide corresponding to H3:1 – 15. Full-length glutathione S -transferase (GST)-tagged PRMT1 expressed in Spodoptera frugiperda cells was assayed at a final concentration of 0.75 nM against biotinylated peptide corresponding to H4:36 – 50. GST-tagged PRMT3 (residues 2 – 531) expressed in E. coli was assayed at a final concentration of 0.5 nM against a biotinylated peptide with the sequence biotin-aminohexyl-GGRGGFGGRGGFGGRGGFG-amide. Flag-tagged full-length PRMT5 expressed in 293 cells was assayed at a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:1 – 15. His-tagged PRMT6 (residues 2 – 375) expressed in 293 cells was assayed at a final concentra-tion of 1 nM against a peptide corresponding to H4:N36 – 50 with monomethylated Lys44. Full-length PRMT8 expressed in E. coli was assayed in a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:31 – 45. Full-length SETD7 expressed in E. coli was assayed at a final concentration of 1 nM against a biotinylated peptide corresponding to H3:1 – 15. Full-length Flag-tagged SMYD3 was expressed in E. coli and assayed at a final concentration of 50 nM against recombinant histone H4. His-tagged full-length SMYD2 was assayed at a final con-centration of 1 nM against a biotinylated peptide corresponding to H4:36 – 50. Flag- and His-tagged full-length WHSC1 was expressed in 293 cells and assayed at a

final concentration of 2.5 nM against chicken erythrocyte oligonucleosomes. Flag-tagged full-length WHSC1L1 was expressed in S. frugiperda cells and was assayed at a final concentration of 4 nM against chicken erythrocyte oligonucleosomes.

Cell culture. Lymphoma cell lines OCI-LY19 (ACC-528), WSU-DLCL2 (ACC-575) and Karpas422 (ACC-32) were obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen. Toledo (CRL-2631), HT (CRL-2260), Pfeiffer (CRL-2632) and SUDHL6 (CRL-2959) cell lines were obtained from American Type Culture Collection. DOHH2 (HTL99022) was obtained from Banca Biologica e Cell Factory. SUDHL6 and Karpas422 cell lines were cultured in RPMI plus 20 % (v / v) FBS, and all other cell lines were cultured in RPMI plus 10 % (v / v) FBS.

Analysis of long-term proliferation and cell cycle. Proliferation and cell cycle analysis were performed as previously described 34 , with slight exceptions. For the 11-d proliferation assay, plating densities were determined for each cell line on the basis of linear log-phase growth. Cells were counted and split back to the original plating density in fresh medium with EPZ005687 on days 4 and 7. Viable cell counts and IC 50 calculations were performed as previously described 34 , and LCC calculations were performed as described in Supplementary Methods .

For cell cycle, WSU-DLCL2 cells were plated in 12-well plates at a density of 1 × 10 5 cells per ml. Cells were incubated with EPZ005687 at 0.2 μ M, 0.67 μ M, 2 μ M and 6 μ M, in a total of 2 ml, over a course of 10 d. All remaining cell cycle analysis was performed as previously described 34 .

Received 19 March 2012; accepted 13 July 2012; published online 30 September 2012

References 1 . Kuzmichev , A . , Nishioka , K . , Erdjument-Bromage , H . , Tempst , P . & Reinberg ,

D . Histone methyltransferase activity associated with a human multiprotein complex containing the Enhancer of Zeste protein . Genes Dev. 16 , 2893 – 2905 ( 2002 ).

2 . Cao , R . et al. Role of histone H3 lysine 27 methylation in Polycomb-group silencing . Science 298 , 1039 – 1043 ( 2002 ).

3 . Agger , K . et al. UTX and JMJD3 are histone H3K27 demethylases involved in HOX gene regulation and development . Nature 449 , 731 – 734 ( 2007 ).

4 . Hong , S . et al. Identifi cation of JmjC domain-containing UTX and JMJD3 as histone H3 lysine 27 demethylases . Proc. Natl. Acad. Sci. USA 104 , 18439 – 18444 ( 2007 ).

5 . Lee , M . G . et al. Demethylation of H3K27 regulates polycomb recruitment and H2A ubiquitination . Science 318 , 447 – 450 ( 2007 ).

6 . Lan , F . et al. A histone H3 lysine 27 demethylase regulates animal posterior development . Nature 449 , 689 – 694 ( 2007 ).

7 . De Santa , F . et al. Th e histone H3 lysine-27 demethylase Jmjd3 links infl ammation to inhibition of polycomb-mediated gene silencing . Cell 130 , 1083 – 1094 ( 2007 ).

8 . Kleer , C . G . et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells . Proc. Natl. Acad. Sci. USA 100 , 11606 – 11611 ( 2003 ).

9 . Varambally , S . et al. Th e polycomb group protein EZH2 is involved in progression of prostate cancer . Nature 419 , 624 – 629 ( 2002 ).

10 . Kirmizis , A . et al. Silencing of human polycomb target genes is associated with methylation of histone H3 Lys 27 . Genes Dev. 18 , 1592 – 1605 ( 2004 ).

11 . Bracken , A . P . et al. EZH2 is downstream of the pRB-E2F pathway, essential for proliferation and amplifi ed in cancer . EMBO J. 22 , 5323 – 5335 ( 2003 ).

12 . Simon , J . A . & Lange , C . A . Roles of the EZH2 histone methyltransferase in cancer epigenetics . Mutat. Res. 647 , 21 – 29 ( 2008 ).

13 . Velichutina , I . et al. EZH2-mediated epigenetic silencing in germinal center B cells contributes to proliferation and lymphomagenesis . Blood 116 , 5247 – 5255 ( 2010 ).

14 . van Haaft en , G . et al. Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer . Nat. Genet. 41 , 521 – 523 ( 2009 ).

15 . Wang , S . , Robertson , G . P . & Zhu , J . A novel human homologue of Drosophila polycomblike gene is up-regulated in multiple cancers . Gene 343 , 69 – 78 ( 2004 ).

16 . Morin , R . D . et al. Somatic mutations altering EZH2 (Tyr641) in follicular and diff use large B-cell lymphomas of germinal-center origin . Nat. Genet. 42 , 181 – 185 ( 2010 ).

17 . Lohr , J . G . et al. Discovery and prioritization of somatic mutations in diff use large B-cell lymphoma (DLBCL) by whole-exome sequencing . Proc. Natl. Acad. Sci. USA 109 , 3879 – 3884 ( 2012 ).

18 . Morin , R . D . et al. Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma . Nature 476 , 298 – 303 ( 2011 ).

19 . Sneeringer , C . J . et al. Coordinated activities of wild-type plus mutant EZH2 drive tumor-associated hypertrimethylation of lysine 27 on histone H3 (H3K27) in human B-cell lymphomas . Proc. Natl. Acad. Sci. USA 107 , 20980 – 20985 ( 2010 ).

20 . Yap , D . B . et al. Somatic mutations at EZH2 Y641 act dominantly through a mechanism of selectively altered PRC2 catalytic activity, to increase H3K27 trimethylation . Blood 117 , 2451 – 2459 ( 2011 ).

896 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

21 . McCabe , M . T . et al. Mutation of A677 in histone methyltransferase EZH2 in human B-cell lymphoma promotes hypertrimethylation of histone H3 on lysine 27 (H3K27) . Proc. Natl. Acad. Sci. USA 109 , 2989 – 2994 ( 2012 ).

22 . Miranda , T . B . et al. DZNep is a global histone methylation inhibitor that reactivates developmental genes not silenced by DNA methylation . Mol. Cancer Th er. 8 , 1579 – 1588 ( 2009 ).

23 . Tan , J . et al. Pharmacologic disruption of Polycomb-repressive complex 2-mediated gene repression selectively induces apoptosis in cancer cells . Genes Dev. 21 , 1050 – 1063 ( 2007 ).

24 . Copeland , R . A . Evaluation of enzyme inhibitors in drug discovery. A guide for medicinal chemists and pharmacologists ( John Wiley & Sons , 2005 ) .

25 . Duquenne , C . et al. Indazoles. International patent application PCT WO2011140325 ( 2011 ).

26 . Burgess , J . et al. Azaindazoles. International patent application PCT WO2012005805 ( 2012 ).

27 . Xu , C . et al. Binding of diff erent histone marks diff erentially regulates the activity and specifi city of polycomb repressive complex 2 (PRC2) . Proc. Natl. Acad. Sci. USA 107 , 19266 – 19271 ( 2010 ).

28 . Han , Z . et al. Structural basis of EZH2 recognition by EED . Structure 15 , 1306 – 1315 ( 2007 ).

29 . Margueron , R . et al. Role of the polycomb protein EED in the propagation of repressive histone marks . Nature 461 , 762 – 767 ( 2009 ).

30 . Yonetani , T . & Th eorell , H . Studies on liver alcohol hydrogenase complexes. 3. Multiple inhibition kinetics in the presence of two competitive inhibitors . Arch. Biochem. Biophys. 106 , 243 – 251 ( 1964 ).

31 . Richon , V . M . et al. Chemogenetic analysis of human protein methyltransferases . Chem. Biol. Drug Des. 78 , 199 – 210 ( 2011 ).

32 . Chapman , P . B . et al. Improved survival with vemurafenib in melanoma with BRAF V600E mutation . N. Engl. J. Med. 364 , 2507 – 2516 ( 2011 ).

33 . Wigle , T . J . et al. Th e Y641C mutation of EZH2 alters substrate specifi city for histone H3 lysine 27 methylation states . FEBS Lett. 585 , 3011 – 3014 ( 2011 ).

34 . Daigle , S . R . et al. Selective killing of mixed lineage leukemia cells by a potent small-molecule DOT1L inhibitor . Cancer Cell 20 , 53 – 65 ( 2011 ).

35 . Ben-Porath , I . et al. An embryonic stem cell-like gene expression signature in poorly diff erentiated aggressive human tumors . Nat. Genet. 40 , 499 – 507 ( 2008 ).

36 . Dornan , D . et al. Th erapeutic potential of an anti-CD79b antibody – drug conjugate, anti – CD79b-vc-MMAE, for the treatment of non-Hodgkin lymphoma . Blood 114 , 2721 – 2729 ( 2009 ).

37 . Renan , M . J . How many mutations are required for tumorigenesis? Implications from human cancer data . Mol. Carcinog. 7 , 139 – 146 ( 1993 ).

38 . Kaelin , W . G . Jr. Choosing anticancer drug targets in the postgenomic era . J. Clin. Invest. 104 , 1503 – 1506 ( 1999 ).

39 . Li , R . & Staff ord , J . A . Kinase Inhibitor Drugs ( John Wiley & Sons, Inc. , 2009 ) .

40 . Tsai , J . et al. Discovery of a selective inhibitor of oncogenic B-Raf kinase with potent antimelanoma activity . Proc. Natl. Acad. Sci. USA 105 , 3041 – 3046 ( 2008 ).

41 . Kwak , E . L . et al. Anaplastic lymphoma kinase inhibition in non – small-cell lung cancer . N. Engl. J. Med. 363 , 1693 – 1703 ( 2010 ).

42 . Copeland , R . A . Protein methyltransferase inhibitors as personalized cancer therapeutics . Drug Discov. Today Th er. Strateg. published online, doi:10.1016/j.ddstr.2011.08.001 (16 September 2011) .

43 . Copeland , R . A . , Solomon , M . E . & Richon , V . M . Protein methyltransferases as a target class for drug discovery . Nat. Rev. Drug Discov. 8 , 724 – 732 ( 2009 ).

44 . Vedadi , M . et al. A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells . Nat. Chem. Biol. 7 , 566 – 574 ( 2011 ).

45 . Ernst , T . et al. Inactivating mutations of the histone methyltransferase gene EZH2 in myeloid disorders . Nat. Genet. 42 , 722 – 726 ( 2010 ).

46 . Nikoloski , G . et al. Somatic mutations of the histone methyltransferase gene EZH2 in myelodysplastic syndromes . Nat. Genet. 42 , 665 – 667 ( 2010 ).

47 . Ntziachristos , P . et al. Genetic inactivation of the polycomb repressive complex 2 in T cell acute lymphoblastic leukemia . Nat. Med. 18 , 298 – 301 ( 2012 ).

Acknowledgments We thank D. Johnston and A. Basavapathruni for performing DOT1L and WHSC1 enzyme selectivity assays, K. Kuplast for help with the LCC calculations, A. Santospago for preparation of assay plates and R. Gould for helpful discussions.

Author contributions L.J. made the enzymes. K.W.K. and E.J.O. designed compounds including EPZ005687. T.J.W., C.R.M. and C.J.S. performed the enzyme inhibition assays, and T.J.W. performed substrate competitions, Yonetani-Theorell analysis and the in vitro EZH2 pull-down assay. S.K.K., N.M.W., C.J.A., C.R.K., J.S. and J.D.S. performed the intracellular inhibition of H3K27 methylation ELISA. S.K.K. and N.M.W. performed the western blotting of all methyl marks and proliferation assays. S.K.K., N.M.W. and J.J.S. performed gene expres-sion and cell cycle experiments. S.K.K., T.J.W., K.W.K., A.R., J.J.S., M.P.S., R.M.P., R.C., M.P.M., V.M.R., R.A.C. and H.K. designed studies and interpreted results. S.K.K., T.J.W., K.W.K. and R.A.C. wrote the paper.

Competing financial interests The authors declare competing financial interests: details accompany the online version of the paper.

Additional information Supplementary information, chemical compound information and chemical probe information is available in the online version of the paper. Reprints and permissions information is available online at http://www.nature.com/reprints/index.html . Correspondence and requests for materials should be addressed to K.W.K.

NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology 895

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

and selective DOT1L inhibitor, was used to elucidate the causal role of DOT1L enzymatic activity in MLL -rearranged leukemia 34 .

In the present work, we identified EPZ005687 as a potent and selective inhibitor of wild-type and mutant EZH2 – containing PRC2 enzymatic activity. We showed that the compound selectively inhib-its H3K27 methylation in cells and that this translated into selec-tive cell killing for lymphoma cells that contain heterozygous EZH2 mutations at Tyr641 or Ala677. These data established a critical and unique dependency on PRC2 enzymatic activity for the lym-phoma cell lines that bear these EZH2 mutations. This dependency is equivalent to the concept of oncogene addiction, in which cells become abnormally dependent on the biochemical activity of a specific oncogene product for growth, survival or both, such that ablation of the oncogene is cytotoxic in the genetically altered cells but inconsequential to growth of normal cells. The present results provide a compelling foundation for the clinical use of selective EZH2 inhibitors for the treatment of mutant-bearing lymphomas. The current compound represents a chemical biological probe for in vitro experiments, and we do not suggest that this compound itself could form the basis for patient treatment. Pharmacological optimization of compounds such as EPZ005687 holds great promise for this eventual outcome.

Genetic alterations in EZH2 and other PRC2 subunits are not limited to the Tyr641 and Ala677 mutations observed in lym-phoma. A broad spectrum of genetic alterations of PRC2 has been documented in a range of hematologic and solid tumors. Notably, in myeloid malignancies and T-cell leukemia, mutations in EZH2 and other PRC2 components lead to a loss of function of the com-plex 45 – 47 . The fact that both activating and inactivating mutations of EZH2 are associated with malignancy is remarkable and reflects the complex role of PRC2 target genes in cell fate decisions.

EPZ005687 is shown here to be an equally potent inhibitor of both wild-type and Tyr641 or Ala677 mutants of EZH2, suggest-ing that pharmacologically optimized inhibitors with this inhibition profile may be useful in the treatment of a number of human can-cers wherein gain-of-enzymatic function of PRC2 drives disease.

METHODS Determination of inhibitor IC 50 values in the PMT panel. Values for enzymes in the histone methyltransferase panel were determined under balanced assay conditions with both SAM and protein or peptide substrate present at concentrations equal to their respective K m values 24 . Where a peptide was used as a methyl-accepting substrate, the peptide is referred to here by the histone and residue numbers that it represents. For example, peptide H3:16 – 30 refers to a peptide representing histone H3 residues 16 through 30. All reactions were run at 25 ° C in a 50- μ l volume with 2 % (v / v) DMSO in the final reaction. Flag- and His-tagged CARM1 (residues 2 – 585) expressed in 293 cells was assayed at a final concentration of 0.25 nM against a biotinylated peptide corresponding to histone H3:16 – 30 with a monomethylated Arg26. His-tagged Dot1L (residues 1 – 416) expressed in Escherichia coli was assayed at a final concentration of 0.25 nM against chicken erythrocyte oligonucleosomes. His-tagged EHMT2 (residues 913 – 1193) expressed in E. coli was assayed at a final concentration of 0.1 nM against a biotinylated peptide corresponding to H3:1 – 15. His-tagged EHMT1 (residues 951 – 1235) expressed in E. coli was assayed at a final concentration of 0.1 nM against a bio-tinylated peptide corresponding to H3:1 – 15. Full-length glutathione S -transferase (GST)-tagged PRMT1 expressed in Spodoptera frugiperda cells was assayed at a final concentration of 0.75 nM against biotinylated peptide corresponding to H4:36 – 50. GST-tagged PRMT3 (residues 2 – 531) expressed in E. coli was assayed at a final concentration of 0.5 nM against a biotinylated peptide with the sequence biotin-aminohexyl-GGRGGFGGRGGFGGRGGFG-amide. Flag-tagged full-length PRMT5 expressed in 293 cells was assayed at a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:1 – 15. His-tagged PRMT6 (residues 2 – 375) expressed in 293 cells was assayed at a final concentra-tion of 1 nM against a peptide corresponding to H4:N36 – 50 with monomethylated Lys44. Full-length PRMT8 expressed in E. coli was assayed in a final concentration of 1.5 nM against a biotinylated peptide corresponding to H4:31 – 45. Full-length SETD7 expressed in E. coli was assayed at a final concentration of 1 nM against a biotinylated peptide corresponding to H3:1 – 15. Full-length Flag-tagged SMYD3 was expressed in E. coli and assayed at a final concentration of 50 nM against recombinant histone H4. His-tagged full-length SMYD2 was assayed at a final con-centration of 1 nM against a biotinylated peptide corresponding to H4:36 – 50. Flag- and His-tagged full-length WHSC1 was expressed in 293 cells and assayed at a

final concentration of 2.5 nM against chicken erythrocyte oligonucleosomes. Flag-tagged full-length WHSC1L1 was expressed in S. frugiperda cells and was assayed at a final concentration of 4 nM against chicken erythrocyte oligonucleosomes.

Cell culture. Lymphoma cell lines OCI-LY19 (ACC-528), WSU-DLCL2 (ACC-575) and Karpas422 (ACC-32) were obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen. Toledo (CRL-2631), HT (CRL-2260), Pfeiffer (CRL-2632) and SUDHL6 (CRL-2959) cell lines were obtained from American Type Culture Collection. DOHH2 (HTL99022) was obtained from Banca Biologica e Cell Factory. SUDHL6 and Karpas422 cell lines were cultured in RPMI plus 20 % (v / v) FBS, and all other cell lines were cultured in RPMI plus 10 % (v / v) FBS.

Analysis of long-term proliferation and cell cycle. Proliferation and cell cycle analysis were performed as previously described 34 , with slight exceptions. For the 11-d proliferation assay, plating densities were determined for each cell line on the basis of linear log-phase growth. Cells were counted and split back to the original plating density in fresh medium with EPZ005687 on days 4 and 7. Viable cell counts and IC 50 calculations were performed as previously described 34 , and LCC calculations were performed as described in Supplementary Methods .

For cell cycle, WSU-DLCL2 cells were plated in 12-well plates at a density of 1 × 10 5 cells per ml. Cells were incubated with EPZ005687 at 0.2 μ M, 0.67 μ M, 2 μ M and 6 μ M, in a total of 2 ml, over a course of 10 d. All remaining cell cycle analysis was performed as previously described 34 .

Received 19 March 2012; accepted 13 July 2012; published online 30 September 2012

References 1 . Kuzmichev , A . , Nishioka , K . , Erdjument-Bromage , H . , Tempst , P . & Reinberg ,

D . Histone methyltransferase activity associated with a human multiprotein complex containing the Enhancer of Zeste protein . Genes Dev. 16 , 2893 – 2905 ( 2002 ).

2 . Cao , R . et al. Role of histone H3 lysine 27 methylation in Polycomb-group silencing . Science 298 , 1039 – 1043 ( 2002 ).

3 . Agger , K . et al. UTX and JMJD3 are histone H3K27 demethylases involved in HOX gene regulation and development . Nature 449 , 731 – 734 ( 2007 ).

4 . Hong , S . et al. Identifi cation of JmjC domain-containing UTX and JMJD3 as histone H3 lysine 27 demethylases . Proc. Natl. Acad. Sci. USA 104 , 18439 – 18444 ( 2007 ).

5 . Lee , M . G . et al. Demethylation of H3K27 regulates polycomb recruitment and H2A ubiquitination . Science 318 , 447 – 450 ( 2007 ).

6 . Lan , F . et al. A histone H3 lysine 27 demethylase regulates animal posterior development . Nature 449 , 689 – 694 ( 2007 ).

7 . De Santa , F . et al. Th e histone H3 lysine-27 demethylase Jmjd3 links infl ammation to inhibition of polycomb-mediated gene silencing . Cell 130 , 1083 – 1094 ( 2007 ).

8 . Kleer , C . G . et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells . Proc. Natl. Acad. Sci. USA 100 , 11606 – 11611 ( 2003 ).

9 . Varambally , S . et al. Th e polycomb group protein EZH2 is involved in progression of prostate cancer . Nature 419 , 624 – 629 ( 2002 ).

10 . Kirmizis , A . et al. Silencing of human polycomb target genes is associated with methylation of histone H3 Lys 27 . Genes Dev. 18 , 1592 – 1605 ( 2004 ).

11 . Bracken , A . P . et al. EZH2 is downstream of the pRB-E2F pathway, essential for proliferation and amplifi ed in cancer . EMBO J. 22 , 5323 – 5335 ( 2003 ).

12 . Simon , J . A . & Lange , C . A . Roles of the EZH2 histone methyltransferase in cancer epigenetics . Mutat. Res. 647 , 21 – 29 ( 2008 ).

13 . Velichutina , I . et al. EZH2-mediated epigenetic silencing in germinal center B cells contributes to proliferation and lymphomagenesis . Blood 116 , 5247 – 5255 ( 2010 ).

14 . van Haaft en , G . et al. Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer . Nat. Genet. 41 , 521 – 523 ( 2009 ).

15 . Wang , S . , Robertson , G . P . & Zhu , J . A novel human homologue of Drosophila polycomblike gene is up-regulated in multiple cancers . Gene 343 , 69 – 78 ( 2004 ).

16 . Morin , R . D . et al. Somatic mutations altering EZH2 (Tyr641) in follicular and diff use large B-cell lymphomas of germinal-center origin . Nat. Genet. 42 , 181 – 185 ( 2010 ).

17 . Lohr , J . G . et al. Discovery and prioritization of somatic mutations in diff use large B-cell lymphoma (DLBCL) by whole-exome sequencing . Proc. Natl. Acad. Sci. USA 109 , 3879 – 3884 ( 2012 ).

18 . Morin , R . D . et al. Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma . Nature 476 , 298 – 303 ( 2011 ).

19 . Sneeringer , C . J . et al. Coordinated activities of wild-type plus mutant EZH2 drive tumor-associated hypertrimethylation of lysine 27 on histone H3 (H3K27) in human B-cell lymphomas . Proc. Natl. Acad. Sci. USA 107 , 20980 – 20985 ( 2010 ).

20 . Yap , D . B . et al. Somatic mutations at EZH2 Y641 act dominantly through a mechanism of selectively altered PRC2 catalytic activity, to increase H3K27 trimethylation . Blood 117 , 2451 – 2459 ( 2011 ).

S44 NATURE REPRINT COLLECTION Epigenetics 896 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

21 . McCabe , M . T . et al. Mutation of A677 in histone methyltransferase EZH2 in human B-cell lymphoma promotes hypertrimethylation of histone H3 on lysine 27 (H3K27) . Proc. Natl. Acad. Sci. USA 109 , 2989 – 2994 ( 2012 ).

22 . Miranda , T . B . et al. DZNep is a global histone methylation inhibitor that reactivates developmental genes not silenced by DNA methylation . Mol. Cancer Th er. 8 , 1579 – 1588 ( 2009 ).

23 . Tan , J . et al. Pharmacologic disruption of Polycomb-repressive complex 2-mediated gene repression selectively induces apoptosis in cancer cells . Genes Dev. 21 , 1050 – 1063 ( 2007 ).

24 . Copeland , R . A . Evaluation of enzyme inhibitors in drug discovery. A guide for medicinal chemists and pharmacologists ( John Wiley & Sons , 2005 ) .

25 . Duquenne , C . et al. Indazoles. International patent application PCT WO2011140325 ( 2011 ).

26 . Burgess , J . et al. Azaindazoles. International patent application PCT WO2012005805 ( 2012 ).

27 . Xu , C . et al. Binding of diff erent histone marks diff erentially regulates the activity and specifi city of polycomb repressive complex 2 (PRC2) . Proc. Natl. Acad. Sci. USA 107 , 19266 – 19271 ( 2010 ).

28 . Han , Z . et al. Structural basis of EZH2 recognition by EED . Structure 15 , 1306 – 1315 ( 2007 ).

29 . Margueron , R . et al. Role of the polycomb protein EED in the propagation of repressive histone marks . Nature 461 , 762 – 767 ( 2009 ).

30 . Yonetani , T . & Th eorell , H . Studies on liver alcohol hydrogenase complexes. 3. Multiple inhibition kinetics in the presence of two competitive inhibitors . Arch. Biochem. Biophys. 106 , 243 – 251 ( 1964 ).

31 . Richon , V . M . et al. Chemogenetic analysis of human protein methyltransferases . Chem. Biol. Drug Des. 78 , 199 – 210 ( 2011 ).

32 . Chapman , P . B . et al. Improved survival with vemurafenib in melanoma with BRAF V600E mutation . N. Engl. J. Med. 364 , 2507 – 2516 ( 2011 ).

33 . Wigle , T . J . et al. Th e Y641C mutation of EZH2 alters substrate specifi city for histone H3 lysine 27 methylation states . FEBS Lett. 585 , 3011 – 3014 ( 2011 ).

34 . Daigle , S . R . et al. Selective killing of mixed lineage leukemia cells by a potent small-molecule DOT1L inhibitor . Cancer Cell 20 , 53 – 65 ( 2011 ).

35 . Ben-Porath , I . et al. An embryonic stem cell-like gene expression signature in poorly diff erentiated aggressive human tumors . Nat. Genet. 40 , 499 – 507 ( 2008 ).

36 . Dornan , D . et al. Th erapeutic potential of an anti-CD79b antibody – drug conjugate, anti – CD79b-vc-MMAE, for the treatment of non-Hodgkin lymphoma . Blood 114 , 2721 – 2729 ( 2009 ).

37 . Renan , M . J . How many mutations are required for tumorigenesis? Implications from human cancer data . Mol. Carcinog. 7 , 139 – 146 ( 1993 ).

38 . Kaelin , W . G . Jr. Choosing anticancer drug targets in the postgenomic era . J. Clin. Invest. 104 , 1503 – 1506 ( 1999 ).

39 . Li , R . & Staff ord , J . A . Kinase Inhibitor Drugs ( John Wiley & Sons, Inc. , 2009 ) .

40 . Tsai , J . et al. Discovery of a selective inhibitor of oncogenic B-Raf kinase with potent antimelanoma activity . Proc. Natl. Acad. Sci. USA 105 , 3041 – 3046 ( 2008 ).

41 . Kwak , E . L . et al. Anaplastic lymphoma kinase inhibition in non – small-cell lung cancer . N. Engl. J. Med. 363 , 1693 – 1703 ( 2010 ).

42 . Copeland , R . A . Protein methyltransferase inhibitors as personalized cancer therapeutics . Drug Discov. Today Th er. Strateg. published online, doi:10.1016/j.ddstr.2011.08.001 (16 September 2011) .

43 . Copeland , R . A . , Solomon , M . E . & Richon , V . M . Protein methyltransferases as a target class for drug discovery . Nat. Rev. Drug Discov. 8 , 724 – 732 ( 2009 ).

44 . Vedadi , M . et al. A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells . Nat. Chem. Biol. 7 , 566 – 574 ( 2011 ).

45 . Ernst , T . et al. Inactivating mutations of the histone methyltransferase gene EZH2 in myeloid disorders . Nat. Genet. 42 , 722 – 726 ( 2010 ).

46 . Nikoloski , G . et al. Somatic mutations of the histone methyltransferase gene EZH2 in myelodysplastic syndromes . Nat. Genet. 42 , 665 – 667 ( 2010 ).

47 . Ntziachristos , P . et al. Genetic inactivation of the polycomb repressive complex 2 in T cell acute lymphoblastic leukemia . Nat. Med. 18 , 298 – 301 ( 2012 ).

Acknowledgments We thank D. Johnston and A. Basavapathruni for performing DOT1L and WHSC1 enzyme selectivity assays, K. Kuplast for help with the LCC calculations, A. Santospago for preparation of assay plates and R. Gould for helpful discussions.

Author contributions L.J. made the enzymes. K.W.K. and E.J.O. designed compounds including EPZ005687. T.J.W., C.R.M. and C.J.S. performed the enzyme inhibition assays, and T.J.W. performed substrate competitions, Yonetani-Theorell analysis and the in vitro EZH2 pull-down assay. S.K.K., N.M.W., C.J.A., C.R.K., J.S. and J.D.S. performed the intracellular inhibition of H3K27 methylation ELISA. S.K.K. and N.M.W. performed the western blotting of all methyl marks and proliferation assays. S.K.K., N.M.W. and J.J.S. performed gene expres-sion and cell cycle experiments. S.K.K., T.J.W., K.W.K., A.R., J.J.S., M.P.S., R.M.P., R.C., M.P.M., V.M.R., R.A.C. and H.K. designed studies and interpreted results. S.K.K., T.J.W., K.W.K. and R.A.C. wrote the paper.

Competing financial interests The authors declare competing financial interests: details accompany the online version of the paper.

Additional information Supplementary information, chemical compound information and chemical probe information is available in the online version of the paper. Reprints and permissions information is available online at http://www.nature.com/reprints/index.html . Correspondence and requests for materials should be addressed to K.W.K.

896 NATURE CHEMICAL BIOLOGY | VOL 8 | NOVEMBER 2012 | www.nature.com/naturechemicalbiology

ARTICLE NATURE CHEMICAL BIOLOGY DOI: 10.1038/NCHEMBIO.1084

21 . McCabe , M . T . et al. Mutation of A677 in histone methyltransferase EZH2 in human B-cell lymphoma promotes hypertrimethylation of histone H3 on lysine 27 (H3K27) . Proc. Natl. Acad. Sci. USA 109 , 2989 – 2994 ( 2012 ).

22 . Miranda , T . B . et al. DZNep is a global histone methylation inhibitor that reactivates developmental genes not silenced by DNA methylation . Mol. Cancer Th er. 8 , 1579 – 1588 ( 2009 ).

23 . Tan , J . et al. Pharmacologic disruption of Polycomb-repressive complex 2-mediated gene repression selectively induces apoptosis in cancer cells . Genes Dev. 21 , 1050 – 1063 ( 2007 ).

24 . Copeland , R . A . Evaluation of enzyme inhibitors in drug discovery. A guide for medicinal chemists and pharmacologists ( John Wiley & Sons , 2005 ) .

25 . Duquenne , C . et al. Indazoles. International patent application PCT WO2011140325 ( 2011 ).

26 . Burgess , J . et al. Azaindazoles. International patent application PCT WO2012005805 ( 2012 ).

27 . Xu , C . et al. Binding of diff erent histone marks diff erentially regulates the activity and specifi city of polycomb repressive complex 2 (PRC2) . Proc. Natl. Acad. Sci. USA 107 , 19266 – 19271 ( 2010 ).

28 . Han , Z . et al. Structural basis of EZH2 recognition by EED . Structure 15 , 1306 – 1315 ( 2007 ).

29 . Margueron , R . et al. Role of the polycomb protein EED in the propagation of repressive histone marks . Nature 461 , 762 – 767 ( 2009 ).

30 . Yonetani , T . & Th eorell , H . Studies on liver alcohol hydrogenase complexes. 3. Multiple inhibition kinetics in the presence of two competitive inhibitors . Arch. Biochem. Biophys. 106 , 243 – 251 ( 1964 ).

31 . Richon , V . M . et al. Chemogenetic analysis of human protein methyltransferases . Chem. Biol. Drug Des. 78 , 199 – 210 ( 2011 ).

32 . Chapman , P . B . et al. Improved survival with vemurafenib in melanoma with BRAF V600E mutation . N. Engl. J. Med. 364 , 2507 – 2516 ( 2011 ).

33 . Wigle , T . J . et al. Th e Y641C mutation of EZH2 alters substrate specifi city for histone H3 lysine 27 methylation states . FEBS Lett. 585 , 3011 – 3014 ( 2011 ).

34 . Daigle , S . R . et al. Selective killing of mixed lineage leukemia cells by a potent small-molecule DOT1L inhibitor . Cancer Cell 20 , 53 – 65 ( 2011 ).

35 . Ben-Porath , I . et al. An embryonic stem cell-like gene expression signature in poorly diff erentiated aggressive human tumors . Nat. Genet. 40 , 499 – 507 ( 2008 ).

36 . Dornan , D . et al. Th erapeutic potential of an anti-CD79b antibody – drug conjugate, anti – CD79b-vc-MMAE, for the treatment of non-Hodgkin lymphoma . Blood 114 , 2721 – 2729 ( 2009 ).

37 . Renan , M . J . How many mutations are required for tumorigenesis? Implications from human cancer data . Mol. Carcinog. 7 , 139 – 146 ( 1993 ).

38 . Kaelin , W . G . Jr. Choosing anticancer drug targets in the postgenomic era . J. Clin. Invest. 104 , 1503 – 1506 ( 1999 ).

39 . Li , R . & Staff ord , J . A . Kinase Inhibitor Drugs ( John Wiley & Sons, Inc. , 2009 ) .

40 . Tsai , J . et al. Discovery of a selective inhibitor of oncogenic B-Raf kinase with potent antimelanoma activity . Proc. Natl. Acad. Sci. USA 105 , 3041 – 3046 ( 2008 ).

41 . Kwak , E . L . et al. Anaplastic lymphoma kinase inhibition in non – small-cell lung cancer . N. Engl. J. Med. 363 , 1693 – 1703 ( 2010 ).

42 . Copeland , R . A . Protein methyltransferase inhibitors as personalized cancer therapeutics . Drug Discov. Today Th er. Strateg. published online, doi:10.1016/j.ddstr.2011.08.001 (16 September 2011) .

43 . Copeland , R . A . , Solomon , M . E . & Richon , V . M . Protein methyltransferases as a target class for drug discovery . Nat. Rev. Drug Discov. 8 , 724 – 732 ( 2009 ).

44 . Vedadi , M . et al. A chemical probe selectively inhibits G9a and GLP methyltransferase activity in cells . Nat. Chem. Biol. 7 , 566 – 574 ( 2011 ).

45 . Ernst , T . et al. Inactivating mutations of the histone methyltransferase gene EZH2 in myeloid disorders . Nat. Genet. 42 , 722 – 726 ( 2010 ).

46 . Nikoloski , G . et al. Somatic mutations of the histone methyltransferase gene EZH2 in myelodysplastic syndromes . Nat. Genet. 42 , 665 – 667 ( 2010 ).

47 . Ntziachristos , P . et al. Genetic inactivation of the polycomb repressive complex 2 in T cell acute lymphoblastic leukemia . Nat. Med. 18 , 298 – 301 ( 2012 ).

Acknowledgments We thank D. Johnston and A. Basavapathruni for performing DOT1L and WHSC1 enzyme selectivity assays, K. Kuplast for help with the LCC calculations, A. Santospago for preparation of assay plates and R. Gould for helpful discussions.

Author contributions L.J. made the enzymes. K.W.K. and E.J.O. designed compounds including EPZ005687. T.J.W., C.R.M. and C.J.S. performed the enzyme inhibition assays, and T.J.W. performed substrate competitions, Yonetani-Theorell analysis and the in vitro EZH2 pull-down assay. S.K.K., N.M.W., C.J.A., C.R.K., J.S. and J.D.S. performed the intracellular inhibition of H3K27 methylation ELISA. S.K.K. and N.M.W. performed the western blotting of all methyl marks and proliferation assays. S.K.K., N.M.W. and J.J.S. performed gene expres-sion and cell cycle experiments. S.K.K., T.J.W., K.W.K., A.R., J.J.S., M.P.S., R.M.P., R.C., M.P.M., V.M.R., R.A.C. and H.K. designed studies and interpreted results. S.K.K., T.J.W., K.W.K. and R.A.C. wrote the paper.

Competing financial interests The authors declare competing financial interests: details accompany the online version of the paper.

Additional information Supplementary information, chemical compound information and chemical probe information is available in the online version of the paper. Reprints and permissions information is available online at http://www.nature.com/reprints/index.html . Correspondence and requests for materials should be addressed to K.W.K.

Finding the right antibody for the right application just got easier!

Find the right antibody for the right application by visiting:

www.antibodypedia.com

Antibodypedia is a free online resource that helps you to compare and select antibodies. Independent, with data curated with the assistance of an international advisory board, Antibodypedia lets you:

• Search for antibodies that have proved themselves effective for speci� c applications• Discover research employing particular antibodies• Publish antibody validation data from your own experiments

EZH2In pre-clinical development for

patients with genetically defined lymphomas and solid tumors

MAPPING THE HMTomeThe Power of Personalized Therapeutics

Personalized Therapeutics • The Power of Epigenetics

Epizyme is creating personalized therapeutics for patients with genetically defined cancers based on breakthrough discoveries in the field of epigenetics.

MLL

EZH1EZH2

MLL4

SETD1B

SETD1A

MLL2

MLL3

SUV39H1

SUV39H2

EHMT1

EHMT2

SETDB1SETMAR

SETDB2Q6ZW69

MLL5

SETD5

NSD1

WHSC1L1

WHSC1

ASH1L

SETD2

SETD7

SETD8

SUV420H2SUV420H1

SETD6

SETD3

PRDM3

PRDM5

PRDM16

PRDM2

PRDM1

PRDM11

PRDM7

PRDM9

PRDM10

PRDM8

PRDM13

PRDM6

PRDM14

PRDM12

PRDM4

SETD4

SMYD5

SMYD1

SMYD2

SMYD3

SMYD4

PRDM15

EZH1EZH2

MLL4

SETD7

SETD8

DOT1LIn Phase I development for

patients with MLL-r, a genetically defined type of acute leukemia

To learn more about moving science forward in genetically defined cancers, visitwww.epizyme.com

Epizyme mapped the HMTome, a therapeutically important class of enzymes known as histone methyltransferases (HMTs) that are proven drivers of diseases such as cancer. The HMTome includes two major families - lysine methyltransferases (KMTs) and arginine methyltransferases (RMTs). Epizyme is creating small molecule HMT inhibitors as personalized therapeutics for the treatment of patients with genetically defined cancers.

METTL11A

METTL11B

COQ3

METTL12

METTL13

ECE2

PRMT5

METTL10

METTL20

PRMT7

PRMT10

PRMT6

PRMT2

PRMT3

PRMT1

PRMT8

CARM1

WBSCR22

ALKBH8

WBSCR27

COQ5DOT1L

METTL7B

AS3MT

METTL7A

NSUN4

PNMT

ASMT

NOP2

NSUN7

PRMT9

PRMT11NSUN5B

NNMT

INMT

NSUN5C

NSUN3

NSUN6NSUN2

NSUN5

METTL2A

METTL2B

METTL6

METTL8

C20orf7

WBSCR22

ALKBH8

WBSCR27

COQ5DOT1L

METTL7B

AS3MT

METTL7A

www.nature.com/reprintcollections/epigenetics

Epigenetics-cov-DPS.indd 2Epigenetics-cov-DPS.indd 2 12/6/12 10:28 AM12/6/12 10:28 AM