9
NATURE BIOTECHNOLOGY VOLUME 31 NUMBER 3 MARCH 2013 251 RESOURCE The human genome, which encodes ~20,000 genes, was sequenced over a decade ago 1,2 , but the biological functions of most genes still remain elusive. To explore the functions of these genes and to identify druggable targets, researchers often rely on the use of short interfer- ing RNAs (siRNAs) that suppress the expression of a gene of interest. Despite of the broad utility of this approach in basic research and drug discovery, siRNAs are limited by many factors. (i) siRNAs induce gene knockdown rather than gene knockout. Thus, 10–50% of the activities of target genes almost always remain after siRNA treatment 3 . (ii) siRNAs are not specific, displaying sequence-dependent off-target effects 4 . It takes only several nucleotide matches between a 21-bp siRNA sequence known as the seed region and the 3untranslated region of an off-target gene to trigger suppression of gene expression 5 . Thus, hundreds of genes in addition to the intended target gene can be downregulated by a single siRNA. (iii) siRNAs can broadly affect cel- lular physiology by activating an innate immune response and com- peting with endogenous microRNAs (miRNAs) for proteins such as Dicer and the RNA-induced silencing complex, which are critical for miRNA function 6,7 . (iv) Furthermore, many genes are refractory to inhibition by siRNAs 3 . The gold standard in genetics is gene knockout, a complete dis- ruption of genetic information, rather than gene knockdown. Gene knockout can be achieved by gene targeting through homologous recombination, an endogenous DNA double-strand break (DSB) repair process. A homologous donor DNA, termed a targeting vector, which can recombine with the chromosomal locus of interest, is introduced into cells 8 . Because the efficiency of homologous recombination is extremely low in mammalian and other higher eukaryotic cells, ranging from 10 −6 to 10 −7 , isolation of clonal popu- lations of cells with a disrupted allele inevitably relies on the use of a selectable marker gene embedded in the targeting vector. To knock out a gene of interest in diploid cells completely, the other allele must be disrupted in a second, separate round of homologous recombina- tion after removal of the marker gene or by using another target- ing vector that contains a different marker gene, a time-consuming, labor-intensive and costly procedure. Creation of a double- or triple- gene knockout in cell lines is even more difficult, if not impossible, because of the lack of multiple selectable markers. To make matters worse, many transformed cell lines, such as the widely used HeLa and human embryonic kidney (HEK)293 model systems, are not diploid but contain three or more copies of most chromosomes 9,10 . To the best of our knowledge, no systemic gene-knockout experiments have been done in these polyploid cell lines. Programmable nucleases, which include zinc finger nucleases (ZFNs), TALENs, and CRISPR-Cas–based RNA-guided endonucle- ases, known as RGENs, are promising new tools that enable targeted genome modifications in cell lines and organisms 11–17 . These enzymes induce site-specific DSBs in a genome, which trigger error-prone non- homologous end-joining (NHEJ), another endogenous DSB repair system. As a result, these engineered nucleases induce small inser- tions and deletions (indels) at the target site, often disrupting genetic information. No homologous DNA with a selectable marker gene is used in this process. Furthermore, two alleles of a gene of inter- est can be disrupted in a single round of nuclease transfection 18,19 . A library of TAL effector nucleases spanning the human genome Yongsub Kim 1,5 , Jiyeon Kweon 1,5 , Annie Kim 1,5 , Jae Kyung Chon 2,5 , Ji Yeon Yoo 3,5 , Hye Joo Kim 1 , Sojung Kim 1 , Choongil Lee 1 , Euihwan Jeong 1 , Eugene Chung 1 , Doyoung Kim 1 , Mi Seon Lee 3 , Eun Mi Go 3 , Hye Jung Song 3 , Hwangbeom Kim 4 , Namjin Cho 4 , Duhee Bang 4 , Seokjoong Kim 3 & Jin-Soo Kim 1 Transcription activator–like (TAL) effector nucleases (TALENs) can be readily engineered to bind specific genomic loci, enabling the introduction of precise genetic modifications such as gene knockouts and additions. Here we present a genome-scale collection of TALENs for efficient and scalable gene targeting in human cells. We chose target sites that did not have highly similar sequences elsewhere in the genome to avoid off-target mutations and assembled TALEN plasmids for 18,740 protein- coding genes using a high-throughput Golden-Gate cloning system. A pilot test involving 124 genes showed that all TALENs were active and disrupted their target genes at high frequencies, although two of these TALENs became active only after their target sites were partially demethylated using an inhibitor of DNA methyltransferase. We used our TALEN library to generate single- and double-gene-knockout cells in which NF-kB signaling pathways were disrupted. Compared with cells treated with short interfering RNAs, these cells showed unambiguous suppression of signal transduction. 1 National Creative Initiatives Research Center for Genome Engineering and Department of Chemistry, Seoul National University, Gwanak-gu, Seoul, South Korea. 2 Department of Interdisciplinary Program in Bioinformatics, Seoul National University, Gwanak-gu, Seoul, South Korea. 3 ToolGen, Inc., Geumcheon-Gu, Seoul, South Korea. 4 Department of Chemistry, Yonsei University, Seoul, South Korea. 5 These authors contributed equally to this work. Correspondence should be addressed to J.-S.K. ([email protected]) or S.K. ([email protected]). Received 5 October 2012; accepted 29 January 2013; published online 17 February 2013; doi:10.1038/nbt.2517 npg © 2013 Nature America, Inc. All rights reserved.

A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

nature biotechnology  VOLUME 31 NUMBER 3 MARCH 2013 251

r e s o u r c e

The human genome, which encodes ~20,000 genes, was sequenced over a decade ago1,2, but the biological functions of most genes still remain elusive. To explore the functions of these genes and to identify druggable targets, researchers often rely on the use of short interfer-ing RNAs (siRNAs) that suppress the expression of a gene of interest. Despite of the broad utility of this approach in basic research and drug discovery, siRNAs are limited by many factors. (i) siRNAs induce gene knockdown rather than gene knockout. Thus, 10–50% of the activities of target genes almost always remain after siRNA treatment3. (ii) siRNAs are not specific, displaying sequence-dependent off-target effects4. It takes only several nucleotide matches between a 21-bp siRNA sequence known as the seed region and the 3′ untranslated region of an off-target gene to trigger suppression of gene expression5. Thus, hundreds of genes in addition to the intended target gene can be downregulated by a single siRNA. (iii) siRNAs can broadly affect cel-lular physiology by activating an innate immune response and com-peting with endogenous microRNAs (miRNAs) for proteins such as Dicer and the RNA-induced silencing complex, which are critical for miRNA function6,7. (iv) Furthermore, many genes are refractory to inhibition by siRNAs3.

The gold standard in genetics is gene knockout, a complete dis-ruption of genetic information, rather than gene knockdown. Gene knockout can be achieved by gene targeting through homologous recombination, an endogenous DNA double-strand break (DSB) repair process. A homologous donor DNA, termed a targeting vector, which can recombine with the chromosomal locus of interest, is introduced into cells8. Because the efficiency of homologous

recombination is extremely low in mammalian and other higher eukaryotic cells, ranging from 10−6 to 10−7, isolation of clonal popu-lations of cells with a disrupted allele inevitably relies on the use of a selectable marker gene embedded in the targeting vector. To knock out a gene of interest in diploid cells completely, the other allele must be disrupted in a second, separate round of homologous recombina-tion after removal of the marker gene or by using another target-ing vector that contains a different marker gene, a time-consuming, labor-intensive and costly procedure. Creation of a double- or triple-gene knockout in cell lines is even more difficult, if not impossible, because of the lack of multiple selectable markers. To make matters worse, many transformed cell lines, such as the widely used HeLa and human embryonic kidney (HEK)293 model systems, are not diploid but contain three or more copies of most chromosomes9,10. To the best of our knowledge, no systemic gene-knockout experiments have been done in these polyploid cell lines.

Programmable nucleases, which include zinc finger nucleases (ZFNs), TALENs, and CRISPR-Cas–based RNA-guided endonucle-ases, known as RGENs, are promising new tools that enable targeted genome modifications in cell lines and organisms11–17. These enzymes induce site-specific DSBs in a genome, which trigger error-prone non-homologous end-joining (NHEJ), another endogenous DSB repair system. As a result, these engineered nucleases induce small inser-tions and deletions (indels) at the target site, often disrupting genetic information. No homologous DNA with a selectable marker gene is used in this process. Furthermore, two alleles of a gene of inter-est can be disrupted in a single round of nuclease transfection18,19.

A library of TAL effector nucleases spanning the human genomeYongsub Kim1,5, Jiyeon Kweon1,5, Annie Kim1,5, Jae Kyung Chon2,5, Ji Yeon Yoo3,5, Hye Joo Kim1, Sojung Kim1, Choongil Lee1, Euihwan Jeong1, Eugene Chung1, Doyoung Kim1, Mi Seon Lee3, Eun Mi Go3, Hye Jung Song3, Hwangbeom Kim4, Namjin Cho4, Duhee Bang4, Seokjoong Kim3 & Jin-Soo Kim1

Transcription activator–like (TAL) effector nucleases (TALENs) can be readily engineered to bind specific genomic loci, enabling the introduction of precise genetic modifications such as gene knockouts and additions. Here we present a genome-scale collection of TALENs for efficient and scalable gene targeting in human cells. We chose target sites that did not have highly similar sequences elsewhere in the genome to avoid off-target mutations and assembled TALEN plasmids for 18,740 protein-coding genes using a high-throughput Golden-Gate cloning system. A pilot test involving 124 genes showed that all TALENs were active and disrupted their target genes at high frequencies, although two of these TALENs became active only after their target sites were partially demethylated using an inhibitor of DNA methyltransferase. We used our TALEN library to generate single- and double-gene-knockout cells in which NF-kB signaling pathways were disrupted. Compared with cells treated with short interfering RNAs, these cells showed unambiguous suppression of signal transduction.

1National Creative Initiatives Research Center for Genome Engineering and Department of Chemistry, Seoul National University, Gwanak-gu, Seoul, South Korea. 2Department of Interdisciplinary Program in Bioinformatics, Seoul National University, Gwanak-gu, Seoul, South Korea. 3ToolGen, Inc., Geumcheon-Gu, Seoul, South Korea. 4Department of Chemistry, Yonsei University, Seoul, South Korea. 5These authors contributed equally to this work. Correspondence should be addressed to J.-S.K. ([email protected]) or S.K. ([email protected]).

Received 5 October 2012; accepted 29 January 2013; published online 17 February 2013; doi:10.1038/nbt.2517

npg

© 2

013

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 2: A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

252  VOLUME 31 NUMBER 3 MARCH 2013 nature biotechnology

r e s o u r c e

In addition to disrupting genes, programmable nucleases can also be used to induce precise genomic changes or insert new sequences at the DSB site by enhancing the efficiency of homologous recombina-tion12. ZFNs and TALENs, although they share the same FokI-derived nuclease domain, differ in that they employ distinctive DNA-binding arrays: ZFNs use zinc finger arrays20 and TALENs use TAL effector repeat arrays14. Because these arrays recognize target DNA sequences in a modular fashion, tailor-made DNA-binding arrays with desired specificities can be constructed by mixing and matching characterized modules. Although these enzymes have been successfully used for targeted genome modification in diverse systems, many custom-made ZFNs and TALENs do not induce the mutations targeted. The success rates of ZFNs and TALENs are in the range of 24–82% (refs. 21,22) and 64–88% (refs. 23–25), respectively.

Here, we present a collection of TALENs that is designed to target every protein-coding gene in the human genome. To this end, we used a computer program to search for unique TALEN target sites in each gene to avoid potential off-target mutations and developed a one-step Golden-Gate cloning system to assemble TALEN plasmids on an unprecedented scale. Our study suggests that TALENs can have a nearly 100% success rate if carefully chosen sites are targeted.

RESULTSDesign of prototype TALENsWe first optimized the architecture of TALENs by testing variable fusion junctions that connect a TAL effector array to the FokI nucle-ase domain for their ability to cleave target sites with different spacer lengths. TALENs, which function as pairs, recognize two half-sites separated by spacers and cleave DNA in the spacers. We used RFP-GFP reporters, which contain potential target sites with variable spacers between the RFP- and GFP-encoding DNA sequences, to measure the activities of TALENs in HEK293 cells26. In these report-ers, the GFP sequence is fused to the RFP sequence out of frame, and functional GFP is expressed only when TALENs induce DSBs at the target site, whose repair by error-prone NHEJ gives rise to indels that often result in frameshifting mutations (Fig. 1a). From among several TALENs with different fusion junctions that we tested in this assay, we chose one (L4) that showed high activity at target sites with

12- to 14-bp spacers but no or negligible activity at sites with <12-bp or >14-bp spacers (Fig. 1b,c). Compared to the two original TALEN architectures that contain additional amino acid residues between the TAL effector array and the FokI sequence (S+28 and S+63 in Fig. 1b,c)14, our TALEN constructs showed mutagenesis activities at sites with a narrow range of spacers, a desirable property for high specificity. These new architectures should provide new options in genome engineering.

Golden-Gate assembly systemWe developed a one-step Golden-Gate cloning system to assemble TALEN plasmids with variable lengths in a high-throughput manner. Although Golden-Gate cloning methods have been used for the assembly of TALEN plasmids previously27–31, these methods either rely on the use of PCR, or require gel isolation of DNA segments, or require at least two rounds of subcloning steps. Our new Golden-Gate system employs a total of 424 TAL effector array plasmids (6 × 64 tripartite arrays, 2 × 16 bipartite arrays, and 2 × 4 monopartite arrays) and 8 obligatory heterodimeric FokI-encoding plasmids32. We used four TAL effector repeat domains, termed NI, NN, NG and HD repeat variable di-residues (RVDs), each specific to one of the four bases (A, G, T and C, respectively), to make these modular arrays33,34. These repeat domains consisted of 34 amino acid residues with similar sequences (Supplementary Table 1); the RVDs at positions 12 and 13 determine the base specificities.

The TAL effector array plasmids are divided into six subgroups according to their positions (Fig. 2a). Each array in a given position generates the same four-base overhang sequence when digested with BsaI. TAL effector arrays in different positions produce different four-base overhang sequences. One member in each of the six positions is chosen; the six arrays are combined with each other for subcloning into one of the FokI expression plasmids (Fig. 2b). This system allows construction of TALEN plasmids that contain at least 14.5 (4 tripartite arrays + 2 monopartite arrays) and up to 18.5 (6 tripartite arrays) RVD modules in a single Golden-Gate reaction. The last half-repeat (0.5) is encoded in the FokI plasmids. These TALENs recognize DNA sequences of 16–20 bps in length, including a conserved base T at the 5′ end. Because TALENs function as dimers, these TALEN

pairs recognize 32- to 40-bp DNA sequences, which consist of two half-sites separated by 12- to 14-bp spacers.

A pilot-scale construction of TALENsTo test whether the new TALEN architec-ture assembled using the single-step Golden-Gate system could provide a framework for efficient genome editing in cultured human

amRFP

PCMV

PCMV

PCMV

25

TA

LEN

act

ivity

(%

of G

FP

+ c

ells

)

Target site

Target site

Target site

EGFP (out of frame, +1)

EGFP(out of frame, +2)

TALEN-induced DSBs andframe-shifting mutations via error-prone NHEJ

EGFP

TALEN target site

Left half-site8 ~ 21 bp

Right half-site

L1L2L3L4

S+28S+63

TAL effector half-repeat domain FokI domain

8 bp9 bp10 bp11 bp12 bp13 bp14 bp15 bp16 bp17 bp18 bp19 bp20 bp21 bp

EGFP

EGFP(out of frame)

EGFP(out of frame)

mRFP

mRFP

b

c

20

15

10

5

0L1 L2 L3 L4

Variable fusion junctions

S+28 S+63

++++++

stop

Figure 1 Optimization of TALEN architecture. (a) Scheme of the RFP-GFP reporter-based assay for measuring the gene-editing activities of various TALEN constructs. (b) TALEN target site and amino-acid sequences in the fusion junctions that connect the TAL effector array to the FokI domain. (c) Comparison of TALEN gene-editing activities. Reporter plasmids that contain target sites with variable spacers (color coded) and TALEN plasmids were co-transfected into HEK293 cells, and GFP+ cells were counted by flow cytometry. S+28 and S+63 are two prototype TALEN architectures previously reported14. Error bars indicate s.e.m. from at least three independent experiments.

npg

© 2

013

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 3: A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

nature biotechnology  VOLUME 31 NUMBER 3 MARCH 2013 253

r e s o u r c e

cells, we constructed 15 TALEN pairs, each targeting a different human gene. These TALENs each consisted of 18.5 RVD mod-ules and an obligatory heterodimeric FokI domain. We measured the genome-editing activities of these TALENs in HEK293 cells using T7 endonuclease I (T7E1), an enzyme that specifically recognizes and cleaves het-eroduplexes formed by the hybridization of wild-type and mutant DNA sequences18. HEK293 cells were trans-fected with plasmids that encoded each TALEN pair, and PCR prod-ucts amplified from genomic DNA were assayed by T7E1. Mutation frequencies were estimated from the intensities of cleaved bands relative to intact bands. Mutations were detected at all 15 sites at frequencies of 3.9–43% (Fig. 2c). This pilot experiment demonstrates that both the new TALEN architecture and the Golden-Gate assem-bly system are robust enough to allow genome-scale construction of TALENs.

Choice of TALEN target sites in the human genomeTo construct a human TALEN library—a genome-scale collection of human gene-targeting TALENs—we obtained the DNA sequences of 18,742 protein-coding genes from the databases compiled by the HUGO Gene Nomenclature Committee35 in March 2011 and at the National Center for Biotechnology Information36 in November 2011. We then developed a computational strategy to identify poten-tial target sites in each gene according to the following criteria (Supplementary Fig. 1).

1. To maximize gene-disrupting activities, we identified 40-bp target sequences with 12- or 13-bp spacers in each gene, which are recognized by TALEN pairs that consist of two 18.5 RVDs, the largest TALEN monomers that can be assembled using our Golden-Gate cloning system.

2. Our algorithm searched for unique target sequences with the minimum number of potential off-target sites. Thus, we avoided target sites with homologous sequences that can be found elsewhere in the genome. As a result, the most homologous potential off-target sites carried at least 7-base mismatches with the target site of choice. We defined potential off-target sites as any homodimeric or het-erodimeric half-sites separated by a 12- to 14-bp spacer.

3. To ensure complete gene disruption, we searched for sites located upstream in each protein-coding region near the start codon and excluded sites that resided within the downstream 30% of each cod-ing sequence.

4. We chose target sites that resided in a common exon, in case a single gene is expressed in two or more splicing variants.

These criteria were stringent enough to avoid poor sites that were not appropriate for site-specific gene knockout but were flexible enough to identify multiple sites in most genes. A genome-wide sur-vey identified at least one TALEN site that satisfied all of the criteria in 17,120 out of 18,742 genes (91%). These sites were classified as group A (Supplementary Tables 2,3 and Supplementary Methods). To identify additional target sites, we loosened the criteria and identi-fied a total of 169,362 TALEN sites in 18,740 protein-coding genes (9.0 sites per gene on average). Our website (http://www.talenlibrary.net/) reports up to 10 TALEN sites in each gene. The vast majority (98%) of these sites are targeted by 18.5/18.5 RVD TALENs. Of these sites 95% (160,712/169,362) do not have any homologous sites with >85% sequence identity (that is, 6-base mismatches/≤40-bp sequence) in the human genome.

Genome-scale assembly of TALENsNext we chose one target site per gene and assembled TALEN expres-sion plasmids using the Golden-Gate cloning system. To facilitate the process of large-scale assembly, we preferentially chose 18.5/18.5 RVD TALEN sites with 12-bp spacers in each gene. A total of 37,480 plasmids encoding 18,740 TALEN pairs were assembled in 96-well plates according to our optimized protocol (Fig. 2b and Supplementary Methods).

We performed quality control of the TALEN plasmids by (i) diges-tion with EcoRI restriction enzyme and (ii) DNA sequencing. We chose one E. coli transformant from each of the 399 96-well plates. TALEN plasmids were purified from four colonies that were grown from each transformant, and then digested with EcoRI. Correctly assembled TALEN plasmids yielded a diagnostic 2.5-kbp DNA band. Typically, at least two out of four plasmids isolated from each trans-formant passed this test (Supplementary Fig. 2). To confirm the TAL effector array sequences in these plasmids, we subjected 298 plasmids that had passed the EcoRI test to dideoxy DNA sequencing.

Position 1a

b c

4x4x4 4x4x4 4x4x4 4x4x4

4x4x4644x4

16

4x4

16

44

64

64646464

4x4x4Position 2 Position 3

One-step Golden-Gate cloning

CTGA

pCMV NLS-HA tag

1. Mix TAL effector array plasmids (KanR)

2. Add expression vectors and Golden-Gatereaction mixture

3. Assemble TALEN plasmid by thermocycling

N termC

term FokIIH

GCCA

–+

9.5 24 43 11 26

133.94.21118

15 32**

** ** * **

*****

*****

** *

* ** ** **

21 12 7.9

+ + + +

+++++

+ + + + +

– – ––

– – – – –

–––––

Position 4 Position 5 Position 6

4. Transformation

ACAT1TALEN

TALEN

TALEN

Indels (%)

Indels (%)

Indels (%)

FOXF1

TRIM45 FCRL3 BEND2 EME1 MOSPD2

GFAP AGT KRT23 BBS9

ANGPT1 AIRE CYP11A1 F8

6. Mix glycerol and store at –80 °C

5. Overnight culture in LB (AmpR)

Figure 2 Assembly of TALEN plasmids using a one-step Golden-Gate cloning system. (a) Scheme of Golden-Gate assembly of TALEN plasmids. A total of 424 TAL effector array plasmids (64 × 6 + 16 × 2 + 4 × 2) (KanR) and 8 FokI plasmids (AmpR) are used. (b) High-throughput Golden-Gate cloning in 96-well plates. Six TAL effector array plasmids and one FokI plasmid are mixed in each well. BsaI releases the TAL effector arrays and allows an ordered assembly of six TAL effector arrays into the FokI plasmid. (c) Pilot test of 15 TALENs using the T7E1 assay. Asterisks indicate the expected positions of DNA bands cleaved by T7E1. The numbers at the bottom of the gels indicate mutation frequencies measured by band intensities.

npg

© 2

013

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 4: A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

254  VOLUME 31 NUMBER 3 MARCH 2013 nature biotechnology

r e s o u r c e

All of these plasmids contained the expected sequences, confirming the robustness of our Golden-Gate cloning system.

We chose 104 TALEN pairs that targeted different genes to validate their genome- editing activities in HEK293 cells using the T7E1 assay. Mutations were detected at 101 out of 103 target sites that were successfully PCR-amplified (assay sensitivity, ~0.5%) (Supplementary Fig. 3). Thus, the success rate of our TALENs was 98.1%. These TALENs were highly active: 76% (78/103) of TALENs were associated with mutation frequencies of >5% (or indel %), and 55% (57/103) of TALENs showed frequencies of >10% (Fig. 3a). The average mutation frequency was 16%. The two best-performing TALENs, which were specific to the FKTN and OPN1SW genes, induced mutations at frequencies of 54% and 46%, respectively. Mutations induced by ten different TALENs were confirmed by DNA sequencing. Indels, signatures of error-prone NHEJ repair of DSBs, were detected at the target sites (Supplementary Fig. 4a).

Analysis and rescue of initially inactive TALENsWe investigated why two of the 118 TALENs (including 15 TALENs used in the pilot study), which were designed to target CYP27B1 and FGFR3, failed to show any genome-editing activity in this assay. First, we sequenced the two target sites in HEK293 cells and found no poly-morphisms or mutations at these sites. Thus, the target sites do not contain sequence variations that prevent the TALENs from binding. Next, we tested these TALENs using the RFP-GFP reporter assay. The two TALENs were highly active and specific, producing GFP+ cells when co-transfected with a reporter that contains their cognate target site but not with a reporter that does not contain the target site (Fig. 3b). To reconcile this unexpected discrepancy in cleavage of episomal and chromosomal DNA, we hypothesized that these TALENs could not access the endogenous sites owing to the local chromatin structure or DNA methylation. To test this idea, we treated cells with either trichostatin A, an inhibitor of histone deacetylase (HDAC), or 5-aza-2-deoxycytidine (5-aza-dC), an inhibitor of DNA methyltransferase. The two TALENs both induced mutations when cells were pretreated with the inhibitor of DNA methylation but not with the HDAC inhibitor, as shown by the T7E1 assay (Fig. 3c) and by DNA sequencing (Supplementary Fig. 4b). In addition, bisulfite DNA sequencing analysis revealed that all the six CpG dinucleotides in each of the two target sites were, indeed, methylated (Fig. 3d). These results show that TALENs cannot recognize heavily methyl-ated DNA and that 5-aza-dC rescued the TALENs that did not cleave chromosomal DNA initially, consistent with recent biochemical and structural studies37–39.

To avoid treating cell lines with 5-aza-dC, a mutagen, we tested whether the two genome-inactive TALENs could be replaced with other TALENs that target the same gene. We synthesized new TALENs specific to the CYP27B1 and FGFR3 genes (Supplementary Table 4) and found that these TALENs were able to induce targeted genome modifications at the two new sites (Fig. 3e).

Undetectable off-target mutationsBoth ZFNs and TALENs can induce mutations at sites other than their intended target sites40–42. As noted above, we have chosen all the TALEN sites carefully to minimize off-target effects. We investigated whether our TALENs could still induce unintended mutations at sites whose sequences are homologous to those of on-target sites. We chose ten TALENs that were highly active, showing mutation frequencies of 30% or greater (36% on average) at their on-target sites, searched for the most likely off-target sites based on the sequence homology in the genome, and tested the genome-editing activities of these TALENs at these potential off-target sites using the T7E1 assay (Supplementary Fig. 5a and Supplementary Table 5). None of these TALENs elicited any measurable mutations at their most likely off-target sites (assay sensitivity, ~0.5%), demonstrating exquisite specificities of these TALENs.

The specificities of RVDs are not absolute. Among the four RVDs, NN appears most promiscuous, as it cannot distinguish G from A efficiently14,33. To address this issue, we searched for additional sites of the ten TALENs that would be homologous to the on-target TALEN sites if the promiscuity of NN is considered. Nine TALEN sites were not associated with any potential off-target sites that carry six or fewer mismatches and thus still belong to group A. Only one TALEN site, which contained an unusually high number (17) of guanines in the 40-bp sequence, was associated with additional off-target sites; in this case, there were three potential off-target sites, each of which carries 6-base mismatches. The T7E1 assay showed that this TALEN did not induce off-target mutations at these sites (Supplementary Fig. 5b). Among the 17,120 group A sites, the vast majority (86%) still satisfy the group A criterion when we take the promiscuity of NN into account. As expected, the other 14% of the sites contain an unusually high number of guanines in their target sequences. These results suggest that we would still choose most of the same sites even

TALENor ZFN

Reporter

CYP

4540

35302520151050

CYP FGFR

FGFR

FGFR––

FGFR

FGFR3

CYP27B1 FGFR3

CYPCYPZif268

Zif268-FokI

Zif268 Zif268

TALE

N a

ctiv

ity (

% o

f GF

P +

cel

ls)

TALEN

Indels (%)

****6.6 28

b

d e–+–+

16

<0.5

<0.5

–3 6–99–

12

12–1

5

15–1

8

18–2

1

21–2

4

24–2

7

27–3

0

30–3

3

33–3

6

36–3

9

39–4

2

42–4

5

45–4

8>5

13–

6

Mutation frequency (%)

No.

of T

ALE

Ns

a

TSA

Indels (%) 5.5 3.2

2.0 1.6

****

** **

Indels (%)

CYP27B1 TALENc

FGFR3 TALEN

– ++ +

+– –

(–)

(–)CYP27B1

FGFR3

14

12

10

8

6

4

2

0

CYP21B1

5-aza-dC

TSA – ++ +

+– –

–5-aza-dC

Figure 3 Targeted gene-disrupting activities of TALENs. (a) Distribution of TALEN activities. (b) RFP-GFP reporter assays of two genome-inactive TALENs. Zif268-FokI (ZFN) was used as a positive control. Error bars, means ± s.e.m. (c) TALEN-driven mutations in drug-treated cells. (d) Cytosine methylation at two initially unmodified sites revealed by bisulfite sequencing. The DNA sequences of two half-sites and spacers are shown in upper case and lower case, respectively. CpG dinucleotides are underlined. Closed and open circles indicate methylated and unmethylated cytosines, respectively. (e) Targeted mutagenesis using a second set of TALENs.

npg

© 2

013

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 5: A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

nature biotechnology  VOLUME 31 NUMBER 3 MARCH 2013 255

r e s o u r c e

when we consider the promiscuity of NN. Because our website reports up to ten sites in each gene, potential users who are concerned about NN promiscuity may avoid sites that contain too many guanines.

Targeted chromosomal deletions using TALENsHomologous genes are often clustered in a chromosome. Many of these homologous genes are redundant in their functions. A single gene knockout often fails to cause any phenotypic changes owing to these redundancies. We propose that our TALENs could be used to induce large chromosomal deletions in a targeted manner to remove a cluster of homologous and functionally redundant genes. The TALEN library reported here consists of 18,740 TALEN pairs whose target sites are distributed all across the human genome. Thus, two neighboring TALEN target sites are separated by 170 kbp, on average, in a chromosome.

As a proof of principle experiment, we investigated whether and how efficiently two TALEN pairs can induce large chromosomal dele-tions in human cell lines. Two combinations of TALEN pairs were tested: (i) a BRAF-specific TALEN and a NOBOX-specific TALEN and (ii) a EDNRB-specific TALEN and a FGF14-specific TALEN. The two target sites of the first pair are separated by 3.6 Mbp on chromosome 7, and those of the second pair are separated by 24 Mbp on chromosome 13 (Fig. 4 and Supplementary Table 6). Targeted deletions of these large chromosomal segments were detected by PCR and confirmed by DNA sequencing. The deletion frequencies measured by limit-ing dilution and PCR detection were 0.6% (BRAF and NOBOX) and 0.4% (EDNRB and FGF14), which are in line with those obtained with ZFNs43. Recently, two TALEN pairs were used to induce a 7-kbp chromosomal deletion within a single gene in porcine fibroblasts at a frequency of 10% (ref. 24). Apparently, large deletions of Mbp by TALENs are at least tenfold less efficient than are those of kbp.

Isolation and characterization of gene-knockout cellsTALENs have been successfully used for making gene-knockout animals and plants25,44–47, but the generation of gene-knockout cell lines using TALENs has lagged far behind. In fact, although many TALENs specific to human or other mammalian genes have been reported14,23,26,40, clonal populations of gene-knockout cells have yet

to be created using TALENs. It seems par-ticularly difficult to disrupt all the copies of a gene of interest in polyploid cells. Most model cell lines such as HEK293 cells and HeLa cells contain more than three copies of most chro-mosomes9,10. Thus, the disruption of one or two alleles does not yield gene-knockout cell lines. In addition, some engineered nucle-ases are cytotoxic48. Thus, cells that contain nuclease-induced mutations often die out, making it difficult to isolate rare clonal popu-lations of gene-knockout cells18.

As a proof-of-principle experiment for creating gene-knockout cells, we used TALENs to disrupt a series of genes associated with NF-κB, a transcription factor linked to inflam-mation, septic shock, apoptosis, oncogenesis and DNA repair49,50. We transfected HEK293 cells or HeLa cells with both TALEN plas-mids and surrogate reporters26, which enabled enrichment of cells in which a gene of interest is completely disrupted. We inspected clonal populations of cells using fluorescent PCR. Gene-knockout cells produced amplicon peaks corresponding to indels but not the wild-type peak (Fig. 5a and Supplementary Fig. 6). Note that all three alleles were mutated in a clonal population of cells (Fig. 5a). DNA sequencing confirmed the presence of frameshift mutations at target loci in these cells. No mutations were detected at poten-tial off-target sites that are highly homologous to the target site in these clones (Supplementary Fig. 5c and Supplementary Table 5). We also performed western blot analysis and RT-PCR to confirm that target proteins were not expressed in gene-knockout cells (Fig. 5b and Supplementary Fig. 7).

First, we compared TALEN-mediated gene knockout with siRNA-mediated gene knockdown. We measured NF-κB–dependent luci-ferase reporter activities after cells were treated with tumor necrosis factor alpha (TNFα) or interleukin-1 beta (IL-1β), two well-known cytokines that activate NF-κB signaling. Two validated siRNAs, each specific to one of the two receptor genes, IL1R1 or TNFR1, only par-tially suppressed the NF-κB signaling by IL-1β or TNFα, respectively (Fig. 5c). Thus, the two cytokines still activated the NF-κB-dependent reporter activities in siRNA-transfected wild-type cells, compared to cells not treated with cytokines (P < 0.01, Student’s t-test). In contrast, both IL1R1 and TNFR1 knockout cells showed complete suppression of IL-1β– and TNFα-mediated NF-κB signal transduc-tion, respectively. In addition, the two cytokines did not activate the luciferase reporter in double-knockout cells in which both IL1R1 and TNFR1 were disrupted. Transfection of the IL1R1 cDNA into IL1R1 knockout cells restored IL-1β-mediated signal transduction, demon-strating that the gene-knockout cells were not impaired in NF-κB signaling (Fig. 5d). Furthermore, mitoxantrone (Novantrone), an anti-cancer drug that induces nonspecific DSBs in cells, strongly activated the reporter gene in the single- and double-gene-knockout cells, demonstrating that NF-κB activation triggered by DNA damage

1R

1R

b c

d

a

3R

(WT)

(WT)

1FChr. 7

TALEN

TALEN

+ +– –

Chr. 133.6 Mbp 24 Mbp

2F 3F 4F

2F 4F

2R 3R 4R

NOBOX FGF14EDNRBBRAF

BRAF - NOBOX(3.6 Mbp deletion)

BRAF + NOBOX

EDNRB - FGF14(24 Mbp deletion)

EDNRB + FGF14 0.400.61

Estimate Upperlimits

Lowerlimits

Frequency (%)

1.420.91

0.260.17

BRAF-NOBOX deletion junctions

EDNRB-FGF14 deletion junctions

Figure 4 TALEN-mediated chromosomal deletions. (a) Scheme of targeted chromosomal deletions induced by TALENs. Zigzag lines indicate TALEN target sites. Small arrows are PCR primers. (b) PCR products corresponding to large chromosomal deletions. (c) Deletion frequencies measured by dilution PCR. (d) DNA sequences of deletion junctions. TALEN recognition sequences are underlined and shown in boldface.

npg

© 2

013

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 6: A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

256  VOLUME 31 NUMBER 3 MARCH 2013 nature biotechnology

r e s o u r c e

is independent of signaling through the two cytokines (Fig. 5e). These gene-knockout cells can be used for dissecting NF-κB pathways and screening for reagents or factors that selectively modulate NF-κB acti-vation independent of signaling via IL-1β or TNFα or both.

To demonstrate the usefulness of TALEN-mediated gene knockout further, we disrupted the SMEK1 gene in HEK293 cells. This gene had been identified as a potential mediator of NF-κB activation in a recent genome-wide siRNA screen51. Both IL-1β and TNFα strongly activated the luciferase reporter gene in two independent knockout clones, revealing that this gene was a false positive (Fig. 5f).

DISCUSSIONWe have developed a scalable Golden-Gate assembly system that consists of a total of 432 plasmids (424 TAL effector array plasmids and 8 FokI plasmids) and used this system to construct a TALEN library, a collection of 18,740 TALEN pairs, to disrupt or modify every protein-coding gene in the human genome. Recently, two different solid surface–based assembly methods for the synthesis of TALEN plasmids were reported: iterative capped assembly and fast liga-tion-based automatable solid-phase high-throughput (FLASH) sys-tems23,52. Although these methods can be automated at least partially, multiple cycles of ligation and washing steps in addition to PCR are involved before subcloning into an expression vector. Furthermore, FLASH uses up to microgram quantities of TAL effector array plasmids in each reaction, making it cost-inefficient and cumbersome to scale up. Compared to these methods, our Golden-Gate cloning system allows PCR-free TALEN assembly with much less hands-on time and greatly reduced plasmid quantities. In addition, cloning effi-ciency is higher in general with Golden-Gate systems than with solid surface–based methods.

We reported here that 98.4% (124/126) of TALENs induced site-specific mutations at high frequencies. Only 2 out of 126 TALENs (including 15 TALENs used in the pilot test, 101 successful TALENs, 2 genome-inactive TALENs, 2 backup TALENs, and 6 TALENs used for making gene-knockout cells) failed to show any measurable genome-editing activities. Each of these two uncleavable TALEN

sites contained 6 CpG dinucleotides, all of which turned out to be fully methylated (Fig. 3d). By contrast, each of 124 sites that were successfully targeted contained four or fewer CpG dinucleotides (Supplementary Table 4). DNA methylation patterns vary widely among cell types. To avoid unexpected failure in TALEN-mediated genome-editing, we recommend choosing target sites free of CpG dinucleotides or sites with four or fewer CpG dinucleotides. Our website (http://www.talenlibrary.net/) provides up to ten TALEN sites per gene, many of which are free of CpG dinucleotides: 35.6% (60,266/169,362) of all TALEN sites are CpG-free. There are 14,866 (79.3%) genes with at least one CpG-free site, and 18,688 genes (99.7%) with at least one site with four or fewer CpG dinucleotides. Among the 18,740 TALEN pairs we assembled, 9,299 (49.6%) or 17,865 (95.3%) TALENs recognize CpG-free sites or sites with four or fewer CpG dinucleotides, respectively (Supplementary Table 3).

Two recent studies reported that the success rates of TALEN-mediated gene disruption range from 64% to 88% in mammalian cells23,24. These success rates are certainly impressive compared to those achieved with ZFNs21,53. On a genomic scale, however, failure rates of 12–36% correspond to 2,000–7,000 genes, if one TALEN is used per gene. Our study suggests that a nearly 100% success rate can be achieved with TALENs by avoiding heavily methylated sites, making it feasible to initiate genome-scale gene- knockout studies.

In the short term, TALEN-mediated gene-knockout experi-ments should be useful for validating or invalidating the functions of genes identified by genome-scale siRNA library screens, which have revealed many potential drug targets associated with various diseases54. Unfortunately, many of the newly identified genes turned out to be false positives that resulted from off-target effects55,56. To distinguish true positives from false positives, one could use TALENs to knock out candidate genes as demonstrated here, rather than using another siRNA to downregulate the genes. In the long term, TALENs could enable the creation of a series of cell lines in which a family of druggable target genes encoding proteases, kinases, phosphodiesterases and G protein–coupled receptors are disrupted.

140

120

100

80

60

40

20

0

120

TNFR1

GAPDH

100

80

60

40

20

0

120100806040200

140160180200

120100806040200

TNFαRel

ativ

e lu

cife

rase

act

ivity

Rel

ativ

e lu

cife

rase

act

ivity

Rel

ativ

e lu

cife

rase

act

ivity

Rel

ativ

e lu

cife

rase

act

ivity

MX

WTsiRNACell WT WT

WT Clone no.1 Clone no.2

SMEK1

IL1R1 KO TNFR1 KO

TNFR1 IL1R1

KOWTTNFR1IL1R1

WT

– cDNA + cDNA – cDNA + cDNA

WT TNFR1KO IL1R1KO TNFR1/IL1R1Double KO

siCTRL siTNFR1 siIL1R1

TNFαIL-1β(–)

WTTNFR1 KOIL1R1 KOTNFR1/IL1R KO

– – – –

TNFα+MX TNFα+IL-1β

(–)IL-1β

4,000

8,000

12,000

16,000

20,000

24,000

28,000

32,000

0

4,000

8,000

12,000

16,000

20,000

24,000

28,000

32,000250 260 270 280 290 300 250 260 270 280 290 300

0

292 bp 261 bp282 bp 288 bp

Wild type

WT

∆31

∆4

∆10

TNFαIL-1β(–)

TNFαIL-1β(–)

a b

c d

e f

Figure 5 NF-κB-associated gene-knockout cell lines. (a) Genotyping of a clonogenic gene-knockout cell line. A single clone derived from cells transfected with TALEN plasmids was analyzed by fPCR (upper panel) and DNA sequencing (lower panel). TALEN recognition sequences are underlined and shown in boldface. (b) Western blot of gene-knockout cells. (c) Comparison of TALEN-mediated gene knockout (KO) with siRNA-mediated gene knockdown. (d) Transfection of cDNA into gene-knockout cells restores cytokine-dependent NF-κB activation. The IL1R1 or TNFR1 cDNA was transfected into IL1R1 or TNFR1 knockout cells, respectively. Note that the overexpression of the TNFR1 cDNA in TNFR1 knockout cells induced NF-κB activation even in the absence of TNFα. (e) Cytokine-independent NF-κB activation by mitoxantrone (MX) in single- and double-gene-knockout cells. (f) Invalidation of a false-positive gene identified in a genome-wide siRNA screen. Three independent clones of SMEK1 knockout cells were treated with cytokines, and the luciferase reporter activities were measured. Error bars indicate s.e.m. from at least three independent experiments.

npg

© 2

013

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 7: A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

nature biotechnology  VOLUME 31 NUMBER 3 MARCH 2013 257

r e s o u r c e

In principle, genetic screens can be performed using these cell lines instead of siRNA libraries. The TALEN library reported here provides a foundation for functional genomic studies based on gene knockout rather than knockdown in mammalian cells.

In addition, some of the TALENs in the library can be used to generate precise genetic modifications by homology-directed repair. For example, a reporter gene can be fused to a gene of interest to monitor the expression of the gene or trafficking of the protein. Single-nucleotide changes can be made in a gene of interest to study phenotypic differences associated with single-nucleotide polymor-phisms in an otherwise isogenic background.

Although the primary use of our TALENs in drug discovery will be to identify and validate druggable target genes, we envision that some TALENs or their improved versions would be useful thera-peutically themselves. An example for a targeted nuclease that is in clinical trials is Sangamo’s CCR5-specific ZFN for the treatment of AIDS, which is now under clinical investigation in the United States57. TALENs can also be used to disrupt disease-associated genes or correct genetic defects in stem and somatic cells. We propose that the TALEN library reported here will be widely used for basic and biomedical research. To this end, we are committed to making the TALEN and TAL effector–array plasmids (Supplementary Table 7) available to the scientific community. Requests for TALEN plasmids can be placed at our website (http://www.talenlibrary.net/).

METHODSMethods and any associated references are available in the online version of the paper.

Note: Supplementary information is available in the online version of the paper.

ACKNowLEDGMENtSThis work was supported by the National Research Foundation of Korea (J.-S.K., 2012-0001225), the Intelligent Synthetic Biology Center of the Global Frontier Project funded by the Ministry of Education, Science and Technology, Korea (D.B., 2011-0031956), Korea Institute of Planning and Evaluation for Technology in Food, Agriculture, Forestry and Fisheries (S.K., 311062-04-2-sb1010), and Plant Molecular Breeding Center of Next-Generation BioGreen 21 Program (S.K., PJ009081).

AUtHoR CoNtRIBUtIoNSJ.-S.K., S.K. and D.B. supervised the research and wrote the manuscript. All the other authors performed the experiments.

COMPETING FINANCIAL INTERESTSThe authors declare competing financial interests: details are available in the online version of the paper.

Published online at http://www.nature.com/doifinder/10.1038/nbt.2517. reprints and permissions information is available online at http://www.nature.com/reprints/index.html.

1. Venter, J.C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).

2. Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

3. Krueger, U. et al. Insights into effective RNAi gained from large-scale siRNA validation screening. Oligonucleotides 17, 237–250 (2007).

4. Jackson, A.L. et al. Expression profiling reveals off-target gene regulation by RNAi. Nat. Biotechnol. 21, 635–637 (2003).

5. Birmingham, A. et al. 3′ UTR seed matches, but not overall identity, are associated with RNAi off-targets. Nat. Methods 3, 199–204 (2006).

6. Khan, A.A. et al. Transfection of small RNAs globally perturbs gene regulation by endogenous microRNAs. Nat. Biotechnol. 27, 549–555 (2009).

7. Sledz, C.A., Holko, M., de Veer, M.J., Silverman, R.H. & Williams, B.R. Activation of the interferon system by short-interfering RNAs. Nat. Cell Biol. 5, 834–839 (2003).

8. Smithies, O., Gregg, R.G., Boggs, S.S., Koralewski, M.A. & Kucherlapati, R.S. Insertion of DNA sequences into the human chromosomal beta-globin locus by homologous recombination. Nature 317, 230–234 (1985).

9. Bylund, L., Kytola, S., Lui, W.O., Larsson, C. & Weber, G. Analysis of the cytogenetic stability of the human embryonal kidney cell line 293 by cytogenetic and STR profiling approaches. Cytogenet. Genome Res. 106, 28–32 (2004).

10. Macville, M. et al. Comprehensive and definitive molecular cytogenetic characterization of HeLa cells by spectral karyotyping. Cancer Res. 59, 141–150 (1999).

11. Mali, P. et al. RNA-Guided Human Genome Engineering via Cas9. Science doi:10.1126/science.1232033 (3 January 2013).

12. Urnov, F.D., Rebar, E.J., Holmes, M.C., Zhang, H.S. & Gregory, P.D. Genome editing with engineered zinc finger nucleases. Nat. Rev. Genet. 11, 636–646 (2010).

13. Bibikova, M., Beumer, K., Trautman, J.K. & Carroll, D. Enhancing gene targeting with designed zinc finger nucleases. Science 300, 764 (2003).

14. Miller, J.C. et al. A TALE nuclease architecture for efficient genome editing. Nat. Biotechnol. 29, 143–148 (2011).

15. Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science doi:10.1126/science.1231143 (3 January 2013).

16. Hwang, W.Y. et al. Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat. Biotechnol. advance online publication, doi:10.1038/nbt.2501 (29 January 2013).

17. Cho, S.W., Kim, S., Kim, J.M. & Kim, J.-S. Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease. Nat. Biotechnol. advance online publication, doi:10.1038/nbt.2507 (29 January 2013).

18. Kim, H.J., Lee, H.J., Kim, H., Cho, S.W. & Kim, J.S. Targeted genome editing in human cells with zinc finger nucleases constructed via modular assembly. Genome Res. 19, 1279–1288 (2009).

19. Santiago, Y. et al. Targeted gene knockout in mammalian cells by using engineered zinc-finger nucleases. Proc. Natl. Acad. Sci. USA 105, 5809–5814 (2008).

20. Kim, Y.G., Cha, J. & Chandrasegaran, S. Hybrid restriction enzymes: zinc finger fusions to Fok I cleavage domain. Proc. Natl. Acad. Sci. USA 93, 1156–1160 (1996).

21. Kim, J.S., Lee, H.J. & Carroll, D. Genome editing with modularly assembled zinc-finger nucleases. Nat. Methods 7, 91 author reply 91–92 (2010).

22. Gupta, A. et al. An optimized two-finger archive for ZFN-mediated gene targeting. Nat. Methods 9, 588–590 (2012).

23. Reyon, D. et al. FLASH assembly of TALENs for high-throughput genome editing. Nat. Biotechnol. 30, 460–465 (2012).

24. Carlson, D.F. et al. Efficient TALEN-mediated gene knockout in livestock. Proc. Natl. Acad. Sci. USA 109, 17382–17387 (2012).

25. Cade, L. et al. Highly efficient generation of heritable zebrafish gene mutations using homo- and heterodimeric TALENs. Nucleic Acids Res. 40, 8001–8010 (2012).

26. Kim, H., Um, E., Cho, S.R., Jung, C. & Kim, J.S. Surrogate reporters for enrichment of cells with nuclease-induced mutations. Nat. Methods 8, 941–943 (2011).

27. Li, T. et al. Modularly assembled designer TAL effector nucleases for targeted gene knockout and gene replacement in eukaryotes. Nucleic Acids Res. 39, 6315–6325 (2011).

28. Zhang, F. et al. Efficient construction of sequence-specific TAL effectors for modulating mammalian transcription. Nat. Biotechnol. 29, 149–153 (2011).

29. Cermak, T. et al. Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting. Nucleic Acids Res. 39, e82 (2011).

30. Morbitzer, R., Elsaesser, J., Hausner, J. & Lahaye, T. Assembly of custom TALE-type DNA binding domains by modular cloning. Nucleic Acids Res. 39, 5790–5799 (2011).

31. Weber, E., Gruetzner, R., Werner, S., Engler, C. & Marillonnet, S. Assembly of designer TAL effectors by Golden Gate cloning. PLoS ONE 6, e19722 (2011).

32. Guo, J., Gaj, T. & Barbas, C.F. 3rd Directed evolution of an enhanced and highly efficient FokI cleavage domain for zinc finger nucleases. J. Mol. Biol. 400, 96–107 (2010).

33. Boch, J. et al. Breaking the code of DNA binding specificity of TAL-type III effectors. Science 326, 1509–1512 (2009).

34. Moscou, M.J. & Bogdanove, A.J. A simple cipher governs DNA recognition by TAL effectors. Science 326, 1501 (2009).

35. Seal, R.L., Gordon, S.M., Lush, M.J., Wright, M.W. & Bruford, E.A. genenames.org: the HGNC resources in 2011. Nucleic Acids Res. 39, D514–D519 (2011).

36. Pruitt, K.D., Tatusova, T., Brown, G.R. & Maglott, D.R. NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res. 40, D130–D135 (2012).

37. Valton, J. et al. Overcoming transcription activator-like effector (TALE) DNA binding domain sensitivity to cytosine methylation. J. Biol. Chem. 287, 38427–38432 (2012).

38. Deng, D. et al. Recognition of methylated DNA by TAL effectors. Cell Res. 22, 1502–1504 (2012).

39. Bultmann, S. et al. Targeted transcriptional activation of silent oct4 pluripotency gene by combining designer TALEs and inhibition of epigenetic modifiers. Nucleic Acids Res. 40, 5368–5377 (2012).

40. Mussolino, C. et al. A novel TALE nuclease scaffold enables high genome editing activity in combination with low toxicity. Nucleic Acids Res. 39, 9283–9293 (2011).

41. Pattanayak, V., Ramirez, C.L., Joung, J.K. & Liu, D.R. Revealing off-target cleavage specificities of zinc-finger nucleases by in vitro selection. Nat. Methods 8, 765–770 (2011).

42. Gabriel, R. et al. An unbiased genome-wide analysis of zinc-finger nuclease specificity. Nat. Biotechnol. 29, 816–823 (2011).

npg

© 2

013

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 8: A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

258  VOLUME 31 NUMBER 3 MARCH 2013 nature biotechnology

r e s o u r c e

43. Lee, H.J., Kim, E. & Kim, J.S. Targeted chromosomal deletions in human cells using zinc finger nucleases. Genome Res. 20, 81–89 (2010).

44. Tesson, L. et al. Knockout rats generated by embryo microinjection of TALENs. Nat. Biotechnol. 29, 695–696 (2011).

45. Huang, P. et al. Heritable gene targeting in zebrafish using customized TALENs. Nat. Biotechnol. 29, 699–700 (2011).

46. Sander, J.D. et al. Targeted gene disruption in somatic zebrafish cells using engineered TALENs. Nat. Biotechnol. 29, 697–698 (2011).

47. Li, T., Liu, B., Spalding, M.H., Weeks, D.P. & Yang, B. High-efficiency TALEN-based gene editing produces disease-resistant rice. Nat. Biotechnol. 30, 390–392 (2012).

48. Cornu, T.I. et al. DNA-binding specificity is a major determinant of the activity and toxicity of zinc-finger nucleases. Mol. Ther. 16, 352–358 (2008).

49. Perkins, N.D. Integrating cell-signalling pathways with NF-kappaB and IKK function. Nat. Rev. Mol. Cell Biol. 8, 49–62 (2007).

50. Volcic, M. et al. NF-kappaB regulates DNA double-strand break repair in conjunction with BRCA1-CtIP complexes. Nucleic Acids Res. 40, 181–195 (2012).

51. Gewurz, B.E. et al. Genome-wide siRNA screen for mediators of NF-kappaB activation. Proc. Natl. Acad. Sci. USA 109, 2467–2472 (2012).

52. Briggs, A.W. et al. Iterative capped assembly: rapid and scalable synthesis of repeat-module DNA such as TAL effectors from individual monomers. Nucleic Acids Res. 40, e117 (2012).

53. Kim, S., Lee, M.J., Kim, H., Kang, M. & Kim, J.S. Preassembled zinc-finger arrays for rapid construction of ZFNs. Nat. Methods 8, 7 (2011).

54. Sigoillot, F.D. & King, R.W. Vigilance and validation: Keys to success in RNAi screening. ACS Chem. Biol. 6, 47–60 (2011).

55. Lin, X. et al. siRNA-mediated off-target gene silencing triggered by a 7 nt complementation. Nucleic Acids Res. 33, 4527–4535 (2005).

56. Adamson, B., Smogorzewska, A., Sigoillot, F.D., King, R.W. & Elledge, S.J. A genome-wide homologous recombination screen identifies the RNA-binding protein RBMX as a component of the DNA-damage response. Nat. Cell Biol. 14, 318–328 (2012).

57. Holt, N. et al. Human hematopoietic stem/progenitor cells modified by zinc-finger nucleases targeted to CCR5 control HIV-1 in vivo. Nat. Biotechnol. 28, 839–847 (2010).

npg

© 2

013

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 9: A library of TAL effector nucleases spanning the human genomegel.snu.ac.kr/paper/2013/GEL_20130217_A library of TAL... · 2013-03-29 · The human genome, which encodes ~20,000 genes,

nature biotechnologydoi:10.1038/nbt.2517

TALEN-induced genome rearrangements. Genomic DNA was isolated from cells transfected with two pairs of TALENs. For estimating frequen-cies of chromosomal rearrangements, genomic DNA was serially diluted and subjected to digital PCR using appropriate primer pairs (Supplementary Table 6). The results were analyzed using the Extreme Limiting Dilution Analysis program as described previously43. The breakpoints junctions were analyzed by dideoxy DNA sequencing.

Analysis and rescue of genome-inactive TALENs. One day before TALEN plasmid transfection, HEK293 cells (two million) were pretreated with 0.2 µM 5-aza-dC or 100 ng/ml trichostatin A in 24-well plates. After 3 d of incubation, genomic DNA was isolated from transfected cells and subjected to the T7E1 assay. To determine whether the TALEN sites were methylated, genomic DNA was treated with bisulfite using the EpiTect Bisulfite kit (QIAGEN) according to the manufacturer’s protocol. The bisulfite-converted DNA was then ampli-fied using bisulfite-specific primers (Supplementary Table 6), and the ampli-fied products were subcloned into the T-Blunt vector and sequenced.

Gene-knockout cell lines. HEK293T/17 and HeLa cells were co-transfected with TALEN plasmids and surrogate reporter plasmids that contain the TALEN target site. Gene-knockout cells were enriched by selection as described26 and cloned by limiting dilution in 96-well plates. Typically, cells were main-tained for 2 weeks in 96-well plates to isolate single clones. These clones were analyzed using T7E1, fPCR and dideoxy sequencing.

Episomal reporter assay to detect NF-κB signaling. The NF-κB–dependent firefly luciferase reporter was constructed by placing three tandem copies of the NF-κB recognition element (TGGGGACTTTCCGC) in front of a synthetic promoter that consists of the TATA-box and the initiator element. Gene-knockout or wild-type cells were co-transfected with the luciferase reporter plasmid and the Renilla luciferase plasmid. After 24 h of incubation, cells were treated with TNFα (1 ng/ml) or IL-1β (25 ng/ml) and incubated for 15 h. Cells were lysed in 1× lysis buffer (100 µl) (Promega), and the dual luciferase assays were done according to the manufacturer’s protocol.

ONLINE METHODSConstruction of plasmids for Golden-Gate assembly of TALENs. The 424 TAL effector array plasmids were constructed using a total of 84 TAL effec-tor plasmids, which include 64 tripartite, 16 bipartite and 4 monopartite arrays of all possible combinations of NN, HD, NI and NG RVD modules custom-synthesized by GenScript. To avoid undesired effects, we excluded rare human codons and limited the maximum sequence identity between different RVDs to 81% (Supplementary Table 1). Each of the 84 plasmids was subjected to PCR amplification using carefully selected primer sets that confer different overhang sequences upon digestion by BsaI accord-ing to the six TAL effector array positions. The PCR amplicons were then subcloned into a backbone vector with the kanamycin-resistance selectable marker. The 8 FokI expression plasmids contain the ampicillin-resistance gene, the CMV promoter, an HA epitope tag, a nuclear localization signal, the N-terminal 135 amino acids of AvrBs3, one of the four RVD half-repeats and the sharkey FokI domain (DAS or RR)32. The entire amino-acid and DNA sequences of a TALEN pair assembled using this system are shown in Supplementary Figure 8.

Mammalian cell culture and transfection. HEK293T/17 (ATCC, CRL-11268) and HeLa cells (ATCC, CCL-2) were maintained in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 100 units/mL penicillin, 100 µg/mL streptomycin, 0.1 mM nonessential amino acids, and 10% FBS (FBS). We transfected 400,000 HEK293 cells using 3 µl of polyethylenimine and 1 µg of plasmid DNA in 24-well plates. We transfected 200,000 HeLa cells with Lipofectamine 2000 (Invitrogen) according to the manufacturer’s protocol.

Genome-editing activities of TALENs measured using T7E1. Genomic DNA was extracted using G-DEX IIc Genomic DNA Extraction Kit (iNtRON) after 3 d of transfection. TALEN target sites were PCR-amplified using primer pairs listed in Supplementary Table 6. For sequencing analysis, PCR products were purified and subcloned into the T-Blunt vector (SolGent) and subjected to dideoxy DNA sequencing. The T7E1 analysis was done as described previously18.

npg

© 2

013

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.