8
NATURE BIOTECHNOLOGY VOLUME 30 NUMBER 5 MAY 2012 447 ARTICLES After exposure to a foreign antigen, the mammalian humoral immune response generates a diverse repertoire of antibodies through changes in the genome of B cells by V(D)J gene recombination, gene conversion (in rabbit and chicken) and somatic hypermutation 1–4 . Each B-cell clone that undergoes this process contributes a specific monoclonal antibody to the diverse polyclonal response that is critical to fending off infection. For over a hundred years, polyclonal antibodies have been used as tools for basic research and clinical diagnostics, as well as for passive immunity therapy for infectious diseases 5,6 . However, polyclonal antibodies enriched for defined properties have been his- torically difficult to produce in large scale, limiting their value for diag- nostic and therapeutic applications. The hybridoma method 7 provided for the first time a way to obtain monoclonal antibodies, and it opened the door to interrogating the complexity of the humoral immune response to an antigen. Since then, various newer technologies have been developed to obtain antigen-specific monoclonal antibodies. Some of these alternative strategies—involving B-cell immortaliza- tion, single-cell sorting and molecular cloning, or phage display—have become increasingly effective, but the antibodies they generate do not necessarily represent the actual antibody repertoire found in circula- tion and are often labor intensive and time consuming 8–15 . To directly investigate the monoclonal composition of polyclonal antibodies in circulating serum, we used a proteomics approach based on nano-flow liquid chromatography coupled to mass spectrometry (LC-MS/MS). Such an approach, however, is difficult to implement because of the high complexity of the polyclonal mixture and the lack of a reference database of the constantly evolving repertoire of antibodies generated against foreign antigens in an individual animal. To address these challenges, we used affinity purification to reduce sample complexity, and next-generation DNA sequencing to gener- ate a reference database derived specifically from the animal’s B-cell repertoire 16 (Fig. 1). To validate our approach, we generated mono- clonal antibodies with potential for diagnostic application from the serum of rabbits immunized with human progesterone receptor A/B (PR A/B) peptide antigens. We focused on PR A/B because of its clinical significance as a biomarker used in immunohistological assays for the diagnosis of breast cancer 17 . RESULTS We immunized New Zealand white rabbits with human PR A/B pep- tides conjugated to keyhole limpet hemocyanin. Next, we screened antigen-specific antibody activity in the crude serum of each animal to select the rabbit with the strongest enzyme-linked immunosorbent analysis (ELISA) and western blot analysis signals to PR A/B (data not shown). Serum from this animal was collected from 20 ml of blood, and RNA was obtained from splenic B cells. We isolated total IgG from the serum using a protein A sepharose column and purified antigen-specific polyclonal antibodies by affinity chromatography using a custom column consisting of antigen-specific peptide conju- gated to sepharose beads. Bound IgGs were washed extensively with PBS then subjected to sequential elutions with progressively acidic buffers (pH 3.5, pH 2.7 and pH 1.8) (Fig. 2a). Fractions from each elution were collected, neutralized and screened by antigen-specific ELISA (data not shown) and western blot analysis of lysate from the PR A/B-expressing cell line T47D and the PR A/B-negative cell line HT1080 (Fig. 2a). We found that specific activity to PR A/B by A proteomics approach for the identification and cloning of monoclonal antibodies from serum Wan Cheung Cheung 1,2 , Sean A Beausoleil 1,2 , Xiaowu Zhang 1 , Shuji Sato 1 , Sandra M Schieferl 1 , James S Wieler 1 , Jason G Beaudet 1 , Ravi K Ramenani 1 , Lana Popova 1 , Michael J Comb 1 , John Rush 1 & Roberto D Polakiewicz 1 We describe a proteomics approach that identifies antigen-specific antibody sequences directly from circulating polyclonal antibodies in the serum of an immunized animal. The approach involves affinity purification of antibodies with high specific activity and then analyzing digested antibody fractions by nano-flow liquid chromatography coupled to tandem mass spectrometry. High-confidence peptide spectral matches of antibody variable regions are obtained by searching a reference database created by next-generation DNA sequencing of the B-cell immunoglobulin repertoire of the immunized animal. Finally, heavy and light chain sequences are paired and expressed as recombinant monoclonal antibodies. Using this technology, we isolated monoclonal antibodies for five antigens from the sera of immunized rabbits and mice. The antigen-specific activities of the monoclonal antibodies recapitulate or surpass those of the original affinity-purified polyclonal antibodies. This technology may aid the discovery and development of vaccines and antibody therapeutics, and help us gain a deeper understanding of the humoral response. 1 Cell Signaling Technology, Inc., Danvers, Massachusetts, USA. 2 These authors contributed equally to this work. Correspondence should be addressed to R.D.P. ([email protected]). Received 6 December 2011; accepted 21 February 2012; published online 25 March 2012; doi:10.1038/nbt.2167 npg © 2012 Nature America, Inc. All rights reserved.

A proteomics approach for the identification and cloning of

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A proteomics approach for the identification and cloning of

nature biotechnology  VOLUME 30 NUMBER 5 MAY 2012 447

A rt i c l e s

After exposure to a foreign antigen, the mammalian humoral immune response generates a diverse repertoire of antibodies through changes in the genome of B cells by V(D)J gene recombination, gene conversion (in rabbit and chicken) and somatic hypermutation1–4. Each B-cell clone that undergoes this process contributes a specific monoclonal antibody to the diverse polyclonal response that is critical to fending off infection. For over a hundred years, polyclonal antibodies have been used as tools for basic research and clinical diagnostics, as well as for passive immunity therapy for infectious diseases5,6. However, polyclonal antibodies enriched for defined properties have been his-torically difficult to produce in large scale, limiting their value for diag-nostic and therapeutic applications. The hybridoma method7 provided for the first time a way to obtain monoclonal antibodies, and it opened the door to interrogating the complexity of the humoral immune response to an antigen. Since then, various newer technologies have been developed to obtain antigen-specific monoclonal antibodies. Some of these alternative strategies—involving B-cell immortaliza-tion, single-cell sorting and molecular cloning, or phage display—have become increasingly effective, but the antibodies they generate do not necessarily represent the actual antibody repertoire found in circula-tion and are often labor intensive and time consuming8–15.

To directly investigate the monoclonal composition of polyclonal antibodies in circulating serum, we used a proteomics approach based on nano-flow liquid chromatography coupled to mass spectrometry (LC-MS/MS). Such an approach, however, is difficult to implement because of the high complexity of the polyclonal mixture and the lack of a reference database of the constantly evolving repertoire of antibodies generated against foreign antigens in an individual animal.

To address these challenges, we used affinity purification to reduce sample complexity, and next-generation DNA sequencing to gener-ate a reference database derived specifically from the animal’s B-cell repertoire16 (Fig. 1). To validate our approach, we generated mono-clonal antibodies with potential for diagnostic application from the serum of rabbits immunized with human progesterone receptor A/B (PR A/B) peptide antigens. We focused on PR A/B because of its clinical significance as a biomarker used in immunohistological assays for the diagnosis of breast cancer17.

RESULTSWe immunized New Zealand white rabbits with human PR A/B pep-tides conjugated to keyhole limpet hemocyanin. Next, we screened antigen-specific antibody activity in the crude serum of each animal to select the rabbit with the strongest enzyme-linked immunosorbent analysis (ELISA) and western blot analysis signals to PR A/B (data not shown). Serum from this animal was collected from 20 ml of blood, and RNA was obtained from splenic B cells. We isolated total IgG from the serum using a protein A sepharose column and purified antigen-specific polyclonal antibodies by affinity chromatography using a custom column consisting of antigen-specific peptide conju-gated to sepharose beads. Bound IgGs were washed extensively with PBS then subjected to sequential elutions with progressively acidic buffers (pH 3.5, pH 2.7 and pH 1.8) (Fig. 2a). Fractions from each elution were collected, neutralized and screened by antigen-specific ELISA (data not shown) and western blot analysis of lysate from the PR A/B-expressing cell line T47D and the PR A/B-negative cell line HT1080 (Fig. 2a). We found that specific activity to PR A/B by

A proteomics approach for the identification and cloning of monoclonal antibodies from serumWan Cheung Cheung1,2, Sean A Beausoleil1,2, Xiaowu Zhang1, Shuji Sato1, Sandra M Schieferl1, James S Wieler1, Jason G Beaudet1, Ravi K Ramenani1, Lana Popova1, Michael J Comb1, John Rush1 & Roberto D Polakiewicz1

We describe a proteomics approach that identifies antigen-specific antibody sequences directly from circulating polyclonal antibodies in the serum of an immunized animal. The approach involves affinity purification of antibodies with high specific activity and then analyzing digested antibody fractions by nano-flow liquid chromatography coupled to tandem mass spectrometry. High-confidence peptide spectral matches of antibody variable regions are obtained by searching a reference database created by next-generation DNA sequencing of the B-cell immunoglobulin repertoire of the immunized animal. Finally, heavy and light chain sequences are paired and expressed as recombinant monoclonal antibodies. Using this technology, we isolated monoclonal antibodies for five antigens from the sera of immunized rabbits and mice. The antigen-specific activities of the monoclonal antibodies recapitulate or surpass those of the original affinity-purified polyclonal antibodies. This technology may aid the discovery and development of vaccines and antibody therapeutics, and help us gain a deeper understanding of the humoral response.

1Cell Signaling Technology, Inc., Danvers, Massachusetts, USA. 2These authors contributed equally to this work. Correspondence should be addressed to R.D.P. ([email protected]).

Received 6 December 2011; accepted 21 February 2012; published online 25 March 2012; doi:10.1038/nbt.2167

npg

© 2

012

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 2: A proteomics approach for the identification and cloning of

448  VOLUME 30 NUMBER 5 MAY 2012 nature biotechnology

A rt i c l e s

western blot analysis was greatly enriched in the pH 1.8 fraction, enriched to a lesser extent in the pH 2.7 fraction and was undetect-able in the pH 3.5 fraction when the polyclonal fraction was matched by concentration.

We therefore sought to identify the antibodies in the pH 1.8 fraction using LC-MS/MS. To maximize sequence coverage, we divided 5 µg of polyclonal antibody into equal aliquots and digested each aliquot separately using either chymotrypsin, elastase, pepsin or trypsin. A total of four LC-MS/MS runs using a 45-min gradient were collected (Fig. 2b), producing an average of 10,000 spectra per run (data not shown). However, to interpret these LC-MS/MS spectra a custom reference database was necessary as translated genome databases would not contain sequences that evolved upon antibody affinity maturation. To generate such a custom database of immunoglobulin V-region sequences, we isolated RNA from total splenocytes col-lected from the same animal that showed strong specific activity to PR A/B. We generated immunoglobulin heavy and light chain variable region amplicons using primers specific to rabbit immunoglobulin γ- and κ-chains. Primers contained barcodes and followed the specific requirements for 454 titanium fusion primer design for the Roche 454 next-generation sequencing platform. To increase the number of V-region sequences collected, we combined three 454 GS Junior sequencing runs, which consisted of γ- and κ-chains. We obtained 80,000 high-confidence reads based on Roche 454 filters, 44,363 of which contained the entire V-region. These sequences provided the basis for the proteomics approach described below. We collected 5,279 unique γ-chain complementarity determining region 3 (CDR3) sequences and 11,681 unique κ-chain CDR3 sequences of varying length that followed a Gaussian distribution (Supplementary Fig. 1). Consistent with previous data, this rabbit preferentially used VH1 (V1S69+ V1S40 >64%) followed by VH4 (V1S44+ V1S45 ~30%) in heavy chain V(D)J rearrangement18–20 (Supplementary Fig. 2).

We next searched each LC-MS/MS run using SEQUEST21 against the custom immunoglobulin V-region database to identify the pep-tides in the sample. To estimate the false-discovery rate (FDR) of the search results, we used the target-decoy approach by generating a

composite database of forward- and reverse-oriented sequences22. Peptide spectral matches were filtered to a final FDR of ≤2% using a linear discriminant analysis23 taking into account enzyme specificity when possible (that is, for the aliquots digested with chymotrypsin or trypsin). An example of a heavy chain CDR3 peptide identified with high-confidence using this method is shown in Figure 2c. Individual runs were combined and a total of 2,356 V-region peptide spectral matches were identified with an FDR of 1.8%.

A database of antibody V-region sequences is analogous to a data-base of protein isoforms. As a result, traditional approaches using shotgun sequencing by LC-MS/MS in which only a few peptides are often used to confidently identify a protein are insufficient for identifying an antibody V-region sequence in a polyclonal antibody mixture. In addition, because the sequences of antibody V-regions can vary by as little as one amino acid, high mass accuracy helped provide additional confidence in peptide spectral matches. Each V-region peptide spectral match with a mass error from −5 to 5 p.p.m. as determined by SEQUEST was mapped back to the entire V-region database to address the redundancy of peptide spectral matches and coverage across the data set (Supplementary Fig. 3). After remapping, for each V-region sequence, we determined the total number of pep-tides, the unique number of peptides, spectrum share (total peptides mapping to sequence/total V-region peptide spectral matches) and the total V-region sequence coverage and CDR3 coverage.

To identify with high-confidence V-region sequences that were enriched from the polyclonal mixture, we applied stringent empirical criteria in our proteomics analysis including (i) overall high coverage (≥65%), (ii) at least 12 unique peptides due to high degree of homo-logy of V-region sequences, and (iii) high hypervariable region cover-age, specifically, ≥95% coverage of CDR3. Although we could identify V-region sequences using one protease alone (data not shown), we found that because of the high degree of variability in V-region sequences along with the unpredictable complexity of a polyclonal mixture, it was advantageous to use multiple proteases to increase V-region coverage. For example, multiple overlapping peptide frag-ments from different proteases contributed to the identification of

Serumor plasma

Affinitypurification

Polyclonalfunctional

characterization

SEQUESTreferencedatabase

454NGS

Antibodysequence

ID list

Antibody chainpairing andexpression

Monoclonalantibodyfunctionalvalidation

Ampliconlibrary

generation

Multipleproteasedigestion

LC-MS/MS

or

B-cellsource

Figure 1 Overview of proteomics approach for identifying functionally relevant monoclonal antibodies from an immunized animal. Serum or plasma from an immunized animal is first purified by protein A or G and subsequently subjected to antigen affinity purification. Purified polyclonal antibodies are then functionally characterized to ensure specific activity enrichment. Validated purified antibodies are digested with various proteases to prepare peptide fragments to be analyzed by high mass accuracy LC-MS/MS. To identify peptide sequences corresponding to antibody fragments by SEQUEST, we generated a reference database of immunoglobulin V-region sequences by next generation sequencing (NGS) of the immunized animal’s B-cell repertoire. V-region sequences identified with high confidence that correspond to antibodies purified from the serum are identified using in-house software. These heavy and light chain sequences are then synthesized and cloned into a single-open-reading-frame antibody expression platform. Recombinant monoclonal antibodies are expressed combinatorially in a matrix of heavy and light chains and screened for precise function and compared to the specificity and activity of the original polyclonal antibody mixture.

npg

© 2

012

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 3: A proteomics approach for the identification and cloning of

nature biotechnology  VOLUME 30 NUMBER 5 MAY 2012 449

A rt i c l e s

the entire CDR3 of both heavy and light chain sequences (Fig. 2d). Identifying unique peptide spectral matches across multiple runs from multiple proteases that map to the same V-region sequence increased spectral counts and coverage across the entire V-region sequence, provided higher confidence that specific V-region sequences were present in the polyclonal mixture and further increased confidence in the quality of the sequence obtained by next-generation sequencing24. Using the filtering criteria described above, we identified a total of ten γ- and eight κ-chain sequences with high confidence from the pH 1.8 elution fraction (Table 1). For an example of the peptides that match to the sequences shown in Figure 2d, see Supplementary Table 1 for heavy chain and Supplementary Table 2 for light chain.

LC-MS/MS data provide evidence with high confidence to support the existence of V-region sequences in affinity-purified serum, but it cannot provide direct information on cognate heavy and light chain pairing, owing to proteolysis and the reduction of disulfide bonds during sample preparation. Therefore, we expressed all possible com-binations of heavy and light chain pairings (an 8 × 10 matrix for a total of 80 antibodies in one 96-well plate transfection) in addition to the heavy and light chain sequences most frequently observed in the B-cell sequencing data. We then used ELISA to screen these anti-bodies for antigen-specific binding activity to PR A/B peptide. A total of 12 heavy- and light-chain pairs were positive by antigen-specific ELISA (Fig. 3a). Each antigen-specific, ELISA-positive clone was then tested by western blot analysis for specificity against endogenously

expressed PR A/B in cell lysates (Fig. 3b). We found six clones that specifically bound to PR A/B (Fig. 3b); two clones showed a much stronger signal compared to the original polyclonal mixture when assayed at the same antibody concentration. Antigen-specific clones positive by western blot analysis were further characterized in addi-tional assays. Two monoclonal antibodies, clone F9 (Fig. 3b–e) and clone C1 (data not shown), exhibited superior signal and specificity in western blot analysis and immunohistochemistry (Fig. 3b,c) and also reacted specifically in flow cytometry and immunofluorescence assays where the polyclonal mixture did not (Fig. 3d,e). In contrast, γ- and κ-chains selected by virtue of their highest next-generation sequencing rank did not yield antigen-specific antibodies (data not shown).

We did not observe CDR3-containing peptides from the next- generation sequencing highest ranking γ- and κ-chains based on our stringent selection criteria, and none of the CDR3 sequences from the 30 highest ranking γ- or κ-chains (Supplementary Table 3) were identified with high confidence by our proteomics approach (data not shown). We cannot rule out that the absence of activity may be due to a lack of cognate pairing, but the fact that we could not observe any of these chains by LC-MS/MS suggests that none of the chains that were most highly ranked by next-generation sequencing were specific against the antigen. Thus, in our experiments it would have been extremely challenging to find antigen-specific antibodies relying on sequencing data alone (Table 1).

b5

Heavy chain CDR3identification

y8y5

y4

y++14

y++15

b3 b4

y6

y2

2000

50

100c

d

Rel

ativ

e ab

unda

nce

400 600 800

m/z

1,000 1,200 1,400 1,600

y9b11 y13

PurifiedpolyclonalpH fraction

Chymotrypsin

Elastase

Pepsin

Trypsin

LC-MS/MS

a Protein A–purifiedtotal serum lgG

Antigencolumn

PR expression

PR BPR A

kDa2001401008060504030

20

10

+ + +– – –

pH 3.5

pH 3

.5

pH 2.7

pH 2

.7

pH 1.8

pH 1

.8

Ladd

er

Wash offnonspecific lgG

Elute antigen-specificlgG with pH gradient

Heavy chain

Light chain

ChymotrypsinElastase

PepsinTrypsin

CDR

b

Figure 2 Affinity purification of progesterone receptor–specific polyclonal rabbit IgG. (a) Total IgG from the serum of the immunized rabbit was isolated with protein A and further affinity purified on immobilized antigen peptides by gravity flow. After extensive washing to reduce nonspecific IgG, a sequential elution with progressively acidic pH was used to fractionate the antigen-specific polyclonal IgG. Each fraction was tested for specific activity by western blot analysis at matched antibody concentration (21.5 ng/ml) to detect PR A/B in lysates from T47D cells (+). Negative control lysates from HT1080 (−) were also tested. (b) The fraction with the highest specific activity, pH1.8, was processed with four proteases for LC-MS/MS analysis. (c) An MS/MS spectrum matched by SEQUEST to the V-region full tryptic peptide GFALWGPGTLVTVSSGQPK containing CDR3 (underlined) with an XCorr of 5.560 and a ∆M (observed m/z – expected m/z) of 0.39 p.p.m. (d) Rabbit heavy and light chain sequence identification coverage of clone F9 (heavy chain, 77% coverage of 50 peptides total; light chain, 65% coverage of 24 peptides total). The depicted V-region sequences, when paired, specifically bind human PR A/B (Fig. 3). Amino acids mapped by one or more peptides are shown in bold. To maximize V-region coverage and account for highly variable amino acid composition, complementary proteases were used.

npg

© 2

012

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 4: A proteomics approach for the identification and cloning of

450  VOLUME 30 NUMBER 5 MAY 2012 nature biotechnology

A rt i c l e s

To visualize clonal diversity, we performed phylogenetic analysis25 on the heavy and light chain V-region sequences identified with high confidence shown in Table 1. Closely related sequences for either heavy or light chains clustered into discrete groups. Notably, all PR A/B–specific monoclonal antibodies that we discovered clustered closely

together in the phylogenetic tree, most likely owing to clonal expan-sion from closely related B cells during immunization (Supplementary Fig. 4). Germline usage also supported this observation (Table 1). Similar observations were made in an independent experiment with a different antigen (Lin28A, Supplementary Fig. 5).

Table 1 Identification of high-confidence heavy and light chainsNGS ref. no. Total peptide count Percent variable region coverage CDR3 sequence NGS rank by CDR3 frequency Germline V(D)J

G2JXQJ001A2Q81 101 95.69 KLGL 212 IGHV1S45, D4-2, J4G2JXQJ001AGJSJ 91 92.04 GFSL 76 IGHV1S69, *, J4G2JXQJ001BJE8R 78 98.26 DLGDL 423 IGHV1S45, D3-1, J4G2JXQJ001BT2NA 70 86.21 DLGNL 461 IGHV1S45, D4-1, J4G2JXQJ001AFBNC 61 87.27 GNL 58 IGHV1S44, D4-1, J4G2JXQJ001AL49Y 59 87.72 DFHL 237 IGHV1S45, *, J4G2JXQJ001BWR23 56 89.17 GSLGTLPL 103 IGHV1S45, D8-1, J2G2JXQJ001BN8MH 50 82.14 GFAL 109 IGHV1S69, *, J4G2JXQJ001BPNUG 48 81.51 GHDDGYNYVYKL 123 IGHV1S69, D6-1, J4G2JXQJ001BZA42 35 95.54 GFTL 1,417 IGHV1S69, *, J4

G2JXQJ001BJ8KJ 93 87.27 LAGYDCTTGDCFA 2,769 IGKV1S15, J1-2G2JXQJ001BQM6D 47 95.5 LGGYDCDNGDCFT 85 IGKV1S15, J1-2G2JXQJ001A9VP3 33 92.79 LGTYDCRRADCNT 5,654 IGKV1S19, J1-2G2JXQJ001BQJFD 28 98.15 QSTLYSSTDEIV 86 IGKV1S10, J1-2G2JXQJ001BJCLS 28 96.23 QCSYVNSNT 4,518 IGKV1S44, J1-2G2JXQJ001AG4TB 24 65.45 LGSYDCRSDDCNV 179 IGKV1S2, J1-2G2JXQJ001AIZ32 17 86.11 LGAYDDAADNS 252 IGKV1S19, J1-2G2JXQJ001BJYR5 15 72.07 LGTYDCNSADCNV 1,128 IGKV1S15, J1-2

Heavy and light chains with 100% CDR3 spectrum coverage and overall ≥65% variable region coverage were identified and ranked in order of confidence as measured by total peptide count. CDR3 sequence identity and rabbit germline determination are also indicated. Heavy and light chains were chosen for gene synthesis, cloning and expression of combinatorial antibodies for characterization. NGS rank indicates the frequency ranking of the given CDR3 sequence identified in the NGS database for each chain. *, no possible D gene can be identified. NGS, next-generation sequencing.

a

bClone F1 F9 H1 C1 F7 H9 Poly

clona

l Ab

E6 H7

PR-B

PR-A

+ – + – + – + – + – + – + – + – + –

c Human breastcarcinoma T47D MCF-7 MDA-MB-231

+ + + –

Polyclonal Ab

MonoclonalAb F9

Heavy chain

LAGYDCTTGDCFALGGYDCDNGDCFTLGTYDCRRADCNTQSTLYSSTDEIVQCSYVNSNTLGSYDCRSDDCNVLGAYDDAADNSLGTYDCNSADCNV

Ligh

t cha

in

ELISAWestern blot

ABCDEFGH

21 3 4 5 6 7 8 9 10GFSL

KLGLDLG

DL

DLGNL

GNLGSLG

TLPL

GFTLGFAL

DFHL

GHDDGYNYVYKL

edPolyclonal Ab

Mean �uorescenceintensity

Mean �uorescenceintensity

Cou

nts

Monoclonal Ab F96

0

37

0

T47D cellsMDA-MB-231 cells

Cou

nts

Monoclonal AbF9

PolyclonalAb

Secondaryantibody only

MCF-7 cells

MDA-MB-231cells

Figure 3 Identification and characterization of functional monoclonal antibodies against progesterone receptor A/B. (a) Combinatorial pairing of heavy and light chains yielded 12 antigen-specific, ELISA-reactive clones (in yellow). CDR3 sequence is used as an identifier. , western blot–positive clones (see b). (b) Six clones (F1, F9, H1, C1, F7 and H9) were specific for progesterone receptor A/B detection by western blot analysis. Clones E6 (negative by ELISA and western blot) and H7 (positive by ELISA, negative by western blot) are shown as controls. +, T47D cell lysate (PR A/B positive); −, MDA-MB-231 cell lysate (PR A/B negative). All antibodies tested at 21.5 ng/ml. Ab, antibody. (c) Comparison of specific activity of clone F9 to the affinity-purified polyclonal mixture by immunohistochemistry. F9 (0.4 µg/ml) specifically stained PR A/B-positive tissue or cell lines (T47D and MCF-7), but not MDA-MB-231. Polyclonal (0.2 µg/ml) antibody was used as a positive control. 20× magnification, (d) Flow cytometry analysis. Polyclonal antibody signal/noise ratio, 1.69; concentration, 3.7 µg/ml. Monoclonal antibody F9 signal/noise ratio, 36.4; concentration, 0.5 µg/ml. (e) Confocal immunofluorescence microscopy analysis showed specific nuclear staining pattern on MCF-7 but not on MDA-MB-231 cells at 0.46 µg/ml. No primary antibody was included as background staining control. Polyclonal antibodies were also used as comparison at a concentration of 1.85 µg/ml. 20× magnification.

npg

© 2

012

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 5: A proteomics approach for the identification and cloning of

nature biotechnology  VOLUME 30 NUMBER 5 MAY 2012 451

A rt i c l e s

DISCUSSIONIn this report, we demonstrate an approach that leverages the strengths of two technologies, LC-MS/MS and next-generation sequencing, for antibody discovery. In addition to the PR A/B example, we have successfully applied this approach to three different additional anti-gens in rabbits (Table 2 and Supplementary Fig. 6) and one in mice (Supplementary Fig. 7), demonstrating that the approach is robust and reproducible in at least two laboratory animal species.

The advantage of our method is its reliance on the direct proteomic investigation of circulating polyclonal antibodies purified from the serum of an immunized animal. This approach thus avoids the techni-cal and practical challenges of existing methods, such as attrition dur-ing in vitro culturing of B cells, the need to construct and pan phage libraries or to extensively screen thousands of clones. By design, our approach disrupts cognate pairing of heavy and light chains, which can be maintained when using B-cell culturing methods. This did not hamper, however, our ability to efficiently identify monoclonal antibody pairs with functional properties equal or superior to the polyclonal antibody from which they were derived. We speculate that this is because antigen-specific chains, including putative cognate pairs, were highly enriched by co-elution during affinity purifica-tion. Antigen affinity purification is therefore a critical step in our approach. In the future, we could envision using different purifica-tion schemes to study potential correlations between purification properties, antibody sequences and their relative affinity for the antigen. Another advantage of our method is speed. It typically takes 7–10 weeks to obtain functionally validated and sequenced mono-clonal antibodies using the traditional hybridoma method. In con-trast, it would take in our experience no more than 3 weeks using our proteomics-based approach (Supplementary Fig. 8).

It has been reported that the relative frequency of the V-gene rep-ertoire in immunized mice can be used to select antigen-specific monoclonal antibodies26. In our study, however, peptides mapping to antibody sequences ranked highest by next-generation sequenc-ing (top 30) were not observed with high confidence by LC-MS/MS. Thus, these antibody chains were most likely not originally present in the serum or were not enriched by affinity purification. We used RNA isolated from total splenocytes to generate a DNA sequence reference database that was sufficient to obtain monoclonal antibod-ies that recapitulated the specific activity of the polyclonal antibody. It would be interesting to examine the relative contribution of other B-cell sources (e.g., bone marrow and lymph nodes) to the circulat-ing polyclonal antibody pool using the approach described here. It is also important to note that even in the presence of a comprehensive reference database, it was challenging to consistently obtain 100% peptide coverage of full V-region sequences confirmed to be present in the purified polyclonal mixture (Fig. 3). Given the complexity and sequence homology of the polyclonal response, it could be dif-ficult to apply MS-based de novo sequencing approaches shown to be

successful with single monoclonal antibodies27,28. As we demonstrate in this study, LC-MS/MS and next-generation sequencing synergize to decipher the complexity of circulating polyclonal antibodies.

The technology presented here for direct identification of circulat-ing antibodies in animals has applications in basic immunology and therapeutics. For example, our approach can be used to study central questions in the field of immunology, including serum antibody diver-sity, dynamics, kinetics, clonality and migration of antibody-secreting B cells following antigen exposure. Furthermore, our approach can be readily applied to vaccine research and to pursue therapeutically relevant human monoclonal antibodies from immunized, naturally infected or diseased individuals.

METHODSMethods and any associated references are available in the online version of the paper at http://www.nature.com/naturebiotechnology/.

Note: Supplementary information is available on the Nature Biotechnology website.

ACKnoWLeDGMentSWe would like to dedicate this work to the memories of César Milstein and George Kohler. We thank A. Singh and S. Kane for polyclonal antibody development, and W. Colpoys, L. Cunningham, J. Simendinger, K. Crosby and G. Innocenti for help with western blot analysis, immunohistochemistry, immunofluorescence and flow cytometry. We thank K. Smith for help with polyclonal purification and M. Lewis for help with animal immunization and spleen isolation. We thank C. Reeves for help with DNA sequencing and J. Knott for peptide antigen synthesis. Finally, we thank P. Hornbeck, C. Hoffman, S. Chow and T. Singleton for reading the manuscript and providing useful discussions.

AUtHoR ContRIBUtIonSW.C.C., S.A.B. and R.D.P. developed the methodology, designed experiments, analyzed the data and wrote the manuscript. W.C.C. and S.A.B. performed experiments and did the bioinformatic analysis. S.S. designed experiments, analyzed data and wrote the manuscript. X.Z., S.M.S., J.S.W., J.G.B., R.K.R. and L.P. performed experiments. M.J.C. and J.R. helped analyze the data and write the manuscript.

CoMPetInG FInAnCIAL InteReStSThe authors declare competing financial interests: details accompany the full-text HTML version of the paper at http://www.nature.com/naturebiotechnology/.

Published online at http://www.nature.com/naturebiotechnology/. reprints and permissions information is available online at http://www.nature.com/reprints/index.html.

1. Baltimore, D. Gene conversion: some implications for immunoglobulin genes. Cell 24, 592–594 (1981).

2. Becker, R.S. & Knight, K.L. Somatic diversification of immunoglobulin heavy chain VDJ genes: evidence for somatic gene conversion in rabbits. Cell 63, 987–997 (1990).

3. Hozumi, N. & Tonegawa, S. Evidence for somatic rearrangement of immunoglobulin genes coding for variable and constant regions. Proc. Natl. Acad. Sci. USA 73, 3628–3632 (1976).

4. Kim, S., Davis, M., Sinn, E., Patten, P. & Hood, L. Antibody diversity: somatic hypermutation of rearranged VH genes. Cell 27, 573–581 (1981).

5. Keller, M.A. & Stiehm, E.R. Passive immunity in prevention and treatment of infectious diseases. Clin. Microbiol. Rev. 13, 602–614 (2000).

6. Lambert, J.S. & Stiehm, E.R. Passive immunity in the prevention of maternal-fetal transmission of human immunodeficiency virus infection. Ann. NY Acad. Sci. 693, 186–193 (1993).

7. Köhler, G. & Milstein, C. Continuous cultures of fused cells secreting antibody of predefined specificity. Nature 256, 495–497 (1975).

8. Barbas, C.F. III., Burton, D.R., Scott, J.K. & Silverman, G.J. Phage Display: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 2001).

9. Harlow, E. & Lane, D. Using Antibodies: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 1998).

10. Lanzavecchia, A., Corti, D. & Sallusto, F. Human monoclonal antibodies by immortalization of memory B cells. Curr. Opin. Biotechnol. 18, 523–528 (2007).

11. Meijer, P.J. et al. Isolation of human antibody repertoires with preservation of the natural heavy and light chain pairing. J. Mol. Biol. 358, 764–772 (2006).

12. O’Brien, P.M. & Aitken, R. Antibody Phage Display: Methods and Protocols (Humana Press, 2002).

Table 2 Functionally relevant monoclonal antibodies against multiple targets identified by the LC-MS/MS platform as tested by ELISA and western blot analysis

AntigenImmunized

speciesHigh-confidence

heavy + light chainsUnique ELISA+

clonesUnique WB+

clones

PR A/B Rabbit 8 + 10 12 6pMET Rabbit 11 + 10 6 4Lin28A Rabbit 7 + 4 5 5Sox1 Rabbit 9 + 5 12 1p-p44/42 Mouse 12 + 13 15 3

NGS, next-generation sequencing; WB, western blot analysis.

npg

© 2

012

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 6: A proteomics approach for the identification and cloning of

452  VOLUME 30 NUMBER 5 MAY 2012 nature biotechnology

A rt i c l e s

13. Pasqualini, R. & Arap, W. Hybridoma-free generation of monoclonal antibodies. Proc. Natl. Acad. Sci. USA 101, 257–259 (2004).

14. Sullivan, M., Kaur, K., Pauli, N. & Wilson, P.C. Harnessing the immune system’s arsenal: producing human monoclonal antibodies for therapeutics and investigating immune responses. F1000 Biol. Rep. 3, 17 <http://f1000.com/reports/b/3/17> (2011).

15. Walker, L.M. et al. Broad and potent neutralizing antibodies from an African donor reveal a new HIV-1 vaccine target. Science 326, 285–289 (2009).

16. Fischer, N. Sequencing antibody repertoires: the next generation. MAbs 3, 17–20 (2011).

17. Lakhani, S.R. et al. The pathology of familial breast cancer: predictive value of immunohistochemical markers estrogen receptor, progesterone receptor, HER-2, and p53 in patients with mutations in BRCA1 and BRCA2. J. Clin. Oncol. 20, 2310–2318 (2002).

18. Becker, R.S., Suter, M. & Knight, K.L. Restricted utilization of VH and DH genes in leukemic rabbit B cells. Eur. J. Immunol. 20, 397–402 (1990).

19. Knight, K.L. Restricted VH gene usage and generation of antibody diversity in rabbit. Annu. Rev. Immunol. 10, 593–616 (1992).

20. Mage, R.G., Lanning, D. & Knight, K.L. B cell and antibody repertoire development in rabbits: the requirement of gut-associated lymphoid tissues. Dev. Comp. Immunol. 30, 137–153 (2006).

21. Yates, J.R. III, Eng, J.K., McCormack, A.L. & Schieltz, D. Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal. Chem. 67, 1426–1436 (1995).

22. Elias, J.E. & Gygi, S.P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).

23. Huttlin, E.L. et al. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell 143, 1174–1189 (2010).

24. Kircher, M. & Kelso, J. High-throughput DNA sequencing–concepts and limitations. Bioessays 32, 524–536 (2010).

25. Dereeper, A. et al. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 36, W465–W469 (2008).

26. Reddy, S.T. et al. Monoclonal antibodies isolated without screening by analyzing the variable-gene repertoire of plasma cells. Nat. Biotechnol. 28, 965–969 (2010).

27. Bandeira, N., Pham, V., Pevzner, P., Arnott, D. & Lill, J.R. Automated de novo protein sequencing of monoclonal antibodies. Nat. Biotechnol. 26, 1336–1338 (2008).

28. Castellana, N.E. et al. Resurrection of a clinical antibody: template proteogenomic de novo proteomic sequencing and reverse engineering of an anti-lymphotoxin-alpha antibody. Proteomics 11, 395–405 (2011).

npg

© 2

012

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 7: A proteomics approach for the identification and cloning of

nature biotechnologydoi:10.1038/nbt.2167

ONLINE METHODSImmunization and handling of animals. New Zealand white rabbits were immunized by intradermal injection with four separate doses, each 3 weeks apart, with a mixture of keyhole limpet hemocyanin-conjugated peptides syn-thesized in-house derived from the amino acid sequence of different regions of each human protein antigen. Peptides were conjugated to Imject maleimide-activated keyhole limpet hemocyanin (Thermo-Pierce). Mouse immunizations were carried out in the same manner, except the route of immunization was intraperitoneal and the injections were 2 weeks apart. Blood was drawn at 3 d after the final boost. Whole spleen from each animal was harvested at time of euthanasia after confirmation of desired polyclonal activity.

Next-generation DNA sequencing of rabbit and mouse B-cell repertoires. Splenocytes from hyperimmunized rabbits and mice were harvested and lysed for total RNA purification using Qiagen’s RNeasy kit following the manufac-turer’s protocol. The RNA was on-column treated with DNase I (Qiagen) to eliminate genomic DNA using the provided protocol. To generate heavy and light chain amplicon libraries from this material to be sequenced with 454 Life Sciences platform (Roche), RT-PCR was carried out as follows. cDNA was generated from the splenocyte total RNA as template using Thermoscript reverse transcriptase (Invitrogen) with oligo dT as primer. For rabbit IgG sequencing, variable regions of γ-, κ1-, κ2- and λ-chains were amplified with sequence-specific 454 fusion primers (hybridizing to the leader on the 5′ end and containing sequences on the 3′ end required for identification and bar-coding in the Lib-L format of 454 sequencing platform) using Phusion Hot Start II High-Fidelity DNA Polymerase (Finnzymes, Thermo Scientific) with the following steps: denaturation −98 °C for 90 s; 20 cycles of (denaturation −98 °C for 10 s; annealing −60 °C for 30 s; extension −72 °C for 30 s). For mouse IgG sequencing, heavy and light chain amplicons were generated by a two-step PCR process. In the first step, γ- or κ-chain variable regions were amplified (15 cycles with the same conditions as described above for rabbit) with a mixture of gene family–specific degenerate oligonucleotides as sense primers, and anti-sense primers that hybridize to a highly conserved region at the start of the constant region, each sense and antisense primer containing distinct adaptor sequences at its 5′ end. Each reaction from the first round was column-purified with a commercial kit (Qiagen cat. no. 28104) then further amplified by an additional 10 (γ-chain) and 8 cycles (κ-chain) in the second step using adaptor sequence-specific primers that contain sequences on the 3′ end required for identification and bar-coding in the Lib-L format of 454 sequencing platform. For either species all light chain amplification reactions for each animal were pooled. Excess primers for heavy and light chain sam-ples were eliminated using Agencourt AMPure XP DNA purification system following the provided protocol. The quality and purity of the amplicon pool after primer elimination was verified on Agilent Bioanalyzer 2100 (Agilent Technologies), and the concentration of the DNA was accurately quantified on a fluorometer using Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen). Following the Lib-L LV, GS FLX Titanium Series protocol from 454 Life Sciences, emulsion PCR and bead enrichment was carried out. Bead number was counted on Beckman Coulter Z2 Particle Counter, and the library was sequenced on 454 GS Junior (Roche).

Affinity purification of antigen-specific IgG. Total IgG from the serum of the hyperimmunized rabbits (New Zealand white) was purified using Protein A sepharose beads (GE Healthcare), then incubated rotating for 15 min in a column with the immunogen peptide covalently coupled to sepharose beads. By gravity flow, the unbound fraction was drained, and the column was washed extensively with 1× PBS to eliminate nonspecific IgG. Antigen-specific polyclonal IgG pool was eluted sequentially with 0.1 M glycine/HCl buffer at pH 3.5, followed by pH 2.7, and finally pH 1.8. Each elution was immediately neutralized with 1 M Tris buffer (pH 8.5). Total IgG from the serum of the hyperimmunized mice was purified using Protein-G magnetic beads (Millipore, cat. no. LSKMAGG10), then incubated rotating overnight at 4 °C with immunogen peptide immobilized on magnetic beads (Pierce, cat. no. 88817). Using a magnetic tube rack (Invitrogen, cat. no. 12321D) beads were extensively washed with PBS, then antibody bound to the column was sequentially eluted with progressively acidic pH as described for the rabbit IgG purification.

Protease digestion of affinity-purified antibody. Polyclonal antibody was denatured in 8 M urea in 20 mM HEPES pH 8 then reduced in 10 mM DTT for 1 h at 55 °C. Reduced polyclonal antibody was cooled to room temperature (~25 °C) and alkylation was done in the presence of 20 mM iodoacetamide for 1 h. Chymotrypsin, elastase and trypsin digestion was done in the presence 2 M Urea in 20 mM HEPES pH 8.0 overnight at 37 °C at an enzyme to substrate ratio of 1:50. Pepsin digestion was done in the presence of 3 M acetic acid at room temperature (~25 °C) overnight at an enzyme/substrate ratio of 1:50. Digested peptides were desalted by STAGE-TIPS as published previously29, and analyzed by LC-MS/MS.

Mass spectrometry. LC-MS/MS was done using the LTQ Orbitrap Velos (Thermo Fisher) mass spectrometer. The samples were loaded for 7 min using a Famos autosampler (LC Packings) onto a hand-poured fused silica capil-lary column (125 mm internal diameter × 20 cm) packed with Magic C18aQ resin (5 µm, 200 Å) using an Agilent 1100 series binary pump with an in-line flow splitter. Chromatography was developed using a binary gradient at 400 nl/min of 5–30% solvent B for 45 min (solvent A, 0.25% formic acid (FA); solvent B, 0.1% FA, 97% acetonitrile). Twenty MS/MS spectra were acquired in a data-dependent fashion30 from a preceding master spectrum in the Orbitrap (300–1,500 m/z at a resolution setting of 6 × 104) with an automatic gain control (AGC) target of 106. Charge-state screening was used to reject singly charged species, and a threshold of 500 counts was required to trigger an MS/MS spectrum. When possible, the LTQ and Orbitrap were operated in parallel processing mode.

Database searching and data processing. MS/MS spectra were searched using the SEQUEST algorithm (version 28 rev 12)22 against a custom hybrid database composed of 21,932 full-length gamma and 22,431 full-length kappa V-region sequences and gamma and kappa constant region sequences concatenated to 6,358 yeast proteins (S. cerevisiae, NCBI) and 42 common contaminants, including several human keratins, trypsin and chymotrypsin. Because V-region sequences are highly related, the yeast proteome artificially contributed more diverse sequences to the reference database31 and provided another source of confidence after filtering the final data set as filtered data should not include peptides identified from yeast. Search parameters included partial specificity for chymotrypsin and trypsin and no specificity for elastase and pepsin, a mass tolerance of ± 50 p.p.m., a static modification of 57.0214 on cysteine, and dynamic modification of 15.9949 on methionine. FDR in the data set was estimated using the target/decoy approach21. Data sets were filtered to an FDR of ≤2% using a linear discriminant analysis23 using XCorr, deltaCN, charge state, peptide length and measured mass accuracy as scoring criteria. Although the mass accuracy of the Orbitrap greatly exceeds 50 p.p.m., when searched with a wider precursor ion tolerance, correct peptide identi-fications result in small precursor mass errors (±1 p.p.m.), whereas incorrect peptide identifications distribute across the entire 50 p.p.m. window. As a result, stringent precursor mass filters selectively remove many incorrect PSMs from the data set.

Post-acquisition analysis was done as described in the text. Briefly, passing peptides derived from V-region sequences were remapped to the NGS immuno-globulin database. For peptides that arose from chymotryptic and tryptic digests, matches were limited to those arising from expected cleavages (KR for trypsin, YWFLMA for chymotrypsin). CDR coverage was determined by iden-tifying CDRs using the rules as described32. In all cases, coverage was defined as the total number of amino acids identified from high-confidence peptides divided by the number of amino acids in the mature V-region sequence.

Cloning, expression and characterization of identified immunoglobulin chains. γ- and κ-chains identified through the mass spectrometry analysis of the affinity-purified polyclonal IgG pool were cloned and expressed as fol-lows. For each identified chain, the nucleic acid sequence encoding the entire variable domain from FWR1 through FWR4 were synthesized (Integrated DNA Technologies). Using overlap PCR, each heavy-light chain combination permutation was expressed with a viral 2A sequence that uses a ribosomal skip mechanism to generate two polypeptides from a single open reading frame33,34. A single open reading frame cassette of, in order from 5′ to 3′, light chain variable and constant regions, 2A peptide sequence from Thosea asigna

npg

© 2

012

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.

Page 8: A proteomics approach for the identification and cloning of

nature biotechnology doi:10.1038/nbt.2167

virus, and heavy chain variable domain was cloned into a cytomegalovirus-promoter driven mammalian expression plasmid containing in-frame rab-bit γ-chain leader sequence and rabbit γ-chain constant regions, 5′ and 3′ of the cloning site, respectively. HEK293 cells were transfected with plasmid preps encoding each light-heavy chain combination assembled in this manner using polyethylenimine35. The supernatant was screened 2 to 5 days post- transfection for secretion of antigen-specific antibody by ELISA using the immunogen peptide as the coating antigen, and light-heavy chain permuta-tions that showed reactivity were further characterized. For mouse antibody expression, constant regions were of mouse IgG2a.

Characterization of polyclonal and monoclonal antibodies by ELISA, western blot analysis, flow cytometry, immunofluorescence and immuno-histochemistry. Detailed protocols of ELISA, western blot analysis, flow cytometry, immunofluorescence and immunohistochemistry can be found online at http://www.cellsignal.com/support/protocols/index.html. Costar cat. no. 3369 certified high-binding polystyrene 96-well plates were used for ELISA. Antigens used for ELISA analysis for each target were the same pep-tides used for immunizations. For Progesterone Receptor antibodies, western blot analysis was done on T47D (PR+), MDA-MB-231 cells (PR−) and HT-1080 (PR−) cell lysate, flow cytometry analysis on T47D (PR+) and MDA-MB-231 cells (PR−), confocal immunofluorescence analysis on MCF-7 cells (PR+) compared with MDA-MB-231 cells (PR−), and immunohistochemical analysis on paraffin-embedded primary human breast carcinoma sections, T47D and paraffin-embedded MCF-7 cells (PR+) compared with MDA- MB-231 cells (PR−). For phospho-p44/42 MAPK mouse antibodies, western

blot analysis was done on lysate from Jurkat cells treated with either U1026 (Cell Signaling Technology) or 12-O-Tetradecanoylphorbol-13-Acetate (TPA) (Cell Signaling Technology). For Lin28A antibodies, western blot analysis was done on total lysate from NCCIT, NTERTA, MES and IGROV1 cell lines, con-focal immunofluorescence and flow cytometry analyses on NTERA (Lin28A+) and HeLa (Lin28A−) cells. For phospho-Met (pMet) antibodies, lysates from MKN45 cells untreated (pMet+) and treated (pMet−) with SU11274 Met kinase inhibitor were used. For Sox1 antibodies, mouse brain extract (Sox1+) and lysate from NIH-3T3 (Sox1−) cells were used.

29. Rappsilber, J., Ishihama, Y. & Mann, M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. Anal. Chem. 75, 663–670 (2003).

30. Villén, J. & Gygi, S.P. The SCX/IMAC enrichment approach for global phosphorylation analysis by mass spectrometry. Nat. Protoc. 3, 1630–1638 (2008).

31. Beausoleil, S.A., Villen, J., Gerber, S.A., Rush, J. & Gygi, S.P. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat. Biotechnol. 24, 1285–1292 (2006).

32. Wu, T.T. & Kabat, E.A. An analysis of the sequences of the variable regions of Bence Jones proteins and myeloma light chains and their implications for antibody complementarity. J. Exp. Med. 132, 211–250 (1970).

33. Doronina, V.A. et al. Site-specific release of nascent chains from ribosomes at a sense codon. Mol. Cell. Biol. 28, 4227–4239 (2008).

34. Donnelly, M.L. et al. The ‘cleavage’ activities of foot-and-mouth disease virus 2A site-directed mutants and naturally occurring ‘2A-like’ sequences. J. Gen. Virol. 82, 1027–1041 (2001).

35. Boussif, O. et al. A versatile vector for gene and oligonucleotide transfer into cells in culture and in vivo: polyethylenimine. Proc. Natl. Acad. Sci. USA 92, 7297–7301 (1995).

npg

© 2

012

Nat

ure

Am

eric

a, In

c. A

ll rig

hts

rese

rved

.