16
Molecular Determinants of Mutant Phenotypes, Inferred from Saturation Mutagenesis Data Arti Tripathi, †,1 Kritika Gupta, †,1 Shruti Khare, †,1 Pankaj C. Jain, 1 Siddharth Patel, 1 Prasanth Kumar, 1 Ajai J. Pulianmackal, 1 Nilesh Aghera, 1 and Raghavan Varadarajan* ,1,2 1 Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India 2 Jawaharlal Nehru Center for Advanced Scientific Research, Bangalore, India These authors contributed equally to this work. *Corresponding author: E-mail: [email protected]. Associate editor: Csaba Pal Abstract Understanding how mutations affect protein activity and organismal fitness is a major challenge. We used saturation mutagenesis combined with deep sequencing to determine mutational sensitivity scores for 1,664 single-site mutants of the 101 residue Escherichia coli cytotoxin, CcdB at seven different expression levels. Active-site residues could be distin- guished from buried ones, based on their differential tolerance to aliphatic and charged amino acid substitutions. At nonactive-site positions, the average mutational tolerance correlated better with depth from the protein surface than with accessibility. Remarkably, similar results were observed for two other small proteins, PDZ domain (PSD95 pdz3 ) and IgG-binding domain of protein G (GB1). Mutational sensitivity data obtained with CcdB were used to derive a procedure for predicting functional effects of mutations. Results compared favorably with those of two widely used computational predictors. In vitro characterization of 80 single, nonactive-site mutants of CcdB showed that activity in vivo correlates moderately with thermal stability and solubility. The inability to refold reversibly, as well as a decreased folding rate in vitro, is associated with decreased activity in vivo. Upon probing the effect of modulating expression of various proteases and chaperones on mutant phenotypes, most deleterious mutants showed an increased in vivo activity and solubility only upon over-expression of either Trigger factor or SecB ATP-independent chaperones. Collectively, these data suggest that folding kinetics rather than protein stability is the primary determinant of activity in vivo. This study enhances our understanding of how mutations affect phenotype, as well as the ability to predict fitness effects of point mutations. Key words: mutagenesis, deep sequencing, protein folding, fitness effect prediction. Introduction The amino acid sequence of a protein determines its three dimensional structure, function and stability. Understanding and predicting the effects of mutations on protein structure, function and organismal fitness is a major challenge in biol- ogy. It has been suggested that most positions in a protein can tolerate mutations while retaining stability and function (DePristo et al. 2005; Bershtein et al. 2006). Other studies indicate that proteins acquire new functions at the cost of stability (Wang et al. 2002). Human single nucleotide poly- morphisms analyses suggest that >80% of disease-causing mutations cause a loss of stability (Yue et al. 2005). The sta- bility of the wildtype protein is believed to determine the nature and extent of mutations that can be tolerated. The distribution of fitness effects of mutations is thought to be primarily shaped by their effects on protein thermodynamic stability (Firnberg et al. 2014). It is widely believed that resi- dues in the protein interior are important for protein shape and stability, and those on the surface for function/interac- tion (Ponder and Richards 1987; Bowie and Sauer 1989; Milla et al. 1994). However, there are few studies which exhaustively test this assertion. For example, in the case of Thioredoxin, the correlation between thermodynamic stability and biological activity was not evident for single mutants (Hellinga et al. 1992). Buried residues are less tolerant to mutations than nonactive-site surface exposed residues. Hence, the propor- tion of solvent exposed residues in a protein is an important determinant of its evolutionary rate (Lin et al. 2007). Further, it is difficult to determine whether deleterious mutations at nonactive-site residues act primarily through affecting ther- modynamic stability or folding kinetics, because both factors can affect the amount of properly folded, functional protein in vivo. It has been suggested that mutations which are desta- bilizing beyond a certain threshold can render a protein dys- functional and hence accumulation of such mutations can decrease organismal fitness (Yue et al. 2005; Bershtein et al. 2006; Randles et al. 2006). In the past, the effects of mutations on protein stability were usually determined by creating a limited number of single-site mutants followed by protein expression, purifica- tion and characterization of the properties of each mutant, relative to the wildtype protein. Each of these steps is Article ß The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Open Access 2960 Mol. Biol. Evol. 33(11):2960–2975 doi:10.1093/molbev/msw182 Advance Access publication August 25, 2016 Downloaded from https://academic.oup.com/mbe/article/33/11/2960/2272479 by guest on 29 March 2022

Molecular Determinants of Mutant Phenotypes, Inferred from

Embed Size (px)

Citation preview

Molecular Determinants of Mutant Phenotypes Inferred fromSaturation Mutagenesis Data

Arti Tripathidagger1 Kritika Guptadagger1 Shruti Kharedagger1 Pankaj C Jain1 Siddharth Patel1 Prasanth Kumar1

Ajai J Pulianmackal1 Nilesh Aghera1 and Raghavan Varadarajan12

1Molecular Biophysics Unit Indian Institute of Science Bangalore India2Jawaharlal Nehru Center for Advanced Scientific Research Bangalore IndiadaggerThese authors contributed equally to this work

Corresponding author E-mail varadarmbuiiscernetin

Associate editor Csaba Pal

Abstract

Understanding how mutations affect protein activity and organismal fitness is a major challenge We used saturationmutagenesis combined with deep sequencing to determine mutational sensitivity scores for 1664 single-site mutants ofthe 101 residue Escherichia coli cytotoxin CcdB at seven different expression levels Active-site residues could be distin-guished from buried ones based on their differential tolerance to aliphatic and charged amino acid substitutions Atnonactive-site positions the average mutational tolerance correlated better with depth from the protein surface thanwith accessibility Remarkably similar results were observed for two other small proteins PDZ domain (PSD95pdz3) andIgG-binding domain of protein G (GB1) Mutational sensitivity data obtained with CcdB were used to derive a procedurefor predicting functional effects of mutations Results compared favorably with those of two widely used computationalpredictors In vitro characterization of 80 single nonactive-site mutants of CcdB showed that activity in vivo correlatesmoderately with thermal stability and solubility The inability to refold reversibly as well as a decreased folding rate invitro is associated with decreased activity in vivo Upon probing the effect of modulating expression of various proteasesand chaperones on mutant phenotypes most deleterious mutants showed an increased in vivo activity and solubility onlyupon over-expression of either Trigger factor or SecB ATP-independent chaperones Collectively these data suggest thatfolding kinetics rather than protein stability is the primary determinant of activity in vivo This study enhances ourunderstanding of how mutations affect phenotype as well as the ability to predict fitness effects of point mutations

Key words mutagenesis deep sequencing protein folding fitness effect prediction

IntroductionThe amino acid sequence of a protein determines its threedimensional structure function and stability Understandingand predicting the effects of mutations on protein structurefunction and organismal fitness is a major challenge in biol-ogy It has been suggested that most positions in a proteincan tolerate mutations while retaining stability and function(DePristo et al 2005 Bershtein et al 2006) Other studiesindicate that proteins acquire new functions at the cost ofstability (Wang et al 2002) Human single nucleotide poly-morphisms analyses suggest thatgt80 of disease-causingmutations cause a loss of stability (Yue et al 2005) The sta-bility of the wildtype protein is believed to determine thenature and extent of mutations that can be tolerated Thedistribution of fitness effects of mutations is thought to beprimarily shaped by their effects on protein thermodynamicstability (Firnberg et al 2014) It is widely believed that resi-dues in the protein interior are important for protein shapeand stability and those on the surface for functioninterac-tion (Ponder and Richards 1987 Bowie and Sauer 1989 Millaet al 1994) However there are few studies which exhaustively

test this assertion For example in the case of Thioredoxin thecorrelation between thermodynamic stability and biologicalactivity was not evident for single mutants (Hellinga et al1992) Buried residues are less tolerant to mutations thannonactive-site surface exposed residues Hence the propor-tion of solvent exposed residues in a protein is an importantdeterminant of its evolutionary rate (Lin et al 2007) Furtherit is difficult to determine whether deleterious mutations atnonactive-site residues act primarily through affecting ther-modynamic stability or folding kinetics because both factorscan affect the amount of properly folded functional proteinin vivo It has been suggested that mutations which are desta-bilizing beyond a certain threshold can render a protein dys-functional and hence accumulation of such mutations candecrease organismal fitness (Yue et al 2005 Bershtein et al2006 Randles et al 2006)

In the past the effects of mutations on protein stabilitywere usually determined by creating a limited number ofsingle-site mutants followed by protein expression purifica-tion and characterization of the properties of each mutantrelative to the wildtype protein Each of these steps is

Article

The Author 2016 Published by Oxford University Press on behalf of the Society for Molecular Biology and EvolutionThis is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License(httpcreativecommonsorglicensesby-nc40) which permits non-commercial re-use distribution and reproduction in anymedium provided the original work is properly cited For commercial re-use please contact journalspermissionsoupcom Open Access2960 Mol Biol Evol 33(11)2960ndash2975 doi101093molbevmsw182 Advance Access publication August 25 2016

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

laborious and limits the number of mutants that can bestudied If a convenient phenotypic readout for protein func-tion is available this can be combined with deep sequencingto obtain relative activity estimates for large numbers of mu-tants (Tripathi and Varadarajan 2014) In cases where a phe-notypic readout is unavailable monitoring the levels of areporter gene fused to the protein of interest can be usedas a proxy for activity although such fusions may also affectthe stability and folding of the protein (Kim et al 2013) Theadvent of next generation sequencing has provided a consid-erable amount of phenotypic data linked to mutations butstudies that aim at understanding the molecular basis of thesephenotypes are limited Many studies that employ site-saturation mutagenesis methodology have goals specific toa given protein such as to identify active-site residues(Melnikov et al 2014 Romero et al 2015) improvealter pro-tein properties (Wang et al 2002 Deng et al 2012 Whiteheadet al 2012 Starita et al 2013) identify stabilizing mutations(Araya et al 2012 Traxlmayr et al 2012 Kim et al 2013)determine affinity and specificity determinants of proteinndashprotein interaction (DeBartolo et al 2012 Dutta et al 2013)or to study the fitness landscape (Hietpas et al 2011 2012Melnikov et al 2014 Thyagarajan and Bloom 2014 Sarkisyanet al 2016) The readout in most cases is either qualitative(bindingno binding) or semi-quantitative experiments arecarried out at a single expression level some cases sample alimited number of sites (Fowler et al 2010 Hietpas et al 2011Deng et al 2012 McLaughlin et al 2012 Schlinkmann et al2012) and can involve metastable proteins with multiplefunctional conformations (Thyagarajan and Bloom 2014)Some of these studies sample multi-site as well as single mu-tations complicating interpretation of the data (Hietpas et al2011 Deng et al 2012) and in most cases inferences fromthese analyses are not validated by detailed characterizationof individual single mutants Previously attempts to obtainresidue-specific contributions to activity with either a fulllength protein such as Ubiquitin (76 aa) (Roscoe et al2013) or with protein domains such as the hYAP65 WWdomain (25-aa region) (Fowler et al 2010) have been madebut in such cases it is difficult to separate the effect of singlemutations on stabilityfolding from those that directly affectfunction either because the system has multiple binding part-ners such as in the case of Ubiquitin or due to a limitednumber of single mutants and presence of several doubleand triple mutants in the library (Fowler et al 2010 Denget al 2012)

There have also been numerous prior attempts to under-stand and predict the functional consequences of mutationsby using computational methods (Bloom et al 2005Parthiban et al 2006 Moretti et al 2013 Pires et al 2014)While experimental approaches often measure changes inthermodynamic stability or activity of proteins upon muta-tion computational methods typically predict stabilitiesbased on either sequence andor structure Some recentmethods based on machine learning such as SNAP2 (Hechtet al 2015) and SuSPect (Yates et al 2014) take into accountevolutionary information and other sequence and structure

based features to predict functional consequences ofmutations

In the present study we attempt to understand the con-tribution of every amino acid in a protein to its structurestability and function understand how mutations modulateprotein activity in vivo and use this information in predictingthe functional effects of mutations computationally We at-tempt to address the following issues (1) Can we distinguishactive-site residues from buried ones based solely on satura-tion mutagenesis phenotypes (2) Are there consistent pat-terns in substitution preferences at buried sites (3) What isthe primary mechanism by which mutations at buried sitesaffect activity in vivo (4) Can we predict functional effects ofspecific mutations at buried sites We use the protein CcdB(Controller of Cell Death protein B) as an experimental testprotein CcdB is a homodimeric protein and each protomercontains 101 residues (Loris et al 1999) CcdB is a part of theCcdAB toxinndashantitoxin system present on the Escherichia coliF-plasmid and plays an important role in F-plasmid mainte-nance by killing plasmid free cells (Jaffe et al 1985 Hayes2003) Biophysical and thermodynamic studies of dimericCcdB (Chakshusmathi 2002 Bajaj et al 2004) indicate thatthe protein exists as a homodimer at neutral pH and under-goes a two-state unfolding process with a free energy of un-folding of21 kcalmol at 298 K (Bajaj et al 2004) CcdB hastwo primary ligands its cognate antitoxin CcdA and cellulartarget DNA Gyrase The Kd of CcdB for CcdA37ndash72 is in thepicomolar range and is much smaller than for GyrA which is10 nM (De Jonge et al 2009)

Phenotypes of 1664 single-site mutants of CcdB were de-termined at seven different expression levels (designated as2ndash8 in the order of increasing expression level) by using twodifferent deep sequencing techniques 454 (Adkar et al 2012)and Illumina (this work) We describe a mutational sensitivityscore derived from sequencing (MSseq) and use it to quanti-tatively rank order mutant effects on phenotype at bothburied and exposed positions and to distinguish buriedfrom active-site residues based solely on mutational dataTwo other systems for which experimentally derived muta-tional sensitivity scores were available namely PDZ domain(PSD95pdz3) and IgG-binding domain of protein G (GB1) wereused to compare the substitution preferences and determineif a coherent set of rules derived from a fraction of the CcdBmutational data can also be used for predicting the functionaleffects of other mutations in CcdB as well as the two addi-tional test proteins

To gain additional insights into the molecular determi-nants of phenotype for the nonactive-site mutants80 CcdB mutants with a range of in vivo activities werepurified and characterized in vitro to obtain insights intodeterminants of protein stability solubility and activityEffects of chaperone over-expression as well as chaperoneand protease deletion on activities of individual mutantswere also studied to rationalize the effect of mutations onprotein folding and stability The data suggest that mutationaleffects on folding rather than stability determine the in vivophenotype of CcdB mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2961

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

In summary this work has important implications for un-derstanding the molecular basis of mutant phenotypes andfor mutant phenotype prediction

Results

Phenotypes Determined from 454 Sanger SequencingMatch Well with Phenotypes Determined by IlluminaSequencingWe have previously described a library consisting of approx-imately 1000 single-site mutants of CcdB (Adkar et al 2012)which was constructed by pooling single-site mutants andindividually sequenced by 454 Sanger sequencing to obtainphenotypes (Bajaj et al 2008) We have previously shown thatphenotypes of individual mutants determined by growingthem on plates at various repressor and inducer concentra-tions correlate well (rfrac14 095) with those obtained from 454deep sequencing (Adkar et al 2012) In the present study afresh library for CcdB was prepared by individually random-izing each codon using an inverse PCR procedure (Jain andVaradarajan 2014) This library was transformed and screenedat seven different expression levels under identical conditionsto those used for the earlier library The relative population ofeach mutant as a fraction of repressorinducer concentrationwas estimated using Illumina deep sequencing In contrast to454 sequencing where the read length was sufficient to coverthe entire gene each Illumina read provided only 50ndash70 bp ofuseful sequence Hence it was necessary to create six PCRproducts to obtain complete sequence coverage for thewhole gene The key assumption here is that each mutantgene is mutant only at a single codon thus we consideredreads which contain exactly one mutant codon We observed785 of the reads to be wildtype which is close to the ex-pected 833 (56100) Only 25 of the non wildtype reads(012 of total reads) had two mutations Since the additionalmutations will likely be randomly distributed and given thatmost single mutants show an active phenotype the fractionof incorrectly assigned inactive phenotypes is expected to besmall Since expression of active CcdB leads to cell death thenumber of sequencing reads for a given mutant abruptlydecreases at the expression level where the mutant showsan active phenotype These expression levels are assignednumerical values from 2 to 8 (value of 9 is assigned to themutants that show cell growth even at the highest expressionlevel) The CcdB gene is amplified from colonies surviving ateach expression level and tagged with a Multiplex IDentifiersequence (MID) unique to each expression level MSseq is theexpression level at which the number of the sequencing readsfor a particular mutant decreases by a factor of five or morecompared to the previous expression level (Adkar et al 2012Sahoo et al 2015) Based on this phenotypes for a total of1664 single-site mutants in the two independent single-sitelibraries of CcdB were mapped collectively by the two deepsequencing methods 454 and Illumina respectively whichcorresponds to 165 mutants per position (876 of all pos-sible mutants) Of the 1093 mutants analyzed by 454 se-quencing and 1342 by Illumina sequencing 771 mutantswere common 625 mutants have the same MSseq value

and the MSseq score differed by at most 1 for 59 mutantsIn few cases where the MSseq value differed between Illuminaand 454 the lower value (higher activity) was taken The highconcordance between phenotypes derived from Illumina 454and plate based assays of individual mutants validates thedeep sequencing based phenotypic identification

Determination of the Active-Site Residues Solely fromthe Mutational DataAs a first step towards understanding and interpretation ofthe large amount of mutational data we calculated residue-wise mutational tolerance namely the fraction of active mu-tants for each residue at a given condition

Residues with low mutational tolerance are mostly buriedwhereas some are surface exposed The latter are likely to be apart of the active-site (Wu et al 2015) Active-site residues canbe distinguished from buried ones even in the absence ofstructural information based on the pattern of mutationalsensitivity At buried positions typically most aliphatic sub-stitutions are tolerated except when the wildtype residue is asmall A or G residue whereas polar and charged residues arepoorly tolerated In contrast for active-site residues (whichare typically exposed) mutations to aliphatic residues are of-ten poorly tolerated polar and charged residues are some-times tolerated and the average mutational tolerance istypically lower than that of the buried residues Based onthese criteria we can identify residues Q2 F3 Y6 S22 I24N95 W99 G100 and I101 as putative active-site residuesbased solely on the mutational data (fig 1) Upon examiningthe crystal structure of free CcdB (PDB ID 3VUB) all theactive-site residues identified from the mutational pheno-types with the exception of Y6 are in close proximity toeach other and line a surface groove indicating that theseeight residues are likely to be part of the active-site (fig 1D) Inthe structure of CcdB bound to a fragment of GyrA (PDB ID1X75) all eight residues are in proximity to GyrA confirmingthat these are indeed part of the active-site Y6 has an expo-sure of just 9 and only the terminal OH group is exposedsuggesting that the low mutational tolerance at this positionis likely to be primarily due to mutational effects on foldingand stability rather than due to direct effects on GyrA bind-ing In subsequent analyses we focus primarily on effects ofmutations at nonactive-site positions Mutational effects onactive-site residues involved in binding Gyrase will be dis-cussed in more detail elsewhere

Substitution Preferences at Buried PositionsThere are 92 nonactive-site positions in CcdB of which 21positions are buried (accessibility 5) and 71 are exposed(accessibilitygt 5) Of the 21 buried residues 18 are hydro-phobic (table 1) Mutational tolerance increased with in-creasing expression level (supplementary fig S2Supplementary Material online) and was lower at buriedpositions compared with the exposed positions At the low-est expression level (MID 2) the average mutational toler-ance for the 14 buried residues that are not part of the dimerinterface or active-site is 485 while for dimer-interfaceburied residues it is 475 indicating that both classes of

Tripathi et al doi101093molbevmsw182 MBE

2962

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

buried residues are equally sensitive to mutation (supplementary table S3 Supplementary Material online) ResidueD19 is the only buried potentially charged residue and yetsurprisingly shows the highest mutational tolerance relativeto other buried residues Although the residue is largelyburied the side-chain points outwards towards solvent ex-plaining its high tolerance to mutation A subset of buriedresidues most sensitive to mutation was selected using thefollowing criteria tolerance at MID 2lt 40 tolerance atMID 8lt 90 and phenotypic data for 15 mutants is avail-able Interestingly this selected subset (V18 V20 I34 I90and I94) clusters together in the interior of each monomer(supplementary fig S1E Supplementary Material online)

On analyzing the mutational tolerance as a function ofmutant amino acid at buried residues we found that at thelowest expression level D R and P are the least toleratedmutations and tolerance decreases in the order ali-phaticgt aromatic polargt charged Interestingly for charged

and polar amino acids smaller amino acids were consistentlymore poorly tolerated than larger ones (compare D E N Q ST tolerances in supplementary table S2 SupplementaryMaterial online) The opposite trend is observed for aromaticsubstitutions where tolerance decreases in order FgtYHgtW D and R are the least tolerated substitutions (fig1C) though most other mutations are well tolerated at thehighest expression level (supplementary table S2Supplementary Material online) The poor tolerance for aburied Aspartate at all expression levels is likely due to theinability of the small charged side-chain to be solvated uponburial and reconfirms our earlier result (Bajaj et al 2005)indicating that Aspartate mutant phenotypes are good indi-cators of residue burial

We further attempted to quantitate the relative prefer-ence for different substitutions for all buried positions byincorporating phenotypic data at multiple expression levelsThe distribution of MSseq values for introducing a specific

FIG 1 Mutational effects on CcdB protein activity inferred from phenotypic screening and deep sequencing (A) (B) and (C) show the MSseq valuesfor representative exposed-site (accessibilitygt5) all active-site and buried-site residues (accessibility5) respectively On the vertical axisresidues are grouped into (G P) aliphatic (AndashM) aromatic (FndashW) polar (SndashQ) and charged (DndashR) amino acids Residue numbers and substi-tutions are indicated on the horizontal and vertical axes respectively Each heatmap is colored according to the MSseq value of the mutant Greento red color gradation represents increasing MSseq values Zero value (light green) indicates that the corresponding mutant was not observed in thelibrary WT residue at each position is indicated in white Data for only representative residue positions are shown for clarity (D) Active-siteresidues (highlighted in cyan) identified from the mutational phenotypes mapped onto the crystal structure of CcdB (PDB ID 3VUB)

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2963

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

residue ldquoXrdquo at every buried-site was obtained Pair-wise com-parisons of these distributions were made using a Wilcoxonsigned-rank test The heatmap (fig 2A) indicates the log10 Pvalue for the null hypothesis that the introduction of the rowresidues at a buried site does not reduce protein functionsignificantly more than introduction of the corresponding col-umn residue at the same site It is important to note here thatboth the residues being compared are mutant residues Unliketypical amino acid substitution matrices (Henikoff andHenikoff 1992) used for sequence alignment our matrix isasymmetric Aspartate and Arginine mutants possess signifi-cantly higher MSseq values than 18 and 16 other residuesrespectively indicating that they are the least tolerated muta-tions Proline is the next most poorly tolerated mutation Pvalues for (D E) (N Q) and (S T) (row column) pairs are lowerthan for (E D) (Q N) and (T S) indicating that on an averagethe order of tolerance is Dlt E NltQ and Slt T Similarly foraromatic residue tolerances WltY Hlt F In order to exam-ine if these observations remain valid for systems other thanCcdB we examined previously published mutational sensitiv-ity data for PSD95pdz3 (McLaughlin et al 2012) and GB1 (Olsonet al 2014) (fig 2B and C) The general trends were very similarand confirm our observation that for buried sites smallercharged and polar residues are disfavored relative to largerones whereas the opposite is true for aromatic residuesClose examination of the log10 P values in figure 2A suggeststhat at buried sites the substitution preference is approxi-mately in the following order ACVLIMgt TgtFgtHYSgtQGWgtNgtKPEgt RgtD A similar (but not identical) trendis also visible in the PSD95pdz3 and GB1 data though this isbased on fewer buried positions and at a single expression

level Additional saturation mutagenesis studies on other sys-tems using quantitative or semi-quantitative readouts wouldbe useful in consolidating our observations

Substitution preferences at active-site residues should bedifferent than those at buried sites because proteinproteininterfaces are more polar than protein interiors (Janin et al1988 Tsai et al 1997) and are also likely to display a greatercontext dependence Extensive analysis of a large amount ofmutational data would be required to decipher these substi-tution preferences In the case of CcdB data for only 142active-site mutants is available Hence we did not attemptto predict mutational sensitivities at active-site residues

Mutational Tolerance as a Function of DepthMutational tolerances at the lowest (MID 2) and highest(MID 8) expression levels for all nonactive-site residues arelisted (supplementary table S2 Supplementary Material on-line and fig 1) At the lowest expression level mutationaltolerance increased with increasing accessibility while at thehighest expression level it is less sensitive to accessibility andmost mutants show an active phenotype Most substitutionsare tolerated at exposed nonactive-site residues both at lowand high expression levels (fig 1A and supplementary fig S1ASupplementary Material online) However a few mutantswith accessibilitygt 40 were found to show an inactive phe-notype These exposed inactive nonactive-site substitutionsare typically either aromatic residues or proline (supplementary table S4 Supplementary Material online) These exposedaromatic substitutions probably affect the folding of CcdBprotein as they show high propensity to aggregation al-though Tmrsquos are somewhat comparable to the wildtype (seemutants G29W L41F and V73F in supplementary table S5Supplementary Material online)

Cationndashp interactions are thought to contribute to pro-tein stability (Gallivan and Dougherty 1999) though an earlierstudy (Prajapati et al 2006) shows these contribute little tothe stability of Maltose Binding Protein We find that all the19 and 11 mutations at the 13th and 14th positions respec-tively involed in cationndashp interaction including the chargereversal mutant R13D were well tolerated even at the lowestexpression levels (supplementary table S6 SupplementaryMaterial online) Salt-bridges are another possible stabilizingnoncovalent electrostatic interaction in proteins In case ofCcdB five salt-bridges are present between the following pairsof residues D19-R31 D23-R31 E59-R40 E79-K4 and D89-R86All amino acids participating in salt-bridges are solvent ex-posed except for D19 in which only the terminal oxygens areexposed Mutations at all these positions are well toleratedeven at the lowest expression level (supplementary table S6Supplementary Material online) suggesting that none of thesalt-bridges in CcdB contributes significantly to the stability oractivity of the protein

We also examined the correlation of average MSseq valueswith residue depth for all nonactive-site positions in CcdB(PDB ID 3VUB) (fig 2D) Similar calculations were performedfor PSD95pdz3 and GB1 using the phenotypic data obtainedfrom (McLaughlin et al 2012) (PDB ID 1BE9) and (Olson et al2014) (PDB ID 1PGA) respectively In these studies the ability

Table 1 Mutational Tolerance at the Buried-Site Residues at Lowestand Highest Expression Levels

Aminoacid

No ofmutants

Depth(A)

ACCa

()Tol atMID2b ()

Tol atMID8b ()

V05 18 68 0 39 94F17 17 73 02 82 100V18 18 93 0 33 83D19 18 67 14 83 100V20cd 19 86 0 32 74Q21cd 19 65 1 63 100M32d 17 78 03 76 100V33 19 65 14 68 95I34 19 79 0 37 79L36 12 72 0 0 67P52 17 54 35 41 100V54 15 56 04 73 100M63 19 81 01 47 89T65 9 79 0 44 100M68cd 12 66 0 33 100L83 19 58 15 53 100I90 19 74 01 26 89A93cd 14 60 0 36 100I94cd 18 79 06 33 83M97cd 16 75 0 56 94F98cd 19 77 07 37 79

aSide-chain accessibilitybMutational tolerance at the lowest (MID 2) and highest (MID 8) expression levelscResidues within van der Waals distance of the active-site residuesdResidues present at dimer interface

Tripathi et al doi101093molbevmsw182 MBE

2964

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

FIG 2 Relative tolerance for substitutions at buried positions (A) Mutational sensitivity data at all buried positions obtained at differentexpression levels for CcdB was used to obtain the distribution of MSseq values for a given mutant residue The distributions for row and columnresidues were compared using a Wilcoxon signed-rank test and the corresponding P values were calculated A log10 of the P values is indicatedGradation from red to blue indicates increasing values log10 P ie decreasing destabilizing effect of the row residue wrt column residue A lowerP value implies that introduction of the row residue at a buried site is typically more destabilizing than introduction of the corresponding columnresidue (B and C) Similar plot but using DEx

i values derived from saturation mutagenesis of the PDZ domain (PSD95pdz3) and lnW values fromsaturation mutagenesis of IgG Binding domain of protein G (GB1) respectively (DndashF) Correlation of the average MSseq values DEx

i values and lnWvalues with side-chain depth for all nonactive-site residues of CcdB PSD95pdz3 and GB1 respectively Accessibility and depth values werecalculated based on the crystal structure of WT homodimeric CcdB (PDB ID 3VUB) PSD95pdz3 (PDB ID 1BE9) and GB1 (PDB ID 1PGA) A residuewas defined as buried if the side-chain accessibility is5

Table 2 Mutant Phenotype Prediction by MSpred SNAP2 and SuSPect

Protein Predictionmethod

Pearsonrsquos correlationcoefficienta

Matthews correlationcoefficientb

Sensitivityc() Specificityd() Accuracye()

CcdB MSpredf 069 065 69 95 90

SNAP2g 027 019 100 11 37SuSPecth 029 014 100 8 30

PSD95pdz3 MSpredf 057 053 61 93 88

SNAP2g 024 015 100 7 34SuSPecth 06 061 87 87 87

GB1 MSpredf 065 049 44 96 79

SNAP2g 027 011 100 3 42SuSPecth 008 003 73 24 38

aModulus of the correlation coefficientbMathews correlation coefficientfrac14 TP X TNFP X FN

ethTPthornFPTHORNethTPthornFNTHORNethTNthornFPTHORNethTNthornFNTHORN where TP TN FP FN are True Positives True Negatives False Positives and False Negatives respectivelycSensitivityfrac14 TP

TPthornFNdSpecificityfrac14 TN

TNthornFPeAccuracyfrac14 TPthornTN

TPthornTNthornFPthornFNfMutant was classified as nonneutral if MSpredgt 2 and neutral if the scorefrac14 2 Mutants were classified into true positives (TP) true negatives (TN) false positives (FP) and falsenegatives (FN)gMutant was classified as nonneutral if SNAP2 scoregt 50 and neutral if the scorelt50 50lt Scorelt 50 low reliability predictions and were omitted Mutants wereclassified into TP TN FP and FNhMutant was classified as nonneutral if SuSPect scoregt 75 and neutral if the scorelt 25 25lt Scorelt 75 low reliability predictions and were omitted Mutants were classifiedinto TP TN FP and FN

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2965

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

of these proteins to bind their cognate ligands is quantita-tively linked to a phenotypic readout In all the three casesaverage phenotypic effect was observed to increase with res-idue depth (Correlation coefficients of 085073 and061for CcdB PSD95pdz3 and GB1 respectively) (fig 2DndashF) Buriedpositions with small (A or G) wildtype residues were notincluded in the correlations These positions are unusuallysensitive to mutation because all substitutions result in largesteric overlap These data suggest that a large fraction of theaverage sensitivity to mutation at nonactive-site residues isgoverned by a single parameter the residue depth This is aremarkably simple metric that provides an alternative to thesector based models used to analyze mutational data forPSD95pdz3 as well as other proteins (McLaughlin et al 2012)

One alternative approach to estimating burial preferencesof amino acids is measuring free energies of transfer of aminoacid side-chain analogs from water to cyclohexane(Wolfenden et al 2015) Another approach is to measureaccessible surface areas of the side-chains averaged over alarge database of protein structures and either infer free en-ergies of transfer from aqueous solution into the protein in-terior as described previously (Rose et al 1985) or constructenvironment dependent substitution matrices from suchdata (Overington et al 1992) Relative DDGrsquos of burial fromthe first two approaches are shown in supplementary figureS1C and D Supplementary Material online Both of theseshow some qualitative similarities with the mutational datain figure 2AndashC but there are several notable differences Forexample the relative DDGrsquos of burial inferred from the freeenergy of transfer approaches show that the introduction ofW at buried positions is clearly favored over Y and H unlikethe situation for the experimental mutational data In addi-tion the transfer data predict that mutation to G and P willbe largely tolerated whereas the experimental mutationaldata suggest that substitutions to G or P are rarely toleratedIt is also observed in the mutational data-sets that at buriedsites smaller charged and polar residues are disfavored rela-tive to larger ones whereas the opposite trend is observed foraromatic residues In case of DDG transfer data the trend ispreserved for polar and charged residues but clearly not foraromatic residues

Prediction of Mutational Sensitivity Score (MSpred)Using Penalties Derived from the CcdB DataWe further determined whether the above observations re-garding substitution preferences could be employed for pre-diction of functional consequences of individual mutationsTo this end we developed a predictive model using a coher-ent set of rules derived from a randomly chosen subset of theCcdB mutational data containing 60 of the mutants andtested its applicability in predicting the mutational sensitivi-ties of the remaining 40 mutants as well as two other pro-teins PSD95pdz3 and GB1 The predicted score is denoted asMSpred

For CcdB mutational data a mutational sensitivity score(MSseq) of 2 is indicative of wild-type like behavior in themutant and higher values of MSseq indicate higher mutationalsensitivity Therefore a base MSpred value of 2 was assigned to

all the mutants in the test set and penalties were subse-quently added according to the nature of the substitutiontaking into account the wildtype residue identity As exposednonactive-site positions tolerated almost all substitutionspenalties were calculated only for buried positions We alsoobserved that buried side-chains that point outwards withrespect to the protein core are less sensitive to mutationscompared with the ones that point inside These residueswere identified by their side-chain depth values (seeldquoMaterials and Methodsrdquo section) and were not consideredfor penalty calculation

Substitutions were divided into categories based on thenature of the wildtype and mutant residue Each wildtype andmutant residue was assigned to one of six categories namelyaliphatic aromatic polar charged G and P resulting in a totalof 34 (362 [GG and PP]) types of substitutions TheCcdB data was randomly divided into training (60 data) andtest sets (40 data) The category penalty for each type ofsubstitution was calculated using only the training data set byaveraging the MSseq values observed for each category ofsubstitution and subtracting the base MSpred value of 2from the average MSseq Additional ldquoresidue-specific penal-tiesrdquo were also derived to account for the residue-size-wisesubstitution preferences eg smaller polar residues beingmore destabilizing than larger ones (Materials andMethods supplementary table S7 Supplementary Materialonline) Penalties for proline substitutions (both buried andexposed) were derived using the flowchart described previ-ously (Bajaj et al 2007) Next MSpred values were calculatedfor all buried positions based on these penalties(MSpredfrac14 2thorn category penaltythorn residue-specific penalty)and all exposed nonactive-site positions were assigned anMSpred of 2 Active-site residues were not considered in theanalysis The predicted mutational sensitivity scores (MSpred)for the test data set showed a high Pearsonrsquos correlation(rfrac14 069) with the experimental MSseq values and a SD of126 (table 2) We also derived the Matthews correlation co-efficient in order to evaluate the performance of MSpred inclassifying mutants as neutral and nonneutral (see ldquoMaterialsand Methodsrdquo section) It was observed to be 065 (table 2)

We tested the performance of MSpred on two other pro-teins The MSpred values for PSD95pdz3 and GB1 agreed wellwith the experimental mutational sensitivity data withPearsonrsquos correlation coefficients of 057 and 065 andMatthews correlation coefficients of 053 and 049 respec-tively (table 2)

We also carried out mutational sensitivity predictions forCcdB PSD95pdz3 and GB1 using two frequently used meth-ods SNAP2 (Hecht et al 2015) and SuSPect (Yates et al 2014)Both SNAP2 and SuSPect show poorer correlation with theexperimental mutational sensitivity data than MSpred (exceptSuSPect predictions for PSD95pdz3 table 2) Both the methodsshow a very high sensitivity but a very low specificity valuecompared with MSpred Thus MSpred which is derived basedon very simple rules compares favorably with the popularmachine learning based methods SNAP2 and SuSPect Thisapproach should work to rank order mutational effects atburied sites for other globular proteins While a three

Tripathi et al doi101093molbevmsw182 MBE

2966

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

dimensional structure is not essential it is important to haveresidue burial information because predictions have beenoptimized for buried residues A saturation mutagenesisdata set is also not required However it is important tohave experimental data on the functional effects of multiplepoint mutants to decide on the cutoff value of MSpred thatwould result in an observable phenotype This value wouldlikely depend on factors such as intrinsic protein stabilityexpression level and gene essentiality that would vary fromone protein to another (Miosge et al 2015)

In Vitro Determined Apparent Tmrsquos Correlate Betterwith in Vivo Solubility than with Relative ActivityDerived from Deep SequencingTo experimentally probe the molecular basis for mutant phe-notypes at nonactive-site positions around 80 single-site mu-tants of CcdB were selected from the saturation mutagenesislibrary (Bajaj et al 2008) based on MSseq and accessibility class(Adkar et al 2012) (supplementary table S5 SupplementaryMaterial online) All the mutants were purified by affinitypurification against immobilized ligands GyrA or CcdAEach purified protein was subjected to thermal denaturationmonitored using Sypro orange dye (Niesen et al 2007) andthe apparent Tm was calculated for each mutant (supplemen

tary fig S3A and table S5 Supplementary Material online)During purification of various CcdB mutants it is possiblethat the protein may be inactivated by aggregation or mis-folding Hence the ability of purified protein to bind CcdA wasexamined by monitoring the thermal denaturation of eachmutant in the absence and presence of a CcdA peptide thatcontains CcdB binding residues (residues 46ndash72) If the mu-tant binds the CcdA peptide this should result in an increasein its apparent Tm (supplementary fig S3B SupplementaryMaterial online) (Fukada et al 1983 Brandts and Lin 1990Gonzalez et al 1999) There were nine mutants eg V05SV05L Y06G and F17D that did not show an increase in ap-parent Tm in the presence of CcdA peptide (supplementarytable S5 Supplementary Material online) suggesting thatthese are misfolded or aggregated hence these mutantswere removed from the analysis Most of these mutants arelargely found in inclusion bodies and have Tmrsquos between 40 Cand 50 C in contrast to WT CcdB which has a Tm of 684 CFurther studies were restricted to the remaining 71 mutantsthat showed an enhancement in thermal stability in the pres-ence of CcdA peptide (supplementary fig S3BSupplementary Material online) Mutants showed a rangeof apparent Tmrsquos (supplementary table S5 SupplementaryMaterial online) When in vitro determined thermal stabilitywas compared with in vivo phenotypes (MSseq) determined

FIG 3 Correlation between apparent in vitro Tm in vivo solubility and activity (MSseq value) for CcdB mutants Correlations of DTm [Tm (WT)Tm

(Mutant)] for 67 single-site mutants with (A) in vivo activity and (B) in vivo fraction of soluble protein respectively (C) Correlation of relativethermal stability (DTm) of mutants with DDGo of unfolding estimated by GdnHCl denaturation (D) Correlation of fraction of protein in thesoluble fraction with in vivo activity of mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2967

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

by deep sequencing a moderate correlation (rfrac14 065) wasobtained (fig 3A) However there were many mutants thatshowed similar activity but differed substantially in their sta-bility such as L16S V18T D19N V54E (supplementary tableS5 Supplementary Material online) Conversely there werealso mutants (eg V33D M32N) that showed similar thermalstability to wildtype but had substantially lower activityin vivo This shows that the in vivo activity of a protein de-pends on many factors inside a cell which assist in properfolding and maintaining an active conformation Since theapparent Tm determined by the thermal shift assay may notreflect the true thermodynamic stability of the protein asubset of 21 mutants was also subjected to GdnHCl chemicaldenaturation These mutants were chosen to span a range ofTm and MSseq values These measurements were done to see ifthe two measures of stability ie thermal and chemical de-naturation correlate with one another It was found thatboth measures of stability were highly correlated (fig 3Cand supplementary table S8 Supplementary Material online)

Various mutations have different effects on protein stabil-ity and activity Properly folded proteins are found in thesoluble fraction of the cell lysate whereas misfolded proteinsoften form insoluble aggregates called inclusion bodiesHence misfolding reduces the amount of active solubleand functional protein though studies have shown thatsome amount of protein in the soluble fraction can also bemisfolded (Liu et al 2014) To study the relation betweenin vivo solubility of CcdB mutants with in vitro determinedthermal stability E coli strain CSH501 (which has a mutationin the gyrA gene and is hence resistant towards CcdB action)was transformed individually with the mutants and theamount of protein in both the soluble fraction and in inclu-sion bodies was estimated Surprisingly for a few mutantsalthough very little protein was found in the soluble fractionthese showed an active phenotype with an MSseq of 2 (fig3D) Hence for these mutants the small amount of proteinpresent in the soluble fraction is properly folded and sufficientto cause cell death in a CcdB sensitive strain In some casesdifferent mutants have similar fractions of soluble proteinin vivo but have different in vivo activity and in vitro thermalstability (supplementary table S5 Supplementary Materialonline) The overall thermal stabilities of mutants correlatedwell with the in vivo amount of soluble protein (fig 3B) Thisindicates that protein stability is an important determinant ofproper folding in vivo The moderate correlation of stability orsolubility with in vivo activity likely arises because only a smallamount of properly folded soluble protein is sufficient toresult in an active phenotype

One reason for the lack of a better correlation betweensolubility and in vivo activity is that for each mutant variousconformational forms of the protein can partition differentlyin the soluble and insoluble fractions of the cell lysate Thesoluble fraction can comprise both of folded protein which isactive and soluble aggregatespartially misfolded proteinwhich are inactive (Liu et al 2014) Moreover this partitioningcan be influenced by perturbations in the cytosolic proteo-stasis network To study the relation between in vivo activityand solubility the ability of four selected CcdB mutants

(V33K and Y06G as examples of active but insoluble mutantsand R31G and V80N as examples of soluble but inactivemutants) in the soluble fraction of the cell lysate to bindGyrase was monitored by surface plasmon resonance (supplementary fig S4A and B Supplementary Material online)Mutants with only a small amount of protein in the solublefraction but displaying an active phenotype in vivo (V33KY06G) showed binding to Gyrase comparable to the wild-type in this surface plasmon resonance assay showing thatthe protein is well folded Whereas in cases where a mutant ismostly in the soluble fraction but shows an inactive pheno-type in vivo (R31G V80N) the in vitro binding with Gyrasewas also negligible compared with the wild-type (supplementary fig S4C Supplementary Material online)

Refolding and Unfolding KineticsRefolding and unfolding kinetics for 10 mutants that havesimilar thermal stability but different in vivo solubility andactivity were monitored by time-course fluorescence spec-troscopy at 25 C Refolding and unfolding were carried outat pH 74 at final GdnHCl concentrations of 06 and 32 Mrespectively Of the 10 selected mutants four (V05S I56GV18R and V18H) could not be studied for their refoldingprofiles due to high precipitation immediately following pu-rification Further for these mutants the proportion in thesoluble fraction in vivo was low ranging from 01 to 03 (supplementary table S5 Supplementary Material online) Ofthese V05S and I56G are active (MSseqfrac142) whereas V18Rand V18H show an inactive phenotype (MSseq of 9 and 6respectively) Most mutants (except V80N) showed slowerrefolding kinetics than the wild-type indicating that thesemutants are folding defective (table 3) Refolding for the wild-type occurs with a significant burst phase (kgt 05 s 1) and aslow phase Mutants typically show a much smaller burstphase an intermediate phase and a slow phase of muchhigher amplitude than the wildtype Most mutants showunfolding kinetics similar to the wildtype except V54Ewhich shows a much higher unfolding rate The ability ofthe refolded mutants to bind to the cognate ligand GyrAor the CcdA peptide (residues 46ndash72) was also monitoredBinding of refolded mutants to immobilized GyrA onAmine Reactive Second Generation (AR2G) biosensorswas monitored using Bio-layer interferometry (Sultanaand Lee 2015) and the binding to CcdA peptide was mon-itored using Thermal Shift Assay (Niesen et al 2007) Activemutants (L16S V18T) retained their binding to both GyrAand CcdA upon refolding even though their refolding ki-netics was slow (table 3) Surprisingly V54E which is also anactive mutant failed to bind GyrA and CcdA upon refold-ing even though the native protein showed binding (supplementary fig S6 Supplementary Material online) On theother hand the inactive mutants R31G and M63N did notbind to GyrA and CcdA after refolding (table 3) showingthat their refolded state is nonnative Interestingly thenative V80N mutant did not show any binding to GyrAbut the refolded protein binds weakly to both ligands Twoof these mutants V80N and V54E also show formation ofhigher order oligomers (supplementary fig S5

Tripathi et al doi101093molbevmsw182 MBE

2968

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

laborious and limits the number of mutants that can bestudied If a convenient phenotypic readout for protein func-tion is available this can be combined with deep sequencingto obtain relative activity estimates for large numbers of mu-tants (Tripathi and Varadarajan 2014) In cases where a phe-notypic readout is unavailable monitoring the levels of areporter gene fused to the protein of interest can be usedas a proxy for activity although such fusions may also affectthe stability and folding of the protein (Kim et al 2013) Theadvent of next generation sequencing has provided a consid-erable amount of phenotypic data linked to mutations butstudies that aim at understanding the molecular basis of thesephenotypes are limited Many studies that employ site-saturation mutagenesis methodology have goals specific toa given protein such as to identify active-site residues(Melnikov et al 2014 Romero et al 2015) improvealter pro-tein properties (Wang et al 2002 Deng et al 2012 Whiteheadet al 2012 Starita et al 2013) identify stabilizing mutations(Araya et al 2012 Traxlmayr et al 2012 Kim et al 2013)determine affinity and specificity determinants of proteinndashprotein interaction (DeBartolo et al 2012 Dutta et al 2013)or to study the fitness landscape (Hietpas et al 2011 2012Melnikov et al 2014 Thyagarajan and Bloom 2014 Sarkisyanet al 2016) The readout in most cases is either qualitative(bindingno binding) or semi-quantitative experiments arecarried out at a single expression level some cases sample alimited number of sites (Fowler et al 2010 Hietpas et al 2011Deng et al 2012 McLaughlin et al 2012 Schlinkmann et al2012) and can involve metastable proteins with multiplefunctional conformations (Thyagarajan and Bloom 2014)Some of these studies sample multi-site as well as single mu-tations complicating interpretation of the data (Hietpas et al2011 Deng et al 2012) and in most cases inferences fromthese analyses are not validated by detailed characterizationof individual single mutants Previously attempts to obtainresidue-specific contributions to activity with either a fulllength protein such as Ubiquitin (76 aa) (Roscoe et al2013) or with protein domains such as the hYAP65 WWdomain (25-aa region) (Fowler et al 2010) have been madebut in such cases it is difficult to separate the effect of singlemutations on stabilityfolding from those that directly affectfunction either because the system has multiple binding part-ners such as in the case of Ubiquitin or due to a limitednumber of single mutants and presence of several doubleand triple mutants in the library (Fowler et al 2010 Denget al 2012)

There have also been numerous prior attempts to under-stand and predict the functional consequences of mutationsby using computational methods (Bloom et al 2005Parthiban et al 2006 Moretti et al 2013 Pires et al 2014)While experimental approaches often measure changes inthermodynamic stability or activity of proteins upon muta-tion computational methods typically predict stabilitiesbased on either sequence andor structure Some recentmethods based on machine learning such as SNAP2 (Hechtet al 2015) and SuSPect (Yates et al 2014) take into accountevolutionary information and other sequence and structure

based features to predict functional consequences ofmutations

In the present study we attempt to understand the con-tribution of every amino acid in a protein to its structurestability and function understand how mutations modulateprotein activity in vivo and use this information in predictingthe functional effects of mutations computationally We at-tempt to address the following issues (1) Can we distinguishactive-site residues from buried ones based solely on satura-tion mutagenesis phenotypes (2) Are there consistent pat-terns in substitution preferences at buried sites (3) What isthe primary mechanism by which mutations at buried sitesaffect activity in vivo (4) Can we predict functional effects ofspecific mutations at buried sites We use the protein CcdB(Controller of Cell Death protein B) as an experimental testprotein CcdB is a homodimeric protein and each protomercontains 101 residues (Loris et al 1999) CcdB is a part of theCcdAB toxinndashantitoxin system present on the Escherichia coliF-plasmid and plays an important role in F-plasmid mainte-nance by killing plasmid free cells (Jaffe et al 1985 Hayes2003) Biophysical and thermodynamic studies of dimericCcdB (Chakshusmathi 2002 Bajaj et al 2004) indicate thatthe protein exists as a homodimer at neutral pH and under-goes a two-state unfolding process with a free energy of un-folding of21 kcalmol at 298 K (Bajaj et al 2004) CcdB hastwo primary ligands its cognate antitoxin CcdA and cellulartarget DNA Gyrase The Kd of CcdB for CcdA37ndash72 is in thepicomolar range and is much smaller than for GyrA which is10 nM (De Jonge et al 2009)

Phenotypes of 1664 single-site mutants of CcdB were de-termined at seven different expression levels (designated as2ndash8 in the order of increasing expression level) by using twodifferent deep sequencing techniques 454 (Adkar et al 2012)and Illumina (this work) We describe a mutational sensitivityscore derived from sequencing (MSseq) and use it to quanti-tatively rank order mutant effects on phenotype at bothburied and exposed positions and to distinguish buriedfrom active-site residues based solely on mutational dataTwo other systems for which experimentally derived muta-tional sensitivity scores were available namely PDZ domain(PSD95pdz3) and IgG-binding domain of protein G (GB1) wereused to compare the substitution preferences and determineif a coherent set of rules derived from a fraction of the CcdBmutational data can also be used for predicting the functionaleffects of other mutations in CcdB as well as the two addi-tional test proteins

To gain additional insights into the molecular determi-nants of phenotype for the nonactive-site mutants80 CcdB mutants with a range of in vivo activities werepurified and characterized in vitro to obtain insights intodeterminants of protein stability solubility and activityEffects of chaperone over-expression as well as chaperoneand protease deletion on activities of individual mutantswere also studied to rationalize the effect of mutations onprotein folding and stability The data suggest that mutationaleffects on folding rather than stability determine the in vivophenotype of CcdB mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2961

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

In summary this work has important implications for un-derstanding the molecular basis of mutant phenotypes andfor mutant phenotype prediction

Results

Phenotypes Determined from 454 Sanger SequencingMatch Well with Phenotypes Determined by IlluminaSequencingWe have previously described a library consisting of approx-imately 1000 single-site mutants of CcdB (Adkar et al 2012)which was constructed by pooling single-site mutants andindividually sequenced by 454 Sanger sequencing to obtainphenotypes (Bajaj et al 2008) We have previously shown thatphenotypes of individual mutants determined by growingthem on plates at various repressor and inducer concentra-tions correlate well (rfrac14 095) with those obtained from 454deep sequencing (Adkar et al 2012) In the present study afresh library for CcdB was prepared by individually random-izing each codon using an inverse PCR procedure (Jain andVaradarajan 2014) This library was transformed and screenedat seven different expression levels under identical conditionsto those used for the earlier library The relative population ofeach mutant as a fraction of repressorinducer concentrationwas estimated using Illumina deep sequencing In contrast to454 sequencing where the read length was sufficient to coverthe entire gene each Illumina read provided only 50ndash70 bp ofuseful sequence Hence it was necessary to create six PCRproducts to obtain complete sequence coverage for thewhole gene The key assumption here is that each mutantgene is mutant only at a single codon thus we consideredreads which contain exactly one mutant codon We observed785 of the reads to be wildtype which is close to the ex-pected 833 (56100) Only 25 of the non wildtype reads(012 of total reads) had two mutations Since the additionalmutations will likely be randomly distributed and given thatmost single mutants show an active phenotype the fractionof incorrectly assigned inactive phenotypes is expected to besmall Since expression of active CcdB leads to cell death thenumber of sequencing reads for a given mutant abruptlydecreases at the expression level where the mutant showsan active phenotype These expression levels are assignednumerical values from 2 to 8 (value of 9 is assigned to themutants that show cell growth even at the highest expressionlevel) The CcdB gene is amplified from colonies surviving ateach expression level and tagged with a Multiplex IDentifiersequence (MID) unique to each expression level MSseq is theexpression level at which the number of the sequencing readsfor a particular mutant decreases by a factor of five or morecompared to the previous expression level (Adkar et al 2012Sahoo et al 2015) Based on this phenotypes for a total of1664 single-site mutants in the two independent single-sitelibraries of CcdB were mapped collectively by the two deepsequencing methods 454 and Illumina respectively whichcorresponds to 165 mutants per position (876 of all pos-sible mutants) Of the 1093 mutants analyzed by 454 se-quencing and 1342 by Illumina sequencing 771 mutantswere common 625 mutants have the same MSseq value

and the MSseq score differed by at most 1 for 59 mutantsIn few cases where the MSseq value differed between Illuminaand 454 the lower value (higher activity) was taken The highconcordance between phenotypes derived from Illumina 454and plate based assays of individual mutants validates thedeep sequencing based phenotypic identification

Determination of the Active-Site Residues Solely fromthe Mutational DataAs a first step towards understanding and interpretation ofthe large amount of mutational data we calculated residue-wise mutational tolerance namely the fraction of active mu-tants for each residue at a given condition

Residues with low mutational tolerance are mostly buriedwhereas some are surface exposed The latter are likely to be apart of the active-site (Wu et al 2015) Active-site residues canbe distinguished from buried ones even in the absence ofstructural information based on the pattern of mutationalsensitivity At buried positions typically most aliphatic sub-stitutions are tolerated except when the wildtype residue is asmall A or G residue whereas polar and charged residues arepoorly tolerated In contrast for active-site residues (whichare typically exposed) mutations to aliphatic residues are of-ten poorly tolerated polar and charged residues are some-times tolerated and the average mutational tolerance istypically lower than that of the buried residues Based onthese criteria we can identify residues Q2 F3 Y6 S22 I24N95 W99 G100 and I101 as putative active-site residuesbased solely on the mutational data (fig 1) Upon examiningthe crystal structure of free CcdB (PDB ID 3VUB) all theactive-site residues identified from the mutational pheno-types with the exception of Y6 are in close proximity toeach other and line a surface groove indicating that theseeight residues are likely to be part of the active-site (fig 1D) Inthe structure of CcdB bound to a fragment of GyrA (PDB ID1X75) all eight residues are in proximity to GyrA confirmingthat these are indeed part of the active-site Y6 has an expo-sure of just 9 and only the terminal OH group is exposedsuggesting that the low mutational tolerance at this positionis likely to be primarily due to mutational effects on foldingand stability rather than due to direct effects on GyrA bind-ing In subsequent analyses we focus primarily on effects ofmutations at nonactive-site positions Mutational effects onactive-site residues involved in binding Gyrase will be dis-cussed in more detail elsewhere

Substitution Preferences at Buried PositionsThere are 92 nonactive-site positions in CcdB of which 21positions are buried (accessibility 5) and 71 are exposed(accessibilitygt 5) Of the 21 buried residues 18 are hydro-phobic (table 1) Mutational tolerance increased with in-creasing expression level (supplementary fig S2Supplementary Material online) and was lower at buriedpositions compared with the exposed positions At the low-est expression level (MID 2) the average mutational toler-ance for the 14 buried residues that are not part of the dimerinterface or active-site is 485 while for dimer-interfaceburied residues it is 475 indicating that both classes of

Tripathi et al doi101093molbevmsw182 MBE

2962

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

buried residues are equally sensitive to mutation (supplementary table S3 Supplementary Material online) ResidueD19 is the only buried potentially charged residue and yetsurprisingly shows the highest mutational tolerance relativeto other buried residues Although the residue is largelyburied the side-chain points outwards towards solvent ex-plaining its high tolerance to mutation A subset of buriedresidues most sensitive to mutation was selected using thefollowing criteria tolerance at MID 2lt 40 tolerance atMID 8lt 90 and phenotypic data for 15 mutants is avail-able Interestingly this selected subset (V18 V20 I34 I90and I94) clusters together in the interior of each monomer(supplementary fig S1E Supplementary Material online)

On analyzing the mutational tolerance as a function ofmutant amino acid at buried residues we found that at thelowest expression level D R and P are the least toleratedmutations and tolerance decreases in the order ali-phaticgt aromatic polargt charged Interestingly for charged

and polar amino acids smaller amino acids were consistentlymore poorly tolerated than larger ones (compare D E N Q ST tolerances in supplementary table S2 SupplementaryMaterial online) The opposite trend is observed for aromaticsubstitutions where tolerance decreases in order FgtYHgtW D and R are the least tolerated substitutions (fig1C) though most other mutations are well tolerated at thehighest expression level (supplementary table S2Supplementary Material online) The poor tolerance for aburied Aspartate at all expression levels is likely due to theinability of the small charged side-chain to be solvated uponburial and reconfirms our earlier result (Bajaj et al 2005)indicating that Aspartate mutant phenotypes are good indi-cators of residue burial

We further attempted to quantitate the relative prefer-ence for different substitutions for all buried positions byincorporating phenotypic data at multiple expression levelsThe distribution of MSseq values for introducing a specific

FIG 1 Mutational effects on CcdB protein activity inferred from phenotypic screening and deep sequencing (A) (B) and (C) show the MSseq valuesfor representative exposed-site (accessibilitygt5) all active-site and buried-site residues (accessibility5) respectively On the vertical axisresidues are grouped into (G P) aliphatic (AndashM) aromatic (FndashW) polar (SndashQ) and charged (DndashR) amino acids Residue numbers and substi-tutions are indicated on the horizontal and vertical axes respectively Each heatmap is colored according to the MSseq value of the mutant Greento red color gradation represents increasing MSseq values Zero value (light green) indicates that the corresponding mutant was not observed in thelibrary WT residue at each position is indicated in white Data for only representative residue positions are shown for clarity (D) Active-siteresidues (highlighted in cyan) identified from the mutational phenotypes mapped onto the crystal structure of CcdB (PDB ID 3VUB)

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2963

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

residue ldquoXrdquo at every buried-site was obtained Pair-wise com-parisons of these distributions were made using a Wilcoxonsigned-rank test The heatmap (fig 2A) indicates the log10 Pvalue for the null hypothesis that the introduction of the rowresidues at a buried site does not reduce protein functionsignificantly more than introduction of the corresponding col-umn residue at the same site It is important to note here thatboth the residues being compared are mutant residues Unliketypical amino acid substitution matrices (Henikoff andHenikoff 1992) used for sequence alignment our matrix isasymmetric Aspartate and Arginine mutants possess signifi-cantly higher MSseq values than 18 and 16 other residuesrespectively indicating that they are the least tolerated muta-tions Proline is the next most poorly tolerated mutation Pvalues for (D E) (N Q) and (S T) (row column) pairs are lowerthan for (E D) (Q N) and (T S) indicating that on an averagethe order of tolerance is Dlt E NltQ and Slt T Similarly foraromatic residue tolerances WltY Hlt F In order to exam-ine if these observations remain valid for systems other thanCcdB we examined previously published mutational sensitiv-ity data for PSD95pdz3 (McLaughlin et al 2012) and GB1 (Olsonet al 2014) (fig 2B and C) The general trends were very similarand confirm our observation that for buried sites smallercharged and polar residues are disfavored relative to largerones whereas the opposite is true for aromatic residuesClose examination of the log10 P values in figure 2A suggeststhat at buried sites the substitution preference is approxi-mately in the following order ACVLIMgt TgtFgtHYSgtQGWgtNgtKPEgt RgtD A similar (but not identical) trendis also visible in the PSD95pdz3 and GB1 data though this isbased on fewer buried positions and at a single expression

level Additional saturation mutagenesis studies on other sys-tems using quantitative or semi-quantitative readouts wouldbe useful in consolidating our observations

Substitution preferences at active-site residues should bedifferent than those at buried sites because proteinproteininterfaces are more polar than protein interiors (Janin et al1988 Tsai et al 1997) and are also likely to display a greatercontext dependence Extensive analysis of a large amount ofmutational data would be required to decipher these substi-tution preferences In the case of CcdB data for only 142active-site mutants is available Hence we did not attemptto predict mutational sensitivities at active-site residues

Mutational Tolerance as a Function of DepthMutational tolerances at the lowest (MID 2) and highest(MID 8) expression levels for all nonactive-site residues arelisted (supplementary table S2 Supplementary Material on-line and fig 1) At the lowest expression level mutationaltolerance increased with increasing accessibility while at thehighest expression level it is less sensitive to accessibility andmost mutants show an active phenotype Most substitutionsare tolerated at exposed nonactive-site residues both at lowand high expression levels (fig 1A and supplementary fig S1ASupplementary Material online) However a few mutantswith accessibilitygt 40 were found to show an inactive phe-notype These exposed inactive nonactive-site substitutionsare typically either aromatic residues or proline (supplementary table S4 Supplementary Material online) These exposedaromatic substitutions probably affect the folding of CcdBprotein as they show high propensity to aggregation al-though Tmrsquos are somewhat comparable to the wildtype (seemutants G29W L41F and V73F in supplementary table S5Supplementary Material online)

Cationndashp interactions are thought to contribute to pro-tein stability (Gallivan and Dougherty 1999) though an earlierstudy (Prajapati et al 2006) shows these contribute little tothe stability of Maltose Binding Protein We find that all the19 and 11 mutations at the 13th and 14th positions respec-tively involed in cationndashp interaction including the chargereversal mutant R13D were well tolerated even at the lowestexpression levels (supplementary table S6 SupplementaryMaterial online) Salt-bridges are another possible stabilizingnoncovalent electrostatic interaction in proteins In case ofCcdB five salt-bridges are present between the following pairsof residues D19-R31 D23-R31 E59-R40 E79-K4 and D89-R86All amino acids participating in salt-bridges are solvent ex-posed except for D19 in which only the terminal oxygens areexposed Mutations at all these positions are well toleratedeven at the lowest expression level (supplementary table S6Supplementary Material online) suggesting that none of thesalt-bridges in CcdB contributes significantly to the stability oractivity of the protein

We also examined the correlation of average MSseq valueswith residue depth for all nonactive-site positions in CcdB(PDB ID 3VUB) (fig 2D) Similar calculations were performedfor PSD95pdz3 and GB1 using the phenotypic data obtainedfrom (McLaughlin et al 2012) (PDB ID 1BE9) and (Olson et al2014) (PDB ID 1PGA) respectively In these studies the ability

Table 1 Mutational Tolerance at the Buried-Site Residues at Lowestand Highest Expression Levels

Aminoacid

No ofmutants

Depth(A)

ACCa

()Tol atMID2b ()

Tol atMID8b ()

V05 18 68 0 39 94F17 17 73 02 82 100V18 18 93 0 33 83D19 18 67 14 83 100V20cd 19 86 0 32 74Q21cd 19 65 1 63 100M32d 17 78 03 76 100V33 19 65 14 68 95I34 19 79 0 37 79L36 12 72 0 0 67P52 17 54 35 41 100V54 15 56 04 73 100M63 19 81 01 47 89T65 9 79 0 44 100M68cd 12 66 0 33 100L83 19 58 15 53 100I90 19 74 01 26 89A93cd 14 60 0 36 100I94cd 18 79 06 33 83M97cd 16 75 0 56 94F98cd 19 77 07 37 79

aSide-chain accessibilitybMutational tolerance at the lowest (MID 2) and highest (MID 8) expression levelscResidues within van der Waals distance of the active-site residuesdResidues present at dimer interface

Tripathi et al doi101093molbevmsw182 MBE

2964

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

FIG 2 Relative tolerance for substitutions at buried positions (A) Mutational sensitivity data at all buried positions obtained at differentexpression levels for CcdB was used to obtain the distribution of MSseq values for a given mutant residue The distributions for row and columnresidues were compared using a Wilcoxon signed-rank test and the corresponding P values were calculated A log10 of the P values is indicatedGradation from red to blue indicates increasing values log10 P ie decreasing destabilizing effect of the row residue wrt column residue A lowerP value implies that introduction of the row residue at a buried site is typically more destabilizing than introduction of the corresponding columnresidue (B and C) Similar plot but using DEx

i values derived from saturation mutagenesis of the PDZ domain (PSD95pdz3) and lnW values fromsaturation mutagenesis of IgG Binding domain of protein G (GB1) respectively (DndashF) Correlation of the average MSseq values DEx

i values and lnWvalues with side-chain depth for all nonactive-site residues of CcdB PSD95pdz3 and GB1 respectively Accessibility and depth values werecalculated based on the crystal structure of WT homodimeric CcdB (PDB ID 3VUB) PSD95pdz3 (PDB ID 1BE9) and GB1 (PDB ID 1PGA) A residuewas defined as buried if the side-chain accessibility is5

Table 2 Mutant Phenotype Prediction by MSpred SNAP2 and SuSPect

Protein Predictionmethod

Pearsonrsquos correlationcoefficienta

Matthews correlationcoefficientb

Sensitivityc() Specificityd() Accuracye()

CcdB MSpredf 069 065 69 95 90

SNAP2g 027 019 100 11 37SuSPecth 029 014 100 8 30

PSD95pdz3 MSpredf 057 053 61 93 88

SNAP2g 024 015 100 7 34SuSPecth 06 061 87 87 87

GB1 MSpredf 065 049 44 96 79

SNAP2g 027 011 100 3 42SuSPecth 008 003 73 24 38

aModulus of the correlation coefficientbMathews correlation coefficientfrac14 TP X TNFP X FN

ethTPthornFPTHORNethTPthornFNTHORNethTNthornFPTHORNethTNthornFNTHORN where TP TN FP FN are True Positives True Negatives False Positives and False Negatives respectivelycSensitivityfrac14 TP

TPthornFNdSpecificityfrac14 TN

TNthornFPeAccuracyfrac14 TPthornTN

TPthornTNthornFPthornFNfMutant was classified as nonneutral if MSpredgt 2 and neutral if the scorefrac14 2 Mutants were classified into true positives (TP) true negatives (TN) false positives (FP) and falsenegatives (FN)gMutant was classified as nonneutral if SNAP2 scoregt 50 and neutral if the scorelt50 50lt Scorelt 50 low reliability predictions and were omitted Mutants wereclassified into TP TN FP and FNhMutant was classified as nonneutral if SuSPect scoregt 75 and neutral if the scorelt 25 25lt Scorelt 75 low reliability predictions and were omitted Mutants were classifiedinto TP TN FP and FN

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2965

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

of these proteins to bind their cognate ligands is quantita-tively linked to a phenotypic readout In all the three casesaverage phenotypic effect was observed to increase with res-idue depth (Correlation coefficients of 085073 and061for CcdB PSD95pdz3 and GB1 respectively) (fig 2DndashF) Buriedpositions with small (A or G) wildtype residues were notincluded in the correlations These positions are unusuallysensitive to mutation because all substitutions result in largesteric overlap These data suggest that a large fraction of theaverage sensitivity to mutation at nonactive-site residues isgoverned by a single parameter the residue depth This is aremarkably simple metric that provides an alternative to thesector based models used to analyze mutational data forPSD95pdz3 as well as other proteins (McLaughlin et al 2012)

One alternative approach to estimating burial preferencesof amino acids is measuring free energies of transfer of aminoacid side-chain analogs from water to cyclohexane(Wolfenden et al 2015) Another approach is to measureaccessible surface areas of the side-chains averaged over alarge database of protein structures and either infer free en-ergies of transfer from aqueous solution into the protein in-terior as described previously (Rose et al 1985) or constructenvironment dependent substitution matrices from suchdata (Overington et al 1992) Relative DDGrsquos of burial fromthe first two approaches are shown in supplementary figureS1C and D Supplementary Material online Both of theseshow some qualitative similarities with the mutational datain figure 2AndashC but there are several notable differences Forexample the relative DDGrsquos of burial inferred from the freeenergy of transfer approaches show that the introduction ofW at buried positions is clearly favored over Y and H unlikethe situation for the experimental mutational data In addi-tion the transfer data predict that mutation to G and P willbe largely tolerated whereas the experimental mutationaldata suggest that substitutions to G or P are rarely toleratedIt is also observed in the mutational data-sets that at buriedsites smaller charged and polar residues are disfavored rela-tive to larger ones whereas the opposite trend is observed foraromatic residues In case of DDG transfer data the trend ispreserved for polar and charged residues but clearly not foraromatic residues

Prediction of Mutational Sensitivity Score (MSpred)Using Penalties Derived from the CcdB DataWe further determined whether the above observations re-garding substitution preferences could be employed for pre-diction of functional consequences of individual mutationsTo this end we developed a predictive model using a coher-ent set of rules derived from a randomly chosen subset of theCcdB mutational data containing 60 of the mutants andtested its applicability in predicting the mutational sensitivi-ties of the remaining 40 mutants as well as two other pro-teins PSD95pdz3 and GB1 The predicted score is denoted asMSpred

For CcdB mutational data a mutational sensitivity score(MSseq) of 2 is indicative of wild-type like behavior in themutant and higher values of MSseq indicate higher mutationalsensitivity Therefore a base MSpred value of 2 was assigned to

all the mutants in the test set and penalties were subse-quently added according to the nature of the substitutiontaking into account the wildtype residue identity As exposednonactive-site positions tolerated almost all substitutionspenalties were calculated only for buried positions We alsoobserved that buried side-chains that point outwards withrespect to the protein core are less sensitive to mutationscompared with the ones that point inside These residueswere identified by their side-chain depth values (seeldquoMaterials and Methodsrdquo section) and were not consideredfor penalty calculation

Substitutions were divided into categories based on thenature of the wildtype and mutant residue Each wildtype andmutant residue was assigned to one of six categories namelyaliphatic aromatic polar charged G and P resulting in a totalof 34 (362 [GG and PP]) types of substitutions TheCcdB data was randomly divided into training (60 data) andtest sets (40 data) The category penalty for each type ofsubstitution was calculated using only the training data set byaveraging the MSseq values observed for each category ofsubstitution and subtracting the base MSpred value of 2from the average MSseq Additional ldquoresidue-specific penal-tiesrdquo were also derived to account for the residue-size-wisesubstitution preferences eg smaller polar residues beingmore destabilizing than larger ones (Materials andMethods supplementary table S7 Supplementary Materialonline) Penalties for proline substitutions (both buried andexposed) were derived using the flowchart described previ-ously (Bajaj et al 2007) Next MSpred values were calculatedfor all buried positions based on these penalties(MSpredfrac14 2thorn category penaltythorn residue-specific penalty)and all exposed nonactive-site positions were assigned anMSpred of 2 Active-site residues were not considered in theanalysis The predicted mutational sensitivity scores (MSpred)for the test data set showed a high Pearsonrsquos correlation(rfrac14 069) with the experimental MSseq values and a SD of126 (table 2) We also derived the Matthews correlation co-efficient in order to evaluate the performance of MSpred inclassifying mutants as neutral and nonneutral (see ldquoMaterialsand Methodsrdquo section) It was observed to be 065 (table 2)

We tested the performance of MSpred on two other pro-teins The MSpred values for PSD95pdz3 and GB1 agreed wellwith the experimental mutational sensitivity data withPearsonrsquos correlation coefficients of 057 and 065 andMatthews correlation coefficients of 053 and 049 respec-tively (table 2)

We also carried out mutational sensitivity predictions forCcdB PSD95pdz3 and GB1 using two frequently used meth-ods SNAP2 (Hecht et al 2015) and SuSPect (Yates et al 2014)Both SNAP2 and SuSPect show poorer correlation with theexperimental mutational sensitivity data than MSpred (exceptSuSPect predictions for PSD95pdz3 table 2) Both the methodsshow a very high sensitivity but a very low specificity valuecompared with MSpred Thus MSpred which is derived basedon very simple rules compares favorably with the popularmachine learning based methods SNAP2 and SuSPect Thisapproach should work to rank order mutational effects atburied sites for other globular proteins While a three

Tripathi et al doi101093molbevmsw182 MBE

2966

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

dimensional structure is not essential it is important to haveresidue burial information because predictions have beenoptimized for buried residues A saturation mutagenesisdata set is also not required However it is important tohave experimental data on the functional effects of multiplepoint mutants to decide on the cutoff value of MSpred thatwould result in an observable phenotype This value wouldlikely depend on factors such as intrinsic protein stabilityexpression level and gene essentiality that would vary fromone protein to another (Miosge et al 2015)

In Vitro Determined Apparent Tmrsquos Correlate Betterwith in Vivo Solubility than with Relative ActivityDerived from Deep SequencingTo experimentally probe the molecular basis for mutant phe-notypes at nonactive-site positions around 80 single-site mu-tants of CcdB were selected from the saturation mutagenesislibrary (Bajaj et al 2008) based on MSseq and accessibility class(Adkar et al 2012) (supplementary table S5 SupplementaryMaterial online) All the mutants were purified by affinitypurification against immobilized ligands GyrA or CcdAEach purified protein was subjected to thermal denaturationmonitored using Sypro orange dye (Niesen et al 2007) andthe apparent Tm was calculated for each mutant (supplemen

tary fig S3A and table S5 Supplementary Material online)During purification of various CcdB mutants it is possiblethat the protein may be inactivated by aggregation or mis-folding Hence the ability of purified protein to bind CcdA wasexamined by monitoring the thermal denaturation of eachmutant in the absence and presence of a CcdA peptide thatcontains CcdB binding residues (residues 46ndash72) If the mu-tant binds the CcdA peptide this should result in an increasein its apparent Tm (supplementary fig S3B SupplementaryMaterial online) (Fukada et al 1983 Brandts and Lin 1990Gonzalez et al 1999) There were nine mutants eg V05SV05L Y06G and F17D that did not show an increase in ap-parent Tm in the presence of CcdA peptide (supplementarytable S5 Supplementary Material online) suggesting thatthese are misfolded or aggregated hence these mutantswere removed from the analysis Most of these mutants arelargely found in inclusion bodies and have Tmrsquos between 40 Cand 50 C in contrast to WT CcdB which has a Tm of 684 CFurther studies were restricted to the remaining 71 mutantsthat showed an enhancement in thermal stability in the pres-ence of CcdA peptide (supplementary fig S3BSupplementary Material online) Mutants showed a rangeof apparent Tmrsquos (supplementary table S5 SupplementaryMaterial online) When in vitro determined thermal stabilitywas compared with in vivo phenotypes (MSseq) determined

FIG 3 Correlation between apparent in vitro Tm in vivo solubility and activity (MSseq value) for CcdB mutants Correlations of DTm [Tm (WT)Tm

(Mutant)] for 67 single-site mutants with (A) in vivo activity and (B) in vivo fraction of soluble protein respectively (C) Correlation of relativethermal stability (DTm) of mutants with DDGo of unfolding estimated by GdnHCl denaturation (D) Correlation of fraction of protein in thesoluble fraction with in vivo activity of mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2967

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

by deep sequencing a moderate correlation (rfrac14 065) wasobtained (fig 3A) However there were many mutants thatshowed similar activity but differed substantially in their sta-bility such as L16S V18T D19N V54E (supplementary tableS5 Supplementary Material online) Conversely there werealso mutants (eg V33D M32N) that showed similar thermalstability to wildtype but had substantially lower activityin vivo This shows that the in vivo activity of a protein de-pends on many factors inside a cell which assist in properfolding and maintaining an active conformation Since theapparent Tm determined by the thermal shift assay may notreflect the true thermodynamic stability of the protein asubset of 21 mutants was also subjected to GdnHCl chemicaldenaturation These mutants were chosen to span a range ofTm and MSseq values These measurements were done to see ifthe two measures of stability ie thermal and chemical de-naturation correlate with one another It was found thatboth measures of stability were highly correlated (fig 3Cand supplementary table S8 Supplementary Material online)

Various mutations have different effects on protein stabil-ity and activity Properly folded proteins are found in thesoluble fraction of the cell lysate whereas misfolded proteinsoften form insoluble aggregates called inclusion bodiesHence misfolding reduces the amount of active solubleand functional protein though studies have shown thatsome amount of protein in the soluble fraction can also bemisfolded (Liu et al 2014) To study the relation betweenin vivo solubility of CcdB mutants with in vitro determinedthermal stability E coli strain CSH501 (which has a mutationin the gyrA gene and is hence resistant towards CcdB action)was transformed individually with the mutants and theamount of protein in both the soluble fraction and in inclu-sion bodies was estimated Surprisingly for a few mutantsalthough very little protein was found in the soluble fractionthese showed an active phenotype with an MSseq of 2 (fig3D) Hence for these mutants the small amount of proteinpresent in the soluble fraction is properly folded and sufficientto cause cell death in a CcdB sensitive strain In some casesdifferent mutants have similar fractions of soluble proteinin vivo but have different in vivo activity and in vitro thermalstability (supplementary table S5 Supplementary Materialonline) The overall thermal stabilities of mutants correlatedwell with the in vivo amount of soluble protein (fig 3B) Thisindicates that protein stability is an important determinant ofproper folding in vivo The moderate correlation of stability orsolubility with in vivo activity likely arises because only a smallamount of properly folded soluble protein is sufficient toresult in an active phenotype

One reason for the lack of a better correlation betweensolubility and in vivo activity is that for each mutant variousconformational forms of the protein can partition differentlyin the soluble and insoluble fractions of the cell lysate Thesoluble fraction can comprise both of folded protein which isactive and soluble aggregatespartially misfolded proteinwhich are inactive (Liu et al 2014) Moreover this partitioningcan be influenced by perturbations in the cytosolic proteo-stasis network To study the relation between in vivo activityand solubility the ability of four selected CcdB mutants

(V33K and Y06G as examples of active but insoluble mutantsand R31G and V80N as examples of soluble but inactivemutants) in the soluble fraction of the cell lysate to bindGyrase was monitored by surface plasmon resonance (supplementary fig S4A and B Supplementary Material online)Mutants with only a small amount of protein in the solublefraction but displaying an active phenotype in vivo (V33KY06G) showed binding to Gyrase comparable to the wild-type in this surface plasmon resonance assay showing thatthe protein is well folded Whereas in cases where a mutant ismostly in the soluble fraction but shows an inactive pheno-type in vivo (R31G V80N) the in vitro binding with Gyrasewas also negligible compared with the wild-type (supplementary fig S4C Supplementary Material online)

Refolding and Unfolding KineticsRefolding and unfolding kinetics for 10 mutants that havesimilar thermal stability but different in vivo solubility andactivity were monitored by time-course fluorescence spec-troscopy at 25 C Refolding and unfolding were carried outat pH 74 at final GdnHCl concentrations of 06 and 32 Mrespectively Of the 10 selected mutants four (V05S I56GV18R and V18H) could not be studied for their refoldingprofiles due to high precipitation immediately following pu-rification Further for these mutants the proportion in thesoluble fraction in vivo was low ranging from 01 to 03 (supplementary table S5 Supplementary Material online) Ofthese V05S and I56G are active (MSseqfrac142) whereas V18Rand V18H show an inactive phenotype (MSseq of 9 and 6respectively) Most mutants (except V80N) showed slowerrefolding kinetics than the wild-type indicating that thesemutants are folding defective (table 3) Refolding for the wild-type occurs with a significant burst phase (kgt 05 s 1) and aslow phase Mutants typically show a much smaller burstphase an intermediate phase and a slow phase of muchhigher amplitude than the wildtype Most mutants showunfolding kinetics similar to the wildtype except V54Ewhich shows a much higher unfolding rate The ability ofthe refolded mutants to bind to the cognate ligand GyrAor the CcdA peptide (residues 46ndash72) was also monitoredBinding of refolded mutants to immobilized GyrA onAmine Reactive Second Generation (AR2G) biosensorswas monitored using Bio-layer interferometry (Sultanaand Lee 2015) and the binding to CcdA peptide was mon-itored using Thermal Shift Assay (Niesen et al 2007) Activemutants (L16S V18T) retained their binding to both GyrAand CcdA upon refolding even though their refolding ki-netics was slow (table 3) Surprisingly V54E which is also anactive mutant failed to bind GyrA and CcdA upon refold-ing even though the native protein showed binding (supplementary fig S6 Supplementary Material online) On theother hand the inactive mutants R31G and M63N did notbind to GyrA and CcdA after refolding (table 3) showingthat their refolded state is nonnative Interestingly thenative V80N mutant did not show any binding to GyrAbut the refolded protein binds weakly to both ligands Twoof these mutants V80N and V54E also show formation ofhigher order oligomers (supplementary fig S5

Tripathi et al doi101093molbevmsw182 MBE

2968

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

In summary this work has important implications for un-derstanding the molecular basis of mutant phenotypes andfor mutant phenotype prediction

Results

Phenotypes Determined from 454 Sanger SequencingMatch Well with Phenotypes Determined by IlluminaSequencingWe have previously described a library consisting of approx-imately 1000 single-site mutants of CcdB (Adkar et al 2012)which was constructed by pooling single-site mutants andindividually sequenced by 454 Sanger sequencing to obtainphenotypes (Bajaj et al 2008) We have previously shown thatphenotypes of individual mutants determined by growingthem on plates at various repressor and inducer concentra-tions correlate well (rfrac14 095) with those obtained from 454deep sequencing (Adkar et al 2012) In the present study afresh library for CcdB was prepared by individually random-izing each codon using an inverse PCR procedure (Jain andVaradarajan 2014) This library was transformed and screenedat seven different expression levels under identical conditionsto those used for the earlier library The relative population ofeach mutant as a fraction of repressorinducer concentrationwas estimated using Illumina deep sequencing In contrast to454 sequencing where the read length was sufficient to coverthe entire gene each Illumina read provided only 50ndash70 bp ofuseful sequence Hence it was necessary to create six PCRproducts to obtain complete sequence coverage for thewhole gene The key assumption here is that each mutantgene is mutant only at a single codon thus we consideredreads which contain exactly one mutant codon We observed785 of the reads to be wildtype which is close to the ex-pected 833 (56100) Only 25 of the non wildtype reads(012 of total reads) had two mutations Since the additionalmutations will likely be randomly distributed and given thatmost single mutants show an active phenotype the fractionof incorrectly assigned inactive phenotypes is expected to besmall Since expression of active CcdB leads to cell death thenumber of sequencing reads for a given mutant abruptlydecreases at the expression level where the mutant showsan active phenotype These expression levels are assignednumerical values from 2 to 8 (value of 9 is assigned to themutants that show cell growth even at the highest expressionlevel) The CcdB gene is amplified from colonies surviving ateach expression level and tagged with a Multiplex IDentifiersequence (MID) unique to each expression level MSseq is theexpression level at which the number of the sequencing readsfor a particular mutant decreases by a factor of five or morecompared to the previous expression level (Adkar et al 2012Sahoo et al 2015) Based on this phenotypes for a total of1664 single-site mutants in the two independent single-sitelibraries of CcdB were mapped collectively by the two deepsequencing methods 454 and Illumina respectively whichcorresponds to 165 mutants per position (876 of all pos-sible mutants) Of the 1093 mutants analyzed by 454 se-quencing and 1342 by Illumina sequencing 771 mutantswere common 625 mutants have the same MSseq value

and the MSseq score differed by at most 1 for 59 mutantsIn few cases where the MSseq value differed between Illuminaand 454 the lower value (higher activity) was taken The highconcordance between phenotypes derived from Illumina 454and plate based assays of individual mutants validates thedeep sequencing based phenotypic identification

Determination of the Active-Site Residues Solely fromthe Mutational DataAs a first step towards understanding and interpretation ofthe large amount of mutational data we calculated residue-wise mutational tolerance namely the fraction of active mu-tants for each residue at a given condition

Residues with low mutational tolerance are mostly buriedwhereas some are surface exposed The latter are likely to be apart of the active-site (Wu et al 2015) Active-site residues canbe distinguished from buried ones even in the absence ofstructural information based on the pattern of mutationalsensitivity At buried positions typically most aliphatic sub-stitutions are tolerated except when the wildtype residue is asmall A or G residue whereas polar and charged residues arepoorly tolerated In contrast for active-site residues (whichare typically exposed) mutations to aliphatic residues are of-ten poorly tolerated polar and charged residues are some-times tolerated and the average mutational tolerance istypically lower than that of the buried residues Based onthese criteria we can identify residues Q2 F3 Y6 S22 I24N95 W99 G100 and I101 as putative active-site residuesbased solely on the mutational data (fig 1) Upon examiningthe crystal structure of free CcdB (PDB ID 3VUB) all theactive-site residues identified from the mutational pheno-types with the exception of Y6 are in close proximity toeach other and line a surface groove indicating that theseeight residues are likely to be part of the active-site (fig 1D) Inthe structure of CcdB bound to a fragment of GyrA (PDB ID1X75) all eight residues are in proximity to GyrA confirmingthat these are indeed part of the active-site Y6 has an expo-sure of just 9 and only the terminal OH group is exposedsuggesting that the low mutational tolerance at this positionis likely to be primarily due to mutational effects on foldingand stability rather than due to direct effects on GyrA bind-ing In subsequent analyses we focus primarily on effects ofmutations at nonactive-site positions Mutational effects onactive-site residues involved in binding Gyrase will be dis-cussed in more detail elsewhere

Substitution Preferences at Buried PositionsThere are 92 nonactive-site positions in CcdB of which 21positions are buried (accessibility 5) and 71 are exposed(accessibilitygt 5) Of the 21 buried residues 18 are hydro-phobic (table 1) Mutational tolerance increased with in-creasing expression level (supplementary fig S2Supplementary Material online) and was lower at buriedpositions compared with the exposed positions At the low-est expression level (MID 2) the average mutational toler-ance for the 14 buried residues that are not part of the dimerinterface or active-site is 485 while for dimer-interfaceburied residues it is 475 indicating that both classes of

Tripathi et al doi101093molbevmsw182 MBE

2962

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

buried residues are equally sensitive to mutation (supplementary table S3 Supplementary Material online) ResidueD19 is the only buried potentially charged residue and yetsurprisingly shows the highest mutational tolerance relativeto other buried residues Although the residue is largelyburied the side-chain points outwards towards solvent ex-plaining its high tolerance to mutation A subset of buriedresidues most sensitive to mutation was selected using thefollowing criteria tolerance at MID 2lt 40 tolerance atMID 8lt 90 and phenotypic data for 15 mutants is avail-able Interestingly this selected subset (V18 V20 I34 I90and I94) clusters together in the interior of each monomer(supplementary fig S1E Supplementary Material online)

On analyzing the mutational tolerance as a function ofmutant amino acid at buried residues we found that at thelowest expression level D R and P are the least toleratedmutations and tolerance decreases in the order ali-phaticgt aromatic polargt charged Interestingly for charged

and polar amino acids smaller amino acids were consistentlymore poorly tolerated than larger ones (compare D E N Q ST tolerances in supplementary table S2 SupplementaryMaterial online) The opposite trend is observed for aromaticsubstitutions where tolerance decreases in order FgtYHgtW D and R are the least tolerated substitutions (fig1C) though most other mutations are well tolerated at thehighest expression level (supplementary table S2Supplementary Material online) The poor tolerance for aburied Aspartate at all expression levels is likely due to theinability of the small charged side-chain to be solvated uponburial and reconfirms our earlier result (Bajaj et al 2005)indicating that Aspartate mutant phenotypes are good indi-cators of residue burial

We further attempted to quantitate the relative prefer-ence for different substitutions for all buried positions byincorporating phenotypic data at multiple expression levelsThe distribution of MSseq values for introducing a specific

FIG 1 Mutational effects on CcdB protein activity inferred from phenotypic screening and deep sequencing (A) (B) and (C) show the MSseq valuesfor representative exposed-site (accessibilitygt5) all active-site and buried-site residues (accessibility5) respectively On the vertical axisresidues are grouped into (G P) aliphatic (AndashM) aromatic (FndashW) polar (SndashQ) and charged (DndashR) amino acids Residue numbers and substi-tutions are indicated on the horizontal and vertical axes respectively Each heatmap is colored according to the MSseq value of the mutant Greento red color gradation represents increasing MSseq values Zero value (light green) indicates that the corresponding mutant was not observed in thelibrary WT residue at each position is indicated in white Data for only representative residue positions are shown for clarity (D) Active-siteresidues (highlighted in cyan) identified from the mutational phenotypes mapped onto the crystal structure of CcdB (PDB ID 3VUB)

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2963

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

residue ldquoXrdquo at every buried-site was obtained Pair-wise com-parisons of these distributions were made using a Wilcoxonsigned-rank test The heatmap (fig 2A) indicates the log10 Pvalue for the null hypothesis that the introduction of the rowresidues at a buried site does not reduce protein functionsignificantly more than introduction of the corresponding col-umn residue at the same site It is important to note here thatboth the residues being compared are mutant residues Unliketypical amino acid substitution matrices (Henikoff andHenikoff 1992) used for sequence alignment our matrix isasymmetric Aspartate and Arginine mutants possess signifi-cantly higher MSseq values than 18 and 16 other residuesrespectively indicating that they are the least tolerated muta-tions Proline is the next most poorly tolerated mutation Pvalues for (D E) (N Q) and (S T) (row column) pairs are lowerthan for (E D) (Q N) and (T S) indicating that on an averagethe order of tolerance is Dlt E NltQ and Slt T Similarly foraromatic residue tolerances WltY Hlt F In order to exam-ine if these observations remain valid for systems other thanCcdB we examined previously published mutational sensitiv-ity data for PSD95pdz3 (McLaughlin et al 2012) and GB1 (Olsonet al 2014) (fig 2B and C) The general trends were very similarand confirm our observation that for buried sites smallercharged and polar residues are disfavored relative to largerones whereas the opposite is true for aromatic residuesClose examination of the log10 P values in figure 2A suggeststhat at buried sites the substitution preference is approxi-mately in the following order ACVLIMgt TgtFgtHYSgtQGWgtNgtKPEgt RgtD A similar (but not identical) trendis also visible in the PSD95pdz3 and GB1 data though this isbased on fewer buried positions and at a single expression

level Additional saturation mutagenesis studies on other sys-tems using quantitative or semi-quantitative readouts wouldbe useful in consolidating our observations

Substitution preferences at active-site residues should bedifferent than those at buried sites because proteinproteininterfaces are more polar than protein interiors (Janin et al1988 Tsai et al 1997) and are also likely to display a greatercontext dependence Extensive analysis of a large amount ofmutational data would be required to decipher these substi-tution preferences In the case of CcdB data for only 142active-site mutants is available Hence we did not attemptto predict mutational sensitivities at active-site residues

Mutational Tolerance as a Function of DepthMutational tolerances at the lowest (MID 2) and highest(MID 8) expression levels for all nonactive-site residues arelisted (supplementary table S2 Supplementary Material on-line and fig 1) At the lowest expression level mutationaltolerance increased with increasing accessibility while at thehighest expression level it is less sensitive to accessibility andmost mutants show an active phenotype Most substitutionsare tolerated at exposed nonactive-site residues both at lowand high expression levels (fig 1A and supplementary fig S1ASupplementary Material online) However a few mutantswith accessibilitygt 40 were found to show an inactive phe-notype These exposed inactive nonactive-site substitutionsare typically either aromatic residues or proline (supplementary table S4 Supplementary Material online) These exposedaromatic substitutions probably affect the folding of CcdBprotein as they show high propensity to aggregation al-though Tmrsquos are somewhat comparable to the wildtype (seemutants G29W L41F and V73F in supplementary table S5Supplementary Material online)

Cationndashp interactions are thought to contribute to pro-tein stability (Gallivan and Dougherty 1999) though an earlierstudy (Prajapati et al 2006) shows these contribute little tothe stability of Maltose Binding Protein We find that all the19 and 11 mutations at the 13th and 14th positions respec-tively involed in cationndashp interaction including the chargereversal mutant R13D were well tolerated even at the lowestexpression levels (supplementary table S6 SupplementaryMaterial online) Salt-bridges are another possible stabilizingnoncovalent electrostatic interaction in proteins In case ofCcdB five salt-bridges are present between the following pairsof residues D19-R31 D23-R31 E59-R40 E79-K4 and D89-R86All amino acids participating in salt-bridges are solvent ex-posed except for D19 in which only the terminal oxygens areexposed Mutations at all these positions are well toleratedeven at the lowest expression level (supplementary table S6Supplementary Material online) suggesting that none of thesalt-bridges in CcdB contributes significantly to the stability oractivity of the protein

We also examined the correlation of average MSseq valueswith residue depth for all nonactive-site positions in CcdB(PDB ID 3VUB) (fig 2D) Similar calculations were performedfor PSD95pdz3 and GB1 using the phenotypic data obtainedfrom (McLaughlin et al 2012) (PDB ID 1BE9) and (Olson et al2014) (PDB ID 1PGA) respectively In these studies the ability

Table 1 Mutational Tolerance at the Buried-Site Residues at Lowestand Highest Expression Levels

Aminoacid

No ofmutants

Depth(A)

ACCa

()Tol atMID2b ()

Tol atMID8b ()

V05 18 68 0 39 94F17 17 73 02 82 100V18 18 93 0 33 83D19 18 67 14 83 100V20cd 19 86 0 32 74Q21cd 19 65 1 63 100M32d 17 78 03 76 100V33 19 65 14 68 95I34 19 79 0 37 79L36 12 72 0 0 67P52 17 54 35 41 100V54 15 56 04 73 100M63 19 81 01 47 89T65 9 79 0 44 100M68cd 12 66 0 33 100L83 19 58 15 53 100I90 19 74 01 26 89A93cd 14 60 0 36 100I94cd 18 79 06 33 83M97cd 16 75 0 56 94F98cd 19 77 07 37 79

aSide-chain accessibilitybMutational tolerance at the lowest (MID 2) and highest (MID 8) expression levelscResidues within van der Waals distance of the active-site residuesdResidues present at dimer interface

Tripathi et al doi101093molbevmsw182 MBE

2964

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

FIG 2 Relative tolerance for substitutions at buried positions (A) Mutational sensitivity data at all buried positions obtained at differentexpression levels for CcdB was used to obtain the distribution of MSseq values for a given mutant residue The distributions for row and columnresidues were compared using a Wilcoxon signed-rank test and the corresponding P values were calculated A log10 of the P values is indicatedGradation from red to blue indicates increasing values log10 P ie decreasing destabilizing effect of the row residue wrt column residue A lowerP value implies that introduction of the row residue at a buried site is typically more destabilizing than introduction of the corresponding columnresidue (B and C) Similar plot but using DEx

i values derived from saturation mutagenesis of the PDZ domain (PSD95pdz3) and lnW values fromsaturation mutagenesis of IgG Binding domain of protein G (GB1) respectively (DndashF) Correlation of the average MSseq values DEx

i values and lnWvalues with side-chain depth for all nonactive-site residues of CcdB PSD95pdz3 and GB1 respectively Accessibility and depth values werecalculated based on the crystal structure of WT homodimeric CcdB (PDB ID 3VUB) PSD95pdz3 (PDB ID 1BE9) and GB1 (PDB ID 1PGA) A residuewas defined as buried if the side-chain accessibility is5

Table 2 Mutant Phenotype Prediction by MSpred SNAP2 and SuSPect

Protein Predictionmethod

Pearsonrsquos correlationcoefficienta

Matthews correlationcoefficientb

Sensitivityc() Specificityd() Accuracye()

CcdB MSpredf 069 065 69 95 90

SNAP2g 027 019 100 11 37SuSPecth 029 014 100 8 30

PSD95pdz3 MSpredf 057 053 61 93 88

SNAP2g 024 015 100 7 34SuSPecth 06 061 87 87 87

GB1 MSpredf 065 049 44 96 79

SNAP2g 027 011 100 3 42SuSPecth 008 003 73 24 38

aModulus of the correlation coefficientbMathews correlation coefficientfrac14 TP X TNFP X FN

ethTPthornFPTHORNethTPthornFNTHORNethTNthornFPTHORNethTNthornFNTHORN where TP TN FP FN are True Positives True Negatives False Positives and False Negatives respectivelycSensitivityfrac14 TP

TPthornFNdSpecificityfrac14 TN

TNthornFPeAccuracyfrac14 TPthornTN

TPthornTNthornFPthornFNfMutant was classified as nonneutral if MSpredgt 2 and neutral if the scorefrac14 2 Mutants were classified into true positives (TP) true negatives (TN) false positives (FP) and falsenegatives (FN)gMutant was classified as nonneutral if SNAP2 scoregt 50 and neutral if the scorelt50 50lt Scorelt 50 low reliability predictions and were omitted Mutants wereclassified into TP TN FP and FNhMutant was classified as nonneutral if SuSPect scoregt 75 and neutral if the scorelt 25 25lt Scorelt 75 low reliability predictions and were omitted Mutants were classifiedinto TP TN FP and FN

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2965

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

of these proteins to bind their cognate ligands is quantita-tively linked to a phenotypic readout In all the three casesaverage phenotypic effect was observed to increase with res-idue depth (Correlation coefficients of 085073 and061for CcdB PSD95pdz3 and GB1 respectively) (fig 2DndashF) Buriedpositions with small (A or G) wildtype residues were notincluded in the correlations These positions are unusuallysensitive to mutation because all substitutions result in largesteric overlap These data suggest that a large fraction of theaverage sensitivity to mutation at nonactive-site residues isgoverned by a single parameter the residue depth This is aremarkably simple metric that provides an alternative to thesector based models used to analyze mutational data forPSD95pdz3 as well as other proteins (McLaughlin et al 2012)

One alternative approach to estimating burial preferencesof amino acids is measuring free energies of transfer of aminoacid side-chain analogs from water to cyclohexane(Wolfenden et al 2015) Another approach is to measureaccessible surface areas of the side-chains averaged over alarge database of protein structures and either infer free en-ergies of transfer from aqueous solution into the protein in-terior as described previously (Rose et al 1985) or constructenvironment dependent substitution matrices from suchdata (Overington et al 1992) Relative DDGrsquos of burial fromthe first two approaches are shown in supplementary figureS1C and D Supplementary Material online Both of theseshow some qualitative similarities with the mutational datain figure 2AndashC but there are several notable differences Forexample the relative DDGrsquos of burial inferred from the freeenergy of transfer approaches show that the introduction ofW at buried positions is clearly favored over Y and H unlikethe situation for the experimental mutational data In addi-tion the transfer data predict that mutation to G and P willbe largely tolerated whereas the experimental mutationaldata suggest that substitutions to G or P are rarely toleratedIt is also observed in the mutational data-sets that at buriedsites smaller charged and polar residues are disfavored rela-tive to larger ones whereas the opposite trend is observed foraromatic residues In case of DDG transfer data the trend ispreserved for polar and charged residues but clearly not foraromatic residues

Prediction of Mutational Sensitivity Score (MSpred)Using Penalties Derived from the CcdB DataWe further determined whether the above observations re-garding substitution preferences could be employed for pre-diction of functional consequences of individual mutationsTo this end we developed a predictive model using a coher-ent set of rules derived from a randomly chosen subset of theCcdB mutational data containing 60 of the mutants andtested its applicability in predicting the mutational sensitivi-ties of the remaining 40 mutants as well as two other pro-teins PSD95pdz3 and GB1 The predicted score is denoted asMSpred

For CcdB mutational data a mutational sensitivity score(MSseq) of 2 is indicative of wild-type like behavior in themutant and higher values of MSseq indicate higher mutationalsensitivity Therefore a base MSpred value of 2 was assigned to

all the mutants in the test set and penalties were subse-quently added according to the nature of the substitutiontaking into account the wildtype residue identity As exposednonactive-site positions tolerated almost all substitutionspenalties were calculated only for buried positions We alsoobserved that buried side-chains that point outwards withrespect to the protein core are less sensitive to mutationscompared with the ones that point inside These residueswere identified by their side-chain depth values (seeldquoMaterials and Methodsrdquo section) and were not consideredfor penalty calculation

Substitutions were divided into categories based on thenature of the wildtype and mutant residue Each wildtype andmutant residue was assigned to one of six categories namelyaliphatic aromatic polar charged G and P resulting in a totalof 34 (362 [GG and PP]) types of substitutions TheCcdB data was randomly divided into training (60 data) andtest sets (40 data) The category penalty for each type ofsubstitution was calculated using only the training data set byaveraging the MSseq values observed for each category ofsubstitution and subtracting the base MSpred value of 2from the average MSseq Additional ldquoresidue-specific penal-tiesrdquo were also derived to account for the residue-size-wisesubstitution preferences eg smaller polar residues beingmore destabilizing than larger ones (Materials andMethods supplementary table S7 Supplementary Materialonline) Penalties for proline substitutions (both buried andexposed) were derived using the flowchart described previ-ously (Bajaj et al 2007) Next MSpred values were calculatedfor all buried positions based on these penalties(MSpredfrac14 2thorn category penaltythorn residue-specific penalty)and all exposed nonactive-site positions were assigned anMSpred of 2 Active-site residues were not considered in theanalysis The predicted mutational sensitivity scores (MSpred)for the test data set showed a high Pearsonrsquos correlation(rfrac14 069) with the experimental MSseq values and a SD of126 (table 2) We also derived the Matthews correlation co-efficient in order to evaluate the performance of MSpred inclassifying mutants as neutral and nonneutral (see ldquoMaterialsand Methodsrdquo section) It was observed to be 065 (table 2)

We tested the performance of MSpred on two other pro-teins The MSpred values for PSD95pdz3 and GB1 agreed wellwith the experimental mutational sensitivity data withPearsonrsquos correlation coefficients of 057 and 065 andMatthews correlation coefficients of 053 and 049 respec-tively (table 2)

We also carried out mutational sensitivity predictions forCcdB PSD95pdz3 and GB1 using two frequently used meth-ods SNAP2 (Hecht et al 2015) and SuSPect (Yates et al 2014)Both SNAP2 and SuSPect show poorer correlation with theexperimental mutational sensitivity data than MSpred (exceptSuSPect predictions for PSD95pdz3 table 2) Both the methodsshow a very high sensitivity but a very low specificity valuecompared with MSpred Thus MSpred which is derived basedon very simple rules compares favorably with the popularmachine learning based methods SNAP2 and SuSPect Thisapproach should work to rank order mutational effects atburied sites for other globular proteins While a three

Tripathi et al doi101093molbevmsw182 MBE

2966

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

dimensional structure is not essential it is important to haveresidue burial information because predictions have beenoptimized for buried residues A saturation mutagenesisdata set is also not required However it is important tohave experimental data on the functional effects of multiplepoint mutants to decide on the cutoff value of MSpred thatwould result in an observable phenotype This value wouldlikely depend on factors such as intrinsic protein stabilityexpression level and gene essentiality that would vary fromone protein to another (Miosge et al 2015)

In Vitro Determined Apparent Tmrsquos Correlate Betterwith in Vivo Solubility than with Relative ActivityDerived from Deep SequencingTo experimentally probe the molecular basis for mutant phe-notypes at nonactive-site positions around 80 single-site mu-tants of CcdB were selected from the saturation mutagenesislibrary (Bajaj et al 2008) based on MSseq and accessibility class(Adkar et al 2012) (supplementary table S5 SupplementaryMaterial online) All the mutants were purified by affinitypurification against immobilized ligands GyrA or CcdAEach purified protein was subjected to thermal denaturationmonitored using Sypro orange dye (Niesen et al 2007) andthe apparent Tm was calculated for each mutant (supplemen

tary fig S3A and table S5 Supplementary Material online)During purification of various CcdB mutants it is possiblethat the protein may be inactivated by aggregation or mis-folding Hence the ability of purified protein to bind CcdA wasexamined by monitoring the thermal denaturation of eachmutant in the absence and presence of a CcdA peptide thatcontains CcdB binding residues (residues 46ndash72) If the mu-tant binds the CcdA peptide this should result in an increasein its apparent Tm (supplementary fig S3B SupplementaryMaterial online) (Fukada et al 1983 Brandts and Lin 1990Gonzalez et al 1999) There were nine mutants eg V05SV05L Y06G and F17D that did not show an increase in ap-parent Tm in the presence of CcdA peptide (supplementarytable S5 Supplementary Material online) suggesting thatthese are misfolded or aggregated hence these mutantswere removed from the analysis Most of these mutants arelargely found in inclusion bodies and have Tmrsquos between 40 Cand 50 C in contrast to WT CcdB which has a Tm of 684 CFurther studies were restricted to the remaining 71 mutantsthat showed an enhancement in thermal stability in the pres-ence of CcdA peptide (supplementary fig S3BSupplementary Material online) Mutants showed a rangeof apparent Tmrsquos (supplementary table S5 SupplementaryMaterial online) When in vitro determined thermal stabilitywas compared with in vivo phenotypes (MSseq) determined

FIG 3 Correlation between apparent in vitro Tm in vivo solubility and activity (MSseq value) for CcdB mutants Correlations of DTm [Tm (WT)Tm

(Mutant)] for 67 single-site mutants with (A) in vivo activity and (B) in vivo fraction of soluble protein respectively (C) Correlation of relativethermal stability (DTm) of mutants with DDGo of unfolding estimated by GdnHCl denaturation (D) Correlation of fraction of protein in thesoluble fraction with in vivo activity of mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2967

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

by deep sequencing a moderate correlation (rfrac14 065) wasobtained (fig 3A) However there were many mutants thatshowed similar activity but differed substantially in their sta-bility such as L16S V18T D19N V54E (supplementary tableS5 Supplementary Material online) Conversely there werealso mutants (eg V33D M32N) that showed similar thermalstability to wildtype but had substantially lower activityin vivo This shows that the in vivo activity of a protein de-pends on many factors inside a cell which assist in properfolding and maintaining an active conformation Since theapparent Tm determined by the thermal shift assay may notreflect the true thermodynamic stability of the protein asubset of 21 mutants was also subjected to GdnHCl chemicaldenaturation These mutants were chosen to span a range ofTm and MSseq values These measurements were done to see ifthe two measures of stability ie thermal and chemical de-naturation correlate with one another It was found thatboth measures of stability were highly correlated (fig 3Cand supplementary table S8 Supplementary Material online)

Various mutations have different effects on protein stabil-ity and activity Properly folded proteins are found in thesoluble fraction of the cell lysate whereas misfolded proteinsoften form insoluble aggregates called inclusion bodiesHence misfolding reduces the amount of active solubleand functional protein though studies have shown thatsome amount of protein in the soluble fraction can also bemisfolded (Liu et al 2014) To study the relation betweenin vivo solubility of CcdB mutants with in vitro determinedthermal stability E coli strain CSH501 (which has a mutationin the gyrA gene and is hence resistant towards CcdB action)was transformed individually with the mutants and theamount of protein in both the soluble fraction and in inclu-sion bodies was estimated Surprisingly for a few mutantsalthough very little protein was found in the soluble fractionthese showed an active phenotype with an MSseq of 2 (fig3D) Hence for these mutants the small amount of proteinpresent in the soluble fraction is properly folded and sufficientto cause cell death in a CcdB sensitive strain In some casesdifferent mutants have similar fractions of soluble proteinin vivo but have different in vivo activity and in vitro thermalstability (supplementary table S5 Supplementary Materialonline) The overall thermal stabilities of mutants correlatedwell with the in vivo amount of soluble protein (fig 3B) Thisindicates that protein stability is an important determinant ofproper folding in vivo The moderate correlation of stability orsolubility with in vivo activity likely arises because only a smallamount of properly folded soluble protein is sufficient toresult in an active phenotype

One reason for the lack of a better correlation betweensolubility and in vivo activity is that for each mutant variousconformational forms of the protein can partition differentlyin the soluble and insoluble fractions of the cell lysate Thesoluble fraction can comprise both of folded protein which isactive and soluble aggregatespartially misfolded proteinwhich are inactive (Liu et al 2014) Moreover this partitioningcan be influenced by perturbations in the cytosolic proteo-stasis network To study the relation between in vivo activityand solubility the ability of four selected CcdB mutants

(V33K and Y06G as examples of active but insoluble mutantsand R31G and V80N as examples of soluble but inactivemutants) in the soluble fraction of the cell lysate to bindGyrase was monitored by surface plasmon resonance (supplementary fig S4A and B Supplementary Material online)Mutants with only a small amount of protein in the solublefraction but displaying an active phenotype in vivo (V33KY06G) showed binding to Gyrase comparable to the wild-type in this surface plasmon resonance assay showing thatthe protein is well folded Whereas in cases where a mutant ismostly in the soluble fraction but shows an inactive pheno-type in vivo (R31G V80N) the in vitro binding with Gyrasewas also negligible compared with the wild-type (supplementary fig S4C Supplementary Material online)

Refolding and Unfolding KineticsRefolding and unfolding kinetics for 10 mutants that havesimilar thermal stability but different in vivo solubility andactivity were monitored by time-course fluorescence spec-troscopy at 25 C Refolding and unfolding were carried outat pH 74 at final GdnHCl concentrations of 06 and 32 Mrespectively Of the 10 selected mutants four (V05S I56GV18R and V18H) could not be studied for their refoldingprofiles due to high precipitation immediately following pu-rification Further for these mutants the proportion in thesoluble fraction in vivo was low ranging from 01 to 03 (supplementary table S5 Supplementary Material online) Ofthese V05S and I56G are active (MSseqfrac142) whereas V18Rand V18H show an inactive phenotype (MSseq of 9 and 6respectively) Most mutants (except V80N) showed slowerrefolding kinetics than the wild-type indicating that thesemutants are folding defective (table 3) Refolding for the wild-type occurs with a significant burst phase (kgt 05 s 1) and aslow phase Mutants typically show a much smaller burstphase an intermediate phase and a slow phase of muchhigher amplitude than the wildtype Most mutants showunfolding kinetics similar to the wildtype except V54Ewhich shows a much higher unfolding rate The ability ofthe refolded mutants to bind to the cognate ligand GyrAor the CcdA peptide (residues 46ndash72) was also monitoredBinding of refolded mutants to immobilized GyrA onAmine Reactive Second Generation (AR2G) biosensorswas monitored using Bio-layer interferometry (Sultanaand Lee 2015) and the binding to CcdA peptide was mon-itored using Thermal Shift Assay (Niesen et al 2007) Activemutants (L16S V18T) retained their binding to both GyrAand CcdA upon refolding even though their refolding ki-netics was slow (table 3) Surprisingly V54E which is also anactive mutant failed to bind GyrA and CcdA upon refold-ing even though the native protein showed binding (supplementary fig S6 Supplementary Material online) On theother hand the inactive mutants R31G and M63N did notbind to GyrA and CcdA after refolding (table 3) showingthat their refolded state is nonnative Interestingly thenative V80N mutant did not show any binding to GyrAbut the refolded protein binds weakly to both ligands Twoof these mutants V80N and V54E also show formation ofhigher order oligomers (supplementary fig S5

Tripathi et al doi101093molbevmsw182 MBE

2968

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

buried residues are equally sensitive to mutation (supplementary table S3 Supplementary Material online) ResidueD19 is the only buried potentially charged residue and yetsurprisingly shows the highest mutational tolerance relativeto other buried residues Although the residue is largelyburied the side-chain points outwards towards solvent ex-plaining its high tolerance to mutation A subset of buriedresidues most sensitive to mutation was selected using thefollowing criteria tolerance at MID 2lt 40 tolerance atMID 8lt 90 and phenotypic data for 15 mutants is avail-able Interestingly this selected subset (V18 V20 I34 I90and I94) clusters together in the interior of each monomer(supplementary fig S1E Supplementary Material online)

On analyzing the mutational tolerance as a function ofmutant amino acid at buried residues we found that at thelowest expression level D R and P are the least toleratedmutations and tolerance decreases in the order ali-phaticgt aromatic polargt charged Interestingly for charged

and polar amino acids smaller amino acids were consistentlymore poorly tolerated than larger ones (compare D E N Q ST tolerances in supplementary table S2 SupplementaryMaterial online) The opposite trend is observed for aromaticsubstitutions where tolerance decreases in order FgtYHgtW D and R are the least tolerated substitutions (fig1C) though most other mutations are well tolerated at thehighest expression level (supplementary table S2Supplementary Material online) The poor tolerance for aburied Aspartate at all expression levels is likely due to theinability of the small charged side-chain to be solvated uponburial and reconfirms our earlier result (Bajaj et al 2005)indicating that Aspartate mutant phenotypes are good indi-cators of residue burial

We further attempted to quantitate the relative prefer-ence for different substitutions for all buried positions byincorporating phenotypic data at multiple expression levelsThe distribution of MSseq values for introducing a specific

FIG 1 Mutational effects on CcdB protein activity inferred from phenotypic screening and deep sequencing (A) (B) and (C) show the MSseq valuesfor representative exposed-site (accessibilitygt5) all active-site and buried-site residues (accessibility5) respectively On the vertical axisresidues are grouped into (G P) aliphatic (AndashM) aromatic (FndashW) polar (SndashQ) and charged (DndashR) amino acids Residue numbers and substi-tutions are indicated on the horizontal and vertical axes respectively Each heatmap is colored according to the MSseq value of the mutant Greento red color gradation represents increasing MSseq values Zero value (light green) indicates that the corresponding mutant was not observed in thelibrary WT residue at each position is indicated in white Data for only representative residue positions are shown for clarity (D) Active-siteresidues (highlighted in cyan) identified from the mutational phenotypes mapped onto the crystal structure of CcdB (PDB ID 3VUB)

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2963

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

residue ldquoXrdquo at every buried-site was obtained Pair-wise com-parisons of these distributions were made using a Wilcoxonsigned-rank test The heatmap (fig 2A) indicates the log10 Pvalue for the null hypothesis that the introduction of the rowresidues at a buried site does not reduce protein functionsignificantly more than introduction of the corresponding col-umn residue at the same site It is important to note here thatboth the residues being compared are mutant residues Unliketypical amino acid substitution matrices (Henikoff andHenikoff 1992) used for sequence alignment our matrix isasymmetric Aspartate and Arginine mutants possess signifi-cantly higher MSseq values than 18 and 16 other residuesrespectively indicating that they are the least tolerated muta-tions Proline is the next most poorly tolerated mutation Pvalues for (D E) (N Q) and (S T) (row column) pairs are lowerthan for (E D) (Q N) and (T S) indicating that on an averagethe order of tolerance is Dlt E NltQ and Slt T Similarly foraromatic residue tolerances WltY Hlt F In order to exam-ine if these observations remain valid for systems other thanCcdB we examined previously published mutational sensitiv-ity data for PSD95pdz3 (McLaughlin et al 2012) and GB1 (Olsonet al 2014) (fig 2B and C) The general trends were very similarand confirm our observation that for buried sites smallercharged and polar residues are disfavored relative to largerones whereas the opposite is true for aromatic residuesClose examination of the log10 P values in figure 2A suggeststhat at buried sites the substitution preference is approxi-mately in the following order ACVLIMgt TgtFgtHYSgtQGWgtNgtKPEgt RgtD A similar (but not identical) trendis also visible in the PSD95pdz3 and GB1 data though this isbased on fewer buried positions and at a single expression

level Additional saturation mutagenesis studies on other sys-tems using quantitative or semi-quantitative readouts wouldbe useful in consolidating our observations

Substitution preferences at active-site residues should bedifferent than those at buried sites because proteinproteininterfaces are more polar than protein interiors (Janin et al1988 Tsai et al 1997) and are also likely to display a greatercontext dependence Extensive analysis of a large amount ofmutational data would be required to decipher these substi-tution preferences In the case of CcdB data for only 142active-site mutants is available Hence we did not attemptto predict mutational sensitivities at active-site residues

Mutational Tolerance as a Function of DepthMutational tolerances at the lowest (MID 2) and highest(MID 8) expression levels for all nonactive-site residues arelisted (supplementary table S2 Supplementary Material on-line and fig 1) At the lowest expression level mutationaltolerance increased with increasing accessibility while at thehighest expression level it is less sensitive to accessibility andmost mutants show an active phenotype Most substitutionsare tolerated at exposed nonactive-site residues both at lowand high expression levels (fig 1A and supplementary fig S1ASupplementary Material online) However a few mutantswith accessibilitygt 40 were found to show an inactive phe-notype These exposed inactive nonactive-site substitutionsare typically either aromatic residues or proline (supplementary table S4 Supplementary Material online) These exposedaromatic substitutions probably affect the folding of CcdBprotein as they show high propensity to aggregation al-though Tmrsquos are somewhat comparable to the wildtype (seemutants G29W L41F and V73F in supplementary table S5Supplementary Material online)

Cationndashp interactions are thought to contribute to pro-tein stability (Gallivan and Dougherty 1999) though an earlierstudy (Prajapati et al 2006) shows these contribute little tothe stability of Maltose Binding Protein We find that all the19 and 11 mutations at the 13th and 14th positions respec-tively involed in cationndashp interaction including the chargereversal mutant R13D were well tolerated even at the lowestexpression levels (supplementary table S6 SupplementaryMaterial online) Salt-bridges are another possible stabilizingnoncovalent electrostatic interaction in proteins In case ofCcdB five salt-bridges are present between the following pairsof residues D19-R31 D23-R31 E59-R40 E79-K4 and D89-R86All amino acids participating in salt-bridges are solvent ex-posed except for D19 in which only the terminal oxygens areexposed Mutations at all these positions are well toleratedeven at the lowest expression level (supplementary table S6Supplementary Material online) suggesting that none of thesalt-bridges in CcdB contributes significantly to the stability oractivity of the protein

We also examined the correlation of average MSseq valueswith residue depth for all nonactive-site positions in CcdB(PDB ID 3VUB) (fig 2D) Similar calculations were performedfor PSD95pdz3 and GB1 using the phenotypic data obtainedfrom (McLaughlin et al 2012) (PDB ID 1BE9) and (Olson et al2014) (PDB ID 1PGA) respectively In these studies the ability

Table 1 Mutational Tolerance at the Buried-Site Residues at Lowestand Highest Expression Levels

Aminoacid

No ofmutants

Depth(A)

ACCa

()Tol atMID2b ()

Tol atMID8b ()

V05 18 68 0 39 94F17 17 73 02 82 100V18 18 93 0 33 83D19 18 67 14 83 100V20cd 19 86 0 32 74Q21cd 19 65 1 63 100M32d 17 78 03 76 100V33 19 65 14 68 95I34 19 79 0 37 79L36 12 72 0 0 67P52 17 54 35 41 100V54 15 56 04 73 100M63 19 81 01 47 89T65 9 79 0 44 100M68cd 12 66 0 33 100L83 19 58 15 53 100I90 19 74 01 26 89A93cd 14 60 0 36 100I94cd 18 79 06 33 83M97cd 16 75 0 56 94F98cd 19 77 07 37 79

aSide-chain accessibilitybMutational tolerance at the lowest (MID 2) and highest (MID 8) expression levelscResidues within van der Waals distance of the active-site residuesdResidues present at dimer interface

Tripathi et al doi101093molbevmsw182 MBE

2964

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

FIG 2 Relative tolerance for substitutions at buried positions (A) Mutational sensitivity data at all buried positions obtained at differentexpression levels for CcdB was used to obtain the distribution of MSseq values for a given mutant residue The distributions for row and columnresidues were compared using a Wilcoxon signed-rank test and the corresponding P values were calculated A log10 of the P values is indicatedGradation from red to blue indicates increasing values log10 P ie decreasing destabilizing effect of the row residue wrt column residue A lowerP value implies that introduction of the row residue at a buried site is typically more destabilizing than introduction of the corresponding columnresidue (B and C) Similar plot but using DEx

i values derived from saturation mutagenesis of the PDZ domain (PSD95pdz3) and lnW values fromsaturation mutagenesis of IgG Binding domain of protein G (GB1) respectively (DndashF) Correlation of the average MSseq values DEx

i values and lnWvalues with side-chain depth for all nonactive-site residues of CcdB PSD95pdz3 and GB1 respectively Accessibility and depth values werecalculated based on the crystal structure of WT homodimeric CcdB (PDB ID 3VUB) PSD95pdz3 (PDB ID 1BE9) and GB1 (PDB ID 1PGA) A residuewas defined as buried if the side-chain accessibility is5

Table 2 Mutant Phenotype Prediction by MSpred SNAP2 and SuSPect

Protein Predictionmethod

Pearsonrsquos correlationcoefficienta

Matthews correlationcoefficientb

Sensitivityc() Specificityd() Accuracye()

CcdB MSpredf 069 065 69 95 90

SNAP2g 027 019 100 11 37SuSPecth 029 014 100 8 30

PSD95pdz3 MSpredf 057 053 61 93 88

SNAP2g 024 015 100 7 34SuSPecth 06 061 87 87 87

GB1 MSpredf 065 049 44 96 79

SNAP2g 027 011 100 3 42SuSPecth 008 003 73 24 38

aModulus of the correlation coefficientbMathews correlation coefficientfrac14 TP X TNFP X FN

ethTPthornFPTHORNethTPthornFNTHORNethTNthornFPTHORNethTNthornFNTHORN where TP TN FP FN are True Positives True Negatives False Positives and False Negatives respectivelycSensitivityfrac14 TP

TPthornFNdSpecificityfrac14 TN

TNthornFPeAccuracyfrac14 TPthornTN

TPthornTNthornFPthornFNfMutant was classified as nonneutral if MSpredgt 2 and neutral if the scorefrac14 2 Mutants were classified into true positives (TP) true negatives (TN) false positives (FP) and falsenegatives (FN)gMutant was classified as nonneutral if SNAP2 scoregt 50 and neutral if the scorelt50 50lt Scorelt 50 low reliability predictions and were omitted Mutants wereclassified into TP TN FP and FNhMutant was classified as nonneutral if SuSPect scoregt 75 and neutral if the scorelt 25 25lt Scorelt 75 low reliability predictions and were omitted Mutants were classifiedinto TP TN FP and FN

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2965

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

of these proteins to bind their cognate ligands is quantita-tively linked to a phenotypic readout In all the three casesaverage phenotypic effect was observed to increase with res-idue depth (Correlation coefficients of 085073 and061for CcdB PSD95pdz3 and GB1 respectively) (fig 2DndashF) Buriedpositions with small (A or G) wildtype residues were notincluded in the correlations These positions are unusuallysensitive to mutation because all substitutions result in largesteric overlap These data suggest that a large fraction of theaverage sensitivity to mutation at nonactive-site residues isgoverned by a single parameter the residue depth This is aremarkably simple metric that provides an alternative to thesector based models used to analyze mutational data forPSD95pdz3 as well as other proteins (McLaughlin et al 2012)

One alternative approach to estimating burial preferencesof amino acids is measuring free energies of transfer of aminoacid side-chain analogs from water to cyclohexane(Wolfenden et al 2015) Another approach is to measureaccessible surface areas of the side-chains averaged over alarge database of protein structures and either infer free en-ergies of transfer from aqueous solution into the protein in-terior as described previously (Rose et al 1985) or constructenvironment dependent substitution matrices from suchdata (Overington et al 1992) Relative DDGrsquos of burial fromthe first two approaches are shown in supplementary figureS1C and D Supplementary Material online Both of theseshow some qualitative similarities with the mutational datain figure 2AndashC but there are several notable differences Forexample the relative DDGrsquos of burial inferred from the freeenergy of transfer approaches show that the introduction ofW at buried positions is clearly favored over Y and H unlikethe situation for the experimental mutational data In addi-tion the transfer data predict that mutation to G and P willbe largely tolerated whereas the experimental mutationaldata suggest that substitutions to G or P are rarely toleratedIt is also observed in the mutational data-sets that at buriedsites smaller charged and polar residues are disfavored rela-tive to larger ones whereas the opposite trend is observed foraromatic residues In case of DDG transfer data the trend ispreserved for polar and charged residues but clearly not foraromatic residues

Prediction of Mutational Sensitivity Score (MSpred)Using Penalties Derived from the CcdB DataWe further determined whether the above observations re-garding substitution preferences could be employed for pre-diction of functional consequences of individual mutationsTo this end we developed a predictive model using a coher-ent set of rules derived from a randomly chosen subset of theCcdB mutational data containing 60 of the mutants andtested its applicability in predicting the mutational sensitivi-ties of the remaining 40 mutants as well as two other pro-teins PSD95pdz3 and GB1 The predicted score is denoted asMSpred

For CcdB mutational data a mutational sensitivity score(MSseq) of 2 is indicative of wild-type like behavior in themutant and higher values of MSseq indicate higher mutationalsensitivity Therefore a base MSpred value of 2 was assigned to

all the mutants in the test set and penalties were subse-quently added according to the nature of the substitutiontaking into account the wildtype residue identity As exposednonactive-site positions tolerated almost all substitutionspenalties were calculated only for buried positions We alsoobserved that buried side-chains that point outwards withrespect to the protein core are less sensitive to mutationscompared with the ones that point inside These residueswere identified by their side-chain depth values (seeldquoMaterials and Methodsrdquo section) and were not consideredfor penalty calculation

Substitutions were divided into categories based on thenature of the wildtype and mutant residue Each wildtype andmutant residue was assigned to one of six categories namelyaliphatic aromatic polar charged G and P resulting in a totalof 34 (362 [GG and PP]) types of substitutions TheCcdB data was randomly divided into training (60 data) andtest sets (40 data) The category penalty for each type ofsubstitution was calculated using only the training data set byaveraging the MSseq values observed for each category ofsubstitution and subtracting the base MSpred value of 2from the average MSseq Additional ldquoresidue-specific penal-tiesrdquo were also derived to account for the residue-size-wisesubstitution preferences eg smaller polar residues beingmore destabilizing than larger ones (Materials andMethods supplementary table S7 Supplementary Materialonline) Penalties for proline substitutions (both buried andexposed) were derived using the flowchart described previ-ously (Bajaj et al 2007) Next MSpred values were calculatedfor all buried positions based on these penalties(MSpredfrac14 2thorn category penaltythorn residue-specific penalty)and all exposed nonactive-site positions were assigned anMSpred of 2 Active-site residues were not considered in theanalysis The predicted mutational sensitivity scores (MSpred)for the test data set showed a high Pearsonrsquos correlation(rfrac14 069) with the experimental MSseq values and a SD of126 (table 2) We also derived the Matthews correlation co-efficient in order to evaluate the performance of MSpred inclassifying mutants as neutral and nonneutral (see ldquoMaterialsand Methodsrdquo section) It was observed to be 065 (table 2)

We tested the performance of MSpred on two other pro-teins The MSpred values for PSD95pdz3 and GB1 agreed wellwith the experimental mutational sensitivity data withPearsonrsquos correlation coefficients of 057 and 065 andMatthews correlation coefficients of 053 and 049 respec-tively (table 2)

We also carried out mutational sensitivity predictions forCcdB PSD95pdz3 and GB1 using two frequently used meth-ods SNAP2 (Hecht et al 2015) and SuSPect (Yates et al 2014)Both SNAP2 and SuSPect show poorer correlation with theexperimental mutational sensitivity data than MSpred (exceptSuSPect predictions for PSD95pdz3 table 2) Both the methodsshow a very high sensitivity but a very low specificity valuecompared with MSpred Thus MSpred which is derived basedon very simple rules compares favorably with the popularmachine learning based methods SNAP2 and SuSPect Thisapproach should work to rank order mutational effects atburied sites for other globular proteins While a three

Tripathi et al doi101093molbevmsw182 MBE

2966

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

dimensional structure is not essential it is important to haveresidue burial information because predictions have beenoptimized for buried residues A saturation mutagenesisdata set is also not required However it is important tohave experimental data on the functional effects of multiplepoint mutants to decide on the cutoff value of MSpred thatwould result in an observable phenotype This value wouldlikely depend on factors such as intrinsic protein stabilityexpression level and gene essentiality that would vary fromone protein to another (Miosge et al 2015)

In Vitro Determined Apparent Tmrsquos Correlate Betterwith in Vivo Solubility than with Relative ActivityDerived from Deep SequencingTo experimentally probe the molecular basis for mutant phe-notypes at nonactive-site positions around 80 single-site mu-tants of CcdB were selected from the saturation mutagenesislibrary (Bajaj et al 2008) based on MSseq and accessibility class(Adkar et al 2012) (supplementary table S5 SupplementaryMaterial online) All the mutants were purified by affinitypurification against immobilized ligands GyrA or CcdAEach purified protein was subjected to thermal denaturationmonitored using Sypro orange dye (Niesen et al 2007) andthe apparent Tm was calculated for each mutant (supplemen

tary fig S3A and table S5 Supplementary Material online)During purification of various CcdB mutants it is possiblethat the protein may be inactivated by aggregation or mis-folding Hence the ability of purified protein to bind CcdA wasexamined by monitoring the thermal denaturation of eachmutant in the absence and presence of a CcdA peptide thatcontains CcdB binding residues (residues 46ndash72) If the mu-tant binds the CcdA peptide this should result in an increasein its apparent Tm (supplementary fig S3B SupplementaryMaterial online) (Fukada et al 1983 Brandts and Lin 1990Gonzalez et al 1999) There were nine mutants eg V05SV05L Y06G and F17D that did not show an increase in ap-parent Tm in the presence of CcdA peptide (supplementarytable S5 Supplementary Material online) suggesting thatthese are misfolded or aggregated hence these mutantswere removed from the analysis Most of these mutants arelargely found in inclusion bodies and have Tmrsquos between 40 Cand 50 C in contrast to WT CcdB which has a Tm of 684 CFurther studies were restricted to the remaining 71 mutantsthat showed an enhancement in thermal stability in the pres-ence of CcdA peptide (supplementary fig S3BSupplementary Material online) Mutants showed a rangeof apparent Tmrsquos (supplementary table S5 SupplementaryMaterial online) When in vitro determined thermal stabilitywas compared with in vivo phenotypes (MSseq) determined

FIG 3 Correlation between apparent in vitro Tm in vivo solubility and activity (MSseq value) for CcdB mutants Correlations of DTm [Tm (WT)Tm

(Mutant)] for 67 single-site mutants with (A) in vivo activity and (B) in vivo fraction of soluble protein respectively (C) Correlation of relativethermal stability (DTm) of mutants with DDGo of unfolding estimated by GdnHCl denaturation (D) Correlation of fraction of protein in thesoluble fraction with in vivo activity of mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2967

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

by deep sequencing a moderate correlation (rfrac14 065) wasobtained (fig 3A) However there were many mutants thatshowed similar activity but differed substantially in their sta-bility such as L16S V18T D19N V54E (supplementary tableS5 Supplementary Material online) Conversely there werealso mutants (eg V33D M32N) that showed similar thermalstability to wildtype but had substantially lower activityin vivo This shows that the in vivo activity of a protein de-pends on many factors inside a cell which assist in properfolding and maintaining an active conformation Since theapparent Tm determined by the thermal shift assay may notreflect the true thermodynamic stability of the protein asubset of 21 mutants was also subjected to GdnHCl chemicaldenaturation These mutants were chosen to span a range ofTm and MSseq values These measurements were done to see ifthe two measures of stability ie thermal and chemical de-naturation correlate with one another It was found thatboth measures of stability were highly correlated (fig 3Cand supplementary table S8 Supplementary Material online)

Various mutations have different effects on protein stabil-ity and activity Properly folded proteins are found in thesoluble fraction of the cell lysate whereas misfolded proteinsoften form insoluble aggregates called inclusion bodiesHence misfolding reduces the amount of active solubleand functional protein though studies have shown thatsome amount of protein in the soluble fraction can also bemisfolded (Liu et al 2014) To study the relation betweenin vivo solubility of CcdB mutants with in vitro determinedthermal stability E coli strain CSH501 (which has a mutationin the gyrA gene and is hence resistant towards CcdB action)was transformed individually with the mutants and theamount of protein in both the soluble fraction and in inclu-sion bodies was estimated Surprisingly for a few mutantsalthough very little protein was found in the soluble fractionthese showed an active phenotype with an MSseq of 2 (fig3D) Hence for these mutants the small amount of proteinpresent in the soluble fraction is properly folded and sufficientto cause cell death in a CcdB sensitive strain In some casesdifferent mutants have similar fractions of soluble proteinin vivo but have different in vivo activity and in vitro thermalstability (supplementary table S5 Supplementary Materialonline) The overall thermal stabilities of mutants correlatedwell with the in vivo amount of soluble protein (fig 3B) Thisindicates that protein stability is an important determinant ofproper folding in vivo The moderate correlation of stability orsolubility with in vivo activity likely arises because only a smallamount of properly folded soluble protein is sufficient toresult in an active phenotype

One reason for the lack of a better correlation betweensolubility and in vivo activity is that for each mutant variousconformational forms of the protein can partition differentlyin the soluble and insoluble fractions of the cell lysate Thesoluble fraction can comprise both of folded protein which isactive and soluble aggregatespartially misfolded proteinwhich are inactive (Liu et al 2014) Moreover this partitioningcan be influenced by perturbations in the cytosolic proteo-stasis network To study the relation between in vivo activityand solubility the ability of four selected CcdB mutants

(V33K and Y06G as examples of active but insoluble mutantsand R31G and V80N as examples of soluble but inactivemutants) in the soluble fraction of the cell lysate to bindGyrase was monitored by surface plasmon resonance (supplementary fig S4A and B Supplementary Material online)Mutants with only a small amount of protein in the solublefraction but displaying an active phenotype in vivo (V33KY06G) showed binding to Gyrase comparable to the wild-type in this surface plasmon resonance assay showing thatthe protein is well folded Whereas in cases where a mutant ismostly in the soluble fraction but shows an inactive pheno-type in vivo (R31G V80N) the in vitro binding with Gyrasewas also negligible compared with the wild-type (supplementary fig S4C Supplementary Material online)

Refolding and Unfolding KineticsRefolding and unfolding kinetics for 10 mutants that havesimilar thermal stability but different in vivo solubility andactivity were monitored by time-course fluorescence spec-troscopy at 25 C Refolding and unfolding were carried outat pH 74 at final GdnHCl concentrations of 06 and 32 Mrespectively Of the 10 selected mutants four (V05S I56GV18R and V18H) could not be studied for their refoldingprofiles due to high precipitation immediately following pu-rification Further for these mutants the proportion in thesoluble fraction in vivo was low ranging from 01 to 03 (supplementary table S5 Supplementary Material online) Ofthese V05S and I56G are active (MSseqfrac142) whereas V18Rand V18H show an inactive phenotype (MSseq of 9 and 6respectively) Most mutants (except V80N) showed slowerrefolding kinetics than the wild-type indicating that thesemutants are folding defective (table 3) Refolding for the wild-type occurs with a significant burst phase (kgt 05 s 1) and aslow phase Mutants typically show a much smaller burstphase an intermediate phase and a slow phase of muchhigher amplitude than the wildtype Most mutants showunfolding kinetics similar to the wildtype except V54Ewhich shows a much higher unfolding rate The ability ofthe refolded mutants to bind to the cognate ligand GyrAor the CcdA peptide (residues 46ndash72) was also monitoredBinding of refolded mutants to immobilized GyrA onAmine Reactive Second Generation (AR2G) biosensorswas monitored using Bio-layer interferometry (Sultanaand Lee 2015) and the binding to CcdA peptide was mon-itored using Thermal Shift Assay (Niesen et al 2007) Activemutants (L16S V18T) retained their binding to both GyrAand CcdA upon refolding even though their refolding ki-netics was slow (table 3) Surprisingly V54E which is also anactive mutant failed to bind GyrA and CcdA upon refold-ing even though the native protein showed binding (supplementary fig S6 Supplementary Material online) On theother hand the inactive mutants R31G and M63N did notbind to GyrA and CcdA after refolding (table 3) showingthat their refolded state is nonnative Interestingly thenative V80N mutant did not show any binding to GyrAbut the refolded protein binds weakly to both ligands Twoof these mutants V80N and V54E also show formation ofhigher order oligomers (supplementary fig S5

Tripathi et al doi101093molbevmsw182 MBE

2968

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

residue ldquoXrdquo at every buried-site was obtained Pair-wise com-parisons of these distributions were made using a Wilcoxonsigned-rank test The heatmap (fig 2A) indicates the log10 Pvalue for the null hypothesis that the introduction of the rowresidues at a buried site does not reduce protein functionsignificantly more than introduction of the corresponding col-umn residue at the same site It is important to note here thatboth the residues being compared are mutant residues Unliketypical amino acid substitution matrices (Henikoff andHenikoff 1992) used for sequence alignment our matrix isasymmetric Aspartate and Arginine mutants possess signifi-cantly higher MSseq values than 18 and 16 other residuesrespectively indicating that they are the least tolerated muta-tions Proline is the next most poorly tolerated mutation Pvalues for (D E) (N Q) and (S T) (row column) pairs are lowerthan for (E D) (Q N) and (T S) indicating that on an averagethe order of tolerance is Dlt E NltQ and Slt T Similarly foraromatic residue tolerances WltY Hlt F In order to exam-ine if these observations remain valid for systems other thanCcdB we examined previously published mutational sensitiv-ity data for PSD95pdz3 (McLaughlin et al 2012) and GB1 (Olsonet al 2014) (fig 2B and C) The general trends were very similarand confirm our observation that for buried sites smallercharged and polar residues are disfavored relative to largerones whereas the opposite is true for aromatic residuesClose examination of the log10 P values in figure 2A suggeststhat at buried sites the substitution preference is approxi-mately in the following order ACVLIMgt TgtFgtHYSgtQGWgtNgtKPEgt RgtD A similar (but not identical) trendis also visible in the PSD95pdz3 and GB1 data though this isbased on fewer buried positions and at a single expression

level Additional saturation mutagenesis studies on other sys-tems using quantitative or semi-quantitative readouts wouldbe useful in consolidating our observations

Substitution preferences at active-site residues should bedifferent than those at buried sites because proteinproteininterfaces are more polar than protein interiors (Janin et al1988 Tsai et al 1997) and are also likely to display a greatercontext dependence Extensive analysis of a large amount ofmutational data would be required to decipher these substi-tution preferences In the case of CcdB data for only 142active-site mutants is available Hence we did not attemptto predict mutational sensitivities at active-site residues

Mutational Tolerance as a Function of DepthMutational tolerances at the lowest (MID 2) and highest(MID 8) expression levels for all nonactive-site residues arelisted (supplementary table S2 Supplementary Material on-line and fig 1) At the lowest expression level mutationaltolerance increased with increasing accessibility while at thehighest expression level it is less sensitive to accessibility andmost mutants show an active phenotype Most substitutionsare tolerated at exposed nonactive-site residues both at lowand high expression levels (fig 1A and supplementary fig S1ASupplementary Material online) However a few mutantswith accessibilitygt 40 were found to show an inactive phe-notype These exposed inactive nonactive-site substitutionsare typically either aromatic residues or proline (supplementary table S4 Supplementary Material online) These exposedaromatic substitutions probably affect the folding of CcdBprotein as they show high propensity to aggregation al-though Tmrsquos are somewhat comparable to the wildtype (seemutants G29W L41F and V73F in supplementary table S5Supplementary Material online)

Cationndashp interactions are thought to contribute to pro-tein stability (Gallivan and Dougherty 1999) though an earlierstudy (Prajapati et al 2006) shows these contribute little tothe stability of Maltose Binding Protein We find that all the19 and 11 mutations at the 13th and 14th positions respec-tively involed in cationndashp interaction including the chargereversal mutant R13D were well tolerated even at the lowestexpression levels (supplementary table S6 SupplementaryMaterial online) Salt-bridges are another possible stabilizingnoncovalent electrostatic interaction in proteins In case ofCcdB five salt-bridges are present between the following pairsof residues D19-R31 D23-R31 E59-R40 E79-K4 and D89-R86All amino acids participating in salt-bridges are solvent ex-posed except for D19 in which only the terminal oxygens areexposed Mutations at all these positions are well toleratedeven at the lowest expression level (supplementary table S6Supplementary Material online) suggesting that none of thesalt-bridges in CcdB contributes significantly to the stability oractivity of the protein

We also examined the correlation of average MSseq valueswith residue depth for all nonactive-site positions in CcdB(PDB ID 3VUB) (fig 2D) Similar calculations were performedfor PSD95pdz3 and GB1 using the phenotypic data obtainedfrom (McLaughlin et al 2012) (PDB ID 1BE9) and (Olson et al2014) (PDB ID 1PGA) respectively In these studies the ability

Table 1 Mutational Tolerance at the Buried-Site Residues at Lowestand Highest Expression Levels

Aminoacid

No ofmutants

Depth(A)

ACCa

()Tol atMID2b ()

Tol atMID8b ()

V05 18 68 0 39 94F17 17 73 02 82 100V18 18 93 0 33 83D19 18 67 14 83 100V20cd 19 86 0 32 74Q21cd 19 65 1 63 100M32d 17 78 03 76 100V33 19 65 14 68 95I34 19 79 0 37 79L36 12 72 0 0 67P52 17 54 35 41 100V54 15 56 04 73 100M63 19 81 01 47 89T65 9 79 0 44 100M68cd 12 66 0 33 100L83 19 58 15 53 100I90 19 74 01 26 89A93cd 14 60 0 36 100I94cd 18 79 06 33 83M97cd 16 75 0 56 94F98cd 19 77 07 37 79

aSide-chain accessibilitybMutational tolerance at the lowest (MID 2) and highest (MID 8) expression levelscResidues within van der Waals distance of the active-site residuesdResidues present at dimer interface

Tripathi et al doi101093molbevmsw182 MBE

2964

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

FIG 2 Relative tolerance for substitutions at buried positions (A) Mutational sensitivity data at all buried positions obtained at differentexpression levels for CcdB was used to obtain the distribution of MSseq values for a given mutant residue The distributions for row and columnresidues were compared using a Wilcoxon signed-rank test and the corresponding P values were calculated A log10 of the P values is indicatedGradation from red to blue indicates increasing values log10 P ie decreasing destabilizing effect of the row residue wrt column residue A lowerP value implies that introduction of the row residue at a buried site is typically more destabilizing than introduction of the corresponding columnresidue (B and C) Similar plot but using DEx

i values derived from saturation mutagenesis of the PDZ domain (PSD95pdz3) and lnW values fromsaturation mutagenesis of IgG Binding domain of protein G (GB1) respectively (DndashF) Correlation of the average MSseq values DEx

i values and lnWvalues with side-chain depth for all nonactive-site residues of CcdB PSD95pdz3 and GB1 respectively Accessibility and depth values werecalculated based on the crystal structure of WT homodimeric CcdB (PDB ID 3VUB) PSD95pdz3 (PDB ID 1BE9) and GB1 (PDB ID 1PGA) A residuewas defined as buried if the side-chain accessibility is5

Table 2 Mutant Phenotype Prediction by MSpred SNAP2 and SuSPect

Protein Predictionmethod

Pearsonrsquos correlationcoefficienta

Matthews correlationcoefficientb

Sensitivityc() Specificityd() Accuracye()

CcdB MSpredf 069 065 69 95 90

SNAP2g 027 019 100 11 37SuSPecth 029 014 100 8 30

PSD95pdz3 MSpredf 057 053 61 93 88

SNAP2g 024 015 100 7 34SuSPecth 06 061 87 87 87

GB1 MSpredf 065 049 44 96 79

SNAP2g 027 011 100 3 42SuSPecth 008 003 73 24 38

aModulus of the correlation coefficientbMathews correlation coefficientfrac14 TP X TNFP X FN

ethTPthornFPTHORNethTPthornFNTHORNethTNthornFPTHORNethTNthornFNTHORN where TP TN FP FN are True Positives True Negatives False Positives and False Negatives respectivelycSensitivityfrac14 TP

TPthornFNdSpecificityfrac14 TN

TNthornFPeAccuracyfrac14 TPthornTN

TPthornTNthornFPthornFNfMutant was classified as nonneutral if MSpredgt 2 and neutral if the scorefrac14 2 Mutants were classified into true positives (TP) true negatives (TN) false positives (FP) and falsenegatives (FN)gMutant was classified as nonneutral if SNAP2 scoregt 50 and neutral if the scorelt50 50lt Scorelt 50 low reliability predictions and were omitted Mutants wereclassified into TP TN FP and FNhMutant was classified as nonneutral if SuSPect scoregt 75 and neutral if the scorelt 25 25lt Scorelt 75 low reliability predictions and were omitted Mutants were classifiedinto TP TN FP and FN

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2965

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

of these proteins to bind their cognate ligands is quantita-tively linked to a phenotypic readout In all the three casesaverage phenotypic effect was observed to increase with res-idue depth (Correlation coefficients of 085073 and061for CcdB PSD95pdz3 and GB1 respectively) (fig 2DndashF) Buriedpositions with small (A or G) wildtype residues were notincluded in the correlations These positions are unusuallysensitive to mutation because all substitutions result in largesteric overlap These data suggest that a large fraction of theaverage sensitivity to mutation at nonactive-site residues isgoverned by a single parameter the residue depth This is aremarkably simple metric that provides an alternative to thesector based models used to analyze mutational data forPSD95pdz3 as well as other proteins (McLaughlin et al 2012)

One alternative approach to estimating burial preferencesof amino acids is measuring free energies of transfer of aminoacid side-chain analogs from water to cyclohexane(Wolfenden et al 2015) Another approach is to measureaccessible surface areas of the side-chains averaged over alarge database of protein structures and either infer free en-ergies of transfer from aqueous solution into the protein in-terior as described previously (Rose et al 1985) or constructenvironment dependent substitution matrices from suchdata (Overington et al 1992) Relative DDGrsquos of burial fromthe first two approaches are shown in supplementary figureS1C and D Supplementary Material online Both of theseshow some qualitative similarities with the mutational datain figure 2AndashC but there are several notable differences Forexample the relative DDGrsquos of burial inferred from the freeenergy of transfer approaches show that the introduction ofW at buried positions is clearly favored over Y and H unlikethe situation for the experimental mutational data In addi-tion the transfer data predict that mutation to G and P willbe largely tolerated whereas the experimental mutationaldata suggest that substitutions to G or P are rarely toleratedIt is also observed in the mutational data-sets that at buriedsites smaller charged and polar residues are disfavored rela-tive to larger ones whereas the opposite trend is observed foraromatic residues In case of DDG transfer data the trend ispreserved for polar and charged residues but clearly not foraromatic residues

Prediction of Mutational Sensitivity Score (MSpred)Using Penalties Derived from the CcdB DataWe further determined whether the above observations re-garding substitution preferences could be employed for pre-diction of functional consequences of individual mutationsTo this end we developed a predictive model using a coher-ent set of rules derived from a randomly chosen subset of theCcdB mutational data containing 60 of the mutants andtested its applicability in predicting the mutational sensitivi-ties of the remaining 40 mutants as well as two other pro-teins PSD95pdz3 and GB1 The predicted score is denoted asMSpred

For CcdB mutational data a mutational sensitivity score(MSseq) of 2 is indicative of wild-type like behavior in themutant and higher values of MSseq indicate higher mutationalsensitivity Therefore a base MSpred value of 2 was assigned to

all the mutants in the test set and penalties were subse-quently added according to the nature of the substitutiontaking into account the wildtype residue identity As exposednonactive-site positions tolerated almost all substitutionspenalties were calculated only for buried positions We alsoobserved that buried side-chains that point outwards withrespect to the protein core are less sensitive to mutationscompared with the ones that point inside These residueswere identified by their side-chain depth values (seeldquoMaterials and Methodsrdquo section) and were not consideredfor penalty calculation

Substitutions were divided into categories based on thenature of the wildtype and mutant residue Each wildtype andmutant residue was assigned to one of six categories namelyaliphatic aromatic polar charged G and P resulting in a totalof 34 (362 [GG and PP]) types of substitutions TheCcdB data was randomly divided into training (60 data) andtest sets (40 data) The category penalty for each type ofsubstitution was calculated using only the training data set byaveraging the MSseq values observed for each category ofsubstitution and subtracting the base MSpred value of 2from the average MSseq Additional ldquoresidue-specific penal-tiesrdquo were also derived to account for the residue-size-wisesubstitution preferences eg smaller polar residues beingmore destabilizing than larger ones (Materials andMethods supplementary table S7 Supplementary Materialonline) Penalties for proline substitutions (both buried andexposed) were derived using the flowchart described previ-ously (Bajaj et al 2007) Next MSpred values were calculatedfor all buried positions based on these penalties(MSpredfrac14 2thorn category penaltythorn residue-specific penalty)and all exposed nonactive-site positions were assigned anMSpred of 2 Active-site residues were not considered in theanalysis The predicted mutational sensitivity scores (MSpred)for the test data set showed a high Pearsonrsquos correlation(rfrac14 069) with the experimental MSseq values and a SD of126 (table 2) We also derived the Matthews correlation co-efficient in order to evaluate the performance of MSpred inclassifying mutants as neutral and nonneutral (see ldquoMaterialsand Methodsrdquo section) It was observed to be 065 (table 2)

We tested the performance of MSpred on two other pro-teins The MSpred values for PSD95pdz3 and GB1 agreed wellwith the experimental mutational sensitivity data withPearsonrsquos correlation coefficients of 057 and 065 andMatthews correlation coefficients of 053 and 049 respec-tively (table 2)

We also carried out mutational sensitivity predictions forCcdB PSD95pdz3 and GB1 using two frequently used meth-ods SNAP2 (Hecht et al 2015) and SuSPect (Yates et al 2014)Both SNAP2 and SuSPect show poorer correlation with theexperimental mutational sensitivity data than MSpred (exceptSuSPect predictions for PSD95pdz3 table 2) Both the methodsshow a very high sensitivity but a very low specificity valuecompared with MSpred Thus MSpred which is derived basedon very simple rules compares favorably with the popularmachine learning based methods SNAP2 and SuSPect Thisapproach should work to rank order mutational effects atburied sites for other globular proteins While a three

Tripathi et al doi101093molbevmsw182 MBE

2966

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

dimensional structure is not essential it is important to haveresidue burial information because predictions have beenoptimized for buried residues A saturation mutagenesisdata set is also not required However it is important tohave experimental data on the functional effects of multiplepoint mutants to decide on the cutoff value of MSpred thatwould result in an observable phenotype This value wouldlikely depend on factors such as intrinsic protein stabilityexpression level and gene essentiality that would vary fromone protein to another (Miosge et al 2015)

In Vitro Determined Apparent Tmrsquos Correlate Betterwith in Vivo Solubility than with Relative ActivityDerived from Deep SequencingTo experimentally probe the molecular basis for mutant phe-notypes at nonactive-site positions around 80 single-site mu-tants of CcdB were selected from the saturation mutagenesislibrary (Bajaj et al 2008) based on MSseq and accessibility class(Adkar et al 2012) (supplementary table S5 SupplementaryMaterial online) All the mutants were purified by affinitypurification against immobilized ligands GyrA or CcdAEach purified protein was subjected to thermal denaturationmonitored using Sypro orange dye (Niesen et al 2007) andthe apparent Tm was calculated for each mutant (supplemen

tary fig S3A and table S5 Supplementary Material online)During purification of various CcdB mutants it is possiblethat the protein may be inactivated by aggregation or mis-folding Hence the ability of purified protein to bind CcdA wasexamined by monitoring the thermal denaturation of eachmutant in the absence and presence of a CcdA peptide thatcontains CcdB binding residues (residues 46ndash72) If the mu-tant binds the CcdA peptide this should result in an increasein its apparent Tm (supplementary fig S3B SupplementaryMaterial online) (Fukada et al 1983 Brandts and Lin 1990Gonzalez et al 1999) There were nine mutants eg V05SV05L Y06G and F17D that did not show an increase in ap-parent Tm in the presence of CcdA peptide (supplementarytable S5 Supplementary Material online) suggesting thatthese are misfolded or aggregated hence these mutantswere removed from the analysis Most of these mutants arelargely found in inclusion bodies and have Tmrsquos between 40 Cand 50 C in contrast to WT CcdB which has a Tm of 684 CFurther studies were restricted to the remaining 71 mutantsthat showed an enhancement in thermal stability in the pres-ence of CcdA peptide (supplementary fig S3BSupplementary Material online) Mutants showed a rangeof apparent Tmrsquos (supplementary table S5 SupplementaryMaterial online) When in vitro determined thermal stabilitywas compared with in vivo phenotypes (MSseq) determined

FIG 3 Correlation between apparent in vitro Tm in vivo solubility and activity (MSseq value) for CcdB mutants Correlations of DTm [Tm (WT)Tm

(Mutant)] for 67 single-site mutants with (A) in vivo activity and (B) in vivo fraction of soluble protein respectively (C) Correlation of relativethermal stability (DTm) of mutants with DDGo of unfolding estimated by GdnHCl denaturation (D) Correlation of fraction of protein in thesoluble fraction with in vivo activity of mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2967

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

by deep sequencing a moderate correlation (rfrac14 065) wasobtained (fig 3A) However there were many mutants thatshowed similar activity but differed substantially in their sta-bility such as L16S V18T D19N V54E (supplementary tableS5 Supplementary Material online) Conversely there werealso mutants (eg V33D M32N) that showed similar thermalstability to wildtype but had substantially lower activityin vivo This shows that the in vivo activity of a protein de-pends on many factors inside a cell which assist in properfolding and maintaining an active conformation Since theapparent Tm determined by the thermal shift assay may notreflect the true thermodynamic stability of the protein asubset of 21 mutants was also subjected to GdnHCl chemicaldenaturation These mutants were chosen to span a range ofTm and MSseq values These measurements were done to see ifthe two measures of stability ie thermal and chemical de-naturation correlate with one another It was found thatboth measures of stability were highly correlated (fig 3Cand supplementary table S8 Supplementary Material online)

Various mutations have different effects on protein stabil-ity and activity Properly folded proteins are found in thesoluble fraction of the cell lysate whereas misfolded proteinsoften form insoluble aggregates called inclusion bodiesHence misfolding reduces the amount of active solubleand functional protein though studies have shown thatsome amount of protein in the soluble fraction can also bemisfolded (Liu et al 2014) To study the relation betweenin vivo solubility of CcdB mutants with in vitro determinedthermal stability E coli strain CSH501 (which has a mutationin the gyrA gene and is hence resistant towards CcdB action)was transformed individually with the mutants and theamount of protein in both the soluble fraction and in inclu-sion bodies was estimated Surprisingly for a few mutantsalthough very little protein was found in the soluble fractionthese showed an active phenotype with an MSseq of 2 (fig3D) Hence for these mutants the small amount of proteinpresent in the soluble fraction is properly folded and sufficientto cause cell death in a CcdB sensitive strain In some casesdifferent mutants have similar fractions of soluble proteinin vivo but have different in vivo activity and in vitro thermalstability (supplementary table S5 Supplementary Materialonline) The overall thermal stabilities of mutants correlatedwell with the in vivo amount of soluble protein (fig 3B) Thisindicates that protein stability is an important determinant ofproper folding in vivo The moderate correlation of stability orsolubility with in vivo activity likely arises because only a smallamount of properly folded soluble protein is sufficient toresult in an active phenotype

One reason for the lack of a better correlation betweensolubility and in vivo activity is that for each mutant variousconformational forms of the protein can partition differentlyin the soluble and insoluble fractions of the cell lysate Thesoluble fraction can comprise both of folded protein which isactive and soluble aggregatespartially misfolded proteinwhich are inactive (Liu et al 2014) Moreover this partitioningcan be influenced by perturbations in the cytosolic proteo-stasis network To study the relation between in vivo activityand solubility the ability of four selected CcdB mutants

(V33K and Y06G as examples of active but insoluble mutantsand R31G and V80N as examples of soluble but inactivemutants) in the soluble fraction of the cell lysate to bindGyrase was monitored by surface plasmon resonance (supplementary fig S4A and B Supplementary Material online)Mutants with only a small amount of protein in the solublefraction but displaying an active phenotype in vivo (V33KY06G) showed binding to Gyrase comparable to the wild-type in this surface plasmon resonance assay showing thatthe protein is well folded Whereas in cases where a mutant ismostly in the soluble fraction but shows an inactive pheno-type in vivo (R31G V80N) the in vitro binding with Gyrasewas also negligible compared with the wild-type (supplementary fig S4C Supplementary Material online)

Refolding and Unfolding KineticsRefolding and unfolding kinetics for 10 mutants that havesimilar thermal stability but different in vivo solubility andactivity were monitored by time-course fluorescence spec-troscopy at 25 C Refolding and unfolding were carried outat pH 74 at final GdnHCl concentrations of 06 and 32 Mrespectively Of the 10 selected mutants four (V05S I56GV18R and V18H) could not be studied for their refoldingprofiles due to high precipitation immediately following pu-rification Further for these mutants the proportion in thesoluble fraction in vivo was low ranging from 01 to 03 (supplementary table S5 Supplementary Material online) Ofthese V05S and I56G are active (MSseqfrac142) whereas V18Rand V18H show an inactive phenotype (MSseq of 9 and 6respectively) Most mutants (except V80N) showed slowerrefolding kinetics than the wild-type indicating that thesemutants are folding defective (table 3) Refolding for the wild-type occurs with a significant burst phase (kgt 05 s 1) and aslow phase Mutants typically show a much smaller burstphase an intermediate phase and a slow phase of muchhigher amplitude than the wildtype Most mutants showunfolding kinetics similar to the wildtype except V54Ewhich shows a much higher unfolding rate The ability ofthe refolded mutants to bind to the cognate ligand GyrAor the CcdA peptide (residues 46ndash72) was also monitoredBinding of refolded mutants to immobilized GyrA onAmine Reactive Second Generation (AR2G) biosensorswas monitored using Bio-layer interferometry (Sultanaand Lee 2015) and the binding to CcdA peptide was mon-itored using Thermal Shift Assay (Niesen et al 2007) Activemutants (L16S V18T) retained their binding to both GyrAand CcdA upon refolding even though their refolding ki-netics was slow (table 3) Surprisingly V54E which is also anactive mutant failed to bind GyrA and CcdA upon refold-ing even though the native protein showed binding (supplementary fig S6 Supplementary Material online) On theother hand the inactive mutants R31G and M63N did notbind to GyrA and CcdA after refolding (table 3) showingthat their refolded state is nonnative Interestingly thenative V80N mutant did not show any binding to GyrAbut the refolded protein binds weakly to both ligands Twoof these mutants V80N and V54E also show formation ofhigher order oligomers (supplementary fig S5

Tripathi et al doi101093molbevmsw182 MBE

2968

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

FIG 2 Relative tolerance for substitutions at buried positions (A) Mutational sensitivity data at all buried positions obtained at differentexpression levels for CcdB was used to obtain the distribution of MSseq values for a given mutant residue The distributions for row and columnresidues were compared using a Wilcoxon signed-rank test and the corresponding P values were calculated A log10 of the P values is indicatedGradation from red to blue indicates increasing values log10 P ie decreasing destabilizing effect of the row residue wrt column residue A lowerP value implies that introduction of the row residue at a buried site is typically more destabilizing than introduction of the corresponding columnresidue (B and C) Similar plot but using DEx

i values derived from saturation mutagenesis of the PDZ domain (PSD95pdz3) and lnW values fromsaturation mutagenesis of IgG Binding domain of protein G (GB1) respectively (DndashF) Correlation of the average MSseq values DEx

i values and lnWvalues with side-chain depth for all nonactive-site residues of CcdB PSD95pdz3 and GB1 respectively Accessibility and depth values werecalculated based on the crystal structure of WT homodimeric CcdB (PDB ID 3VUB) PSD95pdz3 (PDB ID 1BE9) and GB1 (PDB ID 1PGA) A residuewas defined as buried if the side-chain accessibility is5

Table 2 Mutant Phenotype Prediction by MSpred SNAP2 and SuSPect

Protein Predictionmethod

Pearsonrsquos correlationcoefficienta

Matthews correlationcoefficientb

Sensitivityc() Specificityd() Accuracye()

CcdB MSpredf 069 065 69 95 90

SNAP2g 027 019 100 11 37SuSPecth 029 014 100 8 30

PSD95pdz3 MSpredf 057 053 61 93 88

SNAP2g 024 015 100 7 34SuSPecth 06 061 87 87 87

GB1 MSpredf 065 049 44 96 79

SNAP2g 027 011 100 3 42SuSPecth 008 003 73 24 38

aModulus of the correlation coefficientbMathews correlation coefficientfrac14 TP X TNFP X FN

ethTPthornFPTHORNethTPthornFNTHORNethTNthornFPTHORNethTNthornFNTHORN where TP TN FP FN are True Positives True Negatives False Positives and False Negatives respectivelycSensitivityfrac14 TP

TPthornFNdSpecificityfrac14 TN

TNthornFPeAccuracyfrac14 TPthornTN

TPthornTNthornFPthornFNfMutant was classified as nonneutral if MSpredgt 2 and neutral if the scorefrac14 2 Mutants were classified into true positives (TP) true negatives (TN) false positives (FP) and falsenegatives (FN)gMutant was classified as nonneutral if SNAP2 scoregt 50 and neutral if the scorelt50 50lt Scorelt 50 low reliability predictions and were omitted Mutants wereclassified into TP TN FP and FNhMutant was classified as nonneutral if SuSPect scoregt 75 and neutral if the scorelt 25 25lt Scorelt 75 low reliability predictions and were omitted Mutants were classifiedinto TP TN FP and FN

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2965

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

of these proteins to bind their cognate ligands is quantita-tively linked to a phenotypic readout In all the three casesaverage phenotypic effect was observed to increase with res-idue depth (Correlation coefficients of 085073 and061for CcdB PSD95pdz3 and GB1 respectively) (fig 2DndashF) Buriedpositions with small (A or G) wildtype residues were notincluded in the correlations These positions are unusuallysensitive to mutation because all substitutions result in largesteric overlap These data suggest that a large fraction of theaverage sensitivity to mutation at nonactive-site residues isgoverned by a single parameter the residue depth This is aremarkably simple metric that provides an alternative to thesector based models used to analyze mutational data forPSD95pdz3 as well as other proteins (McLaughlin et al 2012)

One alternative approach to estimating burial preferencesof amino acids is measuring free energies of transfer of aminoacid side-chain analogs from water to cyclohexane(Wolfenden et al 2015) Another approach is to measureaccessible surface areas of the side-chains averaged over alarge database of protein structures and either infer free en-ergies of transfer from aqueous solution into the protein in-terior as described previously (Rose et al 1985) or constructenvironment dependent substitution matrices from suchdata (Overington et al 1992) Relative DDGrsquos of burial fromthe first two approaches are shown in supplementary figureS1C and D Supplementary Material online Both of theseshow some qualitative similarities with the mutational datain figure 2AndashC but there are several notable differences Forexample the relative DDGrsquos of burial inferred from the freeenergy of transfer approaches show that the introduction ofW at buried positions is clearly favored over Y and H unlikethe situation for the experimental mutational data In addi-tion the transfer data predict that mutation to G and P willbe largely tolerated whereas the experimental mutationaldata suggest that substitutions to G or P are rarely toleratedIt is also observed in the mutational data-sets that at buriedsites smaller charged and polar residues are disfavored rela-tive to larger ones whereas the opposite trend is observed foraromatic residues In case of DDG transfer data the trend ispreserved for polar and charged residues but clearly not foraromatic residues

Prediction of Mutational Sensitivity Score (MSpred)Using Penalties Derived from the CcdB DataWe further determined whether the above observations re-garding substitution preferences could be employed for pre-diction of functional consequences of individual mutationsTo this end we developed a predictive model using a coher-ent set of rules derived from a randomly chosen subset of theCcdB mutational data containing 60 of the mutants andtested its applicability in predicting the mutational sensitivi-ties of the remaining 40 mutants as well as two other pro-teins PSD95pdz3 and GB1 The predicted score is denoted asMSpred

For CcdB mutational data a mutational sensitivity score(MSseq) of 2 is indicative of wild-type like behavior in themutant and higher values of MSseq indicate higher mutationalsensitivity Therefore a base MSpred value of 2 was assigned to

all the mutants in the test set and penalties were subse-quently added according to the nature of the substitutiontaking into account the wildtype residue identity As exposednonactive-site positions tolerated almost all substitutionspenalties were calculated only for buried positions We alsoobserved that buried side-chains that point outwards withrespect to the protein core are less sensitive to mutationscompared with the ones that point inside These residueswere identified by their side-chain depth values (seeldquoMaterials and Methodsrdquo section) and were not consideredfor penalty calculation

Substitutions were divided into categories based on thenature of the wildtype and mutant residue Each wildtype andmutant residue was assigned to one of six categories namelyaliphatic aromatic polar charged G and P resulting in a totalof 34 (362 [GG and PP]) types of substitutions TheCcdB data was randomly divided into training (60 data) andtest sets (40 data) The category penalty for each type ofsubstitution was calculated using only the training data set byaveraging the MSseq values observed for each category ofsubstitution and subtracting the base MSpred value of 2from the average MSseq Additional ldquoresidue-specific penal-tiesrdquo were also derived to account for the residue-size-wisesubstitution preferences eg smaller polar residues beingmore destabilizing than larger ones (Materials andMethods supplementary table S7 Supplementary Materialonline) Penalties for proline substitutions (both buried andexposed) were derived using the flowchart described previ-ously (Bajaj et al 2007) Next MSpred values were calculatedfor all buried positions based on these penalties(MSpredfrac14 2thorn category penaltythorn residue-specific penalty)and all exposed nonactive-site positions were assigned anMSpred of 2 Active-site residues were not considered in theanalysis The predicted mutational sensitivity scores (MSpred)for the test data set showed a high Pearsonrsquos correlation(rfrac14 069) with the experimental MSseq values and a SD of126 (table 2) We also derived the Matthews correlation co-efficient in order to evaluate the performance of MSpred inclassifying mutants as neutral and nonneutral (see ldquoMaterialsand Methodsrdquo section) It was observed to be 065 (table 2)

We tested the performance of MSpred on two other pro-teins The MSpred values for PSD95pdz3 and GB1 agreed wellwith the experimental mutational sensitivity data withPearsonrsquos correlation coefficients of 057 and 065 andMatthews correlation coefficients of 053 and 049 respec-tively (table 2)

We also carried out mutational sensitivity predictions forCcdB PSD95pdz3 and GB1 using two frequently used meth-ods SNAP2 (Hecht et al 2015) and SuSPect (Yates et al 2014)Both SNAP2 and SuSPect show poorer correlation with theexperimental mutational sensitivity data than MSpred (exceptSuSPect predictions for PSD95pdz3 table 2) Both the methodsshow a very high sensitivity but a very low specificity valuecompared with MSpred Thus MSpred which is derived basedon very simple rules compares favorably with the popularmachine learning based methods SNAP2 and SuSPect Thisapproach should work to rank order mutational effects atburied sites for other globular proteins While a three

Tripathi et al doi101093molbevmsw182 MBE

2966

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

dimensional structure is not essential it is important to haveresidue burial information because predictions have beenoptimized for buried residues A saturation mutagenesisdata set is also not required However it is important tohave experimental data on the functional effects of multiplepoint mutants to decide on the cutoff value of MSpred thatwould result in an observable phenotype This value wouldlikely depend on factors such as intrinsic protein stabilityexpression level and gene essentiality that would vary fromone protein to another (Miosge et al 2015)

In Vitro Determined Apparent Tmrsquos Correlate Betterwith in Vivo Solubility than with Relative ActivityDerived from Deep SequencingTo experimentally probe the molecular basis for mutant phe-notypes at nonactive-site positions around 80 single-site mu-tants of CcdB were selected from the saturation mutagenesislibrary (Bajaj et al 2008) based on MSseq and accessibility class(Adkar et al 2012) (supplementary table S5 SupplementaryMaterial online) All the mutants were purified by affinitypurification against immobilized ligands GyrA or CcdAEach purified protein was subjected to thermal denaturationmonitored using Sypro orange dye (Niesen et al 2007) andthe apparent Tm was calculated for each mutant (supplemen

tary fig S3A and table S5 Supplementary Material online)During purification of various CcdB mutants it is possiblethat the protein may be inactivated by aggregation or mis-folding Hence the ability of purified protein to bind CcdA wasexamined by monitoring the thermal denaturation of eachmutant in the absence and presence of a CcdA peptide thatcontains CcdB binding residues (residues 46ndash72) If the mu-tant binds the CcdA peptide this should result in an increasein its apparent Tm (supplementary fig S3B SupplementaryMaterial online) (Fukada et al 1983 Brandts and Lin 1990Gonzalez et al 1999) There were nine mutants eg V05SV05L Y06G and F17D that did not show an increase in ap-parent Tm in the presence of CcdA peptide (supplementarytable S5 Supplementary Material online) suggesting thatthese are misfolded or aggregated hence these mutantswere removed from the analysis Most of these mutants arelargely found in inclusion bodies and have Tmrsquos between 40 Cand 50 C in contrast to WT CcdB which has a Tm of 684 CFurther studies were restricted to the remaining 71 mutantsthat showed an enhancement in thermal stability in the pres-ence of CcdA peptide (supplementary fig S3BSupplementary Material online) Mutants showed a rangeof apparent Tmrsquos (supplementary table S5 SupplementaryMaterial online) When in vitro determined thermal stabilitywas compared with in vivo phenotypes (MSseq) determined

FIG 3 Correlation between apparent in vitro Tm in vivo solubility and activity (MSseq value) for CcdB mutants Correlations of DTm [Tm (WT)Tm

(Mutant)] for 67 single-site mutants with (A) in vivo activity and (B) in vivo fraction of soluble protein respectively (C) Correlation of relativethermal stability (DTm) of mutants with DDGo of unfolding estimated by GdnHCl denaturation (D) Correlation of fraction of protein in thesoluble fraction with in vivo activity of mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2967

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

by deep sequencing a moderate correlation (rfrac14 065) wasobtained (fig 3A) However there were many mutants thatshowed similar activity but differed substantially in their sta-bility such as L16S V18T D19N V54E (supplementary tableS5 Supplementary Material online) Conversely there werealso mutants (eg V33D M32N) that showed similar thermalstability to wildtype but had substantially lower activityin vivo This shows that the in vivo activity of a protein de-pends on many factors inside a cell which assist in properfolding and maintaining an active conformation Since theapparent Tm determined by the thermal shift assay may notreflect the true thermodynamic stability of the protein asubset of 21 mutants was also subjected to GdnHCl chemicaldenaturation These mutants were chosen to span a range ofTm and MSseq values These measurements were done to see ifthe two measures of stability ie thermal and chemical de-naturation correlate with one another It was found thatboth measures of stability were highly correlated (fig 3Cand supplementary table S8 Supplementary Material online)

Various mutations have different effects on protein stabil-ity and activity Properly folded proteins are found in thesoluble fraction of the cell lysate whereas misfolded proteinsoften form insoluble aggregates called inclusion bodiesHence misfolding reduces the amount of active solubleand functional protein though studies have shown thatsome amount of protein in the soluble fraction can also bemisfolded (Liu et al 2014) To study the relation betweenin vivo solubility of CcdB mutants with in vitro determinedthermal stability E coli strain CSH501 (which has a mutationin the gyrA gene and is hence resistant towards CcdB action)was transformed individually with the mutants and theamount of protein in both the soluble fraction and in inclu-sion bodies was estimated Surprisingly for a few mutantsalthough very little protein was found in the soluble fractionthese showed an active phenotype with an MSseq of 2 (fig3D) Hence for these mutants the small amount of proteinpresent in the soluble fraction is properly folded and sufficientto cause cell death in a CcdB sensitive strain In some casesdifferent mutants have similar fractions of soluble proteinin vivo but have different in vivo activity and in vitro thermalstability (supplementary table S5 Supplementary Materialonline) The overall thermal stabilities of mutants correlatedwell with the in vivo amount of soluble protein (fig 3B) Thisindicates that protein stability is an important determinant ofproper folding in vivo The moderate correlation of stability orsolubility with in vivo activity likely arises because only a smallamount of properly folded soluble protein is sufficient toresult in an active phenotype

One reason for the lack of a better correlation betweensolubility and in vivo activity is that for each mutant variousconformational forms of the protein can partition differentlyin the soluble and insoluble fractions of the cell lysate Thesoluble fraction can comprise both of folded protein which isactive and soluble aggregatespartially misfolded proteinwhich are inactive (Liu et al 2014) Moreover this partitioningcan be influenced by perturbations in the cytosolic proteo-stasis network To study the relation between in vivo activityand solubility the ability of four selected CcdB mutants

(V33K and Y06G as examples of active but insoluble mutantsand R31G and V80N as examples of soluble but inactivemutants) in the soluble fraction of the cell lysate to bindGyrase was monitored by surface plasmon resonance (supplementary fig S4A and B Supplementary Material online)Mutants with only a small amount of protein in the solublefraction but displaying an active phenotype in vivo (V33KY06G) showed binding to Gyrase comparable to the wild-type in this surface plasmon resonance assay showing thatthe protein is well folded Whereas in cases where a mutant ismostly in the soluble fraction but shows an inactive pheno-type in vivo (R31G V80N) the in vitro binding with Gyrasewas also negligible compared with the wild-type (supplementary fig S4C Supplementary Material online)

Refolding and Unfolding KineticsRefolding and unfolding kinetics for 10 mutants that havesimilar thermal stability but different in vivo solubility andactivity were monitored by time-course fluorescence spec-troscopy at 25 C Refolding and unfolding were carried outat pH 74 at final GdnHCl concentrations of 06 and 32 Mrespectively Of the 10 selected mutants four (V05S I56GV18R and V18H) could not be studied for their refoldingprofiles due to high precipitation immediately following pu-rification Further for these mutants the proportion in thesoluble fraction in vivo was low ranging from 01 to 03 (supplementary table S5 Supplementary Material online) Ofthese V05S and I56G are active (MSseqfrac142) whereas V18Rand V18H show an inactive phenotype (MSseq of 9 and 6respectively) Most mutants (except V80N) showed slowerrefolding kinetics than the wild-type indicating that thesemutants are folding defective (table 3) Refolding for the wild-type occurs with a significant burst phase (kgt 05 s 1) and aslow phase Mutants typically show a much smaller burstphase an intermediate phase and a slow phase of muchhigher amplitude than the wildtype Most mutants showunfolding kinetics similar to the wildtype except V54Ewhich shows a much higher unfolding rate The ability ofthe refolded mutants to bind to the cognate ligand GyrAor the CcdA peptide (residues 46ndash72) was also monitoredBinding of refolded mutants to immobilized GyrA onAmine Reactive Second Generation (AR2G) biosensorswas monitored using Bio-layer interferometry (Sultanaand Lee 2015) and the binding to CcdA peptide was mon-itored using Thermal Shift Assay (Niesen et al 2007) Activemutants (L16S V18T) retained their binding to both GyrAand CcdA upon refolding even though their refolding ki-netics was slow (table 3) Surprisingly V54E which is also anactive mutant failed to bind GyrA and CcdA upon refold-ing even though the native protein showed binding (supplementary fig S6 Supplementary Material online) On theother hand the inactive mutants R31G and M63N did notbind to GyrA and CcdA after refolding (table 3) showingthat their refolded state is nonnative Interestingly thenative V80N mutant did not show any binding to GyrAbut the refolded protein binds weakly to both ligands Twoof these mutants V80N and V54E also show formation ofhigher order oligomers (supplementary fig S5

Tripathi et al doi101093molbevmsw182 MBE

2968

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

of these proteins to bind their cognate ligands is quantita-tively linked to a phenotypic readout In all the three casesaverage phenotypic effect was observed to increase with res-idue depth (Correlation coefficients of 085073 and061for CcdB PSD95pdz3 and GB1 respectively) (fig 2DndashF) Buriedpositions with small (A or G) wildtype residues were notincluded in the correlations These positions are unusuallysensitive to mutation because all substitutions result in largesteric overlap These data suggest that a large fraction of theaverage sensitivity to mutation at nonactive-site residues isgoverned by a single parameter the residue depth This is aremarkably simple metric that provides an alternative to thesector based models used to analyze mutational data forPSD95pdz3 as well as other proteins (McLaughlin et al 2012)

One alternative approach to estimating burial preferencesof amino acids is measuring free energies of transfer of aminoacid side-chain analogs from water to cyclohexane(Wolfenden et al 2015) Another approach is to measureaccessible surface areas of the side-chains averaged over alarge database of protein structures and either infer free en-ergies of transfer from aqueous solution into the protein in-terior as described previously (Rose et al 1985) or constructenvironment dependent substitution matrices from suchdata (Overington et al 1992) Relative DDGrsquos of burial fromthe first two approaches are shown in supplementary figureS1C and D Supplementary Material online Both of theseshow some qualitative similarities with the mutational datain figure 2AndashC but there are several notable differences Forexample the relative DDGrsquos of burial inferred from the freeenergy of transfer approaches show that the introduction ofW at buried positions is clearly favored over Y and H unlikethe situation for the experimental mutational data In addi-tion the transfer data predict that mutation to G and P willbe largely tolerated whereas the experimental mutationaldata suggest that substitutions to G or P are rarely toleratedIt is also observed in the mutational data-sets that at buriedsites smaller charged and polar residues are disfavored rela-tive to larger ones whereas the opposite trend is observed foraromatic residues In case of DDG transfer data the trend ispreserved for polar and charged residues but clearly not foraromatic residues

Prediction of Mutational Sensitivity Score (MSpred)Using Penalties Derived from the CcdB DataWe further determined whether the above observations re-garding substitution preferences could be employed for pre-diction of functional consequences of individual mutationsTo this end we developed a predictive model using a coher-ent set of rules derived from a randomly chosen subset of theCcdB mutational data containing 60 of the mutants andtested its applicability in predicting the mutational sensitivi-ties of the remaining 40 mutants as well as two other pro-teins PSD95pdz3 and GB1 The predicted score is denoted asMSpred

For CcdB mutational data a mutational sensitivity score(MSseq) of 2 is indicative of wild-type like behavior in themutant and higher values of MSseq indicate higher mutationalsensitivity Therefore a base MSpred value of 2 was assigned to

all the mutants in the test set and penalties were subse-quently added according to the nature of the substitutiontaking into account the wildtype residue identity As exposednonactive-site positions tolerated almost all substitutionspenalties were calculated only for buried positions We alsoobserved that buried side-chains that point outwards withrespect to the protein core are less sensitive to mutationscompared with the ones that point inside These residueswere identified by their side-chain depth values (seeldquoMaterials and Methodsrdquo section) and were not consideredfor penalty calculation

Substitutions were divided into categories based on thenature of the wildtype and mutant residue Each wildtype andmutant residue was assigned to one of six categories namelyaliphatic aromatic polar charged G and P resulting in a totalof 34 (362 [GG and PP]) types of substitutions TheCcdB data was randomly divided into training (60 data) andtest sets (40 data) The category penalty for each type ofsubstitution was calculated using only the training data set byaveraging the MSseq values observed for each category ofsubstitution and subtracting the base MSpred value of 2from the average MSseq Additional ldquoresidue-specific penal-tiesrdquo were also derived to account for the residue-size-wisesubstitution preferences eg smaller polar residues beingmore destabilizing than larger ones (Materials andMethods supplementary table S7 Supplementary Materialonline) Penalties for proline substitutions (both buried andexposed) were derived using the flowchart described previ-ously (Bajaj et al 2007) Next MSpred values were calculatedfor all buried positions based on these penalties(MSpredfrac14 2thorn category penaltythorn residue-specific penalty)and all exposed nonactive-site positions were assigned anMSpred of 2 Active-site residues were not considered in theanalysis The predicted mutational sensitivity scores (MSpred)for the test data set showed a high Pearsonrsquos correlation(rfrac14 069) with the experimental MSseq values and a SD of126 (table 2) We also derived the Matthews correlation co-efficient in order to evaluate the performance of MSpred inclassifying mutants as neutral and nonneutral (see ldquoMaterialsand Methodsrdquo section) It was observed to be 065 (table 2)

We tested the performance of MSpred on two other pro-teins The MSpred values for PSD95pdz3 and GB1 agreed wellwith the experimental mutational sensitivity data withPearsonrsquos correlation coefficients of 057 and 065 andMatthews correlation coefficients of 053 and 049 respec-tively (table 2)

We also carried out mutational sensitivity predictions forCcdB PSD95pdz3 and GB1 using two frequently used meth-ods SNAP2 (Hecht et al 2015) and SuSPect (Yates et al 2014)Both SNAP2 and SuSPect show poorer correlation with theexperimental mutational sensitivity data than MSpred (exceptSuSPect predictions for PSD95pdz3 table 2) Both the methodsshow a very high sensitivity but a very low specificity valuecompared with MSpred Thus MSpred which is derived basedon very simple rules compares favorably with the popularmachine learning based methods SNAP2 and SuSPect Thisapproach should work to rank order mutational effects atburied sites for other globular proteins While a three

Tripathi et al doi101093molbevmsw182 MBE

2966

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

dimensional structure is not essential it is important to haveresidue burial information because predictions have beenoptimized for buried residues A saturation mutagenesisdata set is also not required However it is important tohave experimental data on the functional effects of multiplepoint mutants to decide on the cutoff value of MSpred thatwould result in an observable phenotype This value wouldlikely depend on factors such as intrinsic protein stabilityexpression level and gene essentiality that would vary fromone protein to another (Miosge et al 2015)

In Vitro Determined Apparent Tmrsquos Correlate Betterwith in Vivo Solubility than with Relative ActivityDerived from Deep SequencingTo experimentally probe the molecular basis for mutant phe-notypes at nonactive-site positions around 80 single-site mu-tants of CcdB were selected from the saturation mutagenesislibrary (Bajaj et al 2008) based on MSseq and accessibility class(Adkar et al 2012) (supplementary table S5 SupplementaryMaterial online) All the mutants were purified by affinitypurification against immobilized ligands GyrA or CcdAEach purified protein was subjected to thermal denaturationmonitored using Sypro orange dye (Niesen et al 2007) andthe apparent Tm was calculated for each mutant (supplemen

tary fig S3A and table S5 Supplementary Material online)During purification of various CcdB mutants it is possiblethat the protein may be inactivated by aggregation or mis-folding Hence the ability of purified protein to bind CcdA wasexamined by monitoring the thermal denaturation of eachmutant in the absence and presence of a CcdA peptide thatcontains CcdB binding residues (residues 46ndash72) If the mu-tant binds the CcdA peptide this should result in an increasein its apparent Tm (supplementary fig S3B SupplementaryMaterial online) (Fukada et al 1983 Brandts and Lin 1990Gonzalez et al 1999) There were nine mutants eg V05SV05L Y06G and F17D that did not show an increase in ap-parent Tm in the presence of CcdA peptide (supplementarytable S5 Supplementary Material online) suggesting thatthese are misfolded or aggregated hence these mutantswere removed from the analysis Most of these mutants arelargely found in inclusion bodies and have Tmrsquos between 40 Cand 50 C in contrast to WT CcdB which has a Tm of 684 CFurther studies were restricted to the remaining 71 mutantsthat showed an enhancement in thermal stability in the pres-ence of CcdA peptide (supplementary fig S3BSupplementary Material online) Mutants showed a rangeof apparent Tmrsquos (supplementary table S5 SupplementaryMaterial online) When in vitro determined thermal stabilitywas compared with in vivo phenotypes (MSseq) determined

FIG 3 Correlation between apparent in vitro Tm in vivo solubility and activity (MSseq value) for CcdB mutants Correlations of DTm [Tm (WT)Tm

(Mutant)] for 67 single-site mutants with (A) in vivo activity and (B) in vivo fraction of soluble protein respectively (C) Correlation of relativethermal stability (DTm) of mutants with DDGo of unfolding estimated by GdnHCl denaturation (D) Correlation of fraction of protein in thesoluble fraction with in vivo activity of mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2967

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

by deep sequencing a moderate correlation (rfrac14 065) wasobtained (fig 3A) However there were many mutants thatshowed similar activity but differed substantially in their sta-bility such as L16S V18T D19N V54E (supplementary tableS5 Supplementary Material online) Conversely there werealso mutants (eg V33D M32N) that showed similar thermalstability to wildtype but had substantially lower activityin vivo This shows that the in vivo activity of a protein de-pends on many factors inside a cell which assist in properfolding and maintaining an active conformation Since theapparent Tm determined by the thermal shift assay may notreflect the true thermodynamic stability of the protein asubset of 21 mutants was also subjected to GdnHCl chemicaldenaturation These mutants were chosen to span a range ofTm and MSseq values These measurements were done to see ifthe two measures of stability ie thermal and chemical de-naturation correlate with one another It was found thatboth measures of stability were highly correlated (fig 3Cand supplementary table S8 Supplementary Material online)

Various mutations have different effects on protein stabil-ity and activity Properly folded proteins are found in thesoluble fraction of the cell lysate whereas misfolded proteinsoften form insoluble aggregates called inclusion bodiesHence misfolding reduces the amount of active solubleand functional protein though studies have shown thatsome amount of protein in the soluble fraction can also bemisfolded (Liu et al 2014) To study the relation betweenin vivo solubility of CcdB mutants with in vitro determinedthermal stability E coli strain CSH501 (which has a mutationin the gyrA gene and is hence resistant towards CcdB action)was transformed individually with the mutants and theamount of protein in both the soluble fraction and in inclu-sion bodies was estimated Surprisingly for a few mutantsalthough very little protein was found in the soluble fractionthese showed an active phenotype with an MSseq of 2 (fig3D) Hence for these mutants the small amount of proteinpresent in the soluble fraction is properly folded and sufficientto cause cell death in a CcdB sensitive strain In some casesdifferent mutants have similar fractions of soluble proteinin vivo but have different in vivo activity and in vitro thermalstability (supplementary table S5 Supplementary Materialonline) The overall thermal stabilities of mutants correlatedwell with the in vivo amount of soluble protein (fig 3B) Thisindicates that protein stability is an important determinant ofproper folding in vivo The moderate correlation of stability orsolubility with in vivo activity likely arises because only a smallamount of properly folded soluble protein is sufficient toresult in an active phenotype

One reason for the lack of a better correlation betweensolubility and in vivo activity is that for each mutant variousconformational forms of the protein can partition differentlyin the soluble and insoluble fractions of the cell lysate Thesoluble fraction can comprise both of folded protein which isactive and soluble aggregatespartially misfolded proteinwhich are inactive (Liu et al 2014) Moreover this partitioningcan be influenced by perturbations in the cytosolic proteo-stasis network To study the relation between in vivo activityand solubility the ability of four selected CcdB mutants

(V33K and Y06G as examples of active but insoluble mutantsand R31G and V80N as examples of soluble but inactivemutants) in the soluble fraction of the cell lysate to bindGyrase was monitored by surface plasmon resonance (supplementary fig S4A and B Supplementary Material online)Mutants with only a small amount of protein in the solublefraction but displaying an active phenotype in vivo (V33KY06G) showed binding to Gyrase comparable to the wild-type in this surface plasmon resonance assay showing thatthe protein is well folded Whereas in cases where a mutant ismostly in the soluble fraction but shows an inactive pheno-type in vivo (R31G V80N) the in vitro binding with Gyrasewas also negligible compared with the wild-type (supplementary fig S4C Supplementary Material online)

Refolding and Unfolding KineticsRefolding and unfolding kinetics for 10 mutants that havesimilar thermal stability but different in vivo solubility andactivity were monitored by time-course fluorescence spec-troscopy at 25 C Refolding and unfolding were carried outat pH 74 at final GdnHCl concentrations of 06 and 32 Mrespectively Of the 10 selected mutants four (V05S I56GV18R and V18H) could not be studied for their refoldingprofiles due to high precipitation immediately following pu-rification Further for these mutants the proportion in thesoluble fraction in vivo was low ranging from 01 to 03 (supplementary table S5 Supplementary Material online) Ofthese V05S and I56G are active (MSseqfrac142) whereas V18Rand V18H show an inactive phenotype (MSseq of 9 and 6respectively) Most mutants (except V80N) showed slowerrefolding kinetics than the wild-type indicating that thesemutants are folding defective (table 3) Refolding for the wild-type occurs with a significant burst phase (kgt 05 s 1) and aslow phase Mutants typically show a much smaller burstphase an intermediate phase and a slow phase of muchhigher amplitude than the wildtype Most mutants showunfolding kinetics similar to the wildtype except V54Ewhich shows a much higher unfolding rate The ability ofthe refolded mutants to bind to the cognate ligand GyrAor the CcdA peptide (residues 46ndash72) was also monitoredBinding of refolded mutants to immobilized GyrA onAmine Reactive Second Generation (AR2G) biosensorswas monitored using Bio-layer interferometry (Sultanaand Lee 2015) and the binding to CcdA peptide was mon-itored using Thermal Shift Assay (Niesen et al 2007) Activemutants (L16S V18T) retained their binding to both GyrAand CcdA upon refolding even though their refolding ki-netics was slow (table 3) Surprisingly V54E which is also anactive mutant failed to bind GyrA and CcdA upon refold-ing even though the native protein showed binding (supplementary fig S6 Supplementary Material online) On theother hand the inactive mutants R31G and M63N did notbind to GyrA and CcdA after refolding (table 3) showingthat their refolded state is nonnative Interestingly thenative V80N mutant did not show any binding to GyrAbut the refolded protein binds weakly to both ligands Twoof these mutants V80N and V54E also show formation ofhigher order oligomers (supplementary fig S5

Tripathi et al doi101093molbevmsw182 MBE

2968

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

dimensional structure is not essential it is important to haveresidue burial information because predictions have beenoptimized for buried residues A saturation mutagenesisdata set is also not required However it is important tohave experimental data on the functional effects of multiplepoint mutants to decide on the cutoff value of MSpred thatwould result in an observable phenotype This value wouldlikely depend on factors such as intrinsic protein stabilityexpression level and gene essentiality that would vary fromone protein to another (Miosge et al 2015)

In Vitro Determined Apparent Tmrsquos Correlate Betterwith in Vivo Solubility than with Relative ActivityDerived from Deep SequencingTo experimentally probe the molecular basis for mutant phe-notypes at nonactive-site positions around 80 single-site mu-tants of CcdB were selected from the saturation mutagenesislibrary (Bajaj et al 2008) based on MSseq and accessibility class(Adkar et al 2012) (supplementary table S5 SupplementaryMaterial online) All the mutants were purified by affinitypurification against immobilized ligands GyrA or CcdAEach purified protein was subjected to thermal denaturationmonitored using Sypro orange dye (Niesen et al 2007) andthe apparent Tm was calculated for each mutant (supplemen

tary fig S3A and table S5 Supplementary Material online)During purification of various CcdB mutants it is possiblethat the protein may be inactivated by aggregation or mis-folding Hence the ability of purified protein to bind CcdA wasexamined by monitoring the thermal denaturation of eachmutant in the absence and presence of a CcdA peptide thatcontains CcdB binding residues (residues 46ndash72) If the mu-tant binds the CcdA peptide this should result in an increasein its apparent Tm (supplementary fig S3B SupplementaryMaterial online) (Fukada et al 1983 Brandts and Lin 1990Gonzalez et al 1999) There were nine mutants eg V05SV05L Y06G and F17D that did not show an increase in ap-parent Tm in the presence of CcdA peptide (supplementarytable S5 Supplementary Material online) suggesting thatthese are misfolded or aggregated hence these mutantswere removed from the analysis Most of these mutants arelargely found in inclusion bodies and have Tmrsquos between 40 Cand 50 C in contrast to WT CcdB which has a Tm of 684 CFurther studies were restricted to the remaining 71 mutantsthat showed an enhancement in thermal stability in the pres-ence of CcdA peptide (supplementary fig S3BSupplementary Material online) Mutants showed a rangeof apparent Tmrsquos (supplementary table S5 SupplementaryMaterial online) When in vitro determined thermal stabilitywas compared with in vivo phenotypes (MSseq) determined

FIG 3 Correlation between apparent in vitro Tm in vivo solubility and activity (MSseq value) for CcdB mutants Correlations of DTm [Tm (WT)Tm

(Mutant)] for 67 single-site mutants with (A) in vivo activity and (B) in vivo fraction of soluble protein respectively (C) Correlation of relativethermal stability (DTm) of mutants with DDGo of unfolding estimated by GdnHCl denaturation (D) Correlation of fraction of protein in thesoluble fraction with in vivo activity of mutants

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2967

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

by deep sequencing a moderate correlation (rfrac14 065) wasobtained (fig 3A) However there were many mutants thatshowed similar activity but differed substantially in their sta-bility such as L16S V18T D19N V54E (supplementary tableS5 Supplementary Material online) Conversely there werealso mutants (eg V33D M32N) that showed similar thermalstability to wildtype but had substantially lower activityin vivo This shows that the in vivo activity of a protein de-pends on many factors inside a cell which assist in properfolding and maintaining an active conformation Since theapparent Tm determined by the thermal shift assay may notreflect the true thermodynamic stability of the protein asubset of 21 mutants was also subjected to GdnHCl chemicaldenaturation These mutants were chosen to span a range ofTm and MSseq values These measurements were done to see ifthe two measures of stability ie thermal and chemical de-naturation correlate with one another It was found thatboth measures of stability were highly correlated (fig 3Cand supplementary table S8 Supplementary Material online)

Various mutations have different effects on protein stabil-ity and activity Properly folded proteins are found in thesoluble fraction of the cell lysate whereas misfolded proteinsoften form insoluble aggregates called inclusion bodiesHence misfolding reduces the amount of active solubleand functional protein though studies have shown thatsome amount of protein in the soluble fraction can also bemisfolded (Liu et al 2014) To study the relation betweenin vivo solubility of CcdB mutants with in vitro determinedthermal stability E coli strain CSH501 (which has a mutationin the gyrA gene and is hence resistant towards CcdB action)was transformed individually with the mutants and theamount of protein in both the soluble fraction and in inclu-sion bodies was estimated Surprisingly for a few mutantsalthough very little protein was found in the soluble fractionthese showed an active phenotype with an MSseq of 2 (fig3D) Hence for these mutants the small amount of proteinpresent in the soluble fraction is properly folded and sufficientto cause cell death in a CcdB sensitive strain In some casesdifferent mutants have similar fractions of soluble proteinin vivo but have different in vivo activity and in vitro thermalstability (supplementary table S5 Supplementary Materialonline) The overall thermal stabilities of mutants correlatedwell with the in vivo amount of soluble protein (fig 3B) Thisindicates that protein stability is an important determinant ofproper folding in vivo The moderate correlation of stability orsolubility with in vivo activity likely arises because only a smallamount of properly folded soluble protein is sufficient toresult in an active phenotype

One reason for the lack of a better correlation betweensolubility and in vivo activity is that for each mutant variousconformational forms of the protein can partition differentlyin the soluble and insoluble fractions of the cell lysate Thesoluble fraction can comprise both of folded protein which isactive and soluble aggregatespartially misfolded proteinwhich are inactive (Liu et al 2014) Moreover this partitioningcan be influenced by perturbations in the cytosolic proteo-stasis network To study the relation between in vivo activityand solubility the ability of four selected CcdB mutants

(V33K and Y06G as examples of active but insoluble mutantsand R31G and V80N as examples of soluble but inactivemutants) in the soluble fraction of the cell lysate to bindGyrase was monitored by surface plasmon resonance (supplementary fig S4A and B Supplementary Material online)Mutants with only a small amount of protein in the solublefraction but displaying an active phenotype in vivo (V33KY06G) showed binding to Gyrase comparable to the wild-type in this surface plasmon resonance assay showing thatthe protein is well folded Whereas in cases where a mutant ismostly in the soluble fraction but shows an inactive pheno-type in vivo (R31G V80N) the in vitro binding with Gyrasewas also negligible compared with the wild-type (supplementary fig S4C Supplementary Material online)

Refolding and Unfolding KineticsRefolding and unfolding kinetics for 10 mutants that havesimilar thermal stability but different in vivo solubility andactivity were monitored by time-course fluorescence spec-troscopy at 25 C Refolding and unfolding were carried outat pH 74 at final GdnHCl concentrations of 06 and 32 Mrespectively Of the 10 selected mutants four (V05S I56GV18R and V18H) could not be studied for their refoldingprofiles due to high precipitation immediately following pu-rification Further for these mutants the proportion in thesoluble fraction in vivo was low ranging from 01 to 03 (supplementary table S5 Supplementary Material online) Ofthese V05S and I56G are active (MSseqfrac142) whereas V18Rand V18H show an inactive phenotype (MSseq of 9 and 6respectively) Most mutants (except V80N) showed slowerrefolding kinetics than the wild-type indicating that thesemutants are folding defective (table 3) Refolding for the wild-type occurs with a significant burst phase (kgt 05 s 1) and aslow phase Mutants typically show a much smaller burstphase an intermediate phase and a slow phase of muchhigher amplitude than the wildtype Most mutants showunfolding kinetics similar to the wildtype except V54Ewhich shows a much higher unfolding rate The ability ofthe refolded mutants to bind to the cognate ligand GyrAor the CcdA peptide (residues 46ndash72) was also monitoredBinding of refolded mutants to immobilized GyrA onAmine Reactive Second Generation (AR2G) biosensorswas monitored using Bio-layer interferometry (Sultanaand Lee 2015) and the binding to CcdA peptide was mon-itored using Thermal Shift Assay (Niesen et al 2007) Activemutants (L16S V18T) retained their binding to both GyrAand CcdA upon refolding even though their refolding ki-netics was slow (table 3) Surprisingly V54E which is also anactive mutant failed to bind GyrA and CcdA upon refold-ing even though the native protein showed binding (supplementary fig S6 Supplementary Material online) On theother hand the inactive mutants R31G and M63N did notbind to GyrA and CcdA after refolding (table 3) showingthat their refolded state is nonnative Interestingly thenative V80N mutant did not show any binding to GyrAbut the refolded protein binds weakly to both ligands Twoof these mutants V80N and V54E also show formation ofhigher order oligomers (supplementary fig S5

Tripathi et al doi101093molbevmsw182 MBE

2968

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

by deep sequencing a moderate correlation (rfrac14 065) wasobtained (fig 3A) However there were many mutants thatshowed similar activity but differed substantially in their sta-bility such as L16S V18T D19N V54E (supplementary tableS5 Supplementary Material online) Conversely there werealso mutants (eg V33D M32N) that showed similar thermalstability to wildtype but had substantially lower activityin vivo This shows that the in vivo activity of a protein de-pends on many factors inside a cell which assist in properfolding and maintaining an active conformation Since theapparent Tm determined by the thermal shift assay may notreflect the true thermodynamic stability of the protein asubset of 21 mutants was also subjected to GdnHCl chemicaldenaturation These mutants were chosen to span a range ofTm and MSseq values These measurements were done to see ifthe two measures of stability ie thermal and chemical de-naturation correlate with one another It was found thatboth measures of stability were highly correlated (fig 3Cand supplementary table S8 Supplementary Material online)

Various mutations have different effects on protein stabil-ity and activity Properly folded proteins are found in thesoluble fraction of the cell lysate whereas misfolded proteinsoften form insoluble aggregates called inclusion bodiesHence misfolding reduces the amount of active solubleand functional protein though studies have shown thatsome amount of protein in the soluble fraction can also bemisfolded (Liu et al 2014) To study the relation betweenin vivo solubility of CcdB mutants with in vitro determinedthermal stability E coli strain CSH501 (which has a mutationin the gyrA gene and is hence resistant towards CcdB action)was transformed individually with the mutants and theamount of protein in both the soluble fraction and in inclu-sion bodies was estimated Surprisingly for a few mutantsalthough very little protein was found in the soluble fractionthese showed an active phenotype with an MSseq of 2 (fig3D) Hence for these mutants the small amount of proteinpresent in the soluble fraction is properly folded and sufficientto cause cell death in a CcdB sensitive strain In some casesdifferent mutants have similar fractions of soluble proteinin vivo but have different in vivo activity and in vitro thermalstability (supplementary table S5 Supplementary Materialonline) The overall thermal stabilities of mutants correlatedwell with the in vivo amount of soluble protein (fig 3B) Thisindicates that protein stability is an important determinant ofproper folding in vivo The moderate correlation of stability orsolubility with in vivo activity likely arises because only a smallamount of properly folded soluble protein is sufficient toresult in an active phenotype

One reason for the lack of a better correlation betweensolubility and in vivo activity is that for each mutant variousconformational forms of the protein can partition differentlyin the soluble and insoluble fractions of the cell lysate Thesoluble fraction can comprise both of folded protein which isactive and soluble aggregatespartially misfolded proteinwhich are inactive (Liu et al 2014) Moreover this partitioningcan be influenced by perturbations in the cytosolic proteo-stasis network To study the relation between in vivo activityand solubility the ability of four selected CcdB mutants

(V33K and Y06G as examples of active but insoluble mutantsand R31G and V80N as examples of soluble but inactivemutants) in the soluble fraction of the cell lysate to bindGyrase was monitored by surface plasmon resonance (supplementary fig S4A and B Supplementary Material online)Mutants with only a small amount of protein in the solublefraction but displaying an active phenotype in vivo (V33KY06G) showed binding to Gyrase comparable to the wild-type in this surface plasmon resonance assay showing thatthe protein is well folded Whereas in cases where a mutant ismostly in the soluble fraction but shows an inactive pheno-type in vivo (R31G V80N) the in vitro binding with Gyrasewas also negligible compared with the wild-type (supplementary fig S4C Supplementary Material online)

Refolding and Unfolding KineticsRefolding and unfolding kinetics for 10 mutants that havesimilar thermal stability but different in vivo solubility andactivity were monitored by time-course fluorescence spec-troscopy at 25 C Refolding and unfolding were carried outat pH 74 at final GdnHCl concentrations of 06 and 32 Mrespectively Of the 10 selected mutants four (V05S I56GV18R and V18H) could not be studied for their refoldingprofiles due to high precipitation immediately following pu-rification Further for these mutants the proportion in thesoluble fraction in vivo was low ranging from 01 to 03 (supplementary table S5 Supplementary Material online) Ofthese V05S and I56G are active (MSseqfrac142) whereas V18Rand V18H show an inactive phenotype (MSseq of 9 and 6respectively) Most mutants (except V80N) showed slowerrefolding kinetics than the wild-type indicating that thesemutants are folding defective (table 3) Refolding for the wild-type occurs with a significant burst phase (kgt 05 s 1) and aslow phase Mutants typically show a much smaller burstphase an intermediate phase and a slow phase of muchhigher amplitude than the wildtype Most mutants showunfolding kinetics similar to the wildtype except V54Ewhich shows a much higher unfolding rate The ability ofthe refolded mutants to bind to the cognate ligand GyrAor the CcdA peptide (residues 46ndash72) was also monitoredBinding of refolded mutants to immobilized GyrA onAmine Reactive Second Generation (AR2G) biosensorswas monitored using Bio-layer interferometry (Sultanaand Lee 2015) and the binding to CcdA peptide was mon-itored using Thermal Shift Assay (Niesen et al 2007) Activemutants (L16S V18T) retained their binding to both GyrAand CcdA upon refolding even though their refolding ki-netics was slow (table 3) Surprisingly V54E which is also anactive mutant failed to bind GyrA and CcdA upon refold-ing even though the native protein showed binding (supplementary fig S6 Supplementary Material online) On theother hand the inactive mutants R31G and M63N did notbind to GyrA and CcdA after refolding (table 3) showingthat their refolded state is nonnative Interestingly thenative V80N mutant did not show any binding to GyrAbut the refolded protein binds weakly to both ligands Twoof these mutants V80N and V54E also show formation ofhigher order oligomers (supplementary fig S5

Tripathi et al doi101093molbevmsw182 MBE

2968

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

Supplementary Material online) Overall the data indicatethat slower refolding in vitro is qualitatively correlated withtargeting to inclusion bodies in vivo Further mutants withlow activity in vivo often refold to an inactive state in vitroFinally some mutants which show high aggregation pro-pensity in vitro show an active phenotype in vivo presum-ably because of the presence of chaperones which help infolding to the native state

Over-Expression of Chaperones Rescues FoldingDefects of MutantsVarious factors within the cell influence the proper folding ofproteins to the native state Folding assistance by various chap-erones and other quality control mechanisms can buffer mu-tational effects on protein stability and function (Bershteinet al 2013) To study this the in vivo activity of CcdB mutantswas assayed in various chaperone and protease deleted strainsas well as chaperone over-expressing strains (see ldquoMaterialsand Methodsrdquo section) Eleven CcdB mutants with a range ofsolubility and activity were chosen to study if the over-expression or deletion of chaperones and proteases affectsboth the in vivo solubility and activity of the mutants Of thesemutants L16S V33K L36K and V80N had low Tmrsquos (lt55 C)but they differ in their in vivo activity whereas mutants G29WD67P and V73F show a higher Tm (gt56 C) but are inactive Invivo activity of these mutants was monitored both in chaper-one and protease deletion strains to delineate effects on pro-tein folding or stability Mutants were transformed in differentstrains and cells were plated in the presence of different re-pressor (glucose) and inducer (arabinose) concentrations tomodulate CcdB expression Over-expression of ATP-dependent chaperones (DnaJ DnaK GroEL and ClpB) didnot lead to a change in the in vivo activity of CcdB mutantsA few mutants showed a decrease in the activity in proteasedeletion strains BWDlon BWDclpP BWDhchA (supplementary table S9 Supplementary Material online) but a consistenteffect on the activity was not observed probably due to directinvolvement of proteases in the process of CcdB mediated celldeath (Van Melderen et al 1996) Many of these proteases

have also been shown to have chaperone-like activity(Gottesman et al 1997) which can further complicate inter-pretation of the observed phenotypes Over-expression of twoATP-independent chaperones namely Trigger Factor andSecB showed substantial and consistent effect on mutant ac-tivity probably due to their ability to cooperate in the foldingof newly synthesized cytosolic proteins (Ullers et al 2004Maier et al 2005) Most mutants show an increase in activityupon over-expression of these two chaperones whereas theybecome less active in BWDtig and BWDsecB strains relative tothe parent BW25113 strain (fig 4A and B and table 4) Anincrease in the in vivo solubility of the mutants was also ob-served upon chaperone over-expression the effect being largerfor Trigger Factor over-expression (fig 4B and C and table 4)These effects suggest that for many of these mutants inactivityprimarily results from folding defects which can be rescued byover-expression of chaperones Interestingly this is also thecase for mutants which show similar stability to wildtype butlower solubility (V73F D67P and G29W) This further indi-cates that defects in folding rather than stability are the pri-mary causes for inactivity Previous studies have shown thatGroELES chaperonins when over-expressed can not only buf-fer destabilizing and adaptive mutations shown in E coli en-zymes during in vitro mutational drift experiments but canhave significant effects on the E coli proteome evolutionthrough their modulation of protein folding (Tokuriki andTawfik 2009 Williams and Fares 2010) The observation thatfolding defects in CcdB mutants are rescued solely by the SecBand Trigger Factor chaperones implies that these defects occurat an early stage of folding and once the misfolding occurs itcannot be rescued by the ATP-dependent chaperones such asGroEL and DnaK as described above This could also be becausefor the ATP-dependent chaperones multiple chaperones mayneed to be over-expressed as they may have to cooperate todisaggregate misfolded mutants (Mogk et al 2015)

DiscussionSaturation mutagenesis is a useful tool to study the contri-bution of each amino acid in a protein to its structure

Table 3 Kinetic Parameters for In Vitro Refolding and Unfolding of Selected Moderately Stable CcdB Mutantsa b

Mutant Fractionsoluble

MSseq DTm

(Wt-mutant)(C)

Refolding Unfolding CcdA bindingto refoldedprotein (TSA)

Gyrase bindingto refoldedprotein (BLI)Fast phase Slow phase

a0 a1 k1 (s1) a2 k2 (s1) A0 A1 K1 (s1)

L16S 04 2 167 004 072 007 024 002 083 017 006 thornthornthornthorn thornthornthornV18T 07 2 9 004 07 01 026 002 08 02 016 thornthornthornthorn thornthornthornthornR31G 06 6 11 005 08 02 015 002 085 015 002 ndashc ndashc

V54E 04 2 145 014 017 028 068 004 1 ndash ndash ndashc ndashc

M63N 02 6 152 015 ndash ndash 085 008 084 016 007 ndashc ndashc

V80N 08 6 175 08 ndash gt 05 02 004 076 024 007 thorn thornWT 1 2 ndash 084 ndash gt 05 016 0046 062 038 004 thornthornthornthorn thornthornthornthornaThe mutants chosen for refolding studies have similar stability and different solubility and activity (MSseq) Four other selected mutants could not be used for refolding studiesdue to very low solubility and high protein precipitation under the given reaction conditions These had MSseq values of 2 2 9 and 6 respectivelybThe traces were fit to a 5-parameter equation for exponential decay for refolding (ffrac14 y0thorn ae(bx)thornce(dx)) yielding fast (k1) and slow phase rate constants (k2) withassociated amplitudes a1 and a2 respectively and to a 3-parameter exponential rise for unfolding (ffrac14 y0thorn ae(bx)) yielding the rate constant k1 with associated amplitudechange A1 a0 and A0 are the amplitudes for the burst phase for refolding and unfolding respectively Errors for all the observed parameters were 10 of the measuredexperimental valuecNo observable binding

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2969

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

stability and function and in understanding the relation be-tween genotype and phenotype In the present study a sat-uration mutagenesis library of single-site mutants of CcdBwas used to understand the molecular basis of mutant phe-notypes and to derive a simple procedure to predict suchphenotypes While there have been other saturation muta-genesis studies published in the recent past (Abriata et al2015 Kowalsky et al 2015 Romero et al 2015 Starita et al2015) the present study examines multiple expression levelseffects of multiple chaperones and proteases and employsextensive in vitro characterization to understand how muta-tions affect phenotype The tolerance of each residue to var-ious substitutions at multiple expression levels was calculatedand mapped on the crystal structure of CcdB (Loris et al1999) Mutational tolerance depended on both protein ex-pression level and structural context as noted by us earlier

(Bajaj et al 2005) Virtually all mutants which showed aninactive phenotype at low expression levels show an activephenotype when over-expressed This is in contrast withother studies that showed growth defects in the presenceof misfolded proteins in a dosage dependent manner(Geiler-Samerotte et al 2011 Bershtein et al 2012) In thesestudies when destabilized mutants of YFP or DHFR wereexpressed at high levels increased aggregation and growthdefects were observed In the case of the CcdB system in-creasing expression results in an increased total amount ofactive protein inside a cell that is available for binding andinhibiting the function of DNA-Gyrase (Bajaj et al 2008) Asimilar observation was made in another study which showedincreased activity of Hsp90 mutants upon over-expression(Jiang et al 2013) In the case of TEM-1b lactamase proteinit has been found that deleterious effects of mutations

FIG 4 In vivo activity and solubility of CcdB mutants in presence and absence of ATP-independent chaperones (A) The activity of the selectedmutants was monitored in chaperone deleted (BWDtig and BWDsecB) as well as in chaperone over-expression strains (BWpTig and BWpSecB)under seven different repressing or activating conditions for the expression of mutants and the condition where growth ceased was reported as theactive condition (B and C) The fraction of protein for cells grown at 37 C and induced for CcdB with 02 arabinose in both supernatant (soluble)and pellet (insoluble) with or without over-expression of chaperones Trigger Factor and SecB respectively determined following SDSndashPAGE andCoomasie staining using Quantity One software (Bio-Rad) S and P are supernatant and pellet respectively Data for representative mutants isshown The relative estimates of protein present in the soluble fraction and inclusion bodies for all mutants are shown in table 4 The arrowindicates the band for the induced chaperone

Tripathi et al doi101093molbevmsw182 MBE

2970

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

primarily arise from a decrease in specific protein activity andnot cellular protein levels (Firnberg et al 2014) contrary tothe results of the present study

For CcdB at exposed nonactive-site residues virtually allmutations are tolerated At a few highly exposed positions( 40 accessibility) aromatic residues and proline are nottolerated (supplementary table S4 Supplementary Materialonline) presumably because of aggregation or misfoldingPrevious experimental studies have shown that the removalof one methylene group from the protein interior destabilizesa protein by 5 kJmol and suggested that loss of packinginteractions is the major contributor to the increase in sta-bility (Main et al 1998 Chakravarty et al 2002 Loladze et al2002) though the relative contributions of packing and thehydrophobic effect to protein stabilization remain a matter ofdebate

Residue substitution penalties derived from analysis of theCcdB mutant data (supplementary table S7 SupplementaryMaterial online) indicate that substitutions of the aliphatic toaliphatic category are well tolerated In contrast aliphatic toaromatic changes are poorly tolerated even when the volumechange is equivalent to a single methylene group such asgoing from I L or M to F (Richards 1977) This is likely dueto the difference in shape between aliphatic and aromaticside-chains and suggests that while small increases in volumecan be tolerated changes in shape of the side-chains requiremore reorganization of the neighboring residues that in turnincur a higher energetic penalty

While there have been many studies that address the sta-bility effects associated with large to small substitutions (Mainet al 1998 Loladze et al 2002) there are relatively few studieswhich have quantitated effects of small to large substitutionsparticularly substitutions to aromatic residues (Liu et al 2000Tanaka et al 2010) In fact some studies have shown that verysignificant increases in residue size of up to three methylenegroups can be well tolerated (Hellinga et al 1992 Wynn et al1996) that energetic effects are highly context dependent(Main et al 1998 Liu et al 2000) and that such substitutionscan even be stabilizing (Lim et al 1994 Liu et al 2000) In thecurrent Protherm database (Kumar et al 2006) (httpwww

abrennetprotherm last accessed 31 August 2016) 4805single buried site mutants from 180 proteins were availableAbout 1667 mutants belonged to the aliphatic to aliphaticcategory nearly half of them being mutations to alanine Only154 aliphatic to aromatic substitutions were available About 50aliphatic to aliphatic and 8 aliphatic to aromatic substitutionshad similar volume increases with average DDGH2O values of043 and 275 kcalmol respectively Thus consistent withour mutational data aromatic substitutions are more destabi-lizing than aliphatic ones involving similar volume changes

Burial of polar groups in the nonpolar interior of a proteinare highly destabilizing and the degree of destabilization de-pends on the relative polarity of the group (Main et al 1998)Interestingly in the saturation mutagenesis data for chargedand polar amino acids at buried positions smaller aminoacids were consistently more poorly tolerated than largerones whereas the opposite trend is observed for aromaticsubstitutions Surprisingly mutations at residues involved incation-p and salt-bridge interactions were well tolerated in-dicating that these interactions do not contribute signifi-cantly to the stability and function of CcdB

By combining phenotypic data at multiple expression lev-els at all buried positions it was possible to approximatelyrank order mutational effects of substitutions at buried posi-tions The results obtained for CcdB were remarkably similarwith those of other proteins PSD95pdz3 and GB1 for whichsaturation mutagenesis data were also available (McLaughlinet al 2012 Olson et al 2014) and differed from trends ob-served in free energy of transfer data (compare fig 2AndashC withsupplementary fig S1C and D Supplementary Material on-line) Prediction of mutational sensitivity score (MSpred) forother proteins (PSD95pdz3 and GB1) using penalties derivedfrom the CcdB data taking into account the wildtype residueidentity (table 2) gave encouraging results and shows thepotential for the use of sequencing based phenotypic dataobtained from saturation mutagenesis in understanding andpredicting the functional effects of mutations The presentapproach compared favorably with known computationalpredictors (SNAP2 and SuSPect) showing more consistentresults and higher specificity (table 2) These and data from

Table 4 In Vivo Activity and Solubility of CcdB Mutants in Presence and Absence of ATP-Independent Chaperones

Mutant Strain Fraction soluble Fractional increase in solubilitya

BW25113 BWDtig BWDsecB BWpTig BWpSecB Tig SecB

WT 1 1 1 1 1 1 1 1L16S 4 7 7 2 3 04 15 17G29W 8 8 8 2 4 06 12 11M32N 4 6 6 2 3 01 3 2V33K 4 7 6 2 3 01 2 2P35I 8 7 8 5 5 06 13 11L36K 8 8 8 3 5 005 4 2L41F 7 8 8 3 3 04 18 25D67P 6 8 8 2 4 02 08 05S70W 6 8 8 2 4 05 1 04V73F 7 7 8 2 3 05 2 13V80N 6 6 7 3 4 06 13 12

aRatio of the soluble fraction of the protein in the presence of over-expressed chaperone (Trigger Factor and SecB respectively) to the soluble fraction of the protein undernormal conditions

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2971

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

other saturation mutagenesis studies can be used to improvepredictions of effects of nonsynonymous single nucleotidepolymorphisms on protein activity (Guerois et al 2002Randles et al 2006 Yue and Moult 2006 Bromberg et al2008 Radivojac et al 2013) as well as for protein threadingapplications to guide structure prediction (Shen and Sali2006 Yang et al 2015)

To obtain further insights into determinants of pheno-types a set of 80 mutants were expressed and purifiedThey showed a range of stabilities Thermal stabilities mea-sured by thermal shift assay (Niesen et al 2007) and equilib-rium chemical denaturation were well correlated Mutationsaffect both the thermodynamic stability and aggregation pro-pensity of proteins by enhancing misfolding Both these fac-tors lead to a decrease in the amount of properly foldedactive protein Thermal stabilities of CcdB mutants correlatedbetter with the amount of soluble protein present in a cell(rfrac14 082) than with in vivo phenotype (rfrac14 065) In somecases despite being highly soluble mutants show low activityin vivo suggesting that a significant fraction of soluble mutantprotein is misfolded and that fraction differs between mu-tants In other cases mutants show high or moderate in vivoactivity but differ in in vivo solubility Both these observationscould be rationalized by monitoring in vitro binding of CcdBmutants in the soluble fraction of the cell lysate with Gyraseusing surface plasmon resonance Mutants with high solubil-ity but low activity also show low binding to Gyrase whereaspartially soluble mutants with high in vivo activity bind well toGyrase in this assay (supplementary fig S4 SupplementaryMaterial online) This shows that even a small amount of wellfolded protein results in sufficient activity to cause cell deatheven at the lowest level of expression despite low solubilityand stability Refolding and unfolding kinetics for a subset ofmutants suggest that slow refolding rates measured in vitrocorrelate with the tendency to form inclusion bodies in vivoAdditionally several inactive mutants fail to refold to a func-tional state in vitro as well In contrast to the refolding ratesmost mutants studied had similar unfolding rates to wildtype

The ability of a mutant to fold to the native state is affectedby many parameters that include the crowded environmentof the cell folding assistance by various chaperones that buf-fer mutational effects on protein stability and quality controlmechanisms which are involved in degradation and removalof misfolded proteins from a cell These factors are likely re-sponsible for the less than perfect correlation between in vitrostability and in vivo activity To study these effects the cellularproteostasis machinery was perturbed by either over-expression or depletion of various chaperones and proteasesInterestingly the most significant changes in the in vivo ac-tivity of many mutants were observed upon perturbing thelevels of two ATP-independent chaperones SecB and TriggerFactor both of which act on their targets while the nascentpolypeptide chain is being synthesized at the ribosome Thissuggests that many of the CcdB mutants are targeted toinclusion bodies due to defects early in the folding pathwayOver expression of these chaperones lead to an increase inthe amount of folded protein in the cell as well as increased

in vivo activity and solubility for several formerly inactivemutants whereas chaperone deletion lead to a correspond-ing decrease in the activity These chaperones have previouslybeen shown to increase soluble protein expression by rescu-ing folding defects (Nishihara et al 2000) Since these chap-erones are ATP-independent the data clearly show thatrescuing folding defects without additional energy input orprotein stabilization results in increased activity in vivo

In conclusion comprehensive analyses of a CcdB satura-tion mutagenesis library reveal the contribution of each res-idue to protein activity and function Protein activity wasfound to depend monotonically on expression level andwas related to stability and solubility in a complex fashionbut correlated well with the ability of mutant protein in thesoluble fraction of the cell lysate to bind DNA Gyrase Themoderate correlation of stability with activity the high in vivoactivity of several destabilized mutants and the ability of theATP-independent chaperones SecB and Trigger Factor to en-hance mutant activity all suggest that mutational effects onfolding rather than on solubility or stability are the primarydeterminant of CcdB activity and fitness in vivo Despite thisapparent mechanistic complication the data demonstrateconsistent preferences in accommodating specific residuesat buried positions Besides enhancing our understanding ofhow mutations affect phenotype these data can be used toenhance predictions of fitness effects of Single NucleotidePolymorphisms and to guide protein design and structureprediction efforts

Materials and MethodsInformation about all the strains used in this study is availablein supplementary table S1 Supplementary Material online

Mutant Library PreparationPreviously a total of 1430 single-site mutants of CcdB (75of possible mutants) were generated by using a mega-primerbased method (Bajaj et al 2005 2008) In the present studyan inverse-PCR based approach was used and mutagenesiswas carried out by using adjacent nonoverlapping forwardand reverse primers The forward primer contained the mu-tant codon NNK in the middle of the primer (N is ACGTand K is GT in equimolar ratio) The individual productswere pooled gel purified phosphorylated subjected to intra-molecular ligation and transformed to generate the mutantlibrary (Jain and Varadarajan 2014)

In Vivo Activity of Individual Single-Site MutantsEscherichia coli strain TOP10pJAT was individually trans-formed with mutant CcdB plasmids and activity was assayedby plating the transformation mix on LB-amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 4 10 2glucose 7 10 3 glucose 0 glucosearabinose2 10 5 arabinose 7 10 5 arabinose and 2 10 2arabinose at 37 C Since active CcdB protein kills the cellscolonies were obtained only for mutants that showed aninactive phenotype Plate data was analyzed and comparedwith relative activity estimates obtained by deep sequencing

Tripathi et al doi101093molbevmsw182 MBE

2972

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

Determination of Active Fraction of the Protein in theCell Lysate Using Surface Plasmon ResonanceCultures of E coli strain CSH501 transformed with the mutantof interest were grown in LB media induced with 02 (wv)arabinose at an OD600 of 06 and grown for 3 h at 37 C Cellswere centrifuged (1800g 10 min RT) The pellet was resus-pended in PBS buffer pH 74 and sonicated followed by cen-trifugation at 11000g 10 min 4C Various dilutions ofsupernatant were passed over GyrA14 fragment immobilizedon the surface of a CM5 chip and binding was monitored aschange in resonance units per unit time Analysis was carriedout on a Biacore 3000 instrument (Biacore GE Healthcare)

In Vivo Activity of CcdB Mutants in Presence andAbsence of Chaperones and ProteasesEscherichia coli BW25113 strain was transformed with plas-mids expressing the following chaperones Trigger factor andSecB (both ATP-independent) ClpB DnaK DnaJ GroEL (allATP dependent chaperones) The resulting strains were re-ferred to as BWpTig BWpSecB BWpClpB BWpDnaKBWpDnaJ and BWpGroEl In addition BW25113 strains de-leted for the following proteases Lon ClpP HslU HslV andHchA were also used and referred to as BWDlon BWDclpPBWDhslU BWDhslV and BWDhchA respectivelyCompetent cells of each of these E coli strains were prepared(Chung et al 1989) and individually transformed with se-lected mutant CcdB plasmids and grown in deep well platesTransformation with pUC19 was used as a transformationefficiency control Activity of the mutants was assayed byspotting the transformation mix on LB-Amp plates in thepresence of the following concentrations of glucose (repres-sor) or arabinose (inducer) 2 10 1 glucose 2 10 2glucose 2 10 3 glucose 0 glucosearabinose2 10 3 arabinose 2 10 2 arabinose and 2 10 1arabinose at 30 C as many of these strains are temperaturesensitive In case of chaperone over-expression strains me-dium used for recovery following transformation wasLBthornChl (35 mgml) as the chaperone expressing plasmidsare ChlR After 60 min of incubation at 30C in the abovemedium cultures were spotted on LBthornAmp plates contain-ing 05 mM IPTG to induce chaperone expression and variousconcentrations of glucose and arabinose as described aboveto modulate CcdB expression Since active CcdB protein killsthe cells colonies are obtained only for mutants that show aninactive phenotype under the conditions examined Plateswere imaged data was analyzed and the condition whereeach of the mutants became active in presence or absenceof the chaperone was tabulated

Supplementary MaterialSupplementary figures S1ndashS6 tables S1ndashS9 CcdB MSseq data(S1_Appendixxlsx) and Materials and Methods are availableat Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

AcknowledgmentsWe thank Dr Abhijit Sarkar (Center for DNA Fingerprintingand Diagnosis India) for providing chaperone and proteasedeletion strains This work was supported by the Departmentof Biotechnology (grant number NOBTCOE34SP152192015 DT20112015) and the Department of Science andTechnology Government of India (grant number FNoSBSOBB-00992013 DTD 24614) The authors declare thatno competing interests exist

ReferencesAbriata LA Palzkill T Dal Peraro M 2015 How structural and physico-

chemical determinants shape sequence constraints in a functionalenzyme PLoS One 10e0118684

Adkar BV Tripathi A Sahoo A Bajaj K Goswami D Chakrabarti PSwarnkar MK Gokhale RS Varadarajan R 2012 Protein model dis-crimination using mutational sensitivity derived from deep sequenc-ing Structure 20371ndash381

Araya CL Fowler DM Chen W Muniez I Kelly JW Fields S 2012 Afundamental protein property thermodynamic stability revealedsolely from large-scale measurements of protein function ProcNatl Acad Sci U S A 10916858ndash16863

Bajaj K Chakrabarti P Varadarajan R 2005 Mutagenesis-based defini-tions and probes of residue burial in proteins Proc Natl Acad SciU S A 10216221ndash16226

Bajaj K Chakshusmathi G Bachhawat-Sikder K Surolia A Varadarajan R2004 Thermodynamic characterization of monomeric and dimericforms of CcdB (controller of cell division or death B protein)Biochem J 380409ndash417

Bajaj K Dewan PC Chakrabarti P Goswami D Barua B Baliga CVaradarajan R 2008 Structural correlates of the temperature sensi-tive phenotype derived from saturation mutagenesis studies ofCcdB Biochemistry 4712964ndash12973

Bajaj K Madhusudhan MS Adkar BV Chakrabarti P Ramakrishnan CSali A Varadarajan R 2007 Stereochemical criteria for prediction ofthe effects of proline mutations on protein stability PLoS ComputBiol 3e241

Bershtein S Mu W Serohijos AW Zhou J Shakhnovich EI 2013 Proteinquality control acts on folding intermediates to shape the effects ofmutations on organismal fitness Mol Cell 49133ndash144

Bershtein S Mu W Shakhnovich EI 2012 Soluble oligomerization pro-vides a beneficial fitness effect on destabilizing mutations Proc NatlAcad Sci U S A 1094857ndash4862

Bershtein S Segal M Bekerman R Tokuriki N Tawfik DS 2006Robustness-epistasis link shapes the fitness landscape of a randomlydrifting protein Nature 444929ndash932

Bloom JD Silberg JJ Wilke CO Drummond DA Adami C Arnold FH2005 Thermodynamic prediction of protein neutrality Proc NatlAcad Sci U S A 102606ndash611

Bowie JU Sauer RT 1989 Identifying determinants of folding and activityfor a protein of unknown structure Proc Natl Acad Sci U S A862152ndash2156

Brandts JF Lin LN 1990 Study of strong to ultratight protein interactionsusing differential scanning calorimetry Biochemistry 296927ndash6940

Bromberg Y Yachdav G Rost B 2008 SNAP predicts effect of mutationson protein function Bioinformatics 242397ndash2398

Chakravarty S Bhinge A Varadarajan R 2002 A procedure for detectionand quantitation of cavity volumes proteins Application to measurethe strength of the hydrophobic driving force in protein foldingJ Biol Chem 27731345ndash31353

Chakshusmathi G 2002 Temperature sensitive mutants of CcdBin vivoand in vitro studies PhD thesis Indian Institute of Science BangaloreIndia

Chung CT Niemela SL Miller RH 1989 One-step preparation ofcompetent Escherichia coli transformation and storage of bacte-rial cells in the same solution Proc Natl Acad Sci U S A 862172ndash2175

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2973

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

De Jonge N Garcia-Pino A Buts L Haesaerts S Charlier D Zangger KWyns L De Greve H Loris R 2009 Rejuvenation of CcdB-poisonedgyrase by an intrinsically disordered protein domain Mol Cell35154ndash163

DeBartolo J Dutta S Reich L Keating AE 2012 Predictive Bcl-2 familybinding models rooted in experiment or structure J Mol Biol422124ndash144

Deng Z Huang W Bakkalbasi E Brown NG Adamski CJ Rice K MuznyD Gibbs RA Palzkill T 2012 Deep sequencing of systematic com-binatorial libraries reveals beta-lactamase sequence constraints athigh resolution J Mol Biol 424150ndash167

DePristo MA Weinreich DM Hartl DL 2005 Missense meanderings insequence space a biophysical view of protein evolution Nat RevGenet 6678ndash687

Dutta S Chen TS Keating AE 2013 Peptide ligands for pro-survivalprotein Bfl-1 from computationally guided library screening ACSChem Biol 8778ndash788

Firnberg E Labonte JW Gray JJ Ostermeier M 2014 A comprehensivehigh-resolution map of a genersquos fitness landscape Mol Biol Evol311581ndash1592

Fowler DM Araya CL Fleishman SJ Kellogg EH Stephany JJ Baker DFields S 2010 High-resolution mapping of protein sequence-function relationships Nat Methods 7741ndash746

Fukada H Sturtevant JM Quiocho FA 1983 Thermodynamics of thebinding of L-arabinose and of D-galactose to the L-arabinose-bindingprotein of Escherichia coli J Biol Chem 25813193ndash13198

Gallivan JP Dougherty DA 1999 Cation-pi interactions in structuralbiology Proc Natl Acad Sci U S A 969459ndash9464

Geiler-Samerotte KA Dion MF Budnik BA Wang SM Hartl DLDrummond DA 2011 Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein re-sponse in yeast Proc Natl Acad Sci U S A 108680ndash685

Gonzalez M Argarana CE Fidelio GD 1999 Extremely high thermalstability of streptavidin and avidin upon biotin binding BiomolEng 1667ndash72

Gottesman S Wickner S Maurizi MR 1997 Protein quality controltriage by chaperones and proteases Genes Dev 11815ndash823

Guerois R Nielsen JE Serrano L 2002 Predicting changes in the stabilityof proteins and protein complexes a study of more than 1000 mu-tations J Mol Biol 320369ndash387

Hayes F 2003 Toxins-antitoxins plasmid maintenance programmedcell death and cell cycle arrest Science 3011496ndash1499

Hecht M Bromberg Y Rost B 2015 Better prediction of functionaleffects for sequence variants BMC Genomics 16(Suppl 8)S1

Hellinga HW Wynn R Richards FM 1992 The hydrophobic core ofEscherichia coli thioredoxin shows a high tolerance to nonconserva-tive single amino acid substitutions Biochemistry 3111203ndash11209

Henikoff S Henikoff JG 1992 Amino acid substitution matrices fromprotein blocks Proc Natl Acad Sci U S A 8910915ndash10919

Hietpas R Roscoe B Jiang L Bolon DN 2012 Fitness analyses of allpossible point mutations for regions of genes in yeast Nat Protoc71382ndash1396

Hietpas RT Jensen JD Bolon DN 2011 Experimental illumination of afitness landscape Proc Natl Acad Sci U S A 1087896ndash7901

Jaffe A Ogura T Hiraga S 1985 Effects of the ccd function of the Fplasmid on bacterial growth J Bacteriol 163841ndash849

Jain PC Varadarajan R 2014 A rapid efficient and economical inversepolymerase chain reaction-based method for generating a site sat-uration mutant library Anal Biochem 44990ndash98

Janin J Miller S Chothia C 1988 Surface subunit interfaces and interiorof oligomeric proteins J Mol Biol 204155ndash164

Jiang L Mishra P Hietpas RT Zeldovich KB Bolon DN 2013 Latenteffects of Hsp90 mutants revealed at reduced expression levelsPLoS Genet 9e1003600

Kim I Miller CR Young DL Fields S 2013 High-throughput analysis ofin vivo protein stability Mol Cell Proteomics 123370ndash3378

Kowalsky CA Klesmith JR Stapleton JA Kelly V Reichkitzer NWhitehead TA 2015 High-resolution sequence-function mappingof full-length proteins PLoS One 10e0118193

Kumar MD Bava KA Gromiha MM Prabakaran P Kitajima K UedairaH Sarai A 2006 ProTherm and ProNIT thermodynamic databasesfor proteins and protein-nucleic acid interactions Nucleic Acids Res34D204ndashD206

Lim WA Hodel A Sauer RT Richards FM 1994 The crystal structure of amutant protein with altered but improved hydrophobic core pack-ing Proc Natl Acad Sci U S A 91423ndash427

Lin YS Hsu WL Hwang JK Li WH 2007 Proportion of solvent-exposedamino acids in a protein and rate of protein evolution Mol Biol Evol241005ndash1011

Liu R Baase WA Matthews BW 2000 The introduction of strain and itseffects on the structure and stability of T4 lysozyme J Mol Biol295127ndash145

Liu Y Tan YL Zhang X Bhabha G Ekiert DC Genereux JC Cho Y KipnisY Bjelic S Baker D et al 2014 Small molecule probes to quantify thefunctional fraction of a specific protein in a cell with minimal foldingequilibrium shifts Proc Natl Acad Sci U S A 1114449ndash4454

Loladze VV Ermolenko DN Makhatadze GI 2002 Thermodynamic con-sequences of burial of polar and non-polar amino acid residues inthe protein interior J Mol Biol 320343ndash357

Loris R Dao-Thi MH Bahassi EM Van Melderen L Poortmans FLiddington R Couturier M Wyns L 1999 Crystal structure ofCcdB a topoisomerase poison from E coli J Mol Biol 2851667ndash1677

Maier T Ferbitz L Deuerling E Ban N 2005 A cradle for new proteinstrigger factor at the ribosome Curr Opin Struct Biol 15204ndash212

Main ER Fulton KF Jackson SE 1998 Context-dependent nature ofdestabilizing mutations on the stability of FKBP12 Biochemistry376145ndash6153

McLaughlin RN Jr Poelwijk FJ Raman A Gosal WS Ranganathan R2012 The spatial architecture of protein function and adaptationNature 491138ndash142

Melnikov A Rogov P Wang L Gnirke A Mikkelsen TS 2014Comprehensive mutational scanning of a kinase in vivo revealssubstrate-dependent fitness landscapes Nucleic Acids Res 42e112

Milla ME Brown BM Sauer RT 1994 Protein stability effects of a com-plete set of alanine substitutions in Arc repressor Nat Struct Biol1518ndash523

Miosge LA Field MA Sontani Y Cho V Johnson S Palkova ABalakishnan B Liang R Zhang Y Lyon S et al 2015 Comparisonof predicted and actual consequences of missense mutations ProcNatl Acad Sci U S A 112E5189ndashE5198

Mogk A Kummer E Bukau B 2015 Cooperation of Hsp70 and Hsp100chaperone machines in protein disaggregation Front Mol Biosci 222

Moretti R Fleishman SJ Agius R Torchala M Bates PA Kastritis PLRodrigues JP Trellet M Bonvin AM Cui M et al 2013Community-wide evaluation of methods for predicting the effectof mutations on protein-protein interactions Proteins 811980ndash1987

Niesen FH Berglund H Vedadi M 2007 The use of differential scanningfluorimetry to detect ligand interactions that promote protein sta-bility Nat Protoc 22212ndash2221

Nishihara K Kanemori M Yanagi H Yura T 2000 Overexpression oftrigger factor prevents aggregation of recombinant proteins inEscherichia coli Appl Environ Microbiol 66884ndash889

Olson CA Wu NC Sun R 2014 A comprehensive biophysical descrip-tion of pairwise epistasis throughout an entire protein domain CurrBiol 242643ndash2651

Overington J Donnelly D Johnson MS Sali A Blundell TL 1992Environment-specific amino acid substitution tables tertiary tem-plates and prediction of protein folds Protein Sci 1216ndash226

Parthiban V Gromiha MM Schomburg D 2006 CUPSAT prediction ofprotein stability upon point mutations Nucleic Acids Res34W239ndashW242

Pires DE Ascher DB Blundell TL 2014 DUET a server for predictingeffects of mutations on protein stability using an integrated com-putational approach Nucleic Acids Res 42W314ndashW319

Ponder JW Richards FM 1987 Tertiary templates for proteins Use ofpacking criteria in the enumeration of allowed sequences for differ-ent structural classes J Mol Biol 193775ndash791

Tripathi et al doi101093molbevmsw182 MBE

2974

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16

Prajapati RS Sirajuddin M Durani V Sreeramulu S Varadarajan R 2006Contribution of cation-pi interactions to protein stabilityBiochemistry 4515000ndash15010

Radivojac P Clark WT Oron TR Schnoes AM Wittkop T Sokolov AGraim K Funk C Verspoor K Ben-Hur A et al 2013 A large-scaleevaluation of computational protein function prediction NatMethods 10221ndash227

Randles LG Lappalainen I Fowler SB Moore B Hamill SJ Clarke J 2006Using model proteins to quantify the effects of pathogenic muta-tions in Ig-like proteins J Biol Chem 28124216ndash24226

Richards FM 1977 Areas volumes packing and protein structure AnnuRev Biophys Bioeng 6151ndash176

Romero PA Tran TM Abate AR 2015 Dissecting enzyme function withmicrofluidic-based deep mutational scanning Proc Natl Acad SciU S A 1127159ndash7164

Roscoe BP Thayer KM Zeldovich KB Fushman D Bolon DN 2013Analyses of the effects of all ubiquitin point mutants on yeastgrowth rate J Mol Biol 4251363ndash1377

Rose GD Geselowitz AR Lesser GJ Lee RH Zehfus MH 1985Hydrophobicity of amino acid residues in globular proteinsScience 229834ndash838

Sahoo A Khare S Devanarayanan S Jain PC Varadarajan R 2015Residue proximity information and protein model discriminationusing saturation-suppressor mutagenesis Elife 4e09532

Sarkisyan KS Bolotin DA Meer MV Usmanova DR Mishin AS SharonovGV Ivankov DN Bozhanova NG Baranov MS Soylemez O et al2016 Local fitness landscape of the green fluorescent protein Nature533397ndash401

Schlinkmann KM Honegger A Tureci E Robison KE Lipovsek DPluckthun A 2012 Critical features for biosynthesis stability andfunctionality of a G protein-coupled receptor uncovered by all-versus-all mutations Proc Natl Acad Sci U S A 1099810ndash9815

Shen MY Sali A 2006 Statistical potential for assessment and predictionof protein structures Protein Sci 152507ndash2524

Starita LM Pruneda JN Lo RS Fowler DM Kim HJ Hiatt JB Shendure JBrzovic PS Fields S Klevit RE 2013 Activity-enhancing mutations inan E3 ubiquitin ligase identified by high-throughput mutagenesisProc Natl Acad Sci U S A 110E1263ndashE1272

Starita LM Young DL Islam M Kitzman JO Gullingsrud J Hause RJFowler DM Parvin JD Shendure J Fields S 2015 Massively ParallelFunctional Analysis of BRCA1 RING Domain Variants Genetics200413ndash422

Sultana A Lee JE 2015 Measuring protein-protein and protein-nucleicacid interactions by biolayer interferometry Curr Protoc Protein Sci7919 25 11ndash19 25 26

Tanaka M Chon H Angkawidjaja C Koga Y Takano K Kanaya S2010 Protein core adaptability crystal structures of the cavity-filling variants of Escherichia coli RNase HI Protein Pept Lett171163ndash1169

Thyagarajan B Bloom JD 2014 The inherent mutational tolerance andantigenic evolvability of influenza hemagglutinin Elife 3e03300

Tokuriki N Tawfik DS 2009 Chaperonin overexpression promotes ge-netic variation and enzyme evolution Nature 459668ndash673

Traxlmayr MW Hasenhindl C Hackl M Stadlmayr G Rybka JD Borth NGrillari J Ruker F Obinger C 2012 Construction of a stability land-scape of the CH3 domain of human IgG1 by combining directedevolution with high throughput sequencing J Mol Biol 423397ndash412

Tripathi A Varadarajan R 2014 Residue specific contributions to stabil-ity and activity inferred from saturation mutagenesis and deep se-quencing Curr Opin Struct Biol 2463ndash71

Tsai CJ Lin SL Wolfson HJ Nussinov R 1997 Studies of protein-proteininterfaces a statistical analysis of the hydrophobic effect Protein Sci653ndash64

Ullers RS Luirink J Harms N Schwager F Georgopoulos C Genevaux P2004 SecB is a bona fide generalized chaperone in Escherichia coliProc Natl Acad Sci U S A 1017583ndash7588

Van Melderen L Thi MH Lecchi P Gottesman S Couturier M MauriziMR 1996 ATP-dependent degradation of CcdA by Lon proteaseEffects of secondary structure and heterologous subunit interactionsJ Biol Chem 27127730ndash27738

Wang X Minasov G Shoichet BK 2002 Evolution of an antibiotic resis-tance enzyme constrained by stability and activity trade-offs J MolBiol 32085ndash95

Whitehead TA Chevalier A Song Y Dreyfus C Fleishman SJ De MattosC Myers CA Kamisetty H Blair P Wilson IA et al 2012Optimization of affinity specificity and function of designed influ-enza inhibitors using deep sequencing Nat Biotechnol 30543ndash548

Williams TA Fares MA 2010 The effect of chaperonin buffering onprotein evolution Genome Biol Evol 2609ndash619

Wolfenden R Lewis CA Jr Yuan Y Carter CW Jr 2015 Temperaturedependence of amino acid hydrophobicities Proc Natl Acad SciU S A 1127484ndash7488

Wu NC Olson CA Du Y Le S Tran K Remenyi R Gong D Al-MawsawiLQ Qi H Wu TT et al 2015 Functional Constraint Profiling of a ViralProtein Reveals Discordance of Evolutionary Conservation andFunctionality PLoS Genet 11e1005310

Wynn R Harkins PC Richards FM Fox RO 1996 Mobile unnaturalamino acid side chains in the core of staphylococcal nucleaseProtein Sci 51026ndash1031

Yang J Yan R Roy A Xu D Poisson J Zhang Y 2015 The I-TASSER Suiteprotein structure and function prediction Nat Methods 127ndash8

Yates CM Filippis I Kelley LA Sternberg MJ 2014 SuSPect enhancedprediction of single amino acid variant (SAV) phenotype using net-work features J Mol Biol 4262692ndash2701

Yue P Li Z Moult J 2005 Loss of protein structure stability as a majorcausative factor in monogenic disease J Mol Biol 353459ndash473

Yue P Moult J 2006 Identification and analysis of deleterious humanSNPs J Mol Biol 3561263ndash1274

Molecular Determinants of Mutant Phenotypes doi101093molbevmsw182 MBE

2975

Dow

nloaded from httpsacadem

icoupcomm

bearticle331129602272479 by guest on 29 March 2022

  • msw182-TF1
  • msw182-TF2
  • msw182-TF3
  • msw182-TF4
  • msw182-TF5
  • msw182-TF6
  • msw182-TF7
  • msw182-TF8
  • msw182-TF9
  • msw182-TF10
  • msw182-TF11
  • msw182-TF12
  • msw182-TF13
  • msw182-TF14
  • msw182-TF15
  • msw182-TF16