CREATION OF MODULAR INDUCIBLE REPRESSORS VIA DIRECTED

CREATION OF MODULAR INDUCIBLE REPRESSORS VIA DIRECTED EVOLUTION

by

Gareth Z. Evans

A thesis submitted to Johns Hopkins University in conformity with the requirements for the degree of Master of Science in Engineering.

Baltimore, Maryland

May 2019

ii

Abstract Inducible repressors are proteins which repress expression of genes downstream

from a target DNA sequence until induced by a change in ligand concentration. These

include repressors such as LacI and TetR. There are two problems with existing

repressors: they are not modular, meaning the target sequence cannot be adjusted, and the

inducer ligands are often prohibitively expensive for large-scale implementation. This

thesis reports progress on the creation of inducible repressors based on transcription

activator-like effectors (TALE) and nuclease-null deactivated Cas9 (dCas9). The strategy

of domain insertion of a circularly permuted glycine betaine binding protein (GBBP) was

utilized for the creation of these inducible repressors. The putative, TALE-based

inducible repressor BITElacO-C7 (identified in a previous study) was found to slightly

repress expression but did not lose its ability to repress in the presence of glycine betaine

(GB). Evidence suggests that BITElacO-C7’s initial identification as a potential switch

may have resulted from a pH change when GB was added to the media. This prompted

the design of a new strategy for creating a switch based on dCas9. A selection system

based on induction or repression of fluorescent proteins was confirmed to behave as

desired. However, construction of a library of insertions of GBBP into dCas9 was

hindered by apparent toxicity of dCas9. Several mitigations of this toxicity were

attempted, but these mitigations did not significantly alleviate the toxicity.

Primary Reader: Marc Ostermeier

Secondary Reader: Michael Betenbaugh

iii

Acknowledgements

Foremost, I would like to thank my advisor, Dr. Marc Ostermeier, for the

opportunity to work in his laboratory. Marc’s advice and guidance have been invaluable

not only for these projects, but for my learning and understanding of the principles of

conducting scientific research.

I would like to thank the entire Ostermeier lab, both current and previous

members, for always being friendly and making the time spent at work so enjoyable. I

owe special thanks to Ryan Weeks, Shaun Spisak, and Pricila Hauk. I would like to thank

Ryan for mentoring me when I first started in the lab, sharing his skills and knowledge

throughout my time working in the lab, and working alongside me on the BITE project. I

would like to thank Shaun for always sharing his helpful opinions and advice and for

characterizing the dCas9 system with me. I would also like to thank Pricila Hauk for

always eagerly and kindly sharing her knowledge and for always having her door open

for a quick question or a chat.

I would like to thank my friends outside of the lab for their support. In particular,

I’d like to thank Alyssa for her encouragement, support, and helpful edits.

Finally, I would like to thank my parents for their love and unwavering support

over the years. I could not have gotten this far without their encouragement, advice, and

patience in all of my pursuits.

iv

Table of Contents

Abstract .............................................................................................................................. ii Acknowledgements .......................................................................................................... iii List of Tables...................................................................................................................................v List of Figures ................................................................................................................... vi

Chapter 1: Introduction to Protein Switches ................................................................. 1 1.1 Creation of Synthetic Proteins by Genetic Engineering ........................................... 1 1.2 Protein Switches ........................................................................................................... 1 1.3 Inducible Repressors .................................................................................................... 2 1.4 Protein Engineering Design Methods ......................................................................... 4 1.5 Random Domain Insertion and Circular Permutation ............................................. 6 1.6 Engineering New Modular, Inducible Repressors .................................................. 10

Chapter 2: Glycine Betaine-Inducible Transcription Activator-Like Effectors ....... 14 2.1: Introduction ...................................................................................................................... 14 2.2: Materials and Methods .................................................................................................... 20

2.3: Results and Discussion .....................................................................................................23

Chapter 3: Glycine Betaine-Inducible Nuclease-Inactive Cas9 .................................. 34 3.1: Introduction ...................................................................................................................... 34 3.2 Materials and Methods ..................................................................................................... 39 3.3 Results and Discussion ...................................................................................................... 43

References ........................................................................................................................ 77

Appendices ....................................................................................................................... 80 Biography ........................................................................................................................ 86

v

List of Tables Table S1: Strains and plasmids used in chapter 2 ....................................................................82

Table S2: Ribeiro dCas9 well positions corresponding with insertion sites ...........................83

Table S3: Additional dCas9 well positions corresponding with insertion sites.......................84

Table S4: GBBP well positions corresponding with site of permutation.................................85

vi

List of Figures Figure 1.1: Lac system schematic .................................................................................................. 3

Figure 1.2: Conceptualization of domain insertion ..................................................................... 7

Figure 1.3: Conceptualization of circular permutation ............................................................... 9

Figure 1.4: Comparison of Cas9 and dCas9 ............................................................................... 12

Figure 2.1: BITElacO-C7 schematic ........................................................................................... 15

Figure 2.2: Conceptualization of the bandpass system ............................................................. 17

Figure 2.3: Original characterization of BITElacO-C7 ............................................................ 19

Figure 2.4: Ampicillin MIC with Ribeiro's constructs .............................................................. 24

Figure 2.5: Second ampicillin MIC of Ribeiro's constructs ...................................................... 25

Figure 2.6: GFP production rate experiment ............................................................................. 26

Figure 2.7: Ampicillin MIC of BITElacO-C7 on smaller plasmid pLR .................................. 28

Figure 2.8: Normalized survival of BITElacO-C7 expressing cells with and without GB-HCl ................................................................................................................................................. 30

Figure 2.9: Normalized survival of BITElacO-C7 expressing cells with and without GB-H2O ................................................................................................................................................ 31

Figure 2.10: MIC ampicillin comparison of GB-HCl and GB-H2O at different pH values .. 32

Figure 3.1: Insertion tolerance of dCas9 ..................................................................................... 36

Figure 3.2: Repression of mRFP with CRISPRi with different target sequences .................. 37

Figure 3.3: Plasmid diagram and genomic reporters for CRISPRi ......................................... 38

Figure 3.4: Fluorescence assay characterizing CRISPRi .......................................................... 44

Figure 3.5: Effect of decreasing aTc concentration on mRFP repression ............................... 45

Figure 3.6: Effect of minimal media on mRFP repression ........................................................ 47

Figure 3.7: Effect of aTc concentration on growth rate and mRFP repression ...................... 49

Figure 3.8: Effect of growth phase on mRFP repression .......................................................... 50

Figure 3.9: Side scatter profiles for mRFP expressing and repressing cells FACS ................ 52

Figure 3.10: mRFP vs GFP profiles for mRFP expressing and repressing cells FACS ......... 53

Figure 3.11: Number of events vs mRFP for mRFP expressing and repressing cells FACS . 54

Figure 3.12: Number of events vs mRFP for test sort FACS .................................................... 56

Figure 3.13: Enrichment of rare mRFP expressing cells during test sort FACS .................... 57

Figure 3.14: Enrichment of rare mRFP repressing cells during test sort FACS .................... 58

Figure 3.15: Effect of annealing temperature on PCR amplification ...................................... 60

Figure 3.16: Inverse PCR amplification of pdCas9 ................................................................... 61


vii


Figure 3.19: Inverse PCR amplification of pdCas9 and amplification of cpGBBP ................ 63

Figure 3.20: PCR amplification of cpGBBP ............................................................................... 64

Figure 3.21: PCR amplification of cpGBBP ............................................................................... 65

Figure 3.22: Effect of betaine and DMSO on difficult PCR reactions ..................................... 66

Figure 3.23: First library diagnostic digest ................................................................................ 68

Figure 3.24: Visualizing ligation reaction product .................................................................... 70

Figure 3.25: Effect of minimal media on successful transformations ...................................... 72

Figure 3.26: Fluorescence assay with T5-TetR pgRNA variant ............................................... 74

Figure 3.27: Effect of pgRNA-T5-TetR on dCas9 toxicity and successful insertions ............. 75

Figure S1: Normalized survival of BITElacO-C7 expressing cells with and without GB-HCl ................................................................................................................................................. 80

Figure S2: Normalized survival of BITElacO-C7 expressing cells with and without GB-HCl ................................................................................................................................................. 81

1

Chapter 1: Introduction to Protein Switches 1.1 Creation of Synthetic Proteins by Genetic Engineering

Humans have contributed to protein modification for centuries, from the artificial

selection of plants such as corn to animals like cattle (Rangel, 2015). In 1973, Herbert

Boyer and Stanley Cohen sped up this process by creating the first genetically engineered

organism and providing the first techniques of DNA cloning. With this discovery, not

only were recombinant proteins like human insulin and growth hormones able to be

produced in new organisms (Goeddel, et al., 1979), but enzymes with new functions

(Wells & Estell, 1988) and new antibodies were able to be produced (Hudson, 2003).

Because proteins perform various functions efficiently with great specificity and can be

precisely and relatively cheaply synthesized through fermentation, the desire to create

and design new proteins with new functions makes protein engineering is a very relevant

and growing field.

1.2 Protein Switches

Protein switches recognize an input and output a response. The input may be in the

form of a change in ligand concentration, a covalent bond formation, or other changes in

the protein’s environment. Output responses are diverse and can vary from an activation

of enzymatic activity to the regulation of transcription. These switches occur naturally in

nature in the forms of protein kinases, allosteric enzymes, and inducible transcription

factors (Stein & Alexandrov, 2015). Due to the ability of proteins to behave in a specific

manner with specific interactions, they form an interesting target to control cellular

processes (Schreiber & Keating, 2011). Engineered, non-endogenous protein switches

2

allow the opportunity to selectively control desired cellular processes as an output

response to a custom, user-designed input. The first generation of protein switches

consisted of fluorescent proteins acting as biosensors, using fluorescence to indicate the

presence of calcium ions in the system. Beyond biosensors, protein switches have the

potential to be used in applications ranging from activating prodrugs through enzymatic

activation to regulating gene expression (Phelan, Ostermeier, & Townsend, 2009).

For a switch with the input of a ligand binding, such as the switches discussed in this

thesis, the characterization of the switch is determined by the ratio of the output activity

with and without the receptor domain’s ligand. The more successful the switch, the

higher the change in activity between its “on” state vs its “off” state. The switch can

either be initially on, and turned off by binding of ligand, or vice versa.

1.3 Inducible Repressors

Because the transcription of DNA is the first step of gene expression, it serves as a

primary location for regulation (Latchman, 1993). Specific protein factors, transcription

factors, bind to regulatory regions of DNA to ensure genes are expressed only when

required. Transcription factors can respond to external stimulus, such as the concentration

of hormone or enzyme substrate. Jacob and Monod most notably examined this type of

transcription factor, an inducible transcription factor, when characterizing the Lac system

(Jacob & Monod, 1961).

The lac system regulates the production of lactose-metabolizing enzyme b-

galactosidase (LacZ), which hydrolyzes lactose, as well as other molecules with

glycosidic bonds, and occasionally transglycosolates lactose to allolactose. The lac

system also regulates production of b-galactoside permease (LacY), an intracellular

3

lactose transporter, and b-galactoside transacetylase (LacA). The gene for the lac

repressor (lacI) constitutively expresses an inducible transcription factor that binds to the

24-base pair long lac operator (lacO) site which represses transcription of the

downstream lacZ, lacY, and lacA (Gilbert & Maxam, 1973). When present intracellularly,

allolactose binds to LacI, causing an allosteric change in shape that prevents LacI from

binding to the lacO site (Schumacher, Choi, & Brennan, 1994). Isopropyl b-D-1-

thiogalactopyranoside (IPTG) mimics the inducible effects of allolactose. However,

unlike allolactose, IPTG not hydrolysable, so its concentration will not be reduced and

constant induction will occur (Figure 1.1).

Figure 1.1: Lac Operon scheme. When allolactose or IPTG is not present, LacI binds to the lacO site, stopping transcription of the genes downstream. When IPTG or allolactose is present in the system, it binds to LacI and allosterically changes the conformation of the binding domain, releasing LacI from lacO and allowing transcription of the downstream genes to occur. Adapted from (Rice University, 2016).

LacZ, LacY, and LacA do not have to be the genes downstream of the lacO site: any

gene can be cloned under lacO and expressed with the addition of IPTG. Due to this, the

lac system is a powerful tool to control transcription of any gene.

4

Other inducible repressors include the tetracycline repressor (TetR) (Orth,

Schnappinger, Hillen, Saenger, & Hinrichs, 2000) and the tryptophan repressor

(Arvidson, et al., 1994). These repressors work to control expression, but have two main

issues. They are not modular, meaning the DNA target sequence of the repressor cannot

be changed. This is important when wanting to control the expression of a gene—the

sequence in front of the gene must be changed, through additional cloning steps, to the

sequence of the operon. In addition to the lack of modularity, the inducers, such as IPTG,

are often expensive and cost-prohibitive for control of gene expression at the industrial-

scale. Due to this, there is a need for a modular inducible repressor that utilizes an

inexpensive inducer. This thesis will center around this concept.

1.4 Protein Engineering Design Methods

Proteins bind to other molecules and, from antibodies binding to specific viral targets

to transcription factors binding to DNA, form greatly specific bonds (Alberts, Johnson, &

Lewis, 2002). Numerous weak, noncovalent bonds and hydrophobic interactions bring

together the protein and its ligand. Due to this, structure is inherently related to the

function of a protein. For instance, the three-dimensional structure of an enzyme or

receptor protein determines the shape of the active site or binding pocket. The shapes of

these sites have been evolved specifically to accommodate their ligands and, in the case

of enzymes, to catalyze specific enzymatic reactions. Changes within the protein’s

primary structure, such as mutation of amino acids, can change the hydrophobicity or

introduce new steric hindrances that change the three-dimensional structure of the folded

protein, therefore potentially affecting function. Changes in shape have the potential to

5

improve activity (e.g. rate of catalysis or binding affinity), but they also have the

potential to have no effect on or even diminish activity.

There are two main schools of thought when designing modifications to proteins

to create new, improved proteins. These are rational design and directed evolution (Cobb,

Si, & Zhao, 2012). Rational design centers around a precise understanding of the

structure of the protein. This understanding is applied to determine how modifications

will affect other residues within the protein, both near and far from the modified residue,

and how these will affect the overall structure, and therefore function, of the protein as a

whole. Advances in computational modeling and new protein structural data provide this

information, which can then be used to exploit protein structure via site-specific directed

approaches. From these computational data, groups have been able to determine areas of

related residues that are far from each other by statistical coupling analysis (Reynolds,

McLaughlin, & Ranganathan, 2011). Others have modeled areas of secondary structure to

determine which could contribute to desired perturbation of a protein’s active site

(Dagliyan, et al., 2016). The main drawbacks of rational design, however, are needing to

intimately know the protein structure and having to predict what functionality, structure,

and interactions will result when a new protein structure is created after altering the

primary structure. This uncertainty is an existing limit of computational modeling. Due to

this, often even functional modifications have lower than desired activities. For instance,

the best switching ratios for switches designed in this fashion are on the order of

magnitude of 4.5-fold (Dagliyan, et al., 2016). Directed evolution, however, allows for

the potential to increase switching potential to 600-fold, and potentially higher (Guntas,

6

Mansell, Kim, & Ostermeier, 2005). However, despite its drawbacks, rational design can

be used to guide the techniques and methods utilized in directed evolution.

Directed evolution describes the process of evolving proteins, similarly to how

nature evolves to produce the most efficient molecules to sustain survival, by applying

iterative rounds of diversification followed by selection or screening (Romero & Arnold,

2009). This design method has resulted in many new proteins that have been optimized to

perform specific functions in specific environments. The methods of diversification vary

but occur on the genetic level. No matter the method, the goal is to increase the diversity

of the protein in the cellular population expressing the protein of interest. The population

undergoes either a selection or a screening where only members with advantageous

protein characteristics are allowed to survive or become enriched, respectively. No

structural information or modeled predictions of relationships are needed to perform

directed evolution, which allows many combinations of variation to be explored, even

those not considered by computer modeling. These combinations are able to be quickly

characterized through high-throughput screening or selection methods such as

fluorescence-activated cell sorting (FACS) or growth selection assays, as will be explored

in this thesis. To use directed evolution, however, selective or screening methods must be

carefully chosen to select for or enrich only the phenotype wanted, while still preserving

the link to genotype.

1.5 Random Domain Insertion and Circular Permutation

Of the multiple methods of creating genetic variation in proteins, this thesis will focus

on variation created by random domain insertion and circular permutation. Domain

insertion describes the method of inserting one protein domain into another in order to

7

couple the functionalities of the two domains (Kanwar M. , Wright, Date, Tullman, &

Ostermeier, 2013). From this method, switches with high fold-changes and reversible

switching have been created. The inserted domain can contain either the desired output or

input (Figure 1.2).

Figure 1.2: “Schematic depiction of the creation of protein switches by domain insertion. A protein switch is a fusion of two domains (by domain insertion) in such a way that the activity of the output domain is regulated by the input domain’s recognition of an input signal. (a) DNA sequences are depicted as lines and (b) their corresponding proteins as geometric shapes. A light gray color of the output domain indicates that the domain is inactive or less active. The signal that modulates the switch is depicted as a black triangle” (Ribeiro, Warren, & Ostermeier, Construction of Protein Switches by Domain Insertion and Directed Evolution, 2017).

The positions of the inserted domain can either be selected based on prior knowledge or

chosen randomly. Whichever way they are chosen, however, a greater number of

insertion positions confers a higher degree of diversity to the final library. Three methods

are used to create random insertion libraries: the use of nonspecific endonucleases to

randomly create double-stranded breaks in DNA, the use of transposons, and the use of

inversed PCR with abutting primers at desired insertion sites.

Inverse PCR is used in this thesis, with a pair of primers designed to either create an

insertion between every codon of the acceptor gene, or some subset of this space. For this

method, the abutting PCR primers amplify the entire plasmid around the insertion site.

This process, termed multiplex inverse PCR when used on many sites, “opens” the

8

acceptor plasmid to allow for insertion of the second domain at the opened positions. The

benefit of this method is that all of the insertions will be in frame, and it creates a more

focused library of variant genes which may be important for difficult to select or screen

libraries. Even though this method results in only in frame insertions, half of the

insertions will be inserted with the incorrect directionality. Also, because two primers

need to be designed and a PCR reaction run for each site of insertion, the process is more

expensive than other options and also more time consuming.

Another method of diversification is circular permutation, which can be used in

combination with random domain insertion (Yu & Lutz, 2011). Many examples of this

have been observed in nature through protein structure studies since its discovery in

1979. The overall goal of circular permutation is to cause perturbations within local

tertiary structures, which can result in not only improved ligand binding affinities, but

also increased allosteric effects once inserted into a new domain. The process of circular

permutation can conceptually be thought of as linking the N- and C-termini of a domain

together via a covalent linkage, forming a circle, then, creating new N- and C-termini by

breaking the circle at a new point (Figure 1.3).

9

Figure 1.3: Conceptualization of circular permutation. A peptide linker (shown in red) is covalently added to the native N- and C-termini to make the domain a circle. After this, one of the existing peptide bonds is used to form the new N- and C-termini of the domain (Yu & Lutz, 2011).

In practice, this is implemented on the genetic level by putting two copies of the domain

on a plasmid, the two genes separated by the sequence coding for a linker with a few

residues of glycine or another hydrophilic amino acid. Multiple sets of primers are then

designed to amplify the gene from a desired point in the first copy of the gene till the

equivalent point in the second copy of the gene. The designed sets of primers will include

a pair that starts at the original N- and C-termini, and the next pair will shift one codon to

amplify the gene with the second residue of the original domain as the N-terminus and

the first residue of the original domain as the C-terminus, and so on. The domain can be

circularly permutated at every residue or only a subset of them. Generally, protein

domains of interest with spatially close N- and C-termini are great candidates (5-10Å),

since there is a lower likelihood that changes to the overall protein structure, leading to

misfolding, could occur (Kanwar M. , Wright, Date, Tullman, & Ostermeier, 2013).

Since a large fraction of single-domain proteins have a less than 5 Å distance between the

N- and C-termini, circular permutation can be easily applied to many domains. By

10

circularly permutating an insert gene and then inserting it via random domain insertion

into the acceptor gene, a great degree of genetic diversity can be created.

1.6 Engineering New Modular, Inducible Repressors

As mentioned before, we wish to move away from inducible repressors, like LacI and

TetR, which use expensive inducers, such as IPTG and anhydrotetracycline (aTc), and

are not able to target new sequences other than their specific operator. Both LacI and

TetR have been previously engineered to respond to new inducers (Taylor, et al., 2016),

(Meinhardt, et al., 2012), (Scholz, Köstner, Reich, Gastiger, & Hillen, 2003). However,

not only are some of the new inducers still expensive like IPTG and aTc, these modified

proteins still have the single-sequence specificity their wild-type precursors have. Due to

this, there is a need for an inducible repressor that is able to target a customizable

sequence and be induced with a cheap molecule.

Zinc fingers (ZF), transcription activator-like effectors (TALE), and the Cas9 protein

from the clustered regularly interspaced short palindromic repeats (CRISPR) system are

proteins which have the ability to be programed to interact with a target DNA sequence

(Copeland, Politz, & Pfleger, 2014). Whichever protein is used, not only must it bind to

DNA, it must bind in a way that prevents or interrupts RNA polymerase (RNAP) from

transcribing the downstream gene. This is often most successfully performed upstream of

the gene, and results in a physical blocking of the RNAP by the bound ZF, TALE, or

Cas9 (Politz, Copeland, & Pfleger, 2013) (Qi, et al., 2013). Since a repressor must

specifically repress the target sequence, ZFs are not the best candidate due to their high

frequency of off-target effects (Waryah, Moses, Arooj, & Blancafort, 2018). TALEs

contain a DNA binding domain consisting of 15.5 to 19.5 repeats, each 34 amino acids

11

long, save the last which is 20 long (Copeland, Politz, & Pfleger, 2014). In each repeat,

the 12th and 13th residues can vary and are responsible for DNA nucleotide recognition,

and these allow TALE to target any specified DNA sequence. Not only can the target

sequence be changed, engineered TALE designed to bind the lacO site in the lac operon

outperformed LacI at blocking transcription (Politz, Copeland, & Pfleger, 2013). In

addition to this, TALE presents little to no off-target effects in both prokaryotes and

eukaryotes (Copeland, Politz, & Pfleger, 2014). There is some challenge, however, in

changing the 1.8 kb region required for target specificity. The repeats in the binding

domain limits cloning through PCR; therefore, Golden Gate cloning or other advanced

cloning strategies must be used. Nevertheless, it is an excellent protein candidate for

binding domain insertion to make an inducible repressor. This concept is explored in

Chapter 2 of this thesis.

Cas9, the CRISPR associated protein from S. pyogenes, is responsible for foreign,

intracellular DNA degradation natively, as part of a cellular immune system (Copeland,

Politz, & Pfleger, 2014). Before DNA cleavage, the Cas9 protein must complex with

CRISPR RNA (crRNA) that contains a 20-base pair complementary (spacer) sequence to

the foreign DNA, as well as another RNA sequence, tracrRNA. In addition to the 20-base

pair region, a protospacer adjacent motif (PAM) that is of the form NGG for S. pyogenes

Cas9 must be present directly adjacent to the target DNA sequence. The complexity of

the system has been reduced by fusing the crRNA and tracrRNA to each other, forming a

chimera known as single guide RNA (sgRNA). In this system, only Cas9 and sgRNA are

needed, and simple PCR cloning of the 20-base pair spacer region of sgRNA is enough to

change the target DNA sequence. It was previously shown that two point mutations to

12

Cas9 yielded a nuclease-null form: deactivated Cas9 or dCas9 (Qi, et al., 2013). This

form is able to bind to the target sequence specified by the sgRNA spacer and, rather than

cleaving the DNA and unbinding, stays bound and prevents RNA polymerase from

transcribing the downstream gene. This system is known as CRISPRi, or CRISPR

interference (Figure 1.4).

Figure 1.4: Schematic comparison of wild-type (wt) Cas9 and dCas9. Wt Cas9 binds to and cleaves the target DNA whereas dCas9, the catalytically inactive version, binds to the target sequence and blocks RNAP from transcribing the downstream gene (Qi, et al., 2013).

dCas9/CRISPRi was shown to repress the transcription of a fluorescent reporter by 300-

fold. In addition, dCas9 has been proven to be tolerant to domain insertion, or able to

have a domain inserted at a position and still retain binding functionality, and the degree

of tolerance has been assessed at nearly each possible insertion position (Oakes, et al.,

2016). This knowledge, along with the great degree of initial repression and the ease of

reconfiguring target sequences, also made dCas9 a protein of interest for engineering a

protein switch. This concept is explored in Chapter 3 of this thesis.

In addition to the proteins which bind DNA, the engineered inducible repressor

needs a ligand binding protein, which binds an inexpensive molecule. Like IPTG, this

13

molecule must not be metabolized by the cell but also still be able to cross the cell

membrane and accumulate intracellularly. The molecule glycine betaine (GB) fits these

criteria, at least in E. coli (Schiefner, et al., 2004). In addition, GBBP binds GB

selectively and with a high affinity. GB binding also induces a large conformational

change in GBBP. This change, in addition to the properties of its ligand GB, makes it an

ideal candidate for introducing allosteric effects to DNA binding proteins such as TALE

and dCas9. Additionally, because its N- and C-termini are spatially near one another, it is

also an ideal candidate for circular permutation, allowing for increased genetic diversity

in the library.

14

Chapter 2: Glycine Betaine-Inducible Transcription Activator-Like Effectors 2.1: Introduction

Prior to my time in the Ostermeier lab, Lucas Ribeiro created a putative protein

switch via domain insertion of the E. coli glycine betaine binding protein (GBBP, also

known as ProX) and transcription activator-like effector (TALE) protein. GBBP is a

periplasmic binding protein that exhibits a large conformational change when bound to

glycine betaine (GB), a small, membrane permeable, and low-cost molecule that

accumulates to high intracellular concentrations and is not metabolized by E. coli

(Ribeiro, Alperson, Pfleger, & Ostermeier, 2016). For the specific TALE protein, Ribeiro

used TALElacO1, the TALE customized to strongly bind to the lacO site. This specific

construction of TALE, created via Golden Gate cloning to introduce the repeat units

targeting each nucleotide of the lacO sequence, had been shown to prevent transcription

better than LacI (Politz, Copeland, & Pfleger, 2013). Ideally, when TALE is coupled with

GBBP to form a switch, the TALE-GBBP protein will be able to bind to the DNA site

and prevent transcription in the absence of GB ligand. In the presence of the input signal

GB, the ligand, GB, binds to GBBP and allosterically inhibits TALE; that is, the

conformation of TALE-GBBP changes in a way that prevents binding to DNA, and

allows RNAP to transcribe the gene, the output (Figure 2.1).

15

Figure 2.1: Schematic of TALE-GBBP functionality. The multi-colored strip represents the TALE protein, able to bind to the TALE binding site (TBS) and repress gene activity. Each colored strip is a repeat unit that binds to one nucleotide (A, T, G, or C) that allows TALE to target the TBS specifically. The pocket on top represents the circularly permutated GBBP (cpGBBP) binding domain, inserted into TALE to form TALE-GBBP. When the ligand, GB, is added and binds to the GBBP domain, TALE-GBBP is no longer able to bind to the TBS, and the gene can be transcribed.

Within TALElacO1, Ribeiro identified 194 unique, in-frame, insertion points,

which he predicted would be less likely to disrupt TALE’s ability to bind DNA and

designed polymerase chain reaction (PCR) primers to linearize the plasmid containing the

gene through multiplex inverse PCR (Ribeiro, Alperson, Pfleger, & Ostermeier, 2016).

This construction of TALElacO1 was under the control of the constitutive promoter

J23102 and contained on plasmid pTS1, part of the bandpass system, a tunable system

allowing for positive and negative selection of beta-lactamase activity (Sohka, et al.,

2009). To circularly permute GBBP, he examined the crystalline structure of the binding

domain to find residues which were “solvent accessible, flexible, loosely packed, and

between secondary structure elements” (Ribeiro, Alperson, Pfleger, & Ostermeier, 2016).

For domains with close N- and C- termini, like GBBP, circular permutation is possible.

Circular permutation describes the process in which two copies of a domain are placed on

a plasmid with a short linker between them. Reading frames are created by PCR

amplification that each contain one full copy of the gene, but with different starting and

16

ending points. By using circular permutation, different geometries of the binding domain

can be achieved, adding to diversity and increasing the likelihood of finding a successful

switch. Ribeiro’s analysis of the GBBP domain yielded 137 positions at which to

circularly permute the gene. He used two glycine residues as the linker. Inserting the

circularly permuted GBBP (cpGBBP) domain amplified DNA fragment into the inverse

PCR-opened TALElacO1 DNA product through ligation produced many variations of

GBBP-TALElacO1 fusion DNA. This was transformed into NEB 5-alpha E. coli for the

creation of a naïve library, consisting of 2.8 x 105 transformants, 60% of which contained

cpGBBP inserted at sites well distributed throughout TALElacO1.

After creating a naïve library, Ribeiro needed to determine the degree of

switching for the gene constructions. Ribeiro harvested the naïve library, and, from an

aliquot, extracted the plasmid DNA. He transformed this DNA into SNOBLA cells (E.

coli strain SNO301 containing pDIMC8-BLA, a plasmid encoding for the beta lactamase

gene with a lacO1 site upstream). This strain of cells containing the pDIMC8-BLA

plasmid, along with the pTS1 vector containing the GBBP-TALElacO1 gene, formed the

complete bandpass system and allowed for positive and negative selection of beta-

lactamase activity (i.e. a system for selection of cells based on whether BLA was being

expressed or not) (Figure 2.2).

17

Figure 2.2: Plasmid schematic depiction for the essential components of the bandpass system, showing the BITElacO-C7 variant. “In the absence of sufficient cellular beta-lactamase (BLA) activity for hydrolysis of Amp, cell wall synthesis is compromised and cells cannot proliferate. In addition, cell wall breakdown results in the accumulation of aM-pentapeptide (aM-Pp), which induces the ampC promoter via interactions with AmpR, resulting in the production of TetC (which confers Tet resistance) and GFP. However, the level of Amp necessary to induce ampC is lower than the level that prevents the growth of E. coli cells” (Sohka, et al., 2009). BLA gene expression is regulated through GB induction of the lacO site after the tac promoter, upstream of the BLA gene. Figure adapted from (Sohka, et al., 2009).

A functional GBBP-TALElacO1 protein switch should repress BLA through the lacO

site downstream of the tac promoter in the absence of GB, and not repress in the presence

of GB. With the bandpass system, a positive and negative selection can be conducted. For

the negative selection, cells from the GBBP-TALElacO1 protein library that are able to

grow in the presence of tetracycline, low concentration of ampicillin, and in the absence

of GB are cells with functional GBBP-TALElacO1 genes repressing bla. When bla is

repressed, aM-Pp is formed, which allows tetC to be transcribed, allowing for cell

survival. The surviving cells are then subjected to the positive selection. For this, the cells

able to grow in the presence of higher concentrations of ampicillin and in the presence of

GB are cells with GBBP-TALElacO1 proteins that are allosterically inhibited by GB,

therefore selecting only proteins with some degree of activity change due to change in

18

concentration of GB. Cells that survived both conditions were subjected to increasingly

higher concentrations of ampicillin to find the minimum inhibitory concentration (MIC)

of ampicillin at which they grew to further characterize the degree of switching. Of the

selected variants, the fusion named BITElacO-C7 (Betaine-Inducible Transcription

Effector) appeared to perform the best with an eight-fold increase in ampicillin resistance

when exposed to GB compared to its absence. This construct consisted of a circularly

permuted GBBP split site between the 130th and 131st residues, located within the

secondary structure in a loop between a b-sheet and an a-helix. This cpGBBP was

inserted into TALElacO1 in the conserved N-terminal region, a common insertion region

in other characterized fusions. The selections took place at 5 mM GB, but with 10 mM,

full induction was achieved, so 10 mM was used for the rest of the experimentation

(Ribeiro, Alperson, Pfleger, & Ostermeier, 2016).

Ribeiro performed many experiments to characterize BITElacO-C7. Starting with

an ampicillin MIC assay with BLA on a plasmid downstream of the lacO binding site, he

compared the degree of repression with and without GB in the system (Figure 2.3).

19

Figure 2.3: “Characterization of BITElacO-C7 controlling expression from lac-derived promoters on plasmids and the chromosome [in MG1655 E. coli]. (A) Scheme of the three different reporter systems used to assess BITElacO-C7’s abilities as a GB-inducible repressor. (B) GB-dependence of the minimum inhibitory concentrations (MIC) for ampicillin for cells in which expression of beta-lactamase from the tac promoter is regulated by LacI, TALElacO, or BITElacO-C7. The fold-increase in the MIC by the inclusion of 10 mM GB in the media is indicated. (C) GB-dependence of the mean sfGFP protein production rates for cells expressing LacI, TALElacO1or BITElacO-C7 targeting a plasmid-borne sfGFP under the control of the tac promoter. (D) GB-dependence of the mean sfGFP protein production rates cells expressing LacI, TALElacO1or BITElacO-C7 targeting a chromosomal sfGFP reporter integrated into the lacIZYA locus. The error bars represent the standard deviation (n=3)” (Ribeiro, Alperson, Pfleger, & Ostermeier, 2016).

He saw a 32-fold difference between induced and un-induced BITElacO-C7 compared to

a 256-fold difference with LacI. He moved next to a fluorescence assay with sfGFP on a

plasmid downstream of the lacO binding site. For this, he saw 9.8-fold difference

between induced and un-induced BITElacO-C7, compared to a 68.6-fold difference with

LacI. Finally, he tested a genomic copy of sfGFP downstream of the lacO site. In this

test, BITElacO-C7 exhibited a 19.6-fold difference, compared with a 4-fold LacI

difference. To achieve tighter control, Ribeiro placed multiple lacO TALE binding sites

(TBS) before the BLA gene. With 4 TBSs in a row upstream of the BLA gene,

BITElacO-C7 exhibited a 128-fold difference between induced and un-induced activity.

A

B C D

20

Finally, he changed the repeat motifs of TALE to bind to E. coli’s lysA promotor to test

the modularity of the switch. LysA is essential for growth in media without lysine. The

induced switch grew at the same rate as cells with access to lysine, and the un-induced

switch grew much less, although still to some extent. This showed that the switch could

be easily moved to target any site with only minor modification.

Immediately prior to my entry into the lab, repeat experiments of the type shown

in Figure 2 no longer indicated a switching effect with BITElacO-C7. The addition of GB

no longer produced a change in antibiotic resistance or fluorescence (depending on the

output reporter). Prior to the loss of function, I had started planning to transfer BITElacO-

C7 to HEK 293 and test its performance in mammalian cells, but first we needed to figure

out why the experiments could not be reproduced and hopeful fix the problem. Ryan

Weeks, also a new member to the lab, and I began our investigation into the loss of

function of BITElacO-C7.

2.2: Materials and Methods

Materials

Strains, Plasmids, and Reagents

A table of all strains and plasmids used can be found in Table S1. For the

experiments in this chapter, E. coli strains K-12 MG1655 ∆LacI and SNO301 were used.

MG1655 genotype used was F- l- ilvG- rfb-50 rph-1. SNO301 genotype used was

ampD1, ampA1, ampC8, pyrB, recA, rpsL. All chemicals used were from Fisher

Scientific and Sigma Aldrich, unless otherwise noted. All enzymes used were from New

21

England Biolabs. Oligonucleotides were purchased from Integrated DNA Technologies.

All Sanger sequencing was performed by Genewiz.

Media

Two types of media were used in the experiments. Tryptone broth (TB) media and

agar were used for all non-fluorescent studies. The composition of this was 10 g/L BD

Bacto tryptone and 5 g/L NaCl. For TB agar, 15 g/L BD Bacto agar was added to the

above TB mixture. All TB media and agar was autoclaved and stored at room

temperature. For fluorescent studies, M63 minimal media was used. This composition

was 15 mM (NH4)2SO4, 22 mM KH2PO4, 40 mM K2HPO4, 25 µM FeSO4, 2mM MgSO4,

0.1 mM CaCl2, 5 mM thiamine HCl, 0.2% (w/v) tryptone, 0.2% (w/v) glucose. M63 was

sterile filtered with a 0.2 um pore Corning filter and stored at 4ºC.

Methods

Minimum Inhibitory Concentration Assay

For the minimum inhibitory concentration (MIC) agar plate assays, TB plates

were made with plasmid maintenance antibiotics and increasing concentrations of freshly

dissolved ampicillin, from 0 to 2048 or 4096 µg/mL with a two-fold dilution factor.

Inducer concentrations of 500 µM IPTG and 10 mM of GB were used, unless indicated

otherwise. All conditions were assayed in triplicate. Thirty (30) µL of frozen cell stocks

diluted to 3x105 CFU/mL were plated, grown overnight, and assayed after 16 hours. MIC

was determined as the lowest ampicillin concentration with no visible cell growth.

For the MIC assays performed with the SpectraMAX M3 plate reader, a two-fold

dilution of ampicillin from 2048 or 4096 µg/mL to either 1 or 4 µg/mL was used. A

1/20,000 dilution of overnight culture was inoculated into 1 mL TB media with inducer if

22

required, ampicillin, and maintenance antibiotics in a 96-well plate with 2 mL well

capacity. This was incubated for 22 hours at 37ºC while shaking. After 22 hours, 200 µL

of each condition was removed to a clear bottomed 96-well plate and OD600 values were

measured on the plate reader.

For the pH MIC assays, the same protocols as above were followed for both plate

reader and agar plate assays, but pH of the media or agar, respectively, was adjusted by

addition of HCl or NaOH to achieve a final pH of 5.5 or 7, depending on the condition

being tested. pH was measured using the Fisher Scientific Accumet AR15.

Green Fluorescent Protein Production Assay

The green fluorescent protein (GFP) production assay was performed on the

Synergy H4 plate reader. Overnight cultures were diluted to an OD600 of 0.005 in M63

media, appropriate maintenance antibiotics, and inducer, if required. These were left to

incubate on the shaker for 2.5 hours at 37ºC. After incubation, 75 µL of each sample was

added to a clear bottom plate, placed in the plate reader at 37ºC with intermittent shaking

enable, and OD600 and 485 nm excitation/510 nm emission values were measured every

20 minutes for 15 hours.

Competent Cell Creation and Transformation

Competent cells were created by inoculating low salt TB media with 1/100th

media volume of overnight culture. 0.2% (w/v) glucose was added, and grown at 37ºC to

OD600 0.7-0.8. The resulting culture was spun at 3,000 x g for 20 minutes at 4ºC. The

supernatant was discarded, and the pellet was washed with water and spun down again.

This step was repeated once again with water and once more with 10% (v/v) glycerol.

23

The resulting pellet was resuspended in approximately 1 mL of remaining supernatant,

aliquoted, and frozen at -80ºC.

Transformations were performed by combining 1-5 µL of purified plasmid

products, depending on concentration, with aliquot of competent cells. The product was

mixed once by pipetting, transferred to electroporation cuvette, pulsed at 2.0 kV, 100

Ohms, 50 µF, and washed with 37ºC SOC recovery media. The media was incubated for

60 minutes at 37ºC, diluted, and plated onto appropriate antibiotic-containing plates.

Plasmid Manipulation and Visualization

All plasmid preparations were performed from overnight cultures of cells using

the Qiagen QIAprep Spin Miniprep Kit according to the protocol of the kit. DNA gel

electrophoresis was performed using Bio-Rad power supplies and electrophoresis boxes.

Invitrogen Ultrapure agarose and TAE buffer were used. Imaging of the gels was

performed by the CareStream MI software and the Gel Logic 112 gel box. DNA gel

extraction was performed with and according to the protocol of the Invitrogen PureLink

Quick Gel Extraction Kit.

2.3: Results and Discussion

We set out to determine in which ways the switch did and did not function. In

order to establish the degree of repression and switching, we needed to repeat Ribeiro’s

characterization experiments (Figure 2). Of the three of these, the MIC ampicillin assays

were the most important due to the significant degree of switching originally exhibited,

the ease of result interpretation, and the lowest complexity. For these reasons, we

repeated Ribeiro’s MIC assays first.

Initial MIC Assays

24

Using new stocks made from Ribeiro’s glycerol stocks, we performed a MIC

assay (Figure 2.4).

Figure 2.4: The first MIC assay of Ribeiro’s TALElacO, BITElacO-C7, and no repressor constructs in MG1655 cells, tested to a maximum concentration of 2048 µg/mL ampicillin with two-fold dilutions. lacO site located upstream of BLA gene. Five hundred µM IPTG and 10 mM GB-H2O were utilized for induction where indicated. Values are the mean and error bars represent the standard deviation (n=3). Lack of error bars indicate all repeats had the same value.

The fold change in Amp resistance when GB was added was negligible when BITElacO-

C7 was used. This confirmed the results Ribeiro recorded before he left: BITElacO-C7 no

longer achieved the initially recorded 32-fold repression. To verify these results, we

repeated the MIC assay for the BITElacO-C7 construct (Figure 2.5).

25

Figure 2.5: Repeat of MIC assay with only BITElacO-C7 and no repressor control, both in MG1655 cells. The highest ampicillin concentration used was 4096 µg/mL, and the MIC of pDIM MG1655 exceeded this concentration as well. 10 mM GB-H2O used for induction. Values are the mean and error bars represent the standard deviation (n=3). Lack of error bars indicate all repeats had the same value.

BITElacO-C7 still exhibited minimal to no switching in the presence of GB-H2O,

duplicating the first result. Though switching was not present, some degree of repression

remained, since MICs were not as high as cells with no repressor, though still higher than

the degree of repression achieved by TALElacO.

Initial Protein Production Rate Assays

Concurrently with the MIC assays, we repeated the plasmid sfGFP production

rate assay Ribeiro used to characterize BITElacO-C7 (Figure 2.6).

26

Figure 2.6: Green fluorescent protein production rate assay. lacO site upstream of sfGFP gene. As shown, GFP expression was strongly induced by IPTG, TALElacO repressed GFP expression regardless of presence of GB, BITElacO-C7 repressed GFP expression (though less than TALElacO), and cells with no repressor expressed GFP regardless of GB presence. 500 µM IPTG and 10 mM GB-H2O used for induction as indicated. Values are the mean GFP fluorescence normalized to OD600 and error bars represent the standard deviation (n=3).

These results mirrored the MIC assay: BITElacO-C7 repressed gene expression to some

degree, but, upon addition of GB-H2O, did not switch to stop repressing gene expression.

From these results, as well as the MIC assay results, we could not confirm that

BITElacO-C7 performed as a switch.

Confirming Sequence of BITElacO-C7

To ensure the BITElacO-C7 gene and promoter were identical to the original gene

Ribeiro characterized, we used Sanger sequencing to rule out mutation causing the loss of

switching. We first sequenced the BITElacO-C7 gene. From the streak of Ribeiro’s

27

glycerol stock, we selected three colonies for characterization, performed a plasmid

preparation, and sequenced the resulting DNA. From the sequencing data, none of the

three had mutations affecting the amino acid sequence as compared to Ribeiro’s initial

sequencing results, though one of the three had a silent mutation at base pair 2466 of the

BITElacO-C7 gene: residue 822, leucine. In addition to the gene, we also sequenced the

J23102 promoter of BITElacO-C7. This was also not mutated.

We also sequenced the pDIM reporter plasmid to ensure there were no mutations

at the lacO binding site or on the BLA gene itself that could be affecting the results.

These sequences were not mutated, however, which ruled out this possibility.

Attempts to Restore Functionality

Since could not reproduce the results of Figure 2.3 and BITElacO-C7 failed to

repress in the absence of GB, we chose to manipulate the system to attempt to restore

switching functionality. Ribeiro informed us that a few years prior, he had lost switch

function also. He restored functionality by retransforming both the pTS1 plasmid

containing BITElacO-C7 and the pDIM reporter plasmid with BLA on it into fresh cells.

This experience informed our next steps.

Transformation into Fresh MG1655 Cells

At the time Ribeiro retransformed the system, he was still using SNO301 cells for

characterization of his library. Since we were working in MG1655 cells, our first step

was to re-transform pTS1-BITElacO-C7 and pDIM-C8 into new MG1655 cells. In

addition to pTS-BITElacO-C7, we found a construct of Ribeiro’s named pLR-BITElacO-

C7. We retransformed these plasmids into fresh MG1655 cells, sequenced to ensure

28

mutations (e.g. deletions) did not occur during transformation, and performed the MIC

assay to characterize both forms of BITElacO-C7 after this manipulation (Figure 2.7).

Figure 2.7: MIC assay with BITElacO-C7 on pLR and pTS plasmids and no repressor control all in fresh MG1655 cells. The MIC of pDIM MG1655 was greater than the highest concentration of ampicillin used, 2048 µg/mL. 10 mM GB-H2O was used for induction as indicated. Values are the mean, SD for all six samples was 0 (n=3). Lack of error bars indicate all repeats had the same value.

Both of the BITElacO-C7 constructs performed identically. Their sequences of

BITElacO-C7 were also identical, the pLR plasmid did not have the GFP or TetC genes

to reduce the plasmid’s size. The MIC results also mirrored those from the original

MG1655 cells: BITElacO-C7 repressed slightly but did not switch when exposed to GB-

H2O. These results showed that retransformation into fresh MG1655 cells did not

reestablish functionality.

Transformation into Fresh SNO301 Cells

29

Following the failure of MG1655 cells, we decided to try to repeat Ribeiro’s

transformation into fresh SNO301 cells. Ribeiro characterized BITElacO-C7 using

MG1655 cells (Figure 2.3), but originally isolated the switch in SNO301 cells. The

ampD1 mutation in SNO301 allows beta-lactam hyperinducible transcription of genes

under the control of the ampC promoter, a key functionality of the bandpass selection

(Sohka, et al., 2009). We hypothesized that if we could reproduce similar results to

Figure 2.3 in SNO301 cells, we could apply what we learned in the cells that originally

isolated BITElacO-C7 to MG1655 cells. We transformed both pTS1-BITElacO-C7 and

pDIM-C8 into SNO301 cells. During his screening to identify switches and his first

characterizations, Ribeiro used the hydrochloride salt of glycine betaine (GB-HCl),

instead of the monohydrate form he used for characterization later in Figure 2.3, which

we had also been using. For these trials, we returned to Ribeiro’s original methodology

and used GB-HCl. Ribeiro used 5 mM GB-HCl during selection and original

characterization, so we also used this reduced concentration.

The first two assays we performed went to 1024 µg/mL ampicillin. For these, we

achieved a MIC of 256 µg/mL for BITElacO-C7 without GB-HCl and a MIC greater than

1024 µg/mL for BITElacO-C7 with GB-HCl (Figures S1 and S2). We increased the

maximum ampicillin concentration to 4096 µg/mL and performed the assay again (Figure

2.8).

30

Figure 2.8: Normalized cell survival vs ampicillin concentration at 22 hours after addition of GB-HCl for BITElacO-C7 expressing SNO301 cells. These values are normalized to the mean absorbance of cells without GB-HCl at 0 µg/mL ampicillin. Cells induced with 5 mM GB-HCl for induced condition. Normalized cell survival values are the mean, error bars represent the standard deviation (n=3).

This assay revealed the MIC of BITElacO-C7 without GB-HCl, still at 256 µg/mL, as

well as the MIC of BITElacO-C7 with GB-HCl, 2048 µg/mL. This 8-fold difference in

MIC was the highest we had characterized yet, though still not Ribeiro’s 32-fold

characterized in Figure 2.3.

In order to determine if the change in switching was from the system in the fresh

SNO301 cells or from a difference between GB-HCl and GB-H2O, we repeated this test

with GB-H2O (Figure 2.9).

31

Figure 2.9: Normalized cell survival vs ampicillin concentration at 22 hours after addition of GB-H2O for BITElacO-C7 expressing SNO301 cells. Values normalized to the mean absorbance of cells without GB-H2O at 0 µg/mL ampicillin. Normalized cell survival values are the mean, error bars represent the standard deviation (n=3).

The repeated test with GB-H2O showed a MIC of 512 µg/mL for both BITElacO-C7 with

and without GB-H2O. The difference between the two results therefore stemmed from the

form of GB used, not from GB causing an increase in BLA expression.

pH Dependence

We started an investigation to determine how the two forms of GB affected

ampicillin resistance. We tested the pH of TB media with 5 mM GB-HCl and GB-H2O.

TB with GB-HCl had a pH of 5.57 and TB with GB-H2O had a pH of 7.07. To

characterize the effects of different pH values and forms of GB, we performed a MIC

assay with each form of GB at the two pH values. The pH of the agar was adjusted after

addition of antibiotics and GB-H2O or GB-HCl. A solution of HCl or NaOH was added

to achieve a final pH of either 7 or 5.5 (Figure 2.10).

32

Figure 2.10: MIC performed on agar plates for LacI, TALElacO, and BITElacO-C7, all expressed in SNO301 cells. pH values of 5.5 and 7 were surveyed. Induction with 1mM IPTG GB-H2O and GB-HCl at 5 mM. Values are mean, error bars are standard deviation (n=3). Lack of error bars indicate all repeats had the same value.

The MICs found were not only pH dependent, but, at pH 7, were affected by the form of

GB added. At pH 7, for the TALE expressing construct, the MIC increased with addition

of GB-HCl, whereas the BITElacO-C7 construct MIC decreased. At pH 5.5, the addition

of either form of GB increased MIC for both TALE and BITElacO-C7 constructs. Even

at pH 5.5 though, BITElacO-C7 only switched by 2.7-fold, not the 8-fold seen earlier in

the optical density assay.

33

In the end, the apparent initial isolation of BITElacO-C7 appears to have been

caused by the acid form of GB causing an increase in ampicillin resistance. When

originally selected for by Ribeiro, the hydrochloride form of GB lowered the pH of the

media and somehow affected what he selected for. Since higher MICs appeared to be

achieved at lower pH values, we considered a few possibilities. The most obvious

solution could be that the switch was pH dependent, though this does not take into

account that Ribeiro reported functionality with GB-H2O at pH 7 in the MG1655 cells.

Another explanation could be that BLA has a higher activity at lower pH values.

Alternatively, and perhaps most likely, the lower pH may cause an increase in the

background rate of ampicillin hydrolysis, leading to an apparent increase in ampicillin

resistance. Some, or a combination, of these effects may have led to the observed

behavior: reduced ampicillin resistance without GB-HCl and increased resistance with

GB-HCl. When a particular switch is chosen from a selection, the switch demonstrates

qualities that it was selected to exhibit. In this case, there was a pH component to the

selection.

Regardless of the cause, we were unable to confirm BITElacO-C7 behaves as a

switch or to restore its switching functionality. We could not replicate Ribeiro’s MIC

assay results, even with our greatest change of 8-fold using the plate reader assay at pH

5.5 and GB-HCl. With this failure, we decided to abandon our study of BITElacO-C7 and

also abandon using TALEs as modular repressors and try the same strategy with

deactivated Cas9 (dCas9) proteins that can specifically inhibit transcription as directed by

guide RNA sequences.

34

Chapter 3: Glycine Betaine-Inducible Nuclease-Inactive Cas9 3.1: Introduction

Following the inability to replicate Ribeiro’s original BITElacO-C7 results from

Figure 2.3, we were faced with the option to perform a selection using GB-H2O from

Ribeiro’s original naïve library or examine other functional proteins to insert into. At the

time Ribeiro designed and created BITElacO-C7, TALEs, Zinc Fingers, and Cas9 were

all known as proteins which can be programmed to interact with target DNA sequences.

He chose TALEs due to their established success in terminating transcription at the lacO

site (Politz, Copeland, & Pfleger, 2013). Now, with the chance to redesign from a new

functional protein or continue with TALEs, we took the chance to examine the benefits

and disadvantages of the three proteins.

Our design goals remained the same as Ribeiro’s. In addition to a switch triggered

with inexpensive ligand, we wanted a modular switch—one that could easily be changed

to target new gene sequences if desired and would also bind specifically to the target site

with minimal off-target effects. Zinc Fingers, due to their high frequency of off-target

effects, were discounted (Waryah, Moses, Arooj, & Blancafort, 2018). TALEs have a low

frequency of off-target effects, but modularity is impacted by the difficulty of cloning due

to many repeats, causing the need for more complicated cloning methods such as Golden

Gate cloning. Ribeiro faced these difficulties while working on BITElacO-C7. In

addition, each time a new target is desired, the protein itself must be changed. TALEs are

35

also methylation sensitive and have difficulty binding to regions of condensed chromatin.

On the other hand, Cas9 has various reports regarding the frequency of off-target effects,

and there is not yet a general consensus (Iyer, et al., 2018) (Nature Methods, 2018). The

small size of bacterial genomes, however, ensures that off-target effects are very unlikely

(Copeland, Politz, & Pfleger, 2014). The modularity, however, is well defined. The

twenty-base pair spacer sequence of Cas9 located on the sgRNA is not only easily cloned

with oligonucleotides, it is separate from the Cas9 gene, allowing for the target sequence

to be adjusted without possibly perturbing the Cas9 protein. One slight downside is that

the target sequence must be adjacent to the PAM sequence to bind to a specified target

(this sequence is 5’-NGG-3’ for S. pyogenes Cas9). However, Cas9 is not methylation

sensitive and can bind to the regions of condensed chromatin TALEs cannot (Waryah,

Moses, Arooj, & Blancafort, 2018). Due to its benefits over TALEs, we chose dCas9 as

our new repressor protein for our protein switch.

For our purposes, we were not interested in the endonuclease activity of Cas9,

only the DNA targeting and binding activity. We focused on the nuclease-null dCas9, a

part of the CRISPRi system (Figure 1.4) (Qi, et al., 2013). Additionally, it was already

proven dCas9 tolerated domain insertions (Figure 3.1) (Oakes, et al., 2016).

36

Figure 3.1: “Fold-change values for insertions at specific amino acid sites derived from sequencing data over two rounds of screening. A positive value indicates the preference of the domain insertion at a site to remain in the library after screening for function. A negative value indicates a loss of the clone with an insertion at the site. Bars that reach 102 represent sites that were not sequenced before screening. Bars that extend into the shaded region represent clones that were cleared from the library (i.e., were not observed after screening). P values were determined in DESeq with multiple-hypothesis-testing correction” (Oakes, et al., 2016).

By randomly inserting a domain into dCas9, performing CRISPRi screening, and deep

sequencing, Oakes, et al. determined the fold enrichment of insertion to determine at

what positions dCas9 could and could not accept a domain insertion and still repress gene

transcription.

We looked into several systems that used dCas9 to choose the best one in which

to perform a selection, including the possibility of incorporating it into the bandpass

system Ribeiro used for his selections. However, we decided the best system to start from

would be the system dCas9 was originally characterized in (Qi, et al., 2013) because of

its degree of repression, the ability to make positive and negative screens, and the fact it

was already thoroughly understood.

In this system, E. coli strain MG1655 expresses both sfGFP (GFP) and

monomeric RFP (mRFP) chromosomally under constitutive promoters. The result we

were most interested in, reproduced in Figure 3.2, used sgRNA that targeted different

areas within the mRFP gene.

37

Figure 3.2: CRISPRi blocking transcription elongation. 6 different sgRNA spacers are shown targeting both the non-template (NT) and template (T) strands in the mRFP gene. The control shows fluorescence of cells with dCas9 only and no sgRNA. The spacer NT1 achieved 300-fold repression (Qi, et al., 2013).

Many target sites both upstream of and inside the mRFP gene were characterized, but the

sgRNA spacer sequence with the most repression, NT1, targeted the beginning of the

mRFP gene and bound the CRISPRi complex to the non-template strand. When dCas9

was co-expressed with sgRNA with the NT1 spacer (sgRNA (NT1)), it was able to

repress the expression of mRFP by 300-fold compared to cells expressing dCas9 but

without any sgRNA (Qi, et al., 2013). We knew from the BITElacO-C7 results that once

we inserted a binding domain into TALE, the new protein may not repress to the same

degree as the wildtype. For the system on which we chose to base our switch, we wanted

a large baseline degree of repression so that even if we did not achieve the same

repression-fold with the un-induced switch, we could still have a chance to have a

relatively high degree of repression. The 300-fold repression the NT1-directed, mRFP

repressing CRISPRi system offered fit this criterion. We also wanted a system in which

we could perform positive and negative selections. This system allowed for that. We

38

could use fluorescence activated cell sorting (FACS) to select for high fluorescent or low-

fluorescent cells (or anything in between). The negative screen in the absence of inducer

would enrich for cells with low levels of RFP (dCas9 still repressed). Cells enriched in

this screen would move to the positive screen in the presence of the inducer, which would

enrich for cells exposed that expressed high levels of RFP (dCas9 lost ability to repress

due to presence of the inducer). With this system, it would also be relatively easy to

change the sgRNA spacer because it is on a separate plasmid (pgRNA) than dCas9

(pdCas9) (Figure 3.3).

Figure 3.3: Components of the CRISPRi system and reporters in MG1655 fluorescent cells. pdCas9

encodes for dCas9 under control of the tet promoter and chloramphenicol resistance (CmR). pgRNA

encodes for sgRNA under control of the constitutive J23119 promoter and ampicillin resistance (AmpR). In

the chromosome, GFP and mRFP are expressed under constitutive promoters with strong terminators at the

end of each gene. Spacer NT1 targets dCas9 to the beginning of the mRFP gene.

Plasmid pgRNA encodes for sgRNA, which is expressed constitutively with the J23119

promoter, and the ampicillin resistance gene. A separate plasmid, pdCas9, encodes for

dCas9, expressed by the bidirectional Tet promoter, and also the chloramphenicol

39

resistance gene. The tet promoter represses downstream production of dCas9 by TetR

(upstream product of the tet promoter). TetR, an inducible repressor itself, is induced by

anhydrotetracycline (aTc), a tetracycline derivative without antibiotic activity, to allow

for dCas9 transcription.

After deciding on the system, Shaun Spisak, joined the project. We would work

on the project in parallel; I focused on a GB-triggered switch, using GB-H2O, while he

focused on a propionate-triggered switch.

3.2 Materials and Methods

Materials

Strains, Plasmids, and Reagents

Strains used were E. coli K-12 strain MG1655 with genomic mRFP and sfGFP, obtained

from Stanley Qi, UC-Berkeley (as described in (Qi, et al., 2013)) and described as

MG1655 fluorescent cells in this chapter and used in all fluorescence assays; NEB 5-

alpha Competent E. coli (NEB C2087), used for cloning; NEB Turbo Competent E. coli

(NEB C2984), used for library creations. Plasmids used were pdCas9 (pdCas9-bacteria

was a gift from Stanley Qi (Addgene plasmid # 44249; http://n2t.net/addgene:44249;

RRID:Addgene_44249)) and pgRNA (pgRNA-bacteria was a gift from Stanley Qi

(Addgene plasmid # 44251; http://n2t.net/addgene:4425; RRID:Addgene_44251)), both

from (Qi, et al., 2013); pgRNA-T5-TetR, modified from pgRNA backbone with TetR

gene from pdCas9 inserted between the ampicillin resistance gene and the sgRNA gene,

downstream of the T5 promoter, taken from pQE-30-HyPer3 (pQE-30-HyPer3 was a gift

from Vsevolod Belousov (Addgene plasmid # 42132; http://n2t.net/addgene:42132;

RRID:Addgene_42132)) (Bilan, et al., 2012). All chemicals used were from Fisher

40

Scientific and Sigma Aldrich, unless otherwise noted. All enzymes used were from New

England Biolabs. Oligonucleotides were purchased from Integrated DNA Technologies.

All Sanger sequencing was performed by Genewiz.

Media

LB media and agar used unless otherwise noted: 10 g/L BD Bacto tryptone, 5 g/L

BD yeast extract, and 10 g/L NaCl for media and with 15 g/L BD Bacto agar for agar.

TB, described in Chapter 2, was used for minimal media fluorescence experiments as was

M9: 0.001% w/v Thiamine, 0.1 mM CaCl2, 2 mM MgSO4, 0.5% w/v D-Glucose, 0.5%

w/v Difco Casamino acids, 6 g Na2HPO4-7H2O, 3 g KH2PO4, 0.5 g NaCl, and water to 1

L total volume. MOPS EZ Rich Defined media (EZ-RDM) (Teknova M2105) was used

in the minimal media fluorescence assay as well and also when trying new conditions for

transformations; EZ-RDM agar was made by autoclaving 15 g/L BD Bacto agar in

ddH2O then adding the required EZ-RDM supplements to the cooled, but still liquid agar.

Methods

Fluorescence Assays

For each condition tested (no plasmids, pdCas9 only, pdCas9 and pgRNA (NT1),

etc.), previously frozen overnight cultures of MG1655 fluorescent cells with the desired

plasmids were thawed on ice and 5 µL transferred to 5 mL of LB with appropriate

antibiotics and incubated shaking at 37ºC for 2-4 hours. These were then diluted in two

separate tubes for each condition to an OD600 of 0.001 in LB with antibiotics in a final

volume of 2 mL. To one of these tubes, 1 µL 100% ethanol was added, and to the other 1

µL of aTc ethanol solution (at various concentrations, as noted) was added. These were

then incubated shaking at 37ºC for 4 hours, unless otherwise noted. After incubation,

41

three 200 µL samples of each culture were removed to a clear bottomed 96-well plate and

OD700, GFP fluorescence (lEx 485 nm, lEm 525 nm, Filter 515 nm), and mRFP

fluorescence (lEx 585 nm, lEm 620 nm, Filter 610 nm) were measured using the

SpectraMAX M3 plate reader. OD700 used rather than OD600 (unless otherwise noted) due

to slight interference from mRFP.

Fluorescence Activated Cell Sorting

Fluorescence activated cell sorting (FACS) was conducted by Hao Zhang at the

Johns Hopkins Bloomberg School of Public Health’s cell sorting laboratory using the

Beckman Coulter MoFlo XDP cell sorter.

Competent Cell Creation, Transformation, Plasmid Preparations, and Visualization

Competent cells, transformations, and plasmid preparations were performed in the

same way as described in Chapter 2. All electrophoresis gels used are 0.8% with 0.006

g/L ethidium bromide and were run at 110 V with constant voltage. The same materials,

buffers, and imaging equipment were used as described in Chapter 2.

Library Creation

For multiplex inverse PCR, abutting oligonucleotide primer pairs were created at

each insertion site via a custom MATLAB script to optimize Tm for each set of primers

at each desired position (Kanwar M. R., Wright, Date, Tullman, & Ostermeier, 2013).

The PCR reactions were carried out in a 96-well PCR plate using a 96-well PCR

thermocycler. Reactions with similar primer melting temperatures were grouped together

and a gradient of temperatures used on the thermocycler to optimize yield. These

groupings can be viewed in Tables S2, S3, and S4. A mixture of Phusion HF master mix,

pdCas9 template DNA (10 ng for each reaction), and required water was prepared and 50

42

µL was added to each well of the 96-well plates. 1 µL of 10 µM forward and reverse

primers was added to each well. A typical Phusion protocol was followed on the

thermocycler. 5 µL of each product was electrophoresed to determine if the amplification

was successful. The rest of the products were pooled together and stored at -20ºC. For

unsuccessful amplification reactions, 3% DMSO and 1 M Betaine were added for new

reactions with the same primers and amounts of template. After all of the amplification

reactions were successful, the pooled products were electrophoresed and the band of

desired size (6.7 kb) cut out, extracted, digested with DpnI, and stored at -20ºC.

The cpGBBP insert reactions were performed in the same manner as the pdCas9

but using Ribeiro’s primers and the template pUC backbone plasmid with 2 copies of the

GBBP gene linked with 2 glycine codons inserted into it. After extraction of purified

cpGBBP products, the product was phosphorylated with the NEB quick blunting kit.

The phosphorylated cpGBBP insert and opened pdCas9 vector were then ligated

using T4 ligase, T4 ligase buffer, 5 µL of 30% w/v PEG-8000, a 3:1 molar insert to

vector ratio with 50 ng of vector (unless otherwise noted), and up to 20 µL of water.

Several ligation reactions could be performed at one time. The reactions were covered

with foil and incubated at room temperature overnight (unless otherwise noted). After

incubation, products were purified and transformed into NEB turbo cells.

Colony PCR

Colony PCR was performed to determine the length of dCas9 (if present) in

selected colonies without the need to perform a plasmid preparation. Primers to amplify

the gene, along with Promega GoTaq Polymerase mix and water, were added to a 96-well

43

plate. Colonies were directly added to this mixture and run on the thermocycler according

the GoTaq protocol.

3.3 Results and Discussion

Verification of CRISPRi and Screening Methods

After acquiring the MG1655 fluorescent strain and dCas9- and NT1-sgRNA-

containing plasmids (pdCas9 and pgRNA, respectively) from the Qi lab, our first step

was to verify the CRISPRi system worked to repress fluorescence as described. To verify

the plasmids received were complete, we performed diagnostic digests and sequenced the

plasmids. Both showed that the plasmids matched the sequences on the Addgene website.

With these results, we transformed the plasmids into the MG1655 fluorescent strain for

characterization of the system. We then began initial fluorescence assays using the plate

reader to assess the CRISPRi system’s initial degree of repression. The dCas9 gene was

under control of the Tet repressor, therefore anhydrotetracycline (aTc) was added to

express dCas9. We first induced with the same concentration of aTc the Qi lab used, 2

µM. We tested fluorescence of the MG1655 strain with no plasmids, pdCas9 only,

pgRNA only, and both plasmids together—each condition with and without aTc (Figure

3.4).

44

Figure 3.4: Initial fluorescence assay of MG1655 fluorescent cells with no plasmids, only expressing dCas9, and expressing dCas9 and sgRNA (NT1). 2 µM aTc was used for induction. Values are mean mRFP fluorescence normalized by OD600, error bars are standard deviation (n=3). Characterized in LB media.

From these results, we saw that even without the addition of aTc, repression of mRFP

was occurring. We concluded that the tet promoter was leaky. In addition, growth was

slowed dramatically when aTc was added. We chose to test the CRISPRi system at lower

concentrations of aTc to see how growth rate and repression were affected (Figure 3.5).

45

Figure 3.5: (A) Effect of different aTc concentrations on mRFP fluorescence in fluorescent MG1655 cells with and without CRISPRi system. Values are mean mRFP fluorescence normalized by OD700, error bars are standard deviations (n=3). (B) Effect of different aTc concentrations on OD700 in fluorescent MG1655 cells with CRISPRi system. Values are OD700, error bars are standard deviation (n=3). Characterized in LB media.

A

B

46

These results show that decreasing aTc concentration does not significantly impact mRFP

fluorescence. Growth rate appeared to also correlate with aTc concentration. Due to this,

we chose to continue using 1 µM aTc for the following characterizations due to the slight

increase in growth rate.

In the characterization of CRISPRi via flow cytometry by Qi et al, EZ-RDM

media was used (Qi, et al., 2013). We decided to also test the effect of media on

fluorescence, repression of mRFP, and leakiness of tet promoter. We also tested TB and

M9 medias. For all three medias, we induced with 1 µM aTc (Figure 3.6).

47

Figure 3.6: Effect of different medias (TB, M9, and EZ-RDM) on (A) fluorescence, CRISPRi efficacy, and tet promoter leakiness and (B) OD700, compared with LB OD700 values from Figure 3.5. 1 µM aTc used for induction where indicated. Values are (A) mean mRFP fluorescence normalized to OD700 and (B) mean OD700 values , error bars are standard deviation (n=3).

A

B

48

All three medias, as expected, slowed growth significantly. The lack of yeast extract in

TB did not seem to affect repression, as compared to LB, which made sense since yeast

extract should not contain any tetracycline derivatives. M9 also produced similar degrees

of repression, though the induced conditions did not repress to the same degree as the

previous two experiments. Growth was very much reduced, even compared to TB. EZ-

RDM also slowed growth, but it did not provide any significant difference in repression.

Interestingly, however, it reduced the fluorescence of the control which did not contain

the CRISPRi plasmids. Due to the similarity of these results to the results in LB, as well

as the dramatically reduced growth, we decided to continue using LB.

Since screening would take place using FACS, not the plate reader, we needed to

determine the ideal conditions at which to sort the cells. We began to examine the best

growth conditions for sorting. At this time, we found that dCas9 can exhibit toxicity

when overexpressed, and a previous study optimized growth using roughly 1000-fold less

aTc than we had been using, while still maintaining similar repression (Nielsen & Voigt,

2014). Though we had not tested aTc concentrations this low, the results of Figure 3.5

supported this trend. We chose to perform a 12-hour fluorescence assay with samples

every half hour to examine the effect of aTc concentration on both growth and repression

of mRFP (Figure 3.7).

49

Figure 3.7: Effect of aTc concentration on (A) growth and (B) mRFP repression over time. All conditions carried out in dCas9 and sgRNA-NT1 expressing MG1655 fluorescent cells. Values are (A) mean optical densities 700 nm and (B) mean mRFP fluorescence. Error bars for both are standard deviation (n=3). Cells induced with 2 nM aTc grow at a similar rate to 0 nM aTc, yet repress mRFP to the same degree as cells induced with 1 µM aTc. Characterized in LB media.

From this result, we confirmed 2 nM aTc was a high enough concentration to achieve

adequate repression of mRFP, and therefore adequate expression of dCas9, without

impacting cell growth. Once we established this concentration of aTc, we sought to

determine which growth phase the best repression occurred in. We compared mid-log

phase (4 hours of growth based on Figure 3.7) and stationary phase (18 hours of growth)

(Figure 3.8).

B A

50

Figure 3.8: Comparison of repression of RFU in mid-log (blue) and stationary phase (orange) in MG1655 fluorescent cells with no plasmids, cells with pdCas9 only, and cells with pdCas9 and pgRNA targeted to NT1. Values are mean mRFP fluorescence normalized to the mean mRFP fluorescence of MG1655 fluorescent cells with no plasmids with the same growth time. Error bars represent standard deviation (n=3). Logarithmic scale. All samples contained 2 nM aTc. Characterized in LB media.

Cells in stationary phase repressed mRFP to a greater degree than cells in mid-log phase.

Due to this, we decided to perform FACS testing with cells grown overnight to stationary

phase and dCas9 induced with 2 nM aTc.

We prepared overnight cultures of MG1655 fluorescent cells with no plasmids,

pdCas9 only, and pdCas9 and pgRNA (NT1). All of these were grown with 2 nM aTc.

Prior to FACS, they were all diluted to an OD600 of 0.5. In addition to these samples,

aliquots of MG1655 fluorescent cells expressing dCas9 only and dCas9 with sgRNA

(NT1) were combined in ratios of 100,000:1, 1,000:1, 1:1,000, and 1:100,000 (dCas9

only:dCas9+sgRNA (NT1)) to perform a test sort.

51

Once at the flow cytometry center with our samples, the technician Hao Zhang

prepared dilutions of roughly 20-30 million cells/mL of the single-construct samples

provided and ran the samples in the flow cytometer to examine the GFP fluorescence,

mRFP fluorescence, forward scatter, and side scatter profiles of each cell type. With these

profiles, he set up gates to identify the characteristics of each of the cell types. Side

scatter is examined in Figure 3.9.

52

Figure 3.9: Side scatter width vs side scatter height profiles for 10,000 individual MG1655 fluorescent cells with (A) pdCas9 only, (B) pdCas9 and pgRNA (NT1), and (C) no plasmids. Each dot represents a member of the population. The bounding box R1 contains 97-99% of the population, as indicated by “% Hist” below each profile.

Because all three conditions take place in MG1655 cells, it makes sense that they have

similar side scatter profiles, and therefore similar internal complexities. More

importantly, however, is that even though the E. coli cells are small, side scatter would be

able to be used to separate cells from debris. Because the MG1655 cells express GFP

53

constitutively, we examined log mRFP fluorescence to log GFP fluorescence profiles

(Figure 3.10).

Figure 3.10: Log mRFP fluorescence vs log GFP fluorescence for 10,000 MG1655 fluorescent cells with

(A) pdCas9 only, (B) pdCas9 and pgRNA (NT1), and (C) no plasmids. Each dot represents an event.

Bounding box R2 bounds populations not repressing mRFP, bounding box R5 bounds populations

repressing mRFP.

54

These results show a clear distinction between cells repressing mRFP and cells

expressing mRFP. This distinction is important because it will allow us to distinguish

between non-repressing dCas9 and repressing dCas9 when we sort through library

members after domain insertion. We also examined the number of cells vs mRFP

fluorescence (Figure 3.11).

Figure 3.11: Number of cells vs log mRFP fluorescence for 10,000 MG1655 fluorescent cells with (A) pdCas9 only, (B) pdCas9 and pgRNA (NT1), and (C) no plasmids. Each dot represents an event. Bounding boxes R3 and R6 bounds populations not repressing mRFP, bounding box R4 bounds populations repressing mRFP.

55

These results mirrored the results in Figure 3.10 and provided the information in a more

intuitive manner.

With the baseline profiles established and the gates set based on them, we

performed test sorts using the premade, known-ratio mixtures of cells. The goal of these

was to not only determine the accuracy of the sorting, but to determine if it was possible

to select low numbers of populations with mRFP fluorescence profiles of interest that are

located within larger populations of a different mRFP fluorescent profile. This scenario

would arise during the library screenings: a viable switch candidate may only be found at

a frequency of one in a million, or even rarer. We ran the four mixed dCas9 and dCas9

with pgRNA (NT1) samples we prepared and sorted based on mRFP fluorescence (Figure

3.12).

56

Figure 3.12: Cell count vs log mRFP fluorescence for a mixture of MG1655 fluorescent cells with pdCas9 only and pdCas9 and pgRNA (NT1) at ratios of (A) 1:1000, (B) 1:100000, (C) 1000:1, and (D) 100000:1. Different numbers of cells were used for each condition, as shown. Cells of different fluorescences were able to be distingusished between.

These results show that the flow cytometer was able to detect and sort the smaller

population of cells from the larger population. Compared to the expected percentages of

the rare cells collected based on OD600, the percentages of actual rare events collected

were much higher. This suggests a high false positive percentage. However, events

collected are not necessarily all cells, and during the time between creating the mixtures

to running the samples, the cells likely grew to some degree, and one construct could

57

have grown more quickly than the other. To truly determine the percentage of false

negatives and false positives, we plated portions of the sorted cell samples onto

chloramphenicol only plates. In addition to resistance as an indicator of genotype, we

could easily observe which colonies were red and which were green through simple

inspection of the plates, which indicated whether only dCas9 was present in cells or if

both dCas9 and sgRNA (NT1) were present. Figure 3.13 examines the sort for rare mRFP

expressing cells.

Figure 3.13: Results of plated colonies from the test sort. Here, dCas9-only expressing MG1655 fluorescent cells were mixed with dCas9 and sgRNA (NT1) expressing MG1655 fluorescent cells at (Top) 1:103 and (Bottom) 1:105 ratios and sorted with FACS to enrich the dCas9-only expressing cells which were red due to expression of mRFP. These correspond to the sorts performed in A and B, respectively, from Figure 3.12. Samples from the sort were plated on chloramphenicol plates and grown to observe the color of the resulting colonies. The right column represents these results. After the sort, the (Top) 1:103 sorted mixture yielded a red to green ratio of 840:16 and the (Bottom) 1:105 sorted mixture yielded a red to green ratio of 270:8. Figure 3.14 examines the sort for rare mRFP repressing cells.

58

Figure 3.14: Results of plated colonies from the test sort. Here, dCas9-only expressing MG1655 fluorescent cells were mixed with dCas9 and sgRNA (NT1) expressing MG1655 fluorescent cells at (Top) 103:1 and (Bottom) 105:1 ratios and sorted with FACS to enrich the dCas9 and sgRNA (NT1) expressing cells which were green due to repression of mRFP. These correspond to the sorts performed in C and D, respectively, from Figure 3.12. Samples from the sort were plated on chloramphenicol plates and grown to observe the color of the resulting colonies. The right column represents these results. After the sort, the (Top) 103:1 sorted mixture yielded a red to green ratio of 0:500 and the (Bottom) 105:1 sorted mixture yielded a red to green ratio of 0:740. From these results, we determined the rate of false positives is much lower than

presumed from the events collected compared to the mixtures made via OD600. Due to

this, we determined that FACS would be adequate for library screening after domain

insertion into dCas9 as a part of the mRFP targeting NT1 CRISPRi system. We began

work on creating the dCas9 with inserted GBBP library.

Initial Library Creation

With the fluorescent reporter system behaving properly and the ability to use

FACS for screening confirmed, we began work on creating the library insertions. For

these, we needed the primers to circularly permutate the GBBP domain and open pdCas9

59

between desired residues of dCas9. Fortunately, we were able to reuse Ribeiro’s GBBP

primers from the BITE project. He had also created primers for 61 insertion positions in

dCas9. However, we wished to choose more insertion positions. Referencing the Oakes,

et al. characterization of the degree of tolerance to insertion at each position in dCas9, we

were able to make informed decisions about where in dCas9 would be best for new

insertion sites (Oakes, et al., 2016). The positions Ribeiro chose were mainly those of

high tolerance, though he also left out many with high tolerances to insertion, as well as

those with medium tolerance. We decided on selection criteria to include additional

insertion positions. We wanted to look at not only sites with high tolerance, but also

weaker tolerated sites, sites between areas of increased and decreased tolerance, sites in

and near areas of secondary structure, and six sites where Cas9 had been split in other

switch constructions. With these criteria, we chose the six split sites, as well as ninety

additional sites above a log2-fold enrichment of 3.41. We then designed and ordered the

forward and reverse abutting primers to open dCas9 at each of these new positions.

Before performing multiplex inverse PCR on all 157 positions, we chose five

reactions to ensure our protocol worked and to optimize the annealing temperature. All of

the primers used in this experiment had melting temperatures at or near 60ºC. We ran

each reaction at three different annealing temperatures, 55ºC, 57ºC, and 60ºC, otherwise

with standard conditions (Figure 3.15).

60

Figure 3.15: Test run of inverse PCR at 5 positions of dCas9 and three different temperatures. 1 corresponds to linearization between residues 1367 and 1368, 2 corresponds to linearization between residues 1368 and 1369, 3 corresponds to linearization between residues 0 and 1, and 4 corresponds to linearization between residues 4 and 5. The three temperatures tested were a, 55ºC, b, 57ºC, and c, 60ºC.

The results of this experiment showed that inverse PCR worked as anticipated, and

temperatures within the range tested showed little variation in results. With the test

inverse PCR working, we moved to the full multiplex inverse PCR of all 157 positions of

dCas9 and the amplification of cpGBBP at 137 positions (Figures 3.16-21).

61

Figure 3.16: The first gel of multiplex PCR using Ribeiro’s existing primers. The lane identifiers correlate to the well plate positions. See Table S2 for corresponding positions within dCas9. Amplifications expected at 6.7 kb.

62

Figure 3.17: The second gel of multiplex PCR using Ribeiro’s existing primers. The lane identifiers correlate to the well plate positions. See Table S2 for corresponding positions within dCas9. Amplifications expected at 6.7 kb.

Figure 3.18: The third gel of multiplex PCR using the newly designed, additional primers. The lane identifiers correlate to the well plate positions shown in Table S3. Amplifications expected at 6.7 kb.

63

Figure 3.19: Top row and bottom left: the fourth gel of multiplex PCR using the newly designed, additional primers. The lane identifiers correlate to the well plate positions. See Table S3 for corresponding positions within dCas9. Bottom right (Boxed): first gel of cpGBBP amplification. See Table S4 for corresponding positions of cpGBBP. Amplifications expected at 6.7 kb for dCas9 and 0.9 kb for cpGBBP.

64

Figure 3.20: Second gel of cpGBBP amplification. See Table S4 for corresponding positions of cpGBBP.

Amplifications expected at 0.9 kb.

65

Figure 3.21: Third gel of cpGBBP amplification. See Table S4 for corresponding positions of cpGBBP.

Amplifications expected at 0.9 kb.

66

A few of these sites failed, so we needed to add betaine and DMSO to the reaction. This

new protocol succeeded for all previously failed reactions (Figure 3.22).

Figure 3.22: Repeats of failed PCRs with DMSO and betaine. Top row: retry of failed positions from initial dCas9 multiplex reactions. Non-asterisked columns represent dCas9 sites selected by Ribeiro, asterisked columns represent the newly selected sites. Bottom row: retry of failed positions from initial cpGBBP amplifications. Non-asterisked columns represent GBBP from the first plate and asterisked columns represent GBBP from the second plate. See Tables S2, S3, and S4 for the location within dCas9 and GBBP of these sites. Amplifications expected at 6.7 kb for dCas9 (top row) and 0.9 kb for cpGBBP (bottom row).

We prepared the dCas9 vector and cpGBBP insert for transformation and

transformed 68.2 ng of purified ligation product with NEB Turbo cells. The

transformation yielded 1444 colonies, a much smaller library than anticipated. We

67

selected thirty colonies to grow and isolated their plasmid DNA. We then performed a

diagnostic digest with BamHI, which would correctly cut constructed plasmids twice—

once in the GBBP insert and once in the pdCas9 plasmid for two bands on an

electrophoresis gel. The sum of the bands was expected to be 7.6 kb. When the digest

was run on the gel, however, we did not observe the expected result for any of the

selected colonies (Figure 3.23).

68

Figure 3.23: Diagnostic digests of the 30 selected colonies from library. Columns with asterisks represent undigested plasmid prep product, columns without asterisks represent BamHI digested plasmid prep product. 0 columns represent the control, pdCas9 without insert. Successful insertions would contain 2 bands with masses summing to 7.6 kb.

69

From these results, we concluded that there was a very low likelihood that any of the

colonies from the library had the expected insertion.

Because we had a low transformation efficiency and no sampled colonies

contained a successful insertion, we came to the conclusion we were dealing with an

issue of dCas9 toxicity. The cells that did survive under the selective pressure of

chloramphenicol appeared to adapt the resistance from pdCas9, recombining the dCas9

plasmid without preserving the toxic dCas9 gene itself. We needed to find a solution to

minimize the toxicity to increase the likelihood of successful insertion.

Attempts to Minimize Toxicity

From the initial results obtained when we evaluated the system, we knew that

even without aTc in the system, the tet promoter did not completely repress gene

expression. We believed this initial burst of expression of dCas9 following transcription

was leading to off-target repression within the genome, resulting in cell death. This

behavior explained the deletions we were seeing; cells were recombining the transformed

plasmids to reduce toxicity. To mitigate this, we tried several strategies, starting with the

protocols used to create the library.

We first examined the ligation protocol to make sure its optimization would not

solve the problem. We varied the vector to insert ratio (1:2, 1:3, 1:4, and 1:5) and tested

different incubation conditions (18 hours at room temperature, 2 hours at room

temperature, 18 hours at 16ºC, and 30 minutes of 30 second alternations between 10ºC

and 30ºC). With a 1:5 vector to insert ratio and 18 hours at room temperature, we

achieved a visible ligation product on an electrophoresis gel (Figure 3.24).

70

Figure 3.24: Ligation reactions of phosphorylated amplified cpGBBP product (GBBP only), un-

phosphorylated inverse PCR pdCas9 product (dCas9 only), and phosphorylated amplified cpGBBP product

and un-phosphorylated inverse PCR pdCas9 product (dCas9 + GBBP). Successful ligation represents a

band at 7.6 kb.

With the possibility of unsuccessful ligation ruled out, we examined the recovery

media used after transformation and the agar we plated on. We tested three conditions:

SOC recovery media with LB agar, SOC recovery media with EZ-RDM agar, and EZ-

RDM recovery media with EZ-RDM agar. We performed three identical electroporations

with similar pulse times using lab-made electrocompetent cells and 76.2 ug new ligation

product, recovered for 1 hour, plated and grown overnight. The SOC/LB plate had 316

71

colonies, SOC/EZ-RDM had 220, and EZ-RDM/EZ-RDM had 143. We selected 16

colonies from each plate for colony PCR designed to identify successful insertions in the

dCas9 gene and report on whether the dCas9 gene was intact (Figure 3.25).

72

Figure 3.25: Colony PCR results for three transformation recovery media and agar conditions. Each condition consisted of 16 colonies. C represents the control, wild-type pdCas9. SOC media + LB agar represents cells recovered in SOC media after electroporation and plated on LB agar. SOC media + EZ-RDM agar represents cells recovered in SOC media after electroporation and plated on EZ-RDM agar. EZ-RDM media + EZ-RDM agar represents cells recovered in EZ-RDM media after electroporation and plated on EZ-RDM agar. The band size for a successful insertion is expected to be approximately 5 kB.

73

All 48 colonies had deletions of the dCas9 gene, independent of media recovered

in or agar plated on. Since growth was diminished with EZ-RDM and it did not result in a

lower frequency of deletions, we decided against using it for this purpose in the future.

At this point, we decided to augment the expression of dCas9 and sgRNA to

attempt to reduce toxicity. The cause of dCas9 toxicity is poorly understood, though it is

speculated that a possible explanation is non-specific binding of dCas9 to any PAM site

(Zhang & Voigt, 2018). Non-specific binding in the genome has the potential to inhibit

transcription of genes important for growth. This effect is worsened by the lack of

sgRNA binding. With this knowledge, Spisak had begun co-transformations with pgRNA

and transforming into electrocompetent cells with pgRNA present which improved

transformation efficiency and frequency of insertion marginally, perhaps due to the

stabilization of dCas9. Though this slightly improved the frequency of insertion, we

needed a way to further reduce toxicity. We chose to manipulate the expression of dCas9

by targeting the tet promoter. We believed if we increased the amount of TetR in the cell,

we could achieve higher repression of dCas9 when cells were not exposed to aTc. To do

this, we cloned the TetR gene into pgRNA, a higher copy number plasmid than pdCas9.

Instead of using the native tet promoter on the new pgRNA construction, we used the

stronger, constitutive T5 promoter to express TetR. This formed the plasmid pgRNA-T5-

TetR. We believed if we used electrocompetent cells already expressing pgRNA-T5-

TetR, dCas9 production would be immediately repressed due to TetR already being

expressed. After confirming successful cloning of pgRNA-T5-TetR via Sanger

sequencing and diagnostic digest, we performed a fluorescence assay to determine the

degree to which dCas9 was repressed with TetR on both plasmids (Figure 3.26).

74

Figure 3.26: Effect of pgRNA-T5-TetR on repression of dCas9 without aTc in comparison to pgRNA and pdCas9, together and separate, all in MG1655 fluorescent cells. pgRNA-T5-TetR results in marginally decreased repression of mRFP in the absence of inducer aTc, though the tet promoter still exhibits leaky expression without aTc as evidenced by the low RFU compared to controls. 2 nM aTc used for induction where indicated. Values are mean mRFP fluorescence normalized to OD700, error bars are standard deviation (n=3). Characterized in LB media.

The additional TetR provided by pgRNA-T5-TetR apparently reduced expression of

dCas9 since mRFP fluorescence was higher in the absence of aTc compared to when the

original pgRNA plasmid was used. However, fluorescence was still greatly diminished

by dCas9/CRISPRi compared to cells not expressing dCas9, indicating that dCas9 would

still be expressed in the absence of aTc. The introduction of additional TetR by use of the

pgRNA-T5-TetR construct did not, however, affect the level of RFU repression when

aTc was added.

Although it did not reduce expression of dCas9 substantially, we did not know

how much expression needed to be repressed to in order to reduce toxicity. To test this,

75

we tried to create an insertion of non-circularly permuted GBBP at only one position

within dCas9. The position we chose was between residues 573 and 574, due to the

observed increase in toxicity when attempting insertions at this position by Spisak while

attempting to minimize toxicity through other means. Inverse PCR of pdCas9 with the

primers to open between those residues, amplification of non-circularly permutated

GBBP, ligation of the two fragments, and transformation into electrocompetent pgRNA-

T5-TetR DH5a cells resulted in approximately 104 colonies. We selected ten to

characterize. Diagnostic digest, however, revealed that none were successful insertions

(Figure 3.27).

Figure 3.27: Diagnostic digestion with XhoI of selected colonies from transformation of pdCas9-GBBP-573. XhoI cuts pdCas9 and pgRNA-T5-TetR each once. Expected band sizes for a successful insertion would be 7.6 kb for pdCas9 with GBBP insertion and 3.3 kb for pgRNA-T5-TetR. Controls here are the first two columns: pdCas9 without modification at 6.7 kb and pgRNA-T5-TetR at 3.3 kb.

76

Additionally, these results show changes in the size of pgRNA-T5-TetR, leading to the

speculation that recombination of the two plasmids occurred, most likely due to the 642

base pairs of similarity between pdCas9 and pgRNA-T5-TetR from the TetR gene being

present on both plasmids. With this result, we did not continue pursuing this tactic due to

the high likelihood of recombination.

The attempts to insert circularly permutated GBBP into dCas9 were hindered by

apparent toxicity of dCas9. Several attempts to mitigate this were made, though without

success from my attempts. Both Spisak and I have both tried different approaches. His

most recent attempt replaced the tet promoter with the L-rhamnose-inducible promoter, a

catabolite sensitive promoter that allows for transcription of downstream genes in the

presence of L-rhamnose and halts transcription of downstream genes in the presence of

glucose. Combined with the insertion of only non-circularly permutated insert, he states

his preliminary results have been promising. Another possible tactic of further

diminishing dCas9 transcription may be to use an alternative initiation codon that reduces

transcription, as described by Firnberg, et al. (Firnberg, Labonte, Gray, & Ostermeier,

2014). Supposing these tactics to reduce dCas9 expression do not reduce toxicity, a

transposon insertion, such as that performed by Oakes, et al., may be a more reliable

method of insertion for dCas9 (Oakes, et al., 2016). Regardless of the method used, once

insertions can be properly made, the library will likely be able to be characterized in the

MG1655 fluorescent strain via FACS, as evidenced by the cell sorting results presented

in this thesis.

77

References Alberts, B., Johnson, A., & Lewis, J. (2002). Molecular Biology of the Cell (4th Edition

ed.). New York: Garland Science. Arvidson, D., Arvidson, C., Lawson, C., Miner, J., Adams, C., & Youderian, P. (1994).

The tryptophan repressor sequence is highly conserved among the Enterobacteriaceae. Nucleic Acids Research, 1821-1829.

Bilan, D., Pase, L., Joosen, L, Gorokhovatsky, A., Ermakova, Y., . . . Belousov, V. (2012). HyPer-3: a genetically encoded H2O2 probe with improved performance for ratiometric and fluorescence lifetime imaging. ACS Chem Biol, 8(3), 535-542.

Cobb, R. E., Si, T., & Zhao, H. (2012). Directed evolution: an evolving and enabling synthetic biology tool. Current Opinion in Chemical Biology, 285-291.

Copeland, M., Politz, M., & Pfleger, B. (2014). Application of TALEs, CRISPR/Cas, and sRNAs as trans-acting regulators in prokaryotes. Current Opinion in Biotechnology(29), 46-54.

Dagliyan, O., Tarnawski, M., Chu, P.-H., Shirvanyants, D., Schlichting, I., Dokholyan, N., & Hahn, K. (2016). Engineering extrinsic disorder to control protein activity in living cells. Science, 1441–1444.

Firnberg, E., Labonte, J. W., Gray, J. J., & Ostermeier, M. (2014). A comprehensive, high-resolution map of a gene’s fitness landscape . Mol Biol Evol, 31(6), 1581-1592.

Gilbert, W., & Maxam, A. (1973). The Nucleotide Sequence of the lac Operator. PNAS, 3581-3584.

Goeddel, D. V., Kleid, D. G., Bolivar, F., Heyneker, H. L., Yansura, D. G., Crea, R., . . . Riggs, A. D. (1979). Expression in Escherichia coli of chemically synthesized genes for human insulin. PNAS, 106-110.

Guntas, G., Mansell, T. J., Kim, J. R., & Ostermeier, M. (2005). Directed evolution of protein switches and their application to the creation of ligand-binding proteins. PNAS, 11224-11229.

Hudson, P. J. (2003). Engineered antibodies. Nature Medicine, 129-134. Iyer, V., Boroviak, K., Thomas, M., Doe, B., Riva, L., Ryder, E., & Adams, D. J. (2018,

July). No unexpected CRISPR-Cas9 off-target activity revealed by trio sequencing of gene-edited mice. PLOS Genetics.

Jacob, F., & Monod, J. (1961). Genetic Regulatory Mechanisms in the Synthesis of Proteins. J. Mol. Biol., 3, 318-356.

Kanwar, M. R., Wright, C., Date, A., Tullman, J., & Ostermeier, M. (2013). Protein Switch Engineering by Domain Insertion. Methods in Enzymology, 523, 369-388.

Latchman, D. S. (1993). Transcription factors: an overview. Int. J. Exp. Path., 417-422. Meinhardt, S., Manley, M. W., Becker, N. A., Hessman, J. A., Maher, J. L., & Swint-

Kruse, L. (2012). Novel insights from hybrid LacI/GalR proteins: family-wide functional attributes and biologically significant variation in transcription repression. Nucleic Acids REsearch, 40(21), 11139-11154.

Moore, R., Chandrahas, A., & Bleris, L. (2014). Transcription Activator-like Effectors: A Toolkit for Synthetic Biology. American Chemical Society Synthetic Biology, 708-716.

78

Nature Methods. (2018, April). CRISPR off-targets: a reassessment. Nature Methods, 15(4), 229-230.

Nielsen, A. A., & Voigt, C. A. (2014). Multi-input CRISPR/Cas genetic circuits that interface host regulatory networks. Molecular Systems Biology, 10(73), 1-11.

Oakes, B. L., Nadler, D. C., Flamholz, A., Fellmann, C., Staahl, B. T., Doudna, J. A., & Savage, D. F. (2016, June). Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch. Nature Biotechnology, 646-651.

Orth, P., Schnappinger, D., Hillen, W., Saenger, W., & Hinrichs, W. (2000). Structureal basis of gene regulation by the tetracycline inducible Tet repressor-operator system. Nature Structural and Molecular Biology, 215-219.

Phelan, R., Ostermeier, M., & Townsend, C. (2009). Design and synthesis of a beta-lactamase activated 5-flurouracil prodrug. Bioorg Med Chem Lett, 19(3), 1261-1263.

Politz, M. C., Copeland, M. F., & Pfleger, B. F. (2013). Artificial repressors for controlling gene expression in bacteria. Chem Commun (Camb), 49(39), 4325-4327.

Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A., Weissman, J. S., Arkin, A. P., & Lim, W. A. (2013). Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression . Cell, 1173-1183.

Rangel, G. (2015, August 9). From Corgis to Corn: A Brief Look at the Long History of GMO Technology. Boston, MA, United States of America: Harvard University.

Reynolds, K. A., McLaughlin, R. N., & Ranganathan, R. (2011). Hotspots for allosteric regulation on protein surfaces. Cell, 1564–1575.

Ribeiro, L. F., Alperson, S. Z., Pfleger, B. F., & Ostermeier, M. (2016). Modular Inducible Repressors Derived from Transcription Activator-like Effectors (TALEs). Nucleic Acids Research, UNPUBLISHED.

Ribeiro, L. F., Warren, T. D., & Ostermeier, M. (2017). Construction of Protein Switches by Domain Insertion and Directed Evolution. In V. Stein (Ed.), Synthetic Portine Switches: Methods in Molecular Biology (Vol. 1596, pp. 43-55). New York, NY: Humana Press.

Rice University. (2016, March 23). Prokaryotic Gene Regulation. Retrieved from OpenStax CNX: https://cnx.org/contents/[email protected]:drSgdNIj@5/Prokaryotic-Gene-Regulation

Romero, P. A., & Arnold, F. H. (2009). Exploring protein fitness landscapes by directed evolution. Nature Molecular Cell Biology, 866-876.

Schiefner, A., Breed, J., Bosser, L., Kneip, S., Gade, J., Hotlmann, G., . . . Bremer, E. (2004). The Journal of Biological Chemistry, 279(7), 5588-5596.

Scholz, O., Köstner, M., Reich, M., Gastiger, S., & Hillen, W. (2003). Teaching TetR to Recognize a New Inducer. J. Mol. Biol., 329, 217-227.

Schreiber, G., & Keating, A. E. (2011, Febuarary). Protein Binding Specificity versus Promiscuity. Current Opinion in Structural Biology, 21(1), 50-61.

Schumacher, M., Choi, K., & Brennan, R. (1994). Crystal structure of LacI member, PurR, bound to DNA: minor groove binding by alpha helices. Science, 763-770.

79

Sohka, T., Heins, R., Phelan, R., Greisler, J., Townsend, C., & Ostermeier, M. (2009, June 23). An Externally Tunable Bacterial Band-Pass Filter. (F. H. Arnold, Ed.) PNAS, 106(25), 10135-10140.

Stein, V., & Alexandrov, K. (2015). Synthetic protein switches: design principles and applications. 101-110.

Stein, V., & Alexandrov, K. (2015, February). Synthetic Protein Switches: Design Principles and Applications. Trends in Biotechnology, 33(2), 101-110.

Taylor, N. D., Garruss, A. S., Moretti, R., Chan, S., Arbing, M. A., Cascio, D., . . . Raman, S. (2016). Engineering an allosteric transcription factor to respond to new ligands. Nature Methods, 13(2), 177-183.

Waryah, C. B., Moses, C., Arooj, M., & Blancafort, P. (2018, March). Zinc Fingers, TALEs, and CRISPR Systems: A Comparison of Tools for Epigenome Editing. Epigenetic Editing, 19-63.

Wells, J. A., & Estell, D. A. (1988). Subtilisin - an enzyme designed to be engineered. TIBS, 290-297.

Younger, A., Dalvie, N., Rottinghaus, A., & Leonard, J. (2016). Engineering Modular Biosensors to Confer Metabolite-Responsive Regulation of Transcription. ACS Synthetic Biology.

Yu, Y., & Lutz, S. (2011). Circular permutation: a different way to engineer enzyme structure and function. Cell Trends in Biotechnology, 18-25.

Zhang, S., & Voigt, C. A. (2018). Engineered dCas9 with reduced toxicity in bacteria: implications for genetic circuit design. Nucleic Acids Research, 46(20), 11115–11125.

80

Appendices

Figure S1: First plate reader MIC with GB-HCl. Normalized cell survival vs ampicillin concentration at 22

hours after addition of GB-HCl for BITElacO-C7 expressing SNO301 cells. These values are normalized to

the mean absorbance of cells without GB-HCl at 0 µg/mL ampicillin. Cells induced with 5 mM GB-HCl

for induced condition. Normalized cell survival values are the mean, error bars represent the standard

deviation (n=3).

81

Figure S2: Second plate reader MIC with GB-HCl, repeat of experiment shown in Figure S1. Normalized

cell survival vs ampicillin concentration at 22 hours after addition of GB-HCl for BITElacO-C7 expressing

SNO301 cells. These values are normalized to the mean absorbance of cells without GB-HCl at 0 µg/mL

ampicillin. Cells induced with 5 mM GB-HCl for induced condition. Normalized cell survival values are

the mean, error bars represent the standard deviation (n=3).

82

Table S1: Strains and plasmids used in each figure for Chapter 2.

RW Label Label E. Coli Strain Plasmid 1Resistance plasmid 1 Plasmid 2

Resistance plasmid 2 Description Figure(s)

pDIM MG1655K12 MG1655 ΔlacI

pDIMC8-TEM1bla Cm 50 ug/mL BLA not repressed

2.3,2.4,2.5

pTS-TALE pDIM MG1655K12 MG1655 ΔlacI

pDIMC8-TEM1bla Cm 50 ug/mL pTS-TALElaco

Spec 50 ug/mL TALElacO targeted to lac operator

2.3,2.4,2.5

pTS pDIM MG1655K12 MG1655 ΔlacI

pDIMC8-TEM1bla Cm 50 ug/mL pTS1

Spec 50 ug/mL LacI control

2.3,2.4,2.5

pTS-BITE-C7 pDIM MG1655K12 MG1655 ΔlacI pDIM-BLA Cm 50 ug/mL pTS-BITElacO-C7

Spec 50 ug/mL

BITElacO-C7-C7 targeted to lac operator

2.3,2.4,2.5

pDIM GFP K12 MG1655 pDIM1-sfGFP Cm 50 ug/mL background fluorescence control 2.3,2.6

pTS pDIM GFPK12 MG1655 ΔlacI pDIM1-sfGFP Cm 50 ug/mL pTS1

Spec 50 ug/mL

LACI targeted to lac operator; plasmidial sfGFP 2.3,2.6

pTS-TALE pDIM GFPK12 MG1655 ΔlacI pDIM1-sfGFP Cm 50 ug/mL pTS-TALElaco

Spec 50 ug/mL

TALElacO targeted to lac operator; plasmidial sfGFP 2.3,2.6

pTS-BITE-C7 pDIM GFPpTS1- BITElacO-C7 pDIM1-sfGFP Cm 50 ug/mL pTS-BITElacO-C7

Spec 50 ug/mL

BITElacO-C7 targeted to lac operator; plasmidial sfGFP 2.3,2.6

pLR-BITElacO-C7 pDIM MG1655

K12 MG1655 ΔlacI

pDIMC8-TEM1bla Cm 50 ug/mL pLR-BITElacO-C7

Spec 50 ug/mL

BITElacO-C7-C7 targeted to Lac operator using pLR-BITElacO-C7lacO-C7 2.7

pTS-BITE-C7 pDIM MG1655K12 MG1655 ΔlacI

pDIMC8-TEM1bla Cm 50 ug/mL pTS1- BITElacO-C7

Spec 50 ug/mL

BITElacO-C7-C7 targeted to Lac operator using pTS1-BITElacO-C7lacO-C7 2.7

pDIM SNO301 SNO301pDIMC8-TEM1bla Cm 50 ug/mL

SNO301 expressing BLA, not repressed 2.8,2.9,

pTS-BITE-C7 pDIM SNO301 SNO301pDIMC8-TEM1bla Cm 50 ug/mL pTS1- BITElacO-C7 Spec 50 ug/uL

BITElacO-C7-C7 targeted to lac operator, SNO301

2.8,2.9,2.10

pTS-TALE pDIM SNO301 SNO 301 pTS-TALElacoSpec 50 ug/mL pDIMC8-TEM1bla Cm 50 ug/mL

TALElacO targeted to lac operator; SNO301

2.8, 2.9, 2.10

pTS pDIM SNO301 SNO 301 pTS1Spec 50 ug/mL pDIMC8-TEM1bla Cm 50 ug/mL LacI control; SNO301

2.8, 2.9, 2.10

83

Table S2: Corresponding insert sites from well plate numbers for Ribeiro’s dCas9 previously selected sites. The setting below represents the gradient setting on the thermocycler used, and the temperature column represents the temperatures of each column of the thermocycler.

84

Table S3: Corresponding insert sites from well plate numbers for the additional selected sites of insertion in dCas9. The setting below represents the gradient setting on the thermocycler used, and the temperature column represents the temperatures of each column of the thermocycler.

85

Table S4: Corresponding split sites from well plate numbers for Ribeiro’s previously made primers to circular permutate and amplify GBBP. Site number represent the starting position of GBBP at that circular permutation. Ribeiro cpGBBP 1 is plate 1 and is plate 2. The setting below represents the gradient setting on the thermocycler used, and the temperature column represents the temperatures of each column of the thermocycler.

86

Biography Gareth Evans was born in 1996 in western Pennsylvania. In 2014, he began his

undergraduate studies at Johns Hopkins University in Baltimore, Maryland. In 2017, he

joined the Ostermeier lab after becoming interested in molecular evolution while taking

Marc Ostermeier’s course. In 2018, he received his Bachelor of Science in Chemical and

Biomolecular Engineering. He stayed at Hopkins to earn his Master of Science in

Chemical and Biomolecular Engineering in May 2019.

Documents

CREATION OF MODULAR INDUCIBLE REPRESSORS VIA DIRECTED