The challenges of informatics insynthetic biology: from biomolecularnetworks to artificial organismsGil Alterovitz, Taro Muso and Marco F. RamoniSubmitted: 5th August 2009; Received (in revised form): 23rd September 2009
AbstractThe field of synthetic biology holds an inspiring vision for the future; it integrates computational analysis, biologicaldata and the systems engineering paradigm in the design of new biological machines and systems. These biologicalmachines are built from basic biomolecular components analogous to electrical devices, and the information flowamong these components requires the augmentation of biological insight with the power of a formal approach toinformation management. Here we review the informatics challenges in synthetic biology along three dimensions:in silico, in vitro and in vivo. First, we describe state of the art of the in silico support of synthetic biology, fromthe specific data exchange formats, to the most popular software platforms and algorithms. Next, we cast in vitrosynthetic biology in terms of information flow, and discuss genetic fidelity in DNAmanipulation, development strate-gies of biological parts and the regulation of biomolecular networks. Finally, we explore how the engineeringchassis can manipulate biological circuitries in vivo to give rise to future artificial organisms.
Keywords: informatics; synthetic biology; systems biology; networks
INTRODUCTIONThe processing and management of information is a
critical part of synthetic biology, a field that
approaches the design of biologically based machines
from a systems engineering perspective, as a comple-
ment to systems biology. Whereas systems biology
studies how biological parts give rise to the emergent
properties and functions of a unified organism,
the main goal of synthetic biology is to start with
a set of functions and properties, and build a suit-
able system out of biological components. In other
words, systems biology and synthetic biology repre-
sent two sides of the same coin: analysis and
design [1].
The development of biologically based solutions
to human problems is as old as mankind. For thou-
sands of years, man has been breeding plants for agri-
culture, horses for transportation and pets for
companionship. Genetic engineering pioneered the
use of natural genes to modify organisms. Synthetic
biologists also alter natural systems for human con-
sumption, but with a different approach: they
engineer biological systems starting from artificial
components. As in systems engineering, biological
modules could be developed from an eclectic set
of natural sources and rapidly combined to arrive at
innovations that would be far beyond incremental,
time-consuming adjustments of natural organisms.
The imminent departure from traditional biological
engineering inspires novel ways to solve age-old
problems, such as those in alternative energy [2],
drug manufacture [3, 4], therapeutics [5] and green
chemistry [6]. In other words, synthetic biology
opens the door to unprecedented biochemical
flexibility—a marked departure from an incremental
pattern of progress.
Gil Alterovitz is a Harvard Medical School faculty member in the Children’s Hospital Informatics Program at the Harvard/MIT
Division of Health Sciences and Technology (HST).
Taro Muso is a graduate of the Harvard/MIT Division of Health Sciences and Technology (HST) and an affiliate of the Partners
Healthcare Center for Personalized Genetic Medicine.
Corresponding author. Taro Muso. E-mail: [email protected]
Marco F. Ramoni is the Associate Professor of Pediatrics and Medicine at Harvard Medical School, and the Director of the
Biomedical Cybernetics Laboratory at the Partners Healthcare Center for Personalized Genetic Medicine.
BRIEFINGS IN BIOINFORMATICS. VOL 11. NO 1. 80^95 doi:10.1093/bib/bbp054Advance Access published on 11 November 2009
� The Author 2009. Published by Oxford University Press. For Permissions, please email: [email protected]
In theory, the synthetic biologist should be able
to start with a set of desired features, design a bio-
logical circuitry that meets those requirements, and
implement that design in vivo. The reality is not
so straightforward (Figure 1). The current practice
of producing complex biological systems usually
requires an iterative optimization, partly because bio-
logical parts are subject to apoptosis, crosstalk, muta-
tions and perturbations. In addition, a biological
component can exhibit context dependence—
it can stop working when it is transplanted from its
native context into another cell type. Synthesized
biological circuitry also suffers from biological noise
and undesirable initial conditions. The issues inher-
ent in this field become most apparent when one
considers biological components, when put together,
give rise to emergent properties in the whole. The
existence of emergent properties indicates that our
biological knowledge and design capabilities are not
yet at the level of sophistication needed for a prioridesign and production of a prototype with a fair
shot at success.
It is clear that the acknowledgement of the exis-
tence of emergent properties implies the need for
a better understanding of systems biology. What
is less obvious is that efficiently building a robust
infrastructure for synthetic biology requires a careful
management of relevant information by the research
community. Such information would include bio-
logical device data exchanged by collaborators, net-
work models exported by software and signals
transduced from one biological device to another.
The complexity and amount of information
needed implies an opportunity for synergy through
standardized communication. However, reviews
on synthetic biology from an informatics perspective
are rare. This review addresses this gap in the
literature.
IN SILICOComputer-based design and simulation are key
elements of synthetic biology, and there is a need
for efficient communication between both human
beings and software programs. Taken together,
these facts imply the need for standardization of
synthetic biology data in silico.
Information standardsMost of the efforts in synthetic biology computer
data standardization can be grouped into two areas.
One starts with a network perspective, and the
other has a ‘bottom-up’ approach that emphasizes
the fundamental building block of synthetic biology,
or the biological part. The dominant parts format
appears to be the BioBrick Standard [7] (Figure 2),
which is used by the Registry of Standard Biological
Parts (http://partsregistry.org) and the international
Genetically Engineered Machine (iGEM) competi-
tion [8]. The Biobricks Standard is a set of rules
that define features of a DNA sequence so that
each BioBrick can be easily combined into larger
compositions in vitro. In other words, each BioBrick
is an easily clonable DNA sequence which codes for
a biological part. While the ease of DNA construc-
tion is addressed, extending the format to support
Figure 1: The synthetic biology infrastructure. Solidlines indicate the components of synthetic biology andthe connections among them. Bold solid lines empha-size the main path from given requirements to finishedproduct. Boxes with thin solid lines indicate supportstructures that need to be developed in order to makesynthetic biology a practical reality. The cycles withinthe graph convey that the current technology requiresan iterative approach to arrive at a useful biologicalsystem. In a larger context, synthetic biology (design)and systems biology (analysis) feed into each other(see dashed lines). For example, in vivo tests resultin data which feed back into synthetic biology. TheDatabase box represents organized knowledge fromsystems biology and quantitative data on biologicalparts.
Challenges of informatics in synthetic biology 81
the functional composition of these modules remains
an important challenge [9]. The BioBrick format
bases its parts characterization on promoter structure
and sequences, and this is not easily translated
into functional characterization within the context
of interacting networks [10]. Sequence-based
descriptions of parts would be appropriate in design-
ing small systems where potential interactions
could be intuitively processed (for example, by
ignoring ‘nonessential’ DNA segments), but this
becomes impractical for the design of large networks.
This is because even ‘nonessential’ portions of bio-
logical sequences still affect functional efficiency
in DNA promoters, RNA, and proteins [11]. (This
paper [11] not only published new biological parts
but also proposed a general strategy that addresses
problems of emergent properties and design inaccu-
racy. This paper convincingly argued for a new
way to develop and characterize components, and
will likely influence the way future biological parts
are presented in databases and publications.) Minor
changes of nonessential sequences affect individual
components in minor amounts that are only quanti-
tatively noticeable, but small changes to one compo-
nent can still have a dramatic impact on network
behavior. Therefore, quantitative characterizations
of component functions are necessary for efficient
network design. Canton et al. [12] (this paper
proposed to augment the BioBricks documentation
standards) proposed to extend the Biobrick Standard
by adding quantified descriptions formatted into
datasheets akin to those common in electrical engi-
neering. However, different biological parts may
require different types of information [9]. In other
words, the Registry may require more than one
datasheet format.
Other enhancements of the BioBrick Standard
have also been proposed. Recent experimental
tests to confirm the validity of plasmid inserts for
a collection of clones have resulted in unexpected
discrepancies, so a quality control scheme has
been proposed [13] (This paper proposes a quality
control scheme for the Registry of Biological
Parts). A provisional BioBrick language (PoBoL)
was created to define a data exchange standard
(http://pobol.org) [14]. More specifically, PoBoL
aims to define minimal information requirements
for BioBricks, provide annotation methods for
BioBricks, maintain interlinking possibilities and
set the stage for further language extensions.
Of equal importance to biological parts standard-
ization is an agreement on how network designs
should be described. To model biological
systems, it seems logical to start with conventions
developed in the systems biology community, such
as the Systems Biology Markup Language (SBML)
[15–18], Cellular Markup Language (CellML)
[19, 20], MIRIAM [21], Systems Biology
Graphical Notation (SBGN) [22, 23] ([23] formally
presents a set of conventions in graphical notation
Figure 2: The BioBrick Standard [7]. (a) Basic sequence template of the BioBrick Standard. The insert ofthe BioBrick is flanked upstream and downstream with restriction sites. EcoRI and XbaI restriction sites are atthe 50 -end (prefix). SpeI and PstI restriction sites are at the 30 -end (suffix). Each insert is a genetic componentthat can code for a promoter, ribosome-binding site, open reading frame, transcription termination sequence, orany combination of these. Restriction site sequences are not allowed within the genetic component. (b) Schematicof the joining process. To attach insert 1 upstream of insert 2, use restriction enzymes SpeI and XbaI respectively.The ends can then be joined together to form a scar, which cannot be cleaved again by either one of the restrictionenzymes.
82 Alterovitz et al.
that will help biologists communicate clearly and
efficiently) and BioPAX [24].
SBML was developed to exchange biological
process information in the systems biology commu-
nity [15–18]. It can be used to model a variety of
phenomena, such as metabolic pathways, gene regu-
lation and cell signaling pathways. Its success can
be attributed to a number of factors. First, SBML
has incorporated a number of other useful standards:
MathML 2.0 [25], which provides a common math-
ematical expression language; the Resource
Description Framework (RDF) [26], which allows
for machine-readable metadata; and the Systems
Biology Ontology (SBO) [27, 28] is a set of six
controlled vocabularies. Second, SBML provides
community-driven software support [29] (http://
sbml.org/SBML_Software_Guide). A particularly
useful software platform is an application program-
ming interface (API) library called LIBSBML [30],
which makes SBML file manipulation accessible
to scripting languages. Current translation scripts
have bridged SBML-structured data and
other formats [31]. Third, the SBML format
is used in the BioModels Database [32] (http://
www.ebi.ac.uk/biomodels-main/). Recent develop-
ments demonstrate both language extensions
and applications. Its utility has been extended for
stochastic simulations [33]. SBML has been used in
the analysis of iron metabolism [34] and the RB/E2F
pathway [35].
CellML, an alternative to SBML, is an extensible
markup language that models the cell as a set of
ordinary differential equations [19, 20]. Its more
modular structure is convenient for multi-scale
modeling and reuse of parts but has less emphasis
on the biochemistry. CellML also incorporates
MathML and RDF. It also has some community-
driven software support [36] (http://www.cellml
.org/tools). There are translators that bridge
SBML and CellML [37]. Community adoption
of this standard has resulted in the CellML
Model Repository, which is a publicly accessible
database of curated biological models [38] (this
paper [38] presents the current state of the model
repository). CellML’s flexibility stems from its
ability to represent biological phenomena through
mathematical and model building constructs,
but sometimes it is useful to have explicit bio-
logical descriptions. To this end Wimalaratne et al.[39] have developed a biophysical annotation
framework.
MIRIAM, or minimal information requested
in the annotation of biochemical networks, is a
scheme to provide extensive documentation in the
model file in a structured manner [21]. Models can
only be useful if there is enough annotation.
Controlled annotations are achieved with the help
of uniform resource identifiers (URIs) [40]. The
MIRIAM approach provides a common annotation
format as well as controlled vocabularies and data-
bases [41].
SBGN is a recent attempt at standardizing
the visual representation of biological networks
(Figure 3) [22]. Recently, automatic equation gen-
eration for SBML from SBGN diagrams was
made possible [42].
BioPAX is an effort to represent pathway data
with ontological annotations [24, 43]. BioPAX com-
plements formats like CellML and SBML because it
focuses on the integration of large qualitative
pathways rather than on mathematical modeling
[10, 44].
The synthetic biology community also has other
approaches that border on standardization. For
example, Pedersen et al. [45] introduced a formal
language called Genetic Engineering of Cells
(GEC), which allows a modular modeling of inter-
actions between potentially undetermined proteins
and genes.
Ideally, a synthetic biology design approach
would have the versatility to employ both network-
and component-centric standards so that multiple
levels of detail could be considered at the same
time. In addition to importing publicly accessible
data in common formats, the workflow would
integrate problem-specific data and formats as well.
Integration of the network and component perspec-
tives is occurring or anticipated on multiple fronts.
The BioBricks format is expected to support the
design of ever more complex networks by incorpor-
ating integration approaches akin to BioPAX [10]
that allow for ontological annotations. In contrast,
standards like CellML and SBML that already
allow mathematical network modeling would
benefit from extending their formalisms to leverage
synthetic biology constructs, such as DNA sequences
and device-level information [10]. A third front
is composed of integration efforts not though
explicit dialogue on standards but with software
development. OpenCell (PCEnv), a CellML-based
platform, can model both quantitative networks and
synthetic biology constructs [19].
Challenges of informatics in synthetic biology 83
The result of these efforts would be a compre-
hensive description framework, but the classic trade-
off between detail-driven accuracy and analytical
efficiency will persist. Because a tradeoff naturally
implies numerous possible approaches to addressing
both accuracy and efficiency, each subgroup
within synthetic biology may opt to pursue
their own specialized formats for data management.
For example, a network that depends on transcrip-
tional regulation and a model that depends on
protein–protein interaction may have different
description requirements for modules and control
kinetics equations. Such specializations may be
easily achieved through the custom tag facility of
XML [46], which is already familiar to developers
of SBML and CellML.
Figure 3: SBGN network example. of inter-cellular signaling near the neuromuscular junction [22, 23]. Biologicalconcepts are organized with glyphs, or named containers. Some glyphs represent entity pool nodes, each of whichis a population of entities that are not distinguished from one another in the current SBGN framework. Circular(not ellipsoid) glyphs represent ‘simple molecules’ like ATP and calcium ions. Rectangles with four rounded cornersrepresent ‘macromolecules’ such as myosin.Glyphs can be adorned with additional information, such as the nicotinicacetylcholine receptors (nAChR), which are attached to the ‘state variable’ glyphs ‘open’ and ‘closed’. Note thatthe transition from ‘closed’ to ‘open’ is designated by an arrow with the ‘transition’ glyph, a small square. Anotherused process glyph here is ‘association’, where to lines converge to form one arrow, and a filled disk is placed down-stream of the connection. By having a carefully planned set of conventions for depicting biological processes, colla-borators can communicate with each other with minimal ambiguity in graphical notation.
84 Alterovitz et al.
Databases and software toolsNo single data standard in synthetic biology has
yet achieved the scope necessary to account for all
useful information, such as epigenetic data [9].
Nevertheless, the current data formats are still
useful for organizing biological information in data-
bases and software. Synthetic Biology Software
Suite (SynBioSS), designed for modeling synthetic
genetic constructs, uses the Registry of Standard
Biological Parts as well as a kinetic parameter data-
base [47]. GenoCAD aims to streamline the design
of synthetic DNA sequences [48]. This program
appears to imply a debate in the synthetic biology
community about the need for well-formatted
ends for easy connection of coding sequences. The
software takes advantage of the BioBrick-formatted
DNA registry, but it also aims to do away with
the standardization of the means by which the parts
are connected. This implies a BioBrick-independent,
general means of producing long stretches of error-
free DNA (discussed later). CellML has software
support through OpenCell (formerly PCEnv),
Cellular Open Resource (COR) [49], InsilicoIDE
and JSIM [19]. Cytoscape can visualize and ana-
lyze complex networks for biological research [50].
Plug-ins, which confer additional features, are
actively being developed [51–54]. Funahashi’s
CellDesigner [55], an editor for SBML, was designed
as a tool to model network dynamics. It has a
plug-in facility that enables third parties to extend
the software capability. CellDesigner’s utility has
been extended for stochastic simulations [33] and
automatic equation generation from SBGN diagrams
[42]. CellDesigner has been used in the analyses
of iron metabolism [34] and the RB/E2F pathway
[35]. The Process Modeling Tool (ProMoT) is
a ‘drag and drop’ design platform [29]. Other
software developments can be found at format-
specific resource pages [29, 36, 56]. In short, con-
current with the efforts to reach consensus on
information standards are attempts to employ
data and standards in the design of synthetic
networks.
Algorithms and heuristicsComputer-based informatics also has the advantage
of relatively low-cost, quick simulations prior to
in vitro implementation. Loewe [57] proposed a
framework that combined systems biology and evo-
lutionary theory to simulate mutations whose effects
are too subtle to be detected in vitro. Chen et al. [58]
proposed a stochastic game theory-based approach
to address complications due to uncertain initial
conditions and extra-cellular disturbances. They
also proposed managing uncertainties by addressing
four design specifications [59]. Banga [60] has
recently reviewed optimization in computational
systems biology. Computational limits make model
simplification a useful strategy. To this end enzyme
kinetic models are translated in a number of formats
to reduce the model complexity. Hadlich et al. [61]
developed an algorithm to automate the process of
kinetic format translation. Bentley [36] proposes
methods called systemic computation (SC) and
fractal proteins for improving the simulations of bio-
logical systems. OptCircuit is an optimization-based
method for automatically identifying the required
circuits from a database of components and kinetic
parameters [62]; this method may work well
with Ellis et al.’s strategy of designing networks
from quantitatively characterized libraries of diversi-
fied components [11]. Cantone et al. [63] developed
a small synthetic gene network to assess current
modeling and reverse-engineering algorithms.
Models based on ordinary differential equations and
Bayesian networks were qualitatively accurate, but it
is not yet clear if these conclusions are generalizable
to the analysis of larger networks. We see that
the need for an unambiguous, quantitative, and
collaborative exchange of digital, computerized
information is currently being addressed by a variety
of standards, databases and software.
Improvements in algorithms for analyzing
networks in synthetic and systems biology are
needed, because our current, relatively simple
models do not have the capacity to handle the abun-
dant data acquired from complex biological systems
[31]. Issues in network analysis are exemplified by
the fact that inferences from small-sized networks
cannot be simply extrapolated to larger networks,
as Stumpf etal. [64] have shown that sub-networks of
a scale-free network are not necessarily scale free.
In general, a rigorous statistical analysis of network
data is difficult because there are numerous correla-
tions [31].
IN VITROThe informatics approach can also reframe the
in vitro aspects of synthetic biology. In this
light, DNA synthesis from computer-aided design
is essentially a format conversion from bytes
Challenges of informatics in synthetic biology 85
to basepairs. Biological parts development often
involves a refinement of signal transduction, or
data flow within a biological circuit. Protein
complexes can be modeled as instances of noisy
communication channels [65, 66]. Indeed, because
information-processing devices such as logic gates
have been already implemented in vitro (Figure 4).
In other words, critical informatics technology
Figure 4: Transcription-based logic gates constructed from modular transcription units [67]. Electronic logic gatesare the fundamental building blocks of computational ability. For each logic gate, the table presents the booleanlogic (column 2), design a biological module (column 3) and emulate the electronic counterpart with an expressionprofile (column 4). Each network architecture represents a synthetically designed component.
86 Alterovitz et al.
in synthetic biology resides not only in computers
but also in biological circuitry as well.
DNA synthesisFollowing a successful simulation, the computer-
based network design must be translated into an
invitro DNA sequence. BioBrick-formatted synthetic
genes can provide a set of required, proofread
sequences that one can splice together (Figure 5).
Combined, the much longer sequence codes for
the synthetic biological circuitry. On the other
hand, doing away with the BioBrick parts connec-
tion formats can streamline the design of synthetic
DNA sequences [48], as long as sequence proofread-
ing can still be done. In other words, an approach
independent of the build-by-parts strategy requires
a high-fidelity method for writing the basepair
sequence, because even a single basepair mutation
has been shown to cause system-wide disorders
such as sickle-cell anemia. Linshiz et al. [68] (this
paper proposes a strategy to make large, error-free
DNA target molecules) developed a method for
writing long, error-free DNA from potentially
faulty building blocks (Figure 6). Gibson et al. [69]
(this paper demonstrates that it is possible to handle
an entire Mycoplasma genome with high fidelity)
developed a method for constructing large DNA
molecules, such as a 582 970-basepair Mycoplasmagenitalium genome.
Biological component designJust as electrical circuits need devices that control
data flow, biological networks need biological parts
that modulate signal transduction. Informatics issues
in components and the network overlap with
Figure 5: Assembling DNA molecules with BioBrick parts [70]. Gene A is to be added to the standardizedplasmid p1. Neither Gene A nor any gene within p1may have sequences that can be recognized by the four restric-tion enzymes used during the main assembly process. Gene A is flanked by ‘prefix’ and ‘suffix’ sequences whichare deliver by primers during PCR. Alternatively, one can acquire a plasmid pA that already has Gene A withthe necessarily prefix and suffix. Plasmid p1 and Gene A undergo separate restriction enzyme digests, and arelater combined to form p1A.The plasmid p1A is now ready to receive another gene.
Challenges of informatics in synthetic biology 87
each other. We will start with components and tran-
sition into network informatics.
Synthetic biological devices are often made
from natural devices with evolutionary optimization.
Natural components may therefore have context
dependence that precludes them from compatible
connection points with other devices. One example
is the codon mismatch that occurs when a biological
Figure 6: Recursive construction of error-free DNA molecules from imperfect oligonucleotides [68]. (A) GFPDNA construction. The entire sequence is divided into overlapping ones in silico. These pieces are synthesizedconventionally. Assembly by overlapping ssDNA results in a target molecule, which are then sequenced to finderrors. Error-free segments are derived, amplified and assembled by overlapping ssDNA takes place again. Thisloop continues until an error-free target molecule is formed. (B) Construction of ssDNA from two overlappingsequences. During PCR, one primer is a phosphorylated primer, which becomes a degradation target of Lambdaendonuclease.
88 Alterovitz et al.
part is transferred from one organism to a host of
a different kingdom [71]. In order to adapt natural
parts to the needs of synthetic biology, they must
be standardized. Lucks [72] proposed a set of general
features to consider when developing a biological
device. An ideal part would be independent, reliable,
tunable, orthogonal and composable. In other words,
it does not interfere with other circuitry, functions
as intended (context independent), can function
in a range of selectable modes, can be tuned so
that it does not interfere with similar devices, and
can be combined to function in a system predictably.
In addition, DNA sequences must adhere to the rules
of transcription control [73]. Suarez et al. [74]
discuss the challenges in the computational design
of proteins. Martin et al. [71] review guidelines for
engineering synthetic enzymes. Recent synthetic
biology devices include a cellular counter in
Escherichia coli [75], a tunable synthetic mammalian
oscillator [76], an aptazyme-based riboswitch [77],
a tunable synthetic gene oscillator [78] and a
double inversion recombination switch [79].
Incidentally, Tsai et al. [80] argue that biological
oscillators sometimes contain positive feedback
loops in order to achieve frequency control without
amplitude change. Dawid et al. [81] designed syn-
thetic RNA regulatory elements based on transcrip-
tion attenuator control.
Arkin [79] proposed developing a group of
devices from a common core structure by altering
a particular key property. Calling them a ‘family of
parts’, Arkin argued that related devices are likely
to share characterization protocols. Common proto-
cols for a versatile set of devices would simplify the
physical composition process, and this would have
important ramifications on design strategies as well
as parts organization within the Registry. However,
it is important to keep in mind that similar devices
raise the risk of crosstalk and interference with each
other [10]. Unlike electrical circuits, the same ‘logic
gate’ probably cannot be used in the same space.
Ellis et al. [11] proposed the development of
libraries of diversified components—parts that are
functionally equivalent but have differences in the
nonessential sequences—for improving design strat-
egy. Differences in nonessential sequences affect
quantitative functional efficiencies of components,
and this in turn can have a large impact on overall
network behavior. If required documented libraries
are established prior to design, then one can
accurately simulate and fine-tune a system by picking
the components with appropriate functional efficien-
cies. In other words, Ellis etal. [11] proposes to move
component ‘tweaking’ to the front-end of the
synthetic biology infrastructure and upstream of
software-based network design. Such ‘diversified’
parts would address issues of emergent properties,
biological noise and tunability. It may also address
the need for compatible inputs and outputs in serial
connectivity. Ellis et al. [11] successfully employed
the above strategy in the development of a feed-
forward loop network and a gene timer network.
Establishment of such libraries will probably occur
not only for DNA but RNA and proteins as well.
Biological noise presents problems for informa-
tion flow through biological parts. A digital step-
like interface between components may reduce
the effect that noise would have on an analog
system [82].
Network informaticsInformation flow can also be addressed from the
perspective of networks. The oldest synthetic bio-
logical circuits were based on transcriptional regula-
tion. Within the transcriptional network, two genes
were connected by having one gene code for the
transcription factor of the promoter of the other
gene. Carrera et al. [83] (this paper demonstrates
a method to model and modify the transcription
regulation network of E. coli ) proposed to rewire
the transcription regulation network by exchanging
the endogenous promoters. Other biological circuit
experiments have involved RNA-based regulation
and metabolism [84]. Recently, Bashor et al. [85]
[this paper introduces and demonstrates the idea
of using protein scaffolds (and hence protein–
protein interactions) to control synthetic regulatory
networks] constructed a biological network
through protein–protein interactions. Compared to
translation-dependent regulatory circuits, protein-
level connections have the potential for quicker
response with lower cellular resource consumption
rates. Engineering of protein–protein interactions
becomes a tractable problem if system design
leverages well-characterized protein domains [86]
that enable a combinatorial strategy to generating
synthetic proteins and signaling pathways. In antici-
pation of multi-cellular assemblies with synthetic
signaling requirements, Weber et al. developed
a metabolite-controlled intercellular signaling
Challenges of informatics in synthetic biology 89
method [87]. To achieve transient system dynamics,
Yin et al. [88] argued for augmenting target structure
sequences with the capability to automatically con-
struct self-assembly and disassembly pathways. Yin etal. [88] implemented such a system with a DNA
hairpin motif.
Biological noise is also a problem at the net-
work level. Studying noise in complex networks tra-
ditionally involves computational perturbation
methods, because an in vitro implementation of
an arbitrary noise source is not always trivial. To
bridge this gap, Lu et al. [89] have developed a
means of implementing simple in silico perturbation
sources as in vitro molecular noise generators.
IN VIVOWhereas in vitro synthetic biology enables biochem-
ical flexibility, invivo synthetic biology endows large-
scale production capacity to a biological network
[90]. The first step in the transition from in vitroto in vivo is the insertion of the constructed DNA
into a biological chassis where transcription and
translation could take place, such as a bacterium’s
genome. Itaya et al. [91] addressed physicochemical
stability issues of large DNA by developing
the Bacillus subtilis genome (BGM) vector, which
accommodates large DNA as part of the B. subtilisgenome, which might combine well with cell-
free expression systems in the future [92]. Shao
et al. [93] developed a method for assembling a
19 kb recombinant DNA molecule in Saccharomycescerevisiae. Minaeva et al. [56] integrated two recom-
bination methods—phages site-specific and Red/
ET-mediated—into a straightforward, convenient
protocol. This method, called the Dual-In/Out
Strategy, was applied successfully on plasmid-less
marker-less E. coli.When a biological network is expressed by syn-
thetic DNA sequences within the host, or engineer-
ing chassis, crosstalk between the host and synthetic
circuitry can adversely affect performance. For
example, endogenous carotenoid pathways in
higher plants seem to resist synthetic alterations
[94]. Emergent problems from crosstalk is not sur-
prising, even for commonly studied organisms like E.coli, because significant portions of organismal gene
regulatory networks are not yet known [95].
Hence, minimizing or at least controlling crosstalk
is a desired goal in network information control.
One approach is to reach community consensus on
a ‘standard’ organism in which developed ‘standard’
parts exhibit negligible crosstalk and other desired
properties. The obvious candidates are those that
already have methods for accommodating large
DNA molecules: S. cerevisiae [93] and E. coli [56].
However, both species will probably require cross-
talk reduction through numerous deletions of non-
essential genes.
The logical endpoint of systematic nonessential
gene deletion is the concept of the minimal cell
[96, 97], which in theory is composed only of
genetic material critical to survival. Natural minimal
cells like Pelagibacter ubique that thrive in resource-
deficient environments may also be good starting
points for the development of a standard artificial
organism [97]. The standard artificial organism,
however, is not necessarily a minimal cell, because
effective crosstalk elimination may occur before
all nonessential genes are deleted. In addition, the
genomes of parasitic minimal cells and artificially
minimized cells may present fastidious habits and
lack the reliability of a bulkier genome [82].
Synthetic biology needs a host that minimizes inter-
ference while providing robust cellular infrastructure,
and minimals cells do not guarantee that.
Another way to address crosstalk is to develop
orthogonal ribosomes and mRNA that interact
only with each other and with neither the ribosome
nor the genetic material of the host organism [98].
Evolved ribosome–mRNA pairs can then be
used to construct cellular networks [99]. With this
approach, a synthetic type 1 coherent feed-forward
loop was developed in E. coli [100] (this paper
demonstrates that synthetic circuits can based
on orthogonal transcription–translation networks).
With enough orthogonal components, it may be
possible to build a parallel metabolism within the
cell [101].
Ultimately however, it may be necessary to
implement physicochemical partitions with the
phospholipid bilayer, whose adoption in natural
modules poses a convincing argument for its use in
synthetic biology. The bilayer can form a liposome
into which one can incorporate several biochemical
modules [96], which roughly outline the series
of steps needed. This is essentially a ‘ground-up’
approach to the minimal cell, and the option to
use artificial, low-interference modules suggests
a higher chance of success than the ‘top-down’
90 Alterovitz et al.
approach of multiple gene deletions. Recently,
Kuruma et al. [102] (this paper represents the latest
progress in the development of the liposome into
a viable chassis) developed a liposome-based system
that synthesizes phosphatidic acid, a major constitu-
ent of cell membranes. A cell-free translation sys-
tem was encapsulated in a liposome, in which
functional membrane enzymes were synthesized.
This represents a significant step toward liposome-
encapsulated phospholipid bilayer biosynthesis and
points toward synthetic modules with autopoietic
capabilities.
At the border of in vitro and in vivo synthetic
biology is the cell-free system, a platform for
implementing complex biological processes outside
a cell membrane. Historically, it has been difficult
to activate more than one biochemical network
in a single platform, but Jewett et al. [103]
(this paper represents the latest progress on
integrating multiple biochemical networks in a
single cell-free system) have recently developed
a cell-free system capable of co-activating central
catabolism, oxidative phosphorylation, and protein
synthesis.
Once a synthetic network has been fully imple-
mented in vivo, the combined host-guest network
must be characterized for performance and poten-
tial crosstalk. However, experimental perturbations
inevitably lead to data noise. In fact, for pro-
tein interactions networks the rate of false-positive
and false-negative results may be as high as
40% [104, 105]. To address this problem Lappe
and Holm [106] have devised a means of efficiently
deriving interaction networks. Cantone et al. [63]
found that reverse-engineering methods based on
ordinary differential equations and Bayesian net-
works were effective at inferring the structure of a
small, synthetic gene regulatory network.
CONCLUSIONThe survey of the role of information processing
in synthetic biology reveals how future develop-
ments may be influenced by current ones
(Table 1). Consolidation of and additions to data
exchange formats are needed to enable efficient
communication between people and software. The
likely improvement in quantitative precision of
component functional data will reduce network
design unpredictability and post hoc tweaking.
Current hosts for in vivo synthetic biology include
E. coli and S. cerevisiae, but future hosts may take
a more minimalist approach and incorporate ortho-
gonal metabolic systems.
Synthetic biology is the next step in the progress
of engineering biological systems. The key infor-
matics challenges (some of which overlap with
those of systems biology) are standardization, devel-
opment of appropriate statistical analysis methods,
digital data integrity, biological noise control and
limitation of crosstalk (Table 2). When these issues
are properly addressed, the result will be artificial
organisms unrivaled in their biochemical
sophistication.
Table 1: Recent major developments in synthetic biology. For each development, the row indicates its immediateimpact niche, and the column indicates the informatics scope. However, all items noted have the potential todeeply influence the progress of synthetic biology in the next decade
In silico In vitro In vivo
Biological part Proposals to extend partsdocumentation standards[11, 12]
Proposal for a revised qualitycontrol scheme for the PartsRegistry [13]
Proposal to develop libraries ofdiversified components [11]
Network SBGN [23]CellML ModelRepository [38]
Increased size of high-fidelityDNA [68, 69] Synthetic networksbased on protein-proteininteractions [85]
Organism Redesigning global transcriptionregulation [83]
Semi-synthetic minimal cells [102]
Orthogonal transcription^translationnetworks [100]
Integrated cell-free metabolicplatform [103]
Challenges of informatics in synthetic biology 91
Table2:
Idealized
recipe
forsynthe
ticbiolog
y.Fo
reach
step,p
otentially
useful
Toolsareidentified.
AllstepsexhibitIssues
atthistim
e.Em
ergent
prop
ertie
scanbe
thou
ghtof
astheresultof
biologicalno
ise.Notethat
manyof
theprob
lemscanbe
traced
tojust
acoup
leof
infrastructure
issues.For
exam
ple,having
manychoices
forpartswithvery
precisecharacterizatio
nswou
ldaddressissues
ofem
ergent
prop
ertie
sat
theinvitro
level.Ifgene
regulatory
netw
orks
ofthechassiswereto
bebe
tter
documented,
then
crosstalkissues
(and
thereforeinvivoem
ergent
prop
ertie
s)wou
ldprob
ablypo
seless
ofaprob
lem
Step
(s)
Tools
Major
issues
Pro
pos
als
Networkdesig
nandsim
ulation
Ope
nCell[19],
Cytoscape
[50],
CellD
esigner[55],and
othersoftware[29,36,5
6]PartsRegistrydataba
se.
Nocomprehensiv
edata
standard
[9]
Insufficientpartsdo
cumentatio
n[9]
Design-friendlypartsdo
cumentatio
n[9^12]
PartsRegistryqu
alitycontrol[13]
Com
putatio
nald
esignmetho
ds[29,36,62]
Librariesof
diversified
compo
nents[11]
DNA
synthe
sis
BioB
rick
ligationscheme(Figure2)
[7]
The
integrityof
largeDNAtarget
molecules
Highfidelityconstruc
tionmetho
dsforlargeDNA
molecules
[68,
69]
Stream
lined
sequ
ence
desig
nmetho
d[48]
Invitro
testing
PCRto
inferne
tworkstructure[93]
Emergent
prop
ertie
sLack
ofrobu
ststatistic
altoolsto
analyzene
tworks
Strategy
toderive
interactionne
tworks
[106].
Simulationstrategies
toincrease
predictability[36]
Increasedprecisionin
partsselectionand
documentatio
n[11]
Chassisloading
E.coli[56,75,100
]S.cerevisiae[93]
Crosstalk
[10]
Cod
onmismatch
[71]
Largelyun
mappe
dgene
regulatory
netw
orks
[95]
B.subtilis[91]
Cell-freesystem
[92,103]
Genedeletio
nsof
host
geno
me
Mod
eling[58]
Ortho
gonalexp
ressionsystem
s[98^101]
Invivotesting
Fluo
rescentproteinexpressio
n[75]
Mutations
Apo
ptosis
Emergent
prop
ertie
s
Strategy
tostud
ysubtlemutations
[57]
Strategy
toderive
interactionne
tworks
[106]
92 Alterovitz et al.
Key Points
� The main goal of synthetic biology is to start with a set offunctions and properties, and build a suitable system out ofbiological components.
� Component data standards (such as BioBrick) will likelyrequire extensions to account for quantitative performancedata, so that networkdesign can becomemore predictable.
� Data standards for networks and componentswill likely consoli-date in order to increase the accuracy of design simulations andefficiency of collaborations.
� Biological parts development will likely employ the strategyof building quantitatively characterized libraries of diversifiedcomponents, because these libraries will increase the accuracyof network-level simulations.
� Host interference of synthetic networks might be effectivelyaddressed by gene deletions and the use of orthogonal proteinexpression systems.
FUNDINGThis work was supported in part by the National
Library of Medicine (NLM/NIH) under grant K99
LM009826 and the National Human Genome
Research Institute (NHGRI/NIH) under grants
1R01HG003354 and 1R01HG004836.
References1. Barrett CL, Kim TY, Kim HU, et al. Systems biology as a
foundation for genome-scale synthetic biology. Curr OpinBiotechnol 2006;17:488–92.
2. Lee SK, Chou H, Ham TS, et al. Metabolic engineeringof microorganisms for biofuels production: from bugs tosynthetic biology to fuels. Curr Opin Biotechnol 2008;19:556–63.
3. Chang MCY, Keasling JD. Production of isoprenoidpharmaceuticals by engineered microbes. Nat Chem Biol2006;2:674–81.
4. Weber W, Schoenmakers R, Keller B, et al. A syntheticmammalian gene circuit reveals antituberculosis com-pounds. Proc Natl Acad Sci 2008;105:9994–8.
5. Lu TK, Collins JJ. Engineered bacteriophage targeting genenetworks as adjuvants for antibiotic therapy. Proc Natl AcadSci 2009;106:4629–34.
6. Marguet P, Balagadde F, Tan C, et al. Biology by design:reduction and synthesis of cellular components andbehaviour. J RSoc Interface 2007;4:607–23.
7. Knight T. Idempotent Vector Design for StandardAssembly of Biobricks. MIT Synth Biol Wkg Grp 2003;1:1–11. http://hdl.handle.net/1721.1/21168 (23 October2009, date last accessed).
8. Brown J. The iGEM competition: building with biology.IETSynth Biol 2007;1:3–6.
9. Purnick PEM, Weiss R. The second wave of syntheticbiology: from modules to systems. Nat Rev Mol Cell Biol2009;10:410–22.
10. Matsuoka Y, Ghosh S, Kitano H. Consistent designschematics for biological systems: standardization of repre-sentation in biological engineering. JRSoc Interface (Advance
online version) 2009, doi: 10.1098/rsif.2009.0046.focus(23 October 2009, date last accessed).
11. Ellis T, Wang X, Collins JJ. Diversity-based, model-guidedconstruction of synthetic gene networks with predictedfunctions. Nat Biotech 2009;27:465–71.
12. Canton B, Labno A, Endy D. Refinement and standardiza-tion of synthetic biological parts and devices. Nat Biotech2008;26:787–93.
13. Peccoud J, Blauvelt MF, Cai Y, etal. Targeted developmentof registries of biological parts. PLoSONE 2008;3:e2671.
14. Participants. PoBoL: provisional BioBrick language. In:Standards and Specifications in Synthetic Biology WorkshopApril 26^27; Seattle,WA,USA, 2008.
15. Hucka M, Finney A, Sauro HM, et al. The systems biologymarkup language (SBML): a medium for representation andexchange of biochemical network models. Bioinformatics2003;19:524–31.
16. Finney A, Hucka M. Systems biology markuplanguage: Level 2 and beyond. Biochem SocTrans 2003;31:1472–3.
17. Hucka M, Finney A, Bornstein BJ, et al. Evolvinga lingua franca and associated software infrastructure forcomputational systems biology: the Systems BiologyMarkup Language (SBML) project. Syst Biol (Stevenage)2004;1:41–53.
18. Endler L, Rodriguez N, Juty N, et al. Designing and encod-ing models for synthetic biology. J R Soc Interface 2009;6:S405–17.
19. Beard DA, Britten R, Cooling MT, et al. CellML metadatastandards, associated tools and repositories. PhilosTransact AMath Phys Eng Sci 2009;367:1845–67.
20. Lloyd CM, Halstead MD, Nielsen PF. CellML: its future,present and past. Prog BiophysMol Biol 2004;85:433–50.
21. Novere NL, Finney A, Hucka M, et al. Minimum informa-tion requested in the annotation of biochemical models(MIRIAM). Nat Biotech 2005;23:1509–15.
22. Le Novere N, Moodie S, Sorokin A, et al. Systems biologygraphical notation: process diagram level 1. Nature Precedings2008;1–75. http://hdl.handle.net/10101/npre.2008.2320.1(23 October 2009, date last accessed).
23. Novere NL, Hucka M, Mi H, et al. The systems biologygraphical notation. Nat Biotech 2009;27:735–741.
24. Workgroup B. BioPAX – biological pathways exchangelanguage, level 3, release candidate 3 (version 0.92) docu-mentation. BioPAX Workgroup 2007.
25. Ausbrooks R, Buswell S, Carlisle D, et al. MathematicalMarkup Language (MathML) Version 2.0, 2nd edn. Editedby (NAG) DC, Patrick Ion (Mathematical Reviews AMS,Robert Miner (Design Science I, Scope) NPP: W3C;REC-MathML2-20031021, 2003. http://dret.net/biblio/reference/mathml2sec (23 October 2009, date last accessed).
26. RDF/XML Syntax Specification (Revised) on WorldWide Web URL: http://www.w3.org/TR/rdf-syntax-grammar/ (23 October 2009, date last accessed).
27. Le Novere N, Courtot M, Laibe C. Adding semanticsin kinetics models of biochemical pathways. In: Proceedingsof the 2nd International Symposium on experimental standardconditions of enzyme characterizations (ESEC 2006) 19^23 March2006; Beilstein Institute, Frankfurt am Main, Germany, 2007,pp. 137–53. http://www.beilstein-institut.de/escec2006/proceedings/LeNovere/LeNovere.pdf (23 October 2009,date last accessed).
Challenges of informatics in synthetic biology 93
28. Le Novere N. Model storage, exchange and integration.BMCNeurosci 2006;7(Suppl 1):S11.
29. Marchisio MA, Stelling J. Computational design of syn-thetic gene circuits with composable parts. Bioinformatics2008;24:1903–10.
30. Bornstein BJ, Keating SM, Jouraku A, et al. LibSBML: anAPI library for SBML. Bioinformatics 2008;24:880–1.
31. de Silva E, Stumpf MPH. Complex networks and simplemodels in biology. J RSoc Int 2005;2:419–30.
32. Le Novere N, Bornstein B, Broicher A, et al.BioModels Database: a free, centralized database ofcurated, published, quantitative kinetic models ofbiochemical and cellular systems. Nucleic Acids Res 2006;34:D689–91.
33. Erhard F, Friedel CC, Zimmer R. FERN - a Java frame-work for stochastic simulation and evaluation of reactionnetworks. BMCBioinformatics 2008;9:356.
34. Hower V, Mendes P, Torti FM, et al. A general map ofiron metabolism and tissue-specific subnetworks. MolBiosyst 2009;5:422–43.
35. Calzone L, Gelay A, Zinovyev A, et al. A comprehensivemodular map of molecular interactions in RB/E2Fpathway. Mol Syst Biol 2008;4:173.
36. Bentley PJ. Methods for improving simulations of biologicalsystems: systemic computation and fractal proteins. J R SocInt 2009;6:S451–66.
37. Schilstra MJ, Li L, Matthews J, et al. CellML2SBML:conversion of CellML into SBML. Bioinformatics 2006;22:1018–20.
38. Lloyd CM, Lawson JR, Hunter PJ, etal. The CellML modelrepository. Bioinformatics 2008;24:2122–3.
39. Wimalaratne SM, Halstead MDB, Lloyd CM, et al.Biophysical annotation and representation of CellMLmodels. Bioinformatics 2009;25:2263–70.
40. Berners-Lee T, Fielding R, Masinter L. Uniform ResourceIdentifier (URI): Generic Syntax. Request For CommentsArchive 2005, RFC3986: http://www.ietf.org/rfc/rfc3986.txt (23 October 2009, date last accessed).
41. Laibe C, Le Novere N. MIRIAM resources: tools togenerate and resolve robust cross-references in systemsbiology. BMCSyst Biol 2007;1:58.
42. Drager A, Hassis N, Supper J, et al. SBMLsqueezer: aCellDesigner plug-in to generate kinetic rate equationsfor biochemical networks. BMCSyst Biol 2008;2:39.
43. Luciano JS. PAX of mind for pathway researchers. DrugDiscovToday 2005;10:937–42.
44. Stromback L, Lambrix P. Representations of molecularpathways: an evaluation of SBML, PSI MI and BioPAX.Bioinformatics 2005;21:4401–07.
45. Pedersen M, Phillips A. Towards programming languagesfor genetic engineering of living cells. JRSoc Interface 2009;6:S437–50.
46. Bray T, Paoli J, Sperberg-McQueen CM, et al. ExtensibleMarkup Language (XML) 1.0. 5th edn. 2008. http://www.w3.org/TR/xml11/ (23 October 2009, date lastaccessed).
47. Hill AD, Tomshine JR, Weeding EMB, et al. SynBioSS:the synthetic biology modeling suite. Bioinformatics 2008;24:2551–3.
48. Czar MJ, Cai Y, Peccoud J. Writing DNA with GenoCAD.Nucleic Acids Res 2009;37:W40–W47.
49. Garny A, Noble D, Hunter PJ, et al. Cellular OpenResource (COR): current status and future directions.PhilosTRoy Soc A:Math Phys Eng Sci 2009;367:1885–905.
50. Cline MS, Smoot M, Cerami E, et al. Integration of biolo-gical networks and gene expression data using Cytoscape.Nat Protocols 2007;2:2366–82.
51. Guziolowski C, Bourde A, Moreews F, et al. BioQualiCytoscape plugin: analysing the global consistency ofregulatory networks. BMCGenomics 2009;10:244.
52. Bindea G, Mlecnik B, Hackl H, et al. ClueGO: a Cytoscapeplug-in to decipher functionally grouped gene ontologyand pathway annotation networks. Bioinformatics 2009;25:1091–3.
53. Clement-Ziza M, Malabat C, Weber C, et al. Genoscape:a Cytoscape plug-in to automate the retrieval and integra-tion of gene expression data and molecular networks.Bioinformatics 2009;25:2617–8.
54. Gao J, Ade AS, Tarcea VG, et al. Integrating and annotatingthe interactome using the MiMI plugin for cytoscape.Bioinformatics 2009;25:137–8.
55. Funahashi A, Morohashi M, Kitano H, et al.CellDesigner: a process diagram editor for gene-regulatory and biochemical networks. BIOSILICO 2003;1:159–62.
56. Minaeva NI, Gak ER, Zimenkov DV, et al. Dual-In/Outstrategy for genes integration into bacterial chromosome:a novel approach to step-by-step construction of plasmid-less marker-less recombinant E. coli strains with predesignedgenome structure. BMCBiotechnol 2008;8:63.
57. Loewe L. A framework for evolutionary systems biology.BMCSyst Biol 2009;3:27.
58. Chen BS, Chang CH, Lee HC. Robust synthetic biologydesign: stochastic game theory approach. Bioinformatics2009;25:1822–30.
59. Chen B-S, Wu C-H. A systematic design method forrobust synthetic biology to satisfy design specifications.BMCSystems Biology 2009;3:66.
60. Banga J. Optimization in computational systems biology.BMCSyst Biol 2008;2:47.
61. Hadlich F, Noack S, Wiechert W. Translating biochemicalnetwork models between different kinetic formats. MetabEngineering 2009;11:87–100.
62. Dasika M, Maranas C. OptCircuit: An optimization basedmethod for computational design of genetic circuits. BMCSystems Biology 2008;2:24.
63. Cantone I, Marucci L, Iorio F, et al. A yeast synthetic net-work for in vivo assessment of reverse-engineering andmodeling approaches. Cell 2009;137:172–81.
64. Stumpf MP, Wiuf C, May RM. Subnets of scale-freenetworks are not scale-free: sampling properties ofnetworks. Proc Natl Acad Sci USA 2005;102:4221–4.
65. Lenaerts T, Ferkinghoff-Borg J, Schymkowitz J, et al.Information theoretical quantification of cooperativity insignalling complexes. BMCSyst Biol 2009;3:9.
66. Lenaerts T, Ferkinghoff-Borg J, Stricher F, et al.Quantifying information transfer by protein domains:analysis of the Fyn SH2 domain structure. BMCStruct Biol2008;8:43.
67. Greber D, Fussenegger M. Mammalian synthetic biology:engineering of sophisticated gene networks. J Biotechnol2007;130:329–45.
94 Alterovitz et al.
68. Linshiz G, Yehezkel TB, Kaplan S, et al. Recursiveconstruction of perfect DNA molecules from imperfectoligonucleotides. Mol Syst Biol 2008;4:191.
69. Gibson DG, Benders GA, Andrews-Pfannkoch C, et al.Complete chemical synthesis, assembly, and cloning of amycoplasma genitalium genome. Science 2008;319:1215–20.
70. Leonard E, Nielsen D, Solomon K, et al. Engineeringmicrobes with synthetic biology frameworks. TrendsBiotechnol 2008;26:674–81.
71. Martin CH, Nielsen DR, Solomon KV, et al. Syntheticmetabolism: engineering biology at the protein and path-way scales. Chemistry & Biology 2009;16:277–86.
72. Lucks JB, Qi L, Whitaker WR, et al. Toward scalable partsfamilies for predictable design of biological circuits. CurrOpinMicrobiol 2008;11:567–73.
73. Welch M, Villalobos A, Gustafsson C, et al. You’re one ina googol: optimizing genes for protein expression. J R SocInterface 2009;6:S467–76.
74. Suarez M, Jaramillo A. Challenges in the computationaldesign of proteins. J RSoc Interface 2009;6:S477–91.
75. Friedland AE, Lu TK, Wang X, et al. Synthetic genenetworks that count. Science 2009;324:1199–202.
76. Tigges M, Marquez-Lago TT, Stelling J, et al. A tunablesynthetic mammalian oscillator. Nature 2009;457:309–312.
77. Atsushi O, Mizuo M. An Artificial Aptazyme-BasedRiboswitch and its Cascading System in E. coli. Chem BioChem 2008;9:206–9.
78. Stricker J, Cookson S, Bennett MR, et al. A fast, robustand tunable synthetic gene oscillator. Nature 2008;456:516–9.
79. Ham TS, Lee SK, Keasling JD, et al. Design andconstruction of a double inversion recombination switchfor heritable sequential genetic memory. PLoSONE 2008;3:e2815.
80. Tsai TY-C, Choi YS, Ma W, et al. Robust, TunableBiological Oscillations from Interlinked Positive andNegative Feedback Loops. Science 2008;321:126–9.
81. Dawid A, Cayrol B, Isambert H. RNA synthetic biologyinspired from bacteria: construction of transcription attenua-tors under antisense regulation. Physical Biology 2009;6:025007.
82. Andrianantoandro E, Basu S, Karig DK, et al. Syntheticbiology: new engineering rules for an emerging discipline.Mol Syst Biol 2006;2:2006.0028.
83. Carrera J, Rodrigo G, Jaramillo A. Model-based redesignof global transcription regulation. Nucleic Acids Res 2009;37:e38.
84. Guye P, Weiss R. Customized signaling with reconfigurableprotein scaffolds. Nat Biotech 2008;26:526–8.
85. Bashor CJ, Helman NC, Yan S, et al. Using engineeredscaffold interactions to reshape MAP kinase pathway signal-ing dynamics. Science 2008;319:1539–43.
86. Pawson T, Nash P. Assembly of Cell Regulatory SystemsThrough Protein Interaction Domains. Science 2003;300:445–52.
87. Weber W, Schuetz M, Denervaud N, et al. A syntheticmetabolite-based mammalian inter-cell signaling system.Mol Biosyst 2009;5:757–63.
88. Yin P, Choi HMT, Calvert CR, et al. Programming bio-molecular self-assembly pathways. Nature 2008;451:318–22.
89. Lu T, Ferry M, Weiss R, et al. A molecular noise generator.Phys Biol 2008;5:036006.
90. Forster AC, Church GM. Synthetic biology projectsin vitro. Genome Res 2007;17:1–6.
91. Itaya M, Fujita K, Kuroki A, et al. Bottom-up genomeassembly using the Bacillus subtilis genome vector. NatMeth 2008;5:41–3.
92. Yoshihiro S, Yutetsu K, Bei-Wen Y, et al. Cell-free transla-tion systems for protein engineering. FEBS J 2006;273:4133–40.
93. Shao Z, Zhao H, Zhao H. DNA assembler, an in vivogenetic method for rapid construction of biochemicalpathways. Nucleic Acids Res 2009;37:e16.
94. Fraser PD, Enfissi EMA, Bramley PM. Genetic engineeringof carotenoid formation in tomato fruit and the potentialapplication of systems and synthetic biology approaches.Arch Biochem Biophys 2009;483:196–204.
95. Keseler IM, Bonavides-Martinez C, Collado-Vides J, et al.EcoCyc: a comprehensive view of Escherichia coli biology.Nucleic Acids Res 2009;37:D464–70.
96. Forster AC, Church GM. Towards synthesis of a minimalcell. Mol Syst Biol 2006;2:45.
97. Moya A, Gil R, Latorre A, et al. Toward minimal bacterialcells: evolution vs. design. FEMS Microbiol Rev 2009;33:225–35.
98. Rackham O, Chin JW. A network of orthogonal ribosome-mRNA pairs. Nat Chem Biol 2005;1:159–66.
99. Rackham O, Chin JW. Synthesizing cellular networksfrom evolved ribosome-mRNA pairs. Biochem Soc Trans2006;34:328–9.
100.An W, Chin JW. Synthesis of orthogonal transcription-translation networks. Proc Natl Acad Sci 2009;106:8477–82.
101.Filipovska A, Rackham O. Building a parallel metabolismwithin the cell. ACSChem Biol 2008;3:51–63.
102.Kuruma Y, Stano P, Ueda T, et al. A synthetic biologyapproach to the construction of membrane proteins insemi-synthetic minimal cells. Biochimica et Biophysica Acta(BBA) - Biomembranes 2009;1788:567–74.
103. Jewett MC, Calhoun KA, Voloshin A, et al. An integratedcell-free metabolic platform for protein production andsynthetic biology. Mol Syst Biol 2008;4:220.
104.Tong AH, Lesage G, Bader GD, et al. Global mapping ofthe yeast genetic interaction network. Science 2004;303:808–13.
105.Bader JS, Chaudhuri A, Rothberg JM, et al. Gaining con-fidence in high-throughput protein interaction networks.Nat Biotechnol 2004;22:78–85.
106.Lappe M, Holm L. Unraveling protein interaction networkswith near-optimal efficiency. Nat Biotechnol 2004;22:98–103.
Challenges of informatics in synthetic biology 95