Upload
weismuler
View
218
Download
0
Embed Size (px)
Citation preview
8/4/2019 Back to the Biology in Systems Biology
1/19
Sui Huang
received his MD and PhD from
the University of Zurich and is
an assistant professor atHarvard Medical School. His
research focuses on c omplexity
in multicellularity and cancer.
Keywords: systems biology,
biocomplexity, genetic
networks, intrinsic constraints,adaptation, attractors
Sui Huang,
Department of Surgery,
Childrens Hospital,
Harvard Medical School,
300 Longwood Avenue,
Boston,
MA 02115
USA
Tel: +1 617 919 2393
e-mail: [email protected]
Back to the biology in systems
biology: What can we learnfrom biomolecular networks?Sui HuangDate received (in revised form): 1st December 2003
AbstractGenome-scale molecular networks, including protein interaction and gene regulatory
networks, have taken centre stage in the investigation of the burgeoning disciplines of systems
biology and biocomplexity. What do networks tell us? Some see in networks simply the
comprehensive, detailed description of all cellular pathways, others seek in networks simple,
higher-order qualities that emerge from the collective action of the individual pathways. This
paper discusses networks from an encompassing category of thinking that will hopefully help
readers to bridge the gap between these polarised viewpoints. Systems biology so far has
emphasised the characterisation of large pathway maps. Now one has to ask: where is the
actual biology in systems biology? As structures midway between genome and phenome, and
by serving as an extended genotype or an elementary phenotype, molecular networks open
a new window to the study of evolution and gene function in complex living systems. For the
study of evolution, features in network topology offer a novel starting point for addressing the
old debate on the relative contributions of natural selection versus intrinsic constraints to a
particular trait. To study the function of genes, it is necessary not only to see them in the
context of gene networks, but also to reach beyond describing network topology and to
embrace the global dynamics of networks that will reveal higher-order, collective behaviour of
the interacting genes. This will pave the way to understanding how the complexity of genome-
wide molecular networks collapses to produce a robust whole-cell behaviour that manifests astightly-regulated switching between distinct cell fates the basis for multicellular life.
INTRODUCTION: MINDTHE GAP OF MINDSEver since molecular biology came into
existence as a discipline devoted to
studying the function and functioning of
genes some 40 years ago, scientists have
entertained the idea of gene regulatory
circuits,1,2 and even of genome-scale
genetic networks,3 for understanding how
the activity of genes by influencingeach other ultimately translates into
the phenotypic behaviour of an organism.
In the absence of gene-specific data, the
concepts of genetic networks were
explored mostly at the level of abstract,
generic models, without molecule-
specific information.4 Despite intriguing
insights into the fundamental properties of
the collective behaviour of a large number
of interacting components,3 such work
remained largely unnoticed by
practitioners of molecular biology, who
emphasised the cloning, characterising
and categorising of individual genes.
Molecular biologists both
experimental and computational used
to studying specific, measurable
biochemical details, were not ready for
the abstraction required to handle
networks as an object of study. It was onlythe recent availability of gene-specific
data for genome-scale networks of
molecular interactions, made possible by
high-throughput genomic technologies,
that revived the interest in the idea of
large gene networks.
The arrival of genome-scale
information on networks, ranging from
networks of metabolic reactions5 to gene
regulatory6,7 and protein interaction
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 27 9
8/4/2019 Back to the Biology in Systems Biology
2/19
networks,810 has led to a recent spate of
publications in which the topology
(wiring diagram) of molecular networks
was analysed.1118 When reading the
literature, however, it is important to be
aware of two camps of researchers with
philosophically opposite mindsets and,
hence, disparate motivation. These two
camps can be labelled the globalists (or,
generalists) and the localists (or,
particularists) (Table 1). Historically, a
similar polarisation existed in brain
research, with globalists emphasising the
idea that the entire network of neurons in
the brain collectively determine
behaviour by distributed information
processing, while the localists believedmore in localising a brain function to a
particular anatomical region.19 It is now
known that both were right.
The first wave of work on genome-
wide networks was conducted mostly by
physicists in the globalists perspective, so
that networks were analysed in a more
general sense, ie as an entity in their own
right.11,12,14,15,20 In this globalists
abstraction the identification of individual
genes and their relationships and functions
were not the primary objectives of
analysis. Thus, this early work on
networks failed to attract the attention of
scientists in the localists camp, which
consists largely of mainstream molecular
biologists habituated to characterising
specific genetic pathways one at a time.
There is a sharp, natural, but
unarticulated intellectual disconnect
between the two camps because of the
fundamentally different motivation of the
two groups of scientists. The globalists are
typically interested in deeper principlesof complex systems,2124 they are attracted
to biology as a new source of complexity,
which led to the notion of
biocomplexity.2528 They now seek
molecule-specific data, made possible by
technological advances, to validate their
ideas.
A similar polarisation
between localists and
globalists existed in
brain research
Table 1: Polarity of two points of view in network biology. The table represents a caricature of extreme positions forillustration purposes.
The localist (particularist) view
(Those who see the trees first)
The globalist (generalist) view
(Those who see the forest first)
Level of original focus Gene- and pathway-centric Network-centricNew field created Systems biology BiocomplexityUse of hypothesis Hypothesis at level of individual pathways. No systems-
level hypothesis: research becomes discovery-driven.Example hypotheses:Gene A inhibits Gene B, is required for function X, etc.
Hypothesis at systems level concerning generic designof network and network position of genesExample hypotheses:Hub proteins are importantPower-law architecture favours ordered dynamics
Philosophy System is complicatedProperties of systems lie in the property of thecomponentsComprehensiveness.The whole equals the sum of the parts
System is complexHigher-order system properties emerge fromcollective behaviour of componentsHolismThe whole is different from sum of parts
Practical aims of study To characterise exhaustively the biochemistry of (all)
individual pathways and their functionsTo describe idiosyncrasy
To understand generic aspects of genome-scale
networks as an entity with its higher-order propertiesTo understand universality
Gene identity in models Of primary interest. Specific models with nominal genesand their idiosyncratic properties
Of secondary interest. Models with anonymous genesas generic entities may sometime suffice
Network topology Precise biochemical characterisation and categorising ofphysical and regulatory interactions in specific pathways
Analysis of large scale features, based on globalstatistics of local network features (degree distribution,modularity, clustering)
Network dynamics Detailed modelling of individual small circuits (modules)in separation as a low-dimensional dynamic system
Global dynamics of networkmaps into whole cell behaviour (cell fates)
Function Focus on local cellular functions (eg protein synthesis,vesicle transport, filopodia extension, DNA repair, etc)associated with a specific pathway
Emphasise emergent whole-cell behaviour, such asswitch between discrete cell phenotypes (cell fates)
Typical non-biologist partners Computer scientists, engineers Physicists
2 8 0 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004
Huang
8/4/2019 Back to the Biology in Systems Biology
3/19
By contrast, the localists view is rooted
in classical molecular biology, hence is
shaped by decades of devotion to the
study of individual cellular pathways that
represent to them linear causal
relationships. But the prevalence of
pleiotropy and convergence in cell
signalling, and of crosstalk between
pathways, has led to the increasing
awareness that understanding gene
function requires that one reaches beyond
the narrow focus on individual pathways.
The advent of massively-parallel analysis
and high-throughput genomics and
proteomics technologies29,30 finally led to
the acceptance of the idea that molecular
pathways are just parts embedded in onegenome-scale system,31 leading to the
explicit inception of systems biology.32
Nevertheless, despite considering
networks and quantitative modelling, the
pathway-centric ideology shines through
in this new initiative towards
integration.3337 To the localists,
integration essentially means the
combination of different kinds of large
datasets in order to find correlations to
or at least to facilitate the generation of
hypotheses on relationships between
individual genes, proteins and biologicalfunctions.3740 Thus, the systems biology
approach seeks comprehensive, qualitative
and quantitative descriptions of all
constituent parts, rather than an
understanding of a higher-order quality
that may arise from the collective action
of the individual parts. The implicit
notion is that extension of the existing
ways of analysing individual pathways to
the exhaustive, simultaneous
characterisation of all the molecular
interactions in the genome will suffice forunderstanding how living organisms
work.27,35,41 Concretely, the emphasis on
the analysis of vast amounts of
biochemical detail has led to the
involvement of computer scientists for
developing computational tools that have
become indispensable for representing,
mining and modelling the bewildering
amount and heterogeneity of collected
information.4246 Another natural
extension of pathway-centricism to the
entire system, which has become a major
research focus of systems biology, is to
reconstruct the topology of a network by
systematic experimental identification of
interactions69,33,47 or to reverse engineer
it by inference from measurement of
network dynamics.4851
In summary, whereas globalists are
attracted by the complex, seeking to
understand general principles giving rise
to the whole, and are ready to abstract
away specific details, localists-turned-
systems biologists are more inclined to
attack the complicated, and seek
comprehensiveness of detailed description.
Because of the unique blend in biology ofuniversality and idiosyncrasy, both
approaches are complementary to each
other and will be necessary in spite of a
persisting, unarticulated mental divide
(Table 1).
This paper discusses molecular
networks in a broadly encompassing
category of view, going beyond the
characterisation of network maps, and
addresses the fundamental questions that
investigators of both ideologies studying
living systems will ultimately have to
address: what do networks tell us aboutthe biology of an organism?
FROM NETWORKS INBIOLOGY TO THEBIOLOGY OF NETWORKSAs with the information in DNA and
amino acid sequences, what networks can
tell us about the biology of an organism
can be divided into two broad domains:
evolution and function (Table 2).
While comparative sequence analysis
proved to be a powerful tool fordetermining taxonomic relationships,52
and can shed light on the history and
mechanism of genome evolution,5355 it
remains of limited use for predicting gene
function. Although conservation can
point to functional importance, and
sequence homology (eg catalytic motifs)
allows the prediction of the biochemical
function of a protein, the road from
sequence to gene function and,
Understanding living
systems requires both
detailed description andabstraction
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 1
Back to the biology in systems biology
8/4/2019 Back to the Biology in Systems Biology
4/19
subsequently, to drug development
turned out to be bumpier than was
anticipated in the dawn of the genomicrevolution. It is often overseen that the
regulation and the context of expression
of a gene is part of what defines its
biological function. For example, based
on sequence analysis, one readily finds
that the protein cyclo-oxygenase-2
(COX-2) has the same enzymatic activity
as COX-1, namely, a cyclo-oxygenase in
the synthesis of prostaglandins.56,57 This
information in itself was of little use until
it was established that COX-2 is regulated
differently than COX-1, and thus has a
different biological function.56 Beinginduced in inflammatory tissues, as
opposed to the constitutively active
COX-1, COX-2 obviously serves a
different function and thus provides a
more specific drug target than its non-
regulated paralogue. Thus, what
determines the distinctive biological
functions of COX-1 and COX-2 is
encoded in their regulation, not in their
enzymatic activity.57,58 Furthermore, a
regulatory protein, such as Ras, myc or
NFkB, can have multiple, disparate, if notopposite, functions (eg either cell
proliferation or differentiation/quiescence
or apoptosis), depending on the cellular
context, ie the presence of other proteins,
and the cell state and type.27,59
Hence, despite great efforts spent in
functional annotation,60,61 it is only in
rather exceptional cases that a specific
physiological function can unconditionally
and unambiguously be assigned to a given
protein as a separate entity, defined just by
the coding region of its gene. It is likely
that most proteins, notably regulatoryproteins, exert their biological function as
a member of a molecular network.
What is less obvious is what networks
can tell us about biological evolution
(Table 2). Representing an intermediate
level of information, between the level of
DNA sequence and that of organismal
phenotype, network structures are more
complex than DNA sequences, yet simple
enough for a mathematical representation
using the formalism from graph theory.
Thus, by serving, to all intents and
purposes, as an extended genotype,networks open an entirely new window
on the workings of evolution. Before
discussing what networks can teach about
the evolution of the organism, the author
will firstly summarise a few recent, salient
findings on network topology obtained
from the globalist point of view.
GLOBAL TOPOLOGY OFNETWORKSThe wiring diagram of the entire
network can readily be drawn from dataon relationships between biomolecules,
including metabolic reactions, physical
protein interactions (heterodimeric
complexes) and transcriptional gene
regulation. Network topology
information is available for metabolic
networks in various microorganisms5 and,
more recently, for a genome-wide
proteinprotein interaction network of
the yeast Saccharomyces cerevisiae.810,62,63
Context of gene
expression is part of
gene function
Table 2: Molecular networks as intermediate level of description complementing information from easily formalisablegenotype (gene sequence) and phenotype (organism).
Gene sequence Molecular network Organism
Formalisation Linear array easy to compare
Graph Morphometry, etc difficult to compare
Clues onevolution
Clues onfunction
Simple: relationship from sequencecomparison
Difficult: heuristics of protein domainprediction, comparison
New opportunity:intermediate betweengenotype and phenotype
Difficult: common ancestry orconvergence?
Simple: evident from observationor experiment
2 8 2 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004
Huang
8/4/2019 Back to the Biology in Systems Biology
5/19
Genome-wide gene regulatory network
data are currently available forEscherichia
coli.7 In the terminology of graph theory,
a network consists of nodes (vertices) that
can be connected by links (edges) (Figure
1). Genome-wide networks consist of
thousands of nodes, and thus are referred
to as complex networks,64 by contrast
with the networks with low node
numbers studied in traditional graph
theory. Moreover, an important aspect of
biological networks is that they grow over
time, by contrast with the networks with
constant node numbers seen in traditional
graph theory.65 Thus, biomolecular
networks have a history of experiencing
an increase in the number of nodes.66
What catches the investigators interest is
every topological feature that deviates
from that of a random network, in which
Nnodes are randomly connected by L
links.65 The non-random structures found
in complex networks, then, provide the
starting point for further research into
their origin and functional significance.
To illustrate this, focus on the results
obtained for the most complete
eukaryotic protein interaction network
available, the protein interaction network
ofS. cerevisiae, which is based on genome-
wide data of pair-wise proteinprotein
interactions (Figure 1).810 It is important
to note that the incompleteness and
inaccuracies in the yeast protein
interaction dataset, mostly due to
potentially false-positive links,62,63 do not
essentially affect the global statistics, as
comparisons of various datasets by the
authors in the cited works have shown.
Many questions posed by the globalist can
be addressed with incomplete and not
perfectly accurate datasets, while thelocalist agenda requires precise and
detailed information. One elementary
global feature that was striking for
biologists especially those who think in
categories of isolated pathway modules
named after their putative cellular
functions is that, despite the overall
sparseness of the network connectivity
Figure 1: Global picture of the topology of large (complex) networks. Upper panels: graph ofthe network, displayed using Pajek software (http://vlado.fmf.uni-lj.si/pub/networks/pajek/). A.Computer generated, not growing random network,65 N 500 nodes, mean connectivitydegree ,k. $ 4. Note the dominating giant component (containing 487 connected nodes). B.Computer generated, growing network using the generative algorithm with preferentialattachment of new nodes,20 N 483, ,k. $ 4. (The algorithm necessarily leads to one singlecomponent). C. Yeast protein interaction network consisting of N 2,165 proteins, with,k. $ 4, based on the CORE data set.63 Lower panels: probability density distribution P(k) inloglog plots. The density distribution exhibits a straight line for B, Cindicating a powerlaw(scale-free) distribution
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 3
Back to the biology in systems biology
8/4/2019 Back to the Biology in Systems Biology
6/19
(three to four connections per protein, on
average), the vast majority (.75 per cent)
of the proteins in yeast participate in a
single, large sub-network of connected
nodes, a so-called giant component12
(Figure 1). Although counterintuitive at
first sight, this finding is not unexpected
from a graph theory point of view. Given
some minimal connectivity (for a
growing, random graph, at least k 1/8
links per node on average66), such a giant
component is practically unavoidable in
protein interaction networks (where
average k % 4), and is thus robust in a
fundamental way. Hence, this feature
does not even qualify as a non-random
feature that would point to a particularbiological purposefulness.
By contrast, Barabasi and coworkers
describe an interesting global topology
feature that indeed deviates from that of a
random graph.12,20 If all the nodes iof the
network are treated as one population,
and ki of each node i(ie the numberk of
interaction partners that node ihas) is
measured and displayed as a distribution,
then the occurrence probability P(k) of
each value ofk exhibits, approximately, a
power-law distribution, P(k) $ kg, rather
than as would be the case in randomnetworks a Poisson distribution. The
exponent gis a characteristic constant
(Figure 1). A power-law distribution
implies that the more nodes that are
sampled, the higher is the mean value of k
for the sampled population. Thus, there is
no stable mean that is characteristic for an
ensemble of proteins (nodes), and such
protein interaction networks are called
scale-free. Although it remains debated
as to whether the degree of distribution
precisely follows a power-law, its heavy-tailed nature implies that there are more
proteins with a high-degree k than one
would expect in a random network: the
protein interaction network is biased to
contain more hubs than one would
expect.
Other non-random topological features
that deviate from those of a random graph
have been identified in the yeast
interaction network. For example, hub
proteins tend not to interact with each
other.14 Moreover, the proteins or
metabolites in real networks have a higher
tendency than the nodes in random
networks to be organised into clusters
(cliques) in which the partners of a given
protein are also likely to interact with
each other.13,17 Together with the scale-
free property, these features give rise to
hierarchical modularity, as has been
described for metabolic networks.15 The
yeast protein network also contains a
significant number of proteins that have
low connectivity degrees, yet are central
in that they lie on many of the shortest
paths that connect any pair of proteins.67
Of interest also is the recent study ofnetwork motifs across the network.
Network motifs are subgraphs, ie locally
defined patterns of interaction between a
small number of nodes, such as triangles,
branching structures, etc.6,16,18,68 In gene
regulatory (transcription) networks, links
represent transactivation and are,
therefore, arrows rather than undirected
connections; thus, transcription networks
form directed graphs. Such digraphs exhibit
a higher diversity of network motifs than
the networks from protein interaction
databases, which yield undirectedconnections. Specifically, Shen and
coworkers found that in bacterial gene
regulatory networks, certain motifs of
directed links were significantly over-
represented, such as the feed-forward
loop (Figure 2).18
In view of all of these non-random
features of network topology, two
interdependent questions immediately
come to mind: how do such structures
originate, and how do they affect
biological function?
NETWORKS ANDEVOLUTION:ADAPTATION VERSUSINTRINSIC CONSTRAINTSA fundamental question in biology is why
organisms are the way they are: why does
a given trait exist, how does it arise? It is
obvious that the non-random network
features discussed above can now be
Topology of biological
networks deviate from
that of random
networks
2 8 4 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004
Huang
8/4/2019 Back to the Biology in Systems Biology
7/19
treated as elementary phenotypic traits. In
an encompassing view of evolution, one
can distinguish between two
fundamentally distinct, but not mutually
exclusive, groups of mechanisms that can
contribute to the very existence and
nature of a given trait in living systems.
Because authors use a large variety of
terms with essentially identical or
overlapping core meaning, here they will
simply be called Type I and Type II
mechanisms, to avoid semantic ambiguity
or bias (Table 3).
The Type I mechanism concerns the
(neo-)Darwinian principles ofselection and
adaptation. Traits can (smoothly) vary in a
random fashion due to mutations; andselection (in a given environment) of the
fittest (of the inheritable) variants drives
the features towards a(n) (local) optimum
in terms of functionality. Because of the
underlying connotation of directedness or
finalism (even if the mutations themselves
are not), the adaptation mechanism is in
essence equivalent to the engineers
notion of optimisation of a task by
deliberate design, or the tinkerers trial
and error approach. The vast majority of
biologists think in categories of this
mechanism to explain almost everyfeature. The Type II mechanism
comprises a variety of alternative, ie non-
adaptionist mechanisms, summarised here
under the general term intrinsic
constraints. The mechanisms have in
common the fact that they embody a set
of rules that impose constraints so as to
make particular features unavoidable, ie
intrinsically robust, independent of a notion
of usefulness or self-maintenance in the
face of external perturbation.69,70 This
implies that no external force, likeadaptation or any other notion of
functionality or purposeful design, is
required. Constraint is a creative rather
than inhibitory force, for it is that which
prevents (constrains) a system from
settling down in a random,
undifferentiated, pattern less state. The
intrinsic constraints can be imposed by
elementary physical laws (eg spherical
shape), allometric relationships,71,72
X
A
B
XRE Gene X
XRE Gene X
Gene A ARE
1
2
3
4
GENE DUPLICATION
DIVERGING MUTATIONS
XRE Gene XGene A ARE
Duplicated segment
INSERTION
XRE Gene X
Gene B ARE
Gene A
XREDuplicated gene A,mutated to gene B
Figure 2: Hypothetical mechanism for the generation of a feed-forwardloop in the absence of selection for functionality. 1. Primordial, auto-regulative gene Xencodes a transcription factor that binds to its own cis-regulatory element, X-responsive element (XRE). 2. A second auto-regulative gene A, together with its cis-regulatory region, ARE, (eg, aregulon) is inserted into the 59 region ofX(eg mediated by transposable
elements) in the reverse direction. If XRE is a bidirectional promoterelement, XRE can contribute to transactivation of A. (This is the case inthe arabinose operon, which is part of a feedforward-loop.18) 3.Duplication of the DNA segment Gene A-ARE-XREcreates a new genethat is responsive to both A and X. Due to subsequent diversifyingmutations, Gene A becomes Gene B; ARE is partially deleted because it isredundant. Now factor Xregulates itself and Gene A via XRE, while bothproducts, A and R, regulate gene B. On top of this structural andprocedural scheme, which facilitates the spontaneous creation of a feed-forward loop, natural selection may then do the fine tuning, eg changestrength and modality of regulation by point mutations in the respectivecis or trans regulatory regions
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 5
Back to the biology in systems biology
8/4/2019 Back to the Biology in Systems Biology
8/19
architectural69,7375 (eg symmetry, space
filling) and mechanical principles (eg
forcebalance rules in tensegrity76) and,
importantly, self-organisation processes,
such as symmetry-breaking patterns24 (see
below). Other non-adaptive mechanisms
include historical (phylogenic inertia,
genetic drift, frozen accidents, etc) and
developmental constraints (restrictions in
the manufacturing path).69,70,75 Adistinct school of thinking explicitly
opposite to adaptionism is structuralism,
which stresses the study of structures
independent of selection for
functionality.70,77
Self-organisation is part of complexity
theory, which emerged as a field of study
in its own right in the 1980s24,78,79 but
which still lacks a rigorous definition, its
novelty and necessity is still being
questioned by some scientists. (This paper
will not refer to theories that address thequestion of macroscopic evolution and
the origin of the complexity of systems
itself, which is a different matter.80)
Nevertheless, by uniting various
theoretical concepts aimed at
understanding the origin of ordered, ie
non-random, patterns through self-
organisation, the approach of the complex
system theories has added substantial
momentum to the community of
researchers that emphasise Type II
explanations for particular features. By
providing astonishingly simple
explanations for the inherent robustness of
many counterintuitive patterns, such as
stripes, spirals, fractals, power-law
relationships, etc, which are found in
living as well as non-living systems,24
complexity theory may provide some of
the deeper principles of organisationwhich embody the design constraints and
which structuralists were seeking.
It is important to note that the Type I
and Type II mechanisms are not mutually
exclusive, but are consistent with each
other and can cooperate in some rather
convoluted way. For example, a particular
intrinsically robust structure, such as the
fractal branching patterns of pulmonary
bronchi81 or a network motif in
molecular networks (see below), may
simply have been exploited becausethey happen to be advantageous rather
than created by natural selection.
Variability and selection pressure may not
have been strong enough to mould such
structures from scratch by stepwise
exploration of many alternative, similar
but less functional forms in the
phenotype space. In other words,
functional adaptation may be a by-
product of an inherently robust process
Table 3: Major points of the two mechanisms for explaining the existence of distinct features in biology. Although the twomechanisms are not mutually exclusive, but complementary and perhaps synergistic, they have become polarised meansof explanation among biologists, with the majority today leaning towards Type I
Type I mechanism: Adaptation (natural selection) Type II mechanism: Intrinsic constraints
(Neo)-Darwinian evolution: random mutation generates phenotypicvariation, selection of a distinct feature based on fitness in a givenenvironment
Intrinsic inevitability (robustness) of a feature due to constraints in themechanism of its genesis or design: architecture constraints physical (mechanical, energetic, allometric) constraints self-organisation, pattern formation developmental constraints historical constraints (phylogenetic inertia, genetic drift, etc)
Biology as a historical science Biology can be historical or ahistorical(Local) optimisation of a trait for functionality No optimisation necessary, but can secondarily shape traitNotion of purposefulness is centralGradual, cumulative evolution used to explain unlikely complex trait
No notion of purposefulnessSelf-organisation used to explain unlikely trait
Adaptation is a powerful moulding force that generates traits fromscratch
Adaptation just maintains favourable traits that are inevitable due toconstraints
Natural selection is creative Natural selection is conservativeAdaptation is cause for a trait Adaptation is consequence of a trait
2 8 6 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004
Huang
8/4/2019 Back to the Biology in Systems Biology
9/19
that creates structure based on internal
design principles. Selection then does the
fine-tuning.
Unfortunately, the above dichotomy
a matter of two non-competing and
overlapping explanations becomes a
matter of competing, polarised points of
view. The majority of modern molecular
biologists gravitate towards the adaptionist
perspective and put all faith in the
extraordinary power of natural selection
in the creation and moulding of traits.
This unquestioned belief in natural
selection as the core, irreducible principle
of explanation in biology hinders the
pluralistic endeavour to disentangle and
assess the relative contribution to a trait ofeither mechanism. Even Darwin himself
cautioned us in the introduction to The
Origin of Species: I am convinced that
Natural Selection has been the main, but
not the exclusive, means of
modification.82
EVOLUTION OF NETWORKSTRUCTURES: INTRINSICCONSTRAINTSThe dualistic scheme of explanation can
now be applied to the non-random
architectural features of networks. In thespirit of the prevailing bias towards
adaptionism, the identification of
particular network structures was almost
always automatically followed by the
interpretation that they must reflect some
optimisation of functionality by
evolution.18,83,84 Such statements,
although particularly attractive to the
engineers mind accustomed to
purposeful, optimal design, are dangerous.
When invoking the survival of the fittest
in the absence of rigorous analysis of theactual mechanism of the fitness associated
with a trait and of its history, one can
easily succumb to the inherently self-
referential logic of this statement (if what
is fit survives, then what survives must be
fit). Although pure Darwinian
adaptionism applies in many specific
scenarios, eg in one of its purest forms
the generation of antibiotic resistance, in
most cases both mechanisms appear to
contribute, in a synergistic manner, to the
prevalence of a given trait.
Topological features of networks are
ideal objects of study for determining the
relative contribution of either type of
mechanism to a distinct trait. The
deviation from randomness of a network
structure can be precisely determined and
a mechanism for the structural and
historical constraint formalised. A
network feature represents an elementary
phenotype, but unlike classical
phenotypic traits it does not undergo
ontogenesis, which complicates debates
on evolution. Thus, developmental
constraints do not have to be considered.
From the discussion presented above, itbecomes obvious that before attributing a
feature to adaptation, one needs to
exclude any constraint accounting for the
inherent inevitability of that structure.
The demonstration that a network feature
is significantly more abundant than one
would expect if the network were to have
been generated randomly, by comparing
it with a random graph, does not suffice
to support natural selection. Furthermore,
the demonstration of convergence, ie
similarity of a trait in the absence of
genetic evidence for common ancestry, isoften used as an argument in favour of
adaptation.85 Other than dispelling the
idea of a common ancestry, however,
convergence does not exclude the role of
various kinds of intrinsic constraints in
enforcing a particular feature, but may
instead point precisely to some
universality of the constraining design
rules.
The origin of a network motif needs to
be examined in more detail. The feed-
forward loop has been shown, based on acomparison with the random graphs, to
be a prominent, hence adaptive feature
in the E. coligene regulatory network
(Figure 2).18 This conclusion is
problematic, since real gene regulatory
networks cannot be randomly redrawn
like the nodes and links in a graph. On
the contrary, evolution of the genome
and, hence, of the network that it
encodes, is subject to a series of
Not all traits are
evolved by natural
selection alone
Structural constraints
may have helped shape
network topology
features
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 7
Back to the biology in systems biology
8/4/2019 Back to the Biology in Systems Biology
10/19
mechanistic constraints that may
significantly reduce and bias the space of
the possible random network
architectures represented by the
immaterial, random graphs. Genes are
embodied by a linear physical structure,
the DNA, which rearranges in particular
ways. Genome evolution occurs via a
finite set of physicochemical mechanisms,
including chromosome and tandem gene
duplication, followed by diversification of
the duplicated gene pair due to mutations
of the redundant genes.5355 Moreover,
rearrangements of DNA are caused by
DNA segment inversion or deletion, or
insertion of DNA fragments via
transposable elements, crossovers,etc.54,55,86 These processes, as well as
promoter-generating mutations,87 drive
the rewiring of gene regulatory networks
by shuffling cis- and trans-regulatory
regions. For example, the duplication of a
gene, with subsequent mutations and
acquisition of new properties in the
coding region but not in the promoter
region,53 would create a branching
network motif (one gene activates two
genes). These mechanisms could facilitate
the genesis of particular network motifs
over others, independent of functionality.Figure 2 shows a hypothetical mechanism
that could explain the genesis of a feed-
forward loop18 via only a gene insertion
and a gene duplication event, plus a few
minor mutations.
Mechanistic and historical
considerations help to place the potential
effect of adaptation by selection in the
appropriate light. Unlike the ad hoc
assumption of an effective selective
advantage, the assumption of mechanistic
and historical constraints can be rigorouslytested.75 Analysis of the DNA sequence to
trace back the history of the cis- and trans-
regulatory elements within a specific
network motif, combined with the study
of generative algorithms constrained by
mechanistic considerations of DNA
rearrangement, could reveal the
contribution by Type II mechanisms to a
network structure.
In the case of global structures, like the
power-law degree distribution ofk
discussed above, Barabasi and coworkers
proposed a generative algorithm for
network growth (increase in node
number) that inevitably results in scale-
free networks.20 An essential ingredient of
their algorithm is that genes newly added
to the existing, growing network
preferentially establish a connection to an
already highly connected gene. Although
the model of genesis is cast in generic
terms, one can now examine whether
simulations of the actual physical
mechanism (increase in genome size by
gene duplication, rewiring by mutations,
particular exon shuffling, etc) are
compatible with the generic model.88
Infact, protein sequence comparisons that
allowed the determination of the
evolutionary age of a protein appear to
suggest that the hubs are older and that
preferential attachment might have taken
place during evolution.89 A power-law
architecture has also been found in a large
number of non-biological networks, such
as social and computer networks,90,91
suggesting that adaptive evolution is not
likely to be the creator of this
architecture.
This section on the evolution ofnetwork features concludes by
emphasising again that each of the types
of explanation for a particular structure
adaptation and intrinsic constraints
does not exclude the other. Evolution
will secondarily, by chance, coopt an
intrinsically robust feature that happens to
have desirable properties and maintain it.
For example, a giant component of the
network which is an unavoidable
structure may coincide with functional
advantages: in the case of the metabolicnetwork, it might facilitate global
coordination of energy utilisation upon
change of nutritional source; and in the
case of the regulatory network, global
connectivity allows regulation of whole-
cell behaviour, such as cell fate switches,
as discussed below. Similarly, structural
modularity of networks, for which several
advantages have been attributed in terms
of functionality and evolvability,92 may
Natural selection and
historical/structural
constraints cooperate in
network evolution
2 8 8 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004
Huang
8/4/2019 Back to the Biology in Systems Biology
11/19
simply have been a fortuitous accident of
horizontal (vector-mediated) transfer of
entire clusters of genes. Such groups
of genes could collectively confer a
biochemical function to the host,
allowing it to invade a new metabolic
niche, and hence be maintained as a
module by selection.87 Since the
expression of these genes is coordinated
inter sesuch as to perform a metabolic
task, they necessarily form a network
module. Finally, as proposed below, the
power-law feature of regulatory networks
may have the functional benefit of
generating ordered global network
dynamics. Thus, adaptation is part of the
game, but it may be conservative, or justdoing the fine tuning, rather than
creative.
NETWORK FUNCTION:FROM TOPOLOGY TODYNAMICSMost functional interpretation of
networks has been based on their
topology alone. For example, it has been
suggested that highly connected hubs in
networks are more likely to be essential
genes because they interact with many
proteins. In fact, for the yeast proteomethere is a weak association between
degree k of a protein and the probability
that it is essential.12 The power-law
network has been shown to be generically
more tolerant to random failure of
individual nodes, but vulnerable to
targeted attacks on the hubs.93 This
would be in line with the suggestion that
hub proteins have a slower rate of
evolution.94
Much of the topology-based reasoning
about function rests on the unarticulatedpremise that the molecular network acts
like a communication network in which
some information flows in the links from
node to node. Although this may be
appropriate for metabolic reaction
networks, it certainly does not apply to
networks of regulation, like the protein or
transcription networks, where a link
represents an influence rather than a flow.
To understand functionality, it is,
therefore, necessary to move from the
domain oftopology to that ofdynamics.
Network dynamics come into play when
considering two essential properties of
molecular regulation networks:
(i) A node iin the network (gene or
protein) takes a value, xi(t), which
represents its expression or activation
level, which changes with time in
response to incoming influences from
other nodes j, k, l, . . . , represented by
the links.
(ii) The links of the network represent
influences (rather than flows) and have
a modality (eg inhibition or
stimulation). Together withinteraction rules (eg additive effects, or
some Boolean functions), they
determine how the level xi(t) of a
given protein i(output) is jointly
affected by the values xj,k,l (t) of its
interaction partners j, k, l, etc
(inputs).
The dynamics of the system is the
concerted change in the levels of xi(t) for
all of the nodes iof the network, and can
be represented by the N-dimensional state
vectorS(t) [x1(t), x2(t),. . .
xN(t)] for anetwork with Nnodes. While the
topology is represented by a graph, the
dynamics can be imagined as a more
abstract N-dimensional state space in
which S(t) moves its position with time t
as defined by its components [x1(t), x2(t),
. . . xN(t)] (see Figure 3 for details). The
interactions of the network now impose
restrictions on the movement of the state
vectorS(t): a given state can only move in
certain directions in the state space
because its components xi(t) cannotchange independently but influence each
other. This frustration of the freedom of
movement is where the unfathomable
complexity of large networks collapses. If
the wiring diagram is right, this
restriction will give rise to simple, robust,
higher-order behaviour, as will be seen
below.
Mathematically, the dynamics of the
network can be described and modelled
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 9
Back to the biology in systems biology
8/4/2019 Back to the Biology in Systems Biology
12/19
using a large variety of formalisms that
differ in the level of coarse-graining. For
example, systems of ordinary differential
equations, with continuous values of
concentrations forxi(t) that are subjected
to mass action kinetics (like chemical
reactions), are often used for smaller
circuits consisting of a handful of proteins
or genes.9598 This is already an
abstraction, in particular when used to
model gene regulatory or signal
transduction, since xi(t) represents a
hypothetical average of the stochastically
fluctuating activity of node i, taken over
time and over an ensemble of
heterogeneous cells. At the even more
Figure 3: From topology to dynamics; illustrated for a small circuit and a large network. Toppanel: Small circuit (signalling module) consisting of two proteins A and B, which are mutuallyinhibitory, and promote their own decay. For a wide range of kinetic parameters, this topologygives rise to bistability. The middle diagram shows a phase plane (two-dimensional state space)that represents the dynamics. Each dot represents a state S(t) [xA(t), xB(t)] of the circuit attime t, defined by the expression or activity values x(t) of the two nodes A and B (see text). Theticks emanating from each dot point to the direction in state space in which that respectivestate will move in the next time unit to comply with the interactions (ie, systems equations) ofthe circuit. There are two stable states, Xand Y, to which the ticks point. The right-handdiagram shows an intuitive projection of the state space (along the line p), emphasising thebistability of the system, with states Xand Yas potential wells (attractors), and a hill inbetween representing an unstable region. Bottom panel: Analogous illustration of topology anddynamics, but for a larger network (N 24). Now the state space is N-dimensional; it containsall the network states, defined by S(t) [x1(t), x2(t), . . . xN(t)], and cannot be drawn, however,it can be schematically considered as a landscape in which the pits represent (high-dimensional)attractor states (right-hand diagram). In this projection, every point in the xy plane representsa network state S(t), arranged so as to generate the landscape, where the vertical axis wouldrepresent some state change potential value reflecting the dissimilarity of the state Sxy atposition (x,y) to that of the attractor state that it will eventually reach. Each attractor statecorresponds to a distinct phenotypic cell state, the cell fate. The landscape can be thought of asthe formal and molecular equivalent of Waddingtons epigenetic landscape (see Figure 4)
A B
A
B
low Ahigh B
high Alow B
1
State X
State X
State Y
State Y
2D phase plane
( N-dimensionalstate space )
Projection as potential wells
SMALLCIRCUITS
LARGE
NETWORKS
TOPOLOGY
Schematic projection as3D epigenetic landscape
DYNAMICS PHENOTYPIC BEHAVIOUR
p
p
N = 2
N = 24 attractors = cell fates
2 9 0 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004
Huang
8/4/2019 Back to the Biology in Systems Biology
13/19
abstract end of the spectrum of models are
the discrete networks in which genes are
represented by binary, ON-OFF
switches, with Boolean functions
determining the interaction modalities.
Because of their computational simplicity,
such Boolean networks have been used as
a convenient, tractable caricature for
studying fundamental aspects of the global
behaviour of large networks using
statistical ensembles of networks of
varying topologies.4,22 In fact, Boolean
functions (eg AND, OR, . . .) capture the
link modalities and their cooperation
rather well/although these are, in reality,
embodied by the molecular interactions at
promoters and in protein proteincomplexes.47,99,100
Quantitative modelling of local
modules of signalling and gene regulatory
pathways that consider non-linear aspects
(such as positive and negative feedback
loops) can predict a variety of interesting
behaviours, such as robustness, multi-
stability, oscillation and switch-like
behaviour.101 These emergent qualities
are not evident from study of the
topology, nor can they intuitively be
predicted by the linear upstream
downstream schemes of theexperimentalist (Figure 3). This pathway-
centred modelling, however, predicts the
dynamics only of the components of a
particular signalling module, not of
phenotypic cell behaviour.
CELL FATES ASATTRACTORS INGENOME-SCALENETWORKSA globalist view of the dynamics of the
network may also, therefore, be necessary.Two findings suggest that global network
dynamics may be relevant. First, the
existence of a giant component in the
protein interaction network12 (and
probably also in the gene regulatory
network6) raises the question of whether
there is a collective, higher-order
behaviour that arises from the large
number (N 1,00010,000) of
interdependent genes and proteins, or
whether their activity changes simply
average out and get lost over the large
population of the nodes, so that no
interesting global pattern emerges. In
other words, is the genome-wide
network (or large portions of it) wired in
such a way that it acts coherently as one
entity with a distinct macroscopic
dynamics and, thus, contains additional
information which would not be
accessible in the study of pathway
modules in separation? Secondly, cells in
multicellular organisms exhibit a simple,
coherent whole-cell behaviour which
may precisely reflect a higher-order
dynamics of the global network: the
switching between cell fates. This strictlyregulated, rule-based systems behaviour is
robust and remarkably simple compared
with the complexity of the underlying
molecular network.100
Cells in multicellular organisms are able
to rapidly and reliably integrate a
multitude of simultaneous and often
conflicting signals from their tissue
environment, and respond by selecting
just one of a few discrete cell fates, such as
quiescence, proliferation, migration or
differentiation into distinct cell types.100
Tight regulation of the balance betweencell fates is required for the development
and homeostasis of the tissue. Cell fates
are discrete, mutually exclusive
behavioural types, as Waddington
described back in the 1940s.102 What is
particularly counterintuitive from the
localist, pathway-centred perspective, is
that the same cell fate switch can be
triggered by a broad variety of unrelated
biochemical and physical stimuli, while
the same biochemical stimulus can elicit
entirely different cell fate switches depending on the context.59,100 This
defies the localist notion of modular
pathways that serve a specific cellular
function. The mutual exclusivity of cell
fates is often taken for granted but is a
systems property that requires that the
pathways that have been associated with
particular cell fates be globally
coordinated. This would be consistent
with the function of a giant component in
The dynamics of largenetworks is highly
constrained
Cells undergo almost
discrete transitions
between a few distinct
cell phenotypes
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 29 1
Back to the biology in systems biology
8/4/2019 Back to the Biology in Systems Biology
14/19
the regulatory networks that spans almost
the entire genome (even if this structural
feature is statistically almost unavoidable,
and thus may not have evolved for this
function, as discussed above).
Do the dynamics of large networks in
fact generate robust, simple, macroscopic
behaviour that is manifest as regulated cell
fate switching? To overcome the lack of
specific information on genes and the
precise network topology, yet be able to
study the universal behaviour of large
networks, more than 30 years ago
Kauffman used random Boolean networks
as models to show that, given some
minimal topological constraints (such as
sparse connectivity), the global dynamicsof a network with Ngenes will be far
from chaotic (disordered), but that due
to the constraints imposed by the
particular genegene interactions, it will
inevitably exhibit some ordered collective
behaviour: most states S(t) are not stable,
thus forcing S(t) to move in the state
space until it arrives at a stable state to
which unstable states will be attracted (see
Figure 3 for details).3,22 These attractor
states can now be designated as the cell
fates or, in Kauffmans more general view,
as the various cell types in a multicellularorganism.22,100 In fact, the dynamics of a
network with attractor states naturally
captures the essential properties of cell fate
dynamics, including mutual exclusivity,
robustness and all-or-none transitions
between cell fates in response not to a
single specific instructive signal but to a
large variety of signals.
Although Kauffman used randomly
connected networks (with restrictions
only with regard to connectivity degree
and type of Boolean function), he wasable to show some universal dynamic
properties, such as the existence of robust
attractors.22 Kauffmans pioneering work
receives little attention from modern day
experimental and computational
molecular biologists, however,34,35 who
in their localist point of view emphasise
the description of specific, idiosyncratic
details over abstraction and analysis of
higher-order behaviour. Interestingly, it
turned out that the scale-free network
topology that is now found in real
networks is even more likely than
randomly connected networks to generate
a reasonable, ordered dynamics, in the
sense of Kauffman (ie containing stable,
compact attractors, consistent with cell
fate behaviour).103 Thus, the self-
organised scale-free topology relaxes
Kauffmans restrictions on gene network
topology for producing ordered
behaviour.
The idea that attractors correspond to
distinct, stable phenotypic states is just one
aspect of a more general view. The
genome-wide molecular network and,
hence, the cell do not randomly andindifferently visit all possible network
states S(t). Instead, because of the
interactions between the molecular
constituents, and the particularity of their
wiring diagram, the state space is not
isotropic, but highly sub structured, or
compartmentalised, into stable attractor
states and their basins of attraction. Thus,
the interactions between genes and
proteins are wired such as to impose
restrictions on the dynamics of the
network S(t). The complexity of the
genomic network collapses into simple,robust behaviour with relatively few
stable network states the biological cell
fates. The attractors in the state space
landscape (Figure 3) may thus represent
a molecular and formal explanation for
Waddingtons epigenetic landscape
(Figure 4), which he proposed as an
intuitive metaphor to explain the
discreteness of cell fates more than 60
years ago, without the notion of genes
and networks.
CONCLUSIONIn this paper the author has discussed
disparate but complementary viewpoints
in biology that the burgeoning notion of
system-wide networks has exposed. On
the one hand, a localist view that stresses
the importance of detailed pathway
analysis at the level of nominal genes and
proteins, and that considers a network as
the sum of all the pathways; and on the
2 9 2 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004
Huang
8/4/2019 Back to the Biology in Systems Biology
15/19
other hand, a globalist view that
emphasises the additional emergentinformation harboured by a network as an
entity. Since living systems are both
complicated and complex,27 both
viewpoints are equally important.
Network structures provide an ideal
handle with which to determine the
relative contribution of natural selection
versus intrinsic constraints to particular
traits. Global network dynamics,
however, open a new perspective to cell
fate decisions in multicellular organisms
which will foster understanding ofphenomena such as the multi-potency of
stem cells.
In conclusion, however, it is important
to remember that networks are just a
convenient abstraction, which fail to
capture some essential properties of living
systems. First, they represent a level of
description that is devoid of space and
physicality, such as the subcellular
localisation of the proteins representing
the network nodes, as well as mechanical
forces within cells.76 Secondly, in
multicellular organisms, every cell
contains its own individual network. One
forgets that in a theoretical treatment one
analyses and models a single proxy
network whose behaviour is then
mentally multiplied to billions of identical
replicate cells in order to represent a tissue
with its billions of cells. This
subconscious, amplifying projection is
problematic, since there is no
representative average cell104 because of
molecular noise105107 and other,
epigenetic variations.108 Instead, a
population, even of clonal cells, can
dramatically vary in its network state, asevidenced by the dispersion of behaviour
of the same genes in individual cells
within a population.104,109 Perhaps this
population heterogeneity itself is a desired
feature, because the dispersion of single
cell behaviour establishes robust,
characteristic macroscopic response
dynamics for the tissue as an entity. In this
case, the detailed behaviour of a single
cellular network would not matter, while
sloppiness within the network of
individual cells may even be beneficial for
the behaviour of tissues. Finally, at thetissue level, there is another network: the
cellcell communication network,
mediated by secreted factors, extracellular
matrix and mechanical forces.
Understanding intracellular molecular
network is, therefore, all but a first step
towards this network of networks that
makes multicellular living systems work.
Acknowledgment
This work was funded by the Air Force Office of
Scientific Research (AFOSR/USAF) and thePatterson Trust. The author also wishes to thank
Joy P. Maliackal for helpful discussions on
networks and Donald E. Ingber for his continuous
support.
References
1. Monod, J. and Jacob, F. (1961), Generalconclusions: Teleonomic mechanisms incellular metabolism, growth anddifferentiation, Cold Spring Harbor Symp.Quant. Biol., Vol. 26, pp. 389 401.
Figure 4: Epigenetic landscape proposedby Waddington in the 1940s as a metaphorto capture the observable dynamics of cellfates. The marble represents a cell state.Since Waddington was studyingdevelopment, the overall slope representsthe driving force of development. As themarble rolls downwards, the cell is forced tochoose between discrete fates represented by the valleys. Reprinted fromWaddington (1957), The Strategy of theGenes, Allen and Unwin, London.
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 29 3
Back to the biology in systems biology
8/4/2019 Back to the Biology in Systems Biology
16/19
2. Delbruck, M. (1949), Discussion, in CNRS,Eds, Unites biologiques douees de continuitegenetique, Colloques Internationaux duCentre National de la RechercheScientifique, Paris.
3. Kauffman, S. A. (1969), Metabolic stabilityand epigenesis in randomly constructedgenetic nets, J. Theor. Biol., Vol. 22,pp. 437467.
4. Huang, S. (2001), Genomics, complexityand drug discovery: Insights from Booleannetwork models of cellular regulation,Pharmacogenomics, Vol. 2, pp. 203 222.
5. Overbeek, R., Larsen, N., Pusch, G. D. et al.(2000), WIT: integrated system for high-throughput genome sequence analysis andmetabolic reconstruction, Nucleic Acids Res.,Vol. 28, pp. 123 125.
6. Lee, T. I., Rinaldi, N. J., Robert, F. et al.
(2002), Transcriptional regulatory networksin Saccharomyces cerevisiae, Science, Vol. 298,pp. 799804.
7. Salgado, H., Santos-Zavaleta, A., Gama-Castro, S. et al. (2001), RegulonDB (version3.2): transcriptional regulation and operonorganization in Escherichia coliK-12, NucleicAcids Res., Vol. 29, pp. 72 74.
8. Mewes, H. W., Frishman, D., Guldener, U.et al. (2002), MIPS: A database for genomesand protein sequences, Nucleic Acids Res.,Vol. 30, pp. 3134.
9. Uetz, P., Giot, L., Cagney, G. et al. (2000),A comprehensive analysis of proteinprotein
interactions in Saccharomyces cerevisiae, Nature,Vol. 403, pp. 623627.
10. Ito, T., Chiba, T., Ozawa, R. et al. (2001), Acomprehensive two-hybrid analysis toexplore the yeast protein interactome, Proc.Natl. Acad. Sci. USA, Vol. 98, pp. 45694574.
11. Jeong, H., Tombor, B., Albert, R. et al.(2000), The large-scale organization ofmetabolic networks, Nature, Vol. 407, pp.651654.
12. Jeong, H., Mason, S. P., Barabasi, A.-L. andOltvai, Z. N. (2001), Lethality and centralityin protein networks, Nature, Vol. 411,pp. 4142.
13. Wagner, A. and Fell, D. A. (2001), Thesmall world inside large metabolic networks,Proc. R. Soc. Lond., B, Biol. Sci., Vol. 268,pp. 1803 1810.
14. Maslov, S. and Sneppen, K. (2002),Specificity and stability in topology ofprotein networks, Science, Vol. 296,pp. 910913.
15. Ravasz, E., Somera, A. L., Mongru, D. A.et al. (2002), Hierarchical organization ofmodularity in metabolic networks, Science,Vol. 297, pp. 15511555.
16. Milo, R., Shen-Orr, S., Itzkovitz, S. et al.(2002), Network motifs: Simple buildingblocks of complex networks, Science, Vol.298, pp. 824 827.
17. Wuchty, S. (2002), Interaction and domainnetworks of yeast, Proteomics, Vol. 2,pp. 1715 1723.
18. Shen-Orr, S. S., Milo, R., Mangan, S. andAlon, U. (2002), Network motifs in thetranscriptional regulation network ofEscherichia coli, Nat. Genet., Vol. 31,pp. 6468.
19. Sacks, O. (2002), Forgetting and Neglect inScience, in Silver, R. B., Ed., HiddenHistories of Science, New York Review ofBooks, pp. 141 179.
20. Barabasi, A. L. and Albert, R. (1999),Emergence of scaling in random networks,Science, Vol. 286, pp. 509512.
21. Goodwin, B. (1993), How the LeopardChanged Its Spots: The Evolution ofComplexity, Princeton University Press,Princeton, NJ.
22. Kauffman, S. A. (1993), The Origins ofOrder, Oxford University Press, New York,NY.
23. Kauffman, S. A. (2000), Investigations,Oxford University Press, New York, NY.
24. Ball, P. (1998), The Self-made Tapestry:Pattern Formation in Nature, OxfordUniversity Press, New York, NY.
25. Coffey, D. S. (1998), Self-organization,
complexity and chaos: The new biology formedicine, Nat. Med., Vol. 4, pp. 882 885.
26. Strohman, R. C. (1997), The comingKuhnian revolution in biology, Nat.Biotechnol., Vol. 15, pp. 194200.
27. Huang, S. (2000), The practical problems ofpost-genomic biology, Nat. Biotechnol., Vol.18, pp. 471 472.
28. Oltvai, Z. N. and Barabasi, A. L. (2002),Systems biology. Lifes complexity pyramid,Science, Vol. 298, pp. 763764.
29. Murphy, D. (2002), Gene expression studiesusing microarrays: Principles, problems, andprospects, Adv. Physiol. Educ., Vol. 26, pp.256270.
30. Mollay, M. P. and Witzmann, F. A. (2002),Proteomics: Technologies and applications,Brief. Funct. Genom. Proteom., Vol. 1, pp. 2329.
31. Marcotte, E. M. (2001), The path not taken,Nat. Biotechnol., Vol. 19, pp. 626627.
32. Aebersold, R., Hood, L. E. and Watts, J. D.(2000), Equipping scientists for the newbiology, Nat. Biotechnol., Vol. 18, p. 359.
33. Ideker, T., Thorsson, V., Ranish, J. A. et al.(2002), Integrated genomic and proteomic
2 9 4 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004
Huang
8/4/2019 Back to the Biology in Systems Biology
17/19
analyses of a systematically perturbedmetabolic network, Science, Vol. 292,pp. 929934.
34. Endy, D. and Brent, R. (2001), Modelling
cellular behaviour, Nature, Vol. 409(Suppl.),pp. 391395.
35. Bray, D. (2003), Molecular networks: Thetop-down view, Science, Vol. 301, pp. 18641865.
36. Nurse, P. (2003), Systems biology:Understanding cells, Nature, Vol. 424,p. 883.
37. Stuart, J. M., Segal, E., Koller, D. and Kim.S. K. (2003), A gene-coexpression networkfor global discovery of conserved geneticmodules, Science, Vol. 302, pp. 249255.
38. Tornow, S. and Mewes, H. W. (2003),Functional modules by relating protein
interaction networks and gene expression,Nucleic Acids Res., Vol. 31, pp. 6283 6289.
39. Bader, G. D., Heilbut, A., Andrews, B. et al.(2003), Functional genomics andproteomics: Charting a multidimensionalmap of the yeast cell, Trends Cell. Biol., Vol.13, pp. 344 356.
40. Ge, H., Walhout, A. J. and Vidal, M. (2003),Integrating omic information: a bridgebetween genomics and systems biology,Trends Genet., Vol. 19, pp. 551560.
41. Evans, G. (2000), Designer science and theomic revolution, Nat. Biotechnol., Vol. 18,p. 127.
42. Eisen, M. B., Spellman, P. T., Brown, P. O.and Botstein, D. (1998), Cluster analysis anddisplay of genome-wide expression patterns,Proc. Natl. Acad. Sci. USA, Vol. 95,pp. 1486314868.
43. Steffen, M., Petti, A., Aach, J. et al. (2002),Automated modelling of signal transductionnetworks, BMC Bioinformatics, Vol. 3, pp. 34.
44. Shannon, P., Markiel, A., Ozier, O. et al.(2003), Cytoscape: a software environmentfor integrated models of biomolecularinteraction networks, Genome Res., Vol. 13,pp. 2498 2504.
45. Bader, G. D. and Hogue, C. W. (2000),BIND-a data specification for storing and
describing biomolecular interactions,molecular complexes and pathways,Bioinformatics, Vol. 16, pp. 465 477.
46. Chen, Y. and Xu, D. (2003), Computationalanalyses of high-throughput proteinproteininteraction data, Curr. Protein Pept. Sci., Vol.4, pp. 159 181.
47. Davidson, E. H., Rast, J. P., Oliveri, P. et al.(2002), A genomic regulatory network fordevelopment, Science, Vol. 295, pp. 16691678.
48. Hasty, J., McMillen, D., Isaacs, F. and
Collins, J. J. (2001), Computational studiesof gene regulatory networks: in numeromolecular biology, Nat. Rev. Genet., Vol. 2,pp. 268279.
49. Dhaeseleer, P., Liang, S. and Somogyi, R.(2000), Genetic network inference: Fromco-expression clustering to reverseengineering, Bioinformatics, Vol. 16,pp. 707726.
50. Gardner, T. S., di Bernardo, D., Lorenz, D.and Collins, J. J. (2003), Inferring geneticnetworks and identifying compound mode ofaction via expression profiling, Science, Vol.301, pp. 102105.
51. Tamada, Y., Kim, S., Bannai, H. et al. (2003),Estimating gene networks from geneexpression data by combining Bayesiannetwork model with promoter elementdetection, Bioinformatics, Vol. 19 (Suppl. 2),
pp. II227 II236.52. Tatusov, R. L., Galperin, M. Y., Natale,
D. A. and Koonin, E. V. (2000), The COGdatabase: A tool for genome-scale analysis ofprotein functions and evolution, Nucleic AcidsRes., Vol. 28, pp. 33 36.
53. Lynch, M., Conery, J. S. (2000), Theevolutionary fate and consequences ofduplicate genes, Science, Vol. 290, pp. 11511155.
54. Eichler, E. E. and Sankoff, D. (2003),Structural dynamics of eukaryoticchromosome evolution, Science, Vol. 301,pp. 793797.
55. Long, M., Deutsch, M., Wang, W. et al.(2003), Origin of new genes: Evidence fromexperimental and computational analyses,Genetica, Vol. 118, pp. 171 182.
56. Williams, C. S. and DuBois, R. N. (1996),Prostaglandin endoperoxide synthase: whytwo isoforms?, Am. J. Physiol., Vol. 270,pp. G393 G400.
57. Tanabe, T. and Tohnai, N. (2002),Cyclooxygenase isozymes and their genestructures and expression, Prostaglandins OtherLipid Mediat., Vol. 6869, pp. 95114.
58. Kam, P. C. and See, A. U. (2000), Cyclo-oxygenase isoenzymes: Physiological andpharmacological role, Anaesthesia, Vol. 55,
pp. 442449.
59. Huang, S. and Ingber, D. E. (2000), Shape-dependent control of cell growth,differentiation and apoptosis: Switchingbetween attractors in cell regulatorynetworks, Exp. Cell Res., Vol. 261, pp.91103.
60. Ashburner, M. and Lewis, S. (2002), Onontologies for biologists. The Gene Ontology
untangling the web, Novartis Found Symp.,Vol. 247, pp. 6680.
61. Wu, C. H., Huang, H., Arminski, L. et al.
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 29 5
Back to the biology in systems biology
8/4/2019 Back to the Biology in Systems Biology
18/19
(2002), The protein information resource:An integrated public resource of functionalannotation of proteins, Nucleic Acids Res.,Vol. 30, pp. 3537.
62. von Mering, C., Krause, R., Snel, B. et al.(2002), Comparative assessment of large-scale data sets of protein-protein interactions,Nature, Vol. 417, pp. 399403.
63. Deane, C. M., Salwinski, L., Xenarios, I. andEisenberg, D. (2002), Protein interactions:Two methods for assessment of the reliabilityof throughput observations, Mol. Cell.Proteomics, Vol. 1, pp. 349 356.
64. Strogatz, S. H. (2001), Exploring complexnetworks, Nature, Vol. 410, pp. 268 276.
65. Erdos, P. and Renyi, A. (1959), On randomgraphs, Publ. Math., Vol. 6, p. 290.
66. Callaway, D. S., Hopcroft, J. E., Kleinberg,
J. M. et al. (2001), Are randomly growngraphs really random?, Phys. Rev. E Stat.Nonlin. Soft Matter Phys., Vol. 64, 041902.
67. Joy, M. P., Brock, A. and Huang, S.,manuscript in preparation.
68. Itzkovitz, S., Milo, R., Kashtan, N. et al.(2003), Subgraphs in random networks,Phys. Rev. E Stat. Nonlin. Soft Matter Phys.,Vol. 68, 026127.
69. Gould, S. J. and Lewontin, R. C. (1979),The spandrels of San Marco and thePanglossian paradigm: A critique of theadaptationist programme, Proc. R. Soc. Lond.,B, Biol. Sci., Vol. 205, pp. 581 598.
70. Webster, G. and Goodwin, B. (1981),History and structure in biology, Perspect.Biol. Med., Vol. 25, pp. 39 62.
71. Gould, S. J. (1966), Allometry and size inontogeny and phylogeny, Biol. Rev. Camb.Philos. Soc., Vol. 41, pp. 587 640.
72. West, G. B., Brown, J. H. and Enquist, B. J.(1999), The fourth dimension of life: fractalgeometry and allometric scaling oforganisms, Science, Vol. 284, pp. 16771679.
73. Thompson, D. W. (1942), Growth andForm, Macmillian, New York, NY.
74. Seilacher, A. (1970), Arbeitskonzept zurKonstruktionsmorphologie, Lethaia, Vol. 3,
pp. 393396.
75. Autumn, K., Ryan, M. J. and Wake, D. B.(2002), Integrating historical and mechanisticbiology enhances the study of adaptation,Q. Rev. Biol., Vol. 77, pp. 383408.
76. Ingber, D. E. (2003), Tensegrity I. Cellstructure and hierarchical systems biology,J. Cell Sci., Vol. 116, pp. 1157 1173.
77. Hughes, A. J. and Lambert, D. M. (1984),Functionalism and structuralism, and Waysof Seeing , J. Theor. Biol., Vol. 111,pp. 787800.
78. Mitchell Waldrop, M. (1992), Complexity.The emerging science at the edge of orderand chaos, Simon and Schuster, New York,NY.
79. Bar-Yam, Y. (1997), Dynamics in complexsystems (studies in nonlinearity), PerseusPublishing New York, NY.
80. Carlson, J. M. and Doyle, J. (2002),Complexity and robustness, Proc. Natl. Acad.Sci. USA, Vol. 99(Suppl. 1), pp. 25382545.
81. Weibel, E. R. (1991), Fractal geometry: Adesign principle for living organisms, Am. J.Physiol., Vol. 1, pp. L361 L369.
82. Darwin, C. (1964), On the origin of species:A facsimile of the first edition, HarvardUniversity Press, Cambridge, MA.
83. Alon, U. (2003), Biological networks: Thetinkerer as an engineer, Science, Vol. 301,
pp. 1866 1867.84. Wuchty, S., Oltvai, Z. N. and Barabasi, A. L.
(2003), Evolutionary conservation of motifconstituents in the yeast protein interactionnetwork, Nat. Genet., Vol. 35, pp. 176179.
85. Conant, G. C. and Wagner, A. (2003),Convergent evolution of gene circuits, Nat.Genet. Vol. 34, pp. 264 266.
86. Lawrence, J. G. (2002), Shared strategies ingene organization among prokaryotes andeukaryotes, Cell, Vol. 110, pp. 407 413.
87. Dabizzi, S., Ammannato, S. and Fani, R.(2001), Expression of horizontally transferredgene clusters: Activation by promoter-generating mutations, Res. Microbiol., Vol.152, pp. 539 549.
88. Berg, J., Lassig, M. and Wagner, A. (2003),Structure and evolution of protein networks(in review). Available at http://arxiv.org/abs/cond-mat/0207711.
89. Eisenberg, E. and Levanon, E. Y. (2003),Preferential attachment in the proteinnetwork evolution, Phys. Rev. Lett., Vol. 91,138701.
90. Amaral, L. A., Scala, A., Barthelemy, M. andStanley, H. E. (2000), Classes of small-worldnetworks, Proc. Natl. Acad. Sci. USA, Vol.97, pp. 1114911152.
91. Barabasi, A.-L. (2002), Linked: How
everything is connected to everything andwhat it means, Perseus Publishing New
York, NY.
92. Hartwell, L. H., Hopfield, J. J., Leibler, S.and Murray, A. W. (1999), From molecularto modular cell biology, Nature, Vol.402(Suppl.), pp. C47C52.
93. Albert, R., Jeong, H. and Barabasi, A.-L.(2000), Error and attack tolerance ofcomplex networks, Nature, Vol. 406, pp.378382.
94. Fraser, H. B., Wall, D. P. and Hirsh, A. E.
2 9 6 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004
Huang
8/4/2019 Back to the Biology in Systems Biology
19/19
(2003), A simple dependence betweenprotein evolution rate and the number ofproteinprotein interactions, BMC Evol.Biol., Vol. 23, p. 11.
95. Tyson, J. J., Chen, K. and Novak, B. (2001),Network dynamics and cell physiology, Nat.Rev. Mol. Cell. Biol., Vol. 2, pp. 908 916.
96. Ferrell Jr, J. E. and Machleder, E. M. (1998),The biochemical basis of an all-or-none cellfate switch in xenopus oocytes, Science, Vol.280, pp. 895898.
97. Bhalla, U. S., Ram, P. T. and Iyengar, R.(2002), MAP kinase phosphatase as a locus offlexibility in a mitogen-activated proteinkinase signaling network, Science, Vol. 297,pp. 1018 1023.
98. Hoffmann, A., Levchenko, A., Scott, M. L.and Baltimore, D. (2002), The IkappaB-NF-kappaB signaling module: temporal control
and selective gene activation, Science, Vol.295, pp. 12411245.
99. Dueber, J. E., Yeh, B. J., Chak, K. and Lim,W. A. (2003), Reprogramming control of anallosteric signaling switch through modularrecombination, Science, Vol. 301, pp. 19041908.
100. Huang, S (2002), Regulation of cellularstates in mammalian cells from a genome-wide view, in Collado-Vides, J. andHofestadt, R., Eds, Gene Regulation andMetabolism: Post-Genomic ComputationalApproach, MIT Press, Cambridge, MA.
101. Tyson, J. J., Chen, K. C. and Novak, B.
(2003), Sniffers, buzzers, toggles andblinkers: Dynamics of regulatory andsignaling pathways in the cell, Curr. Opin.Cell Biol., Vol. 15, pp. 221231.
102. Waddington, C. H. (1956), Principles ofEmbryology, Allen and Unwin Ltd, London.
103. Fox, J. J. and Hill, C. C. (2001), Fromtopology to dynamics in biochemicalnetworks, Chaos, Vol. 11, pp. 809 815.
104. Levsky, J. M. and Singer, R. H. (2003),Gene expression and the myth of the averagecell, Trends Cell Biol., Vol. 13, pp. 46.
105. McAdams, H. H. and Arkin, A. (1999), Its anoisy business! Genetic regulation at thenanomolar scale, Trends Genet., Vol. 15, pp.6569.
106. Elowitz, M. B., Levine, A. J., Siggia, E. D.and Swain, P. S. (2002), Stochastic gene
expression in a single cell, Science, Vol. 297,pp. 1183 1186.
107. Blake, W. J., Kaern, M., Cantor, C. R. andCollins, J. J. (2003), Noise in eukaryoticgene expression, Nature, Vol. 422, pp.633637.
108. Spudich, J. L. and Koshland Jr, D. E. (1976),Non-genetic individuality: Chance in thesingle cell, Nature, Vol. 262, pp. 467471.
109. Takasuka, N., White, M. R., Wood, C. D.et al. (1998), Dynamic changes in prolactinpromoter activation in individual livinglactotrophic cells, Endocrinology, Vol. 139,pp. 1361 1368.
& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 29 7
Back to the biology in systems biology