Back to the Biology in Systems Biology

8/4/2019 Back to the Biology in Systems Biology

1/19

Sui Huang

received his MD and PhD from

the University of Zurich and is

an assistant professor atHarvard Medical School. His

research focuses on c omplexity

in multicellularity and cancer.

Keywords: systems biology,

biocomplexity, genetic

networks, intrinsic constraints,adaptation, attractors

Sui Huang,

Department of Surgery,

Childrens Hospital,

Harvard Medical School,

300 Longwood Avenue,

Boston,

MA 02115

USA

Tel: +1 617 919 2393

e-mail: [email protected]

Back to the biology in systems

biology: What can we learnfrom biomolecular networks?Sui HuangDate received (in revised form): 1st December 2003

AbstractGenome-scale molecular networks, including protein interaction and gene regulatory

networks, have taken centre stage in the investigation of the burgeoning disciplines of systems

biology and biocomplexity. What do networks tell us? Some see in networks simply the

comprehensive, detailed description of all cellular pathways, others seek in networks simple,

higher-order qualities that emerge from the collective action of the individual pathways. This

paper discusses networks from an encompassing category of thinking that will hopefully help

readers to bridge the gap between these polarised viewpoints. Systems biology so far has

emphasised the characterisation of large pathway maps. Now one has to ask: where is the

actual biology in systems biology? As structures midway between genome and phenome, and

by serving as an extended genotype or an elementary phenotype, molecular networks open

a new window to the study of evolution and gene function in complex living systems. For the

study of evolution, features in network topology offer a novel starting point for addressing the

old debate on the relative contributions of natural selection versus intrinsic constraints to a

particular trait. To study the function of genes, it is necessary not only to see them in the

context of gene networks, but also to reach beyond describing network topology and to

embrace the global dynamics of networks that will reveal higher-order, collective behaviour of

the interacting genes. This will pave the way to understanding how the complexity of genome-

wide molecular networks collapses to produce a robust whole-cell behaviour that manifests astightly-regulated switching between distinct cell fates the basis for multicellular life.

INTRODUCTION: MINDTHE GAP OF MINDSEver since molecular biology came into

existence as a discipline devoted to

studying the function and functioning of

genes some 40 years ago, scientists have

entertained the idea of gene regulatory

circuits,1,2 and even of genome-scale

genetic networks,3 for understanding how

the activity of genes by influencingeach other ultimately translates into

the phenotypic behaviour of an organism.

In the absence of gene-specific data, the

concepts of genetic networks were

explored mostly at the level of abstract,

generic models, without molecule-

specific information.4 Despite intriguing

insights into the fundamental properties of

the collective behaviour of a large number

of interacting components,3 such work

remained largely unnoticed by

practitioners of molecular biology, who

emphasised the cloning, characterising

and categorising of individual genes.

Molecular biologists both

experimental and computational used

to studying specific, measurable

biochemical details, were not ready for

the abstraction required to handle

networks as an object of study. It was onlythe recent availability of gene-specific

data for genome-scale networks of

molecular interactions, made possible by

high-throughput genomic technologies,

that revived the interest in the idea of

large gene networks.

The arrival of genome-scale

information on networks, ranging from

networks of metabolic reactions5 to gene

regulatory6,7 and protein interaction

& HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 27 9


2/19

networks,810 has led to a recent spate of

publications in which the topology

(wiring diagram) of molecular networks

was analysed.1118 When reading the

literature, however, it is important to be

aware of two camps of researchers with

philosophically opposite mindsets and,

hence, disparate motivation. These two

camps can be labelled the globalists (or,

generalists) and the localists (or,

particularists) (Table 1). Historically, a

similar polarisation existed in brain

research, with globalists emphasising the

idea that the entire network of neurons in

the brain collectively determine

behaviour by distributed information

processing, while the localists believedmore in localising a brain function to a

particular anatomical region.19 It is now

known that both were right.

The first wave of work on genome-

wide networks was conducted mostly by

physicists in the globalists perspective, so

that networks were analysed in a more

general sense, ie as an entity in their own

right.11,12,14,15,20 In this globalists

abstraction the identification of individual

genes and their relationships and functions

were not the primary objectives of

analysis. Thus, this early work on

networks failed to attract the attention of

scientists in the localists camp, which

consists largely of mainstream molecular

biologists habituated to characterising

specific genetic pathways one at a time.

There is a sharp, natural, but

unarticulated intellectual disconnect

between the two camps because of the

fundamentally different motivation of the

two groups of scientists. The globalists are

typically interested in deeper principlesof complex systems,2124 they are attracted

to biology as a new source of complexity,

which led to the notion of

biocomplexity.2528 They now seek

molecule-specific data, made possible by

technological advances, to validate their

ideas.

A similar polarisation

between localists and

globalists existed in

brain research

Table 1: Polarity of two points of view in network biology. The table represents a caricature of extreme positions forillustration purposes.

The localist (particularist) view

(Those who see the trees first)

The globalist (generalist) view

(Those who see the forest first)

Level of original focus Gene- and pathway-centric Network-centricNew field created Systems biology BiocomplexityUse of hypothesis Hypothesis at level of individual pathways. No systems-

level hypothesis: research becomes discovery-driven.Example hypotheses:Gene A inhibits Gene B, is required for function X, etc.

Hypothesis at systems level concerning generic designof network and network position of genesExample hypotheses:Hub proteins are importantPower-law architecture favours ordered dynamics

Philosophy System is complicatedProperties of systems lie in the property of thecomponentsComprehensiveness.The whole equals the sum of the parts

System is complexHigher-order system properties emerge fromcollective behaviour of componentsHolismThe whole is different from sum of parts

Practical aims of study To characterise exhaustively the biochemistry of (all)

individual pathways and their functionsTo describe idiosyncrasy

To understand generic aspects of genome-scale

networks as an entity with its higher-order propertiesTo understand universality

Gene identity in models Of primary interest. Specific models with nominal genesand their idiosyncratic properties

Of secondary interest. Models with anonymous genesas generic entities may sometime suffice

Network topology Precise biochemical characterisation and categorising ofphysical and regulatory interactions in specific pathways

Analysis of large scale features, based on globalstatistics of local network features (degree distribution,modularity, clustering)

Network dynamics Detailed modelling of individual small circuits (modules)in separation as a low-dimensional dynamic system

Global dynamics of networkmaps into whole cell behaviour (cell fates)

Function Focus on local cellular functions (eg protein synthesis,vesicle transport, filopodia extension, DNA repair, etc)associated with a specific pathway

Emphasise emergent whole-cell behaviour, such asswitch between discrete cell phenotypes (cell fates)

Typical non-biologist partners Computer scientists, engineers Physicists

2 8 0 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

Huang


3/19

By contrast, the localists view is rooted

in classical molecular biology, hence is

shaped by decades of devotion to the

study of individual cellular pathways that

represent to them linear causal

relationships. But the prevalence of

pleiotropy and convergence in cell

signalling, and of crosstalk between

pathways, has led to the increasing

awareness that understanding gene

function requires that one reaches beyond

the narrow focus on individual pathways.

The advent of massively-parallel analysis

and high-throughput genomics and

proteomics technologies29,30 finally led to

the acceptance of the idea that molecular

pathways are just parts embedded in onegenome-scale system,31 leading to the

explicit inception of systems biology.32

Nevertheless, despite considering

networks and quantitative modelling, the

pathway-centric ideology shines through

in this new initiative towards

integration.3337 To the localists,

integration essentially means the

combination of different kinds of large

datasets in order to find correlations to

or at least to facilitate the generation of

hypotheses on relationships between

individual genes, proteins and biologicalfunctions.3740 Thus, the systems biology

approach seeks comprehensive, qualitative

and quantitative descriptions of all

constituent parts, rather than an

understanding of a higher-order quality

that may arise from the collective action

of the individual parts. The implicit

notion is that extension of the existing

ways of analysing individual pathways to

the exhaustive, simultaneous

characterisation of all the molecular

interactions in the genome will suffice forunderstanding how living organisms

work.27,35,41 Concretely, the emphasis on

the analysis of vast amounts of

biochemical detail has led to the

involvement of computer scientists for

developing computational tools that have

become indispensable for representing,

mining and modelling the bewildering

amount and heterogeneity of collected

information.4246 Another natural

extension of pathway-centricism to the

entire system, which has become a major

research focus of systems biology, is to

reconstruct the topology of a network by

systematic experimental identification of

interactions69,33,47 or to reverse engineer

it by inference from measurement of

network dynamics.4851

In summary, whereas globalists are

attracted by the complex, seeking to

understand general principles giving rise

to the whole, and are ready to abstract

away specific details, localists-turned-

systems biologists are more inclined to

attack the complicated, and seek

comprehensiveness of detailed description.

Because of the unique blend in biology ofuniversality and idiosyncrasy, both

approaches are complementary to each

other and will be necessary in spite of a

persisting, unarticulated mental divide

(Table 1).

This paper discusses molecular

networks in a broadly encompassing

category of view, going beyond the

characterisation of network maps, and

addresses the fundamental questions that

investigators of both ideologies studying

living systems will ultimately have to

address: what do networks tell us aboutthe biology of an organism?

FROM NETWORKS INBIOLOGY TO THEBIOLOGY OF NETWORKSAs with the information in DNA and

amino acid sequences, what networks can

tell us about the biology of an organism

can be divided into two broad domains:

evolution and function (Table 2).

While comparative sequence analysis

proved to be a powerful tool fordetermining taxonomic relationships,52

and can shed light on the history and

mechanism of genome evolution,5355 it

remains of limited use for predicting gene

function. Although conservation can

point to functional importance, and

sequence homology (eg catalytic motifs)

allows the prediction of the biochemical

function of a protein, the road from

sequence to gene function and,

Understanding living

systems requires both

detailed description andabstraction


Back to the biology in systems biology


4/19

subsequently, to drug development

turned out to be bumpier than was

anticipated in the dawn of the genomicrevolution. It is often overseen that the

regulation and the context of expression

of a gene is part of what defines its

biological function. For example, based

on sequence analysis, one readily finds

that the protein cyclo-oxygenase-2

(COX-2) has the same enzymatic activity

as COX-1, namely, a cyclo-oxygenase in

the synthesis of prostaglandins.56,57 This

information in itself was of little use until

it was established that COX-2 is regulated

differently than COX-1, and thus has a

different biological function.56 Beinginduced in inflammatory tissues, as

opposed to the constitutively active

COX-1, COX-2 obviously serves a

different function and thus provides a

more specific drug target than its non-

regulated paralogue. Thus, what

determines the distinctive biological

functions of COX-1 and COX-2 is

encoded in their regulation, not in their

enzymatic activity.57,58 Furthermore, a

regulatory protein, such as Ras, myc or

NFkB, can have multiple, disparate, if notopposite, functions (eg either cell

proliferation or differentiation/quiescence

or apoptosis), depending on the cellular

context, ie the presence of other proteins,

and the cell state and type.27,59

Hence, despite great efforts spent in

functional annotation,60,61 it is only in

rather exceptional cases that a specific

physiological function can unconditionally

and unambiguously be assigned to a given

protein as a separate entity, defined just by

the coding region of its gene. It is likely

that most proteins, notably regulatoryproteins, exert their biological function as

a member of a molecular network.

What is less obvious is what networks

can tell us about biological evolution

(Table 2). Representing an intermediate

level of information, between the level of

DNA sequence and that of organismal

phenotype, network structures are more

complex than DNA sequences, yet simple

enough for a mathematical representation

using the formalism from graph theory.

Thus, by serving, to all intents and

purposes, as an extended genotype,networks open an entirely new window

on the workings of evolution. Before

discussing what networks can teach about

the evolution of the organism, the author

will firstly summarise a few recent, salient

findings on network topology obtained

from the globalist point of view.

GLOBAL TOPOLOGY OFNETWORKSThe wiring diagram of the entire

network can readily be drawn from dataon relationships between biomolecules,

including metabolic reactions, physical

protein interactions (heterodimeric

complexes) and transcriptional gene

regulation. Network topology

information is available for metabolic

networks in various microorganisms5 and,

more recently, for a genome-wide

proteinprotein interaction network of

the yeast Saccharomyces cerevisiae.810,62,63

Context of gene

expression is part of

gene function

Table 2: Molecular networks as intermediate level of description complementing information from easily formalisablegenotype (gene sequence) and phenotype (organism).

Gene sequence Molecular network Organism

Formalisation Linear array easy to compare

Graph Morphometry, etc difficult to compare

Clues onevolution

Clues onfunction

Simple: relationship from sequencecomparison

Difficult: heuristics of protein domainprediction, comparison

New opportunity:intermediate betweengenotype and phenotype

Difficult: common ancestry orconvergence?

Simple: evident from observationor experiment


Huang


5/19

Genome-wide gene regulatory network

data are currently available forEscherichia

coli.7 In the terminology of graph theory,

a network consists of nodes (vertices) that

can be connected by links (edges) (Figure

1). Genome-wide networks consist of

thousands of nodes, and thus are referred

to as complex networks,64 by contrast

with the networks with low node

numbers studied in traditional graph

theory. Moreover, an important aspect of

biological networks is that they grow over

time, by contrast with the networks with

constant node numbers seen in traditional

graph theory.65 Thus, biomolecular

networks have a history of experiencing

an increase in the number of nodes.66

What catches the investigators interest is

every topological feature that deviates

from that of a random network, in which

Nnodes are randomly connected by L

links.65 The non-random structures found

in complex networks, then, provide the

starting point for further research into

their origin and functional significance.

To illustrate this, focus on the results

obtained for the most complete

eukaryotic protein interaction network

available, the protein interaction network

ofS. cerevisiae, which is based on genome-

wide data of pair-wise proteinprotein

interactions (Figure 1).810 It is important

to note that the incompleteness and

inaccuracies in the yeast protein

interaction dataset, mostly due to

potentially false-positive links,62,63 do not

essentially affect the global statistics, as

comparisons of various datasets by the

authors in the cited works have shown.

Many questions posed by the globalist can

be addressed with incomplete and not

perfectly accurate datasets, while thelocalist agenda requires precise and

detailed information. One elementary

global feature that was striking for

biologists especially those who think in

categories of isolated pathway modules

named after their putative cellular

functions is that, despite the overall

sparseness of the network connectivity

Figure 1: Global picture of the topology of large (complex) networks. Upper panels: graph ofthe network, displayed using Pajek software (http://vlado.fmf.uni-lj.si/pub/networks/pajek/). A.Computer generated, not growing random network,65 N 500 nodes, mean connectivitydegree ,k. $ 4. Note the dominating giant component (containing 487 connected nodes). B.Computer generated, growing network using the generative algorithm with preferentialattachment of new nodes,20 N 483, ,k. $ 4. (The algorithm necessarily leads to one singlecomponent). C. Yeast protein interaction network consisting of N 2,165 proteins, with,k. $ 4, based on the CORE data set.63 Lower panels: probability density distribution P(k) inloglog plots. The density distribution exhibits a straight line for B, Cindicating a powerlaw(scale-free) distribution




6/19

(three to four connections per protein, on

average), the vast majority (.75 per cent)

of the proteins in yeast participate in a

single, large sub-network of connected

nodes, a so-called giant component12

(Figure 1). Although counterintuitive at

first sight, this finding is not unexpected

from a graph theory point of view. Given

some minimal connectivity (for a

growing, random graph, at least k 1/8

links per node on average66), such a giant

component is practically unavoidable in

protein interaction networks (where

average k % 4), and is thus robust in a

fundamental way. Hence, this feature

does not even qualify as a non-random

feature that would point to a particularbiological purposefulness.

By contrast, Barabasi and coworkers

describe an interesting global topology

feature that indeed deviates from that of a

random graph.12,20 If all the nodes iof the

network are treated as one population,

and ki of each node i(ie the numberk of

interaction partners that node ihas) is

measured and displayed as a distribution,

then the occurrence probability P(k) of

each value ofk exhibits, approximately, a

power-law distribution, P(k) $ kg, rather

than as would be the case in randomnetworks a Poisson distribution. The

exponent gis a characteristic constant

(Figure 1). A power-law distribution

implies that the more nodes that are

sampled, the higher is the mean value of k

for the sampled population. Thus, there is

no stable mean that is characteristic for an

ensemble of proteins (nodes), and such

protein interaction networks are called

scale-free. Although it remains debated

as to whether the degree of distribution

precisely follows a power-law, its heavy-tailed nature implies that there are more

proteins with a high-degree k than one

would expect in a random network: the

protein interaction network is biased to

contain more hubs than one would

expect.

Other non-random topological features

that deviate from those of a random graph

have been identified in the yeast

interaction network. For example, hub

proteins tend not to interact with each

other.14 Moreover, the proteins or

metabolites in real networks have a higher

tendency than the nodes in random

networks to be organised into clusters

(cliques) in which the partners of a given

protein are also likely to interact with

each other.13,17 Together with the scale-

free property, these features give rise to

hierarchical modularity, as has been

described for metabolic networks.15 The

yeast protein network also contains a

significant number of proteins that have

low connectivity degrees, yet are central

in that they lie on many of the shortest

paths that connect any pair of proteins.67

Of interest also is the recent study ofnetwork motifs across the network.

Network motifs are subgraphs, ie locally

defined patterns of interaction between a

small number of nodes, such as triangles,

branching structures, etc.6,16,18,68 In gene

regulatory (transcription) networks, links

represent transactivation and are,

therefore, arrows rather than undirected

connections; thus, transcription networks

form directed graphs. Such digraphs exhibit

a higher diversity of network motifs than

the networks from protein interaction

databases, which yield undirectedconnections. Specifically, Shen and

coworkers found that in bacterial gene

regulatory networks, certain motifs of

directed links were significantly over-

represented, such as the feed-forward

loop (Figure 2).18

In view of all of these non-random

features of network topology, two

interdependent questions immediately

come to mind: how do such structures

originate, and how do they affect

biological function?

NETWORKS ANDEVOLUTION:ADAPTATION VERSUSINTRINSIC CONSTRAINTSA fundamental question in biology is why

organisms are the way they are: why does

a given trait exist, how does it arise? It is

obvious that the non-random network

features discussed above can now be

Topology of biological

networks deviate from

that of random

networks


Huang


7/19

treated as elementary phenotypic traits. In

an encompassing view of evolution, one

can distinguish between two

fundamentally distinct, but not mutually

exclusive, groups of mechanisms that can

contribute to the very existence and

nature of a given trait in living systems.

Because authors use a large variety of

terms with essentially identical or

overlapping core meaning, here they will

simply be called Type I and Type II

mechanisms, to avoid semantic ambiguity

or bias (Table 3).

The Type I mechanism concerns the

(neo-)Darwinian principles ofselection and

adaptation. Traits can (smoothly) vary in a

random fashion due to mutations; andselection (in a given environment) of the

fittest (of the inheritable) variants drives

the features towards a(n) (local) optimum

in terms of functionality. Because of the

underlying connotation of directedness or

finalism (even if the mutations themselves

are not), the adaptation mechanism is in

essence equivalent to the engineers

notion of optimisation of a task by

deliberate design, or the tinkerers trial

and error approach. The vast majority of

biologists think in categories of this

mechanism to explain almost everyfeature. The Type II mechanism

comprises a variety of alternative, ie non-

adaptionist mechanisms, summarised here

under the general term intrinsic

constraints. The mechanisms have in

common the fact that they embody a set

of rules that impose constraints so as to

make particular features unavoidable, ie

intrinsically robust, independent of a notion

of usefulness or self-maintenance in the

face of external perturbation.69,70 This

implies that no external force, likeadaptation or any other notion of

functionality or purposeful design, is

required. Constraint is a creative rather

than inhibitory force, for it is that which

prevents (constrains) a system from

settling down in a random,

undifferentiated, pattern less state. The

intrinsic constraints can be imposed by

elementary physical laws (eg spherical

shape), allometric relationships,71,72

X

A

B

XRE Gene X

XRE Gene X

Gene A ARE

1

2

3

4

GENE DUPLICATION

DIVERGING MUTATIONS

XRE Gene XGene A ARE

Duplicated segment

INSERTION

XRE Gene X

Gene B ARE

Gene A

XREDuplicated gene A,mutated to gene B

Figure 2: Hypothetical mechanism for the generation of a feed-forwardloop in the absence of selection for functionality. 1. Primordial, auto-regulative gene Xencodes a transcription factor that binds to its own cis-regulatory element, X-responsive element (XRE). 2. A second auto-regulative gene A, together with its cis-regulatory region, ARE, (eg, aregulon) is inserted into the 59 region ofX(eg mediated by transposable

elements) in the reverse direction. If XRE is a bidirectional promoterelement, XRE can contribute to transactivation of A. (This is the case inthe arabinose operon, which is part of a feedforward-loop.18) 3.Duplication of the DNA segment Gene A-ARE-XREcreates a new genethat is responsive to both A and X. Due to subsequent diversifyingmutations, Gene A becomes Gene B; ARE is partially deleted because it isredundant. Now factor Xregulates itself and Gene A via XRE, while bothproducts, A and R, regulate gene B. On top of this structural andprocedural scheme, which facilitates the spontaneous creation of a feed-forward loop, natural selection may then do the fine tuning, eg changestrength and modality of regulation by point mutations in the respectivecis or trans regulatory regions




8/19

architectural69,7375 (eg symmetry, space

filling) and mechanical principles (eg

forcebalance rules in tensegrity76) and,

importantly, self-organisation processes,

such as symmetry-breaking patterns24 (see

below). Other non-adaptive mechanisms

include historical (phylogenic inertia,

genetic drift, frozen accidents, etc) and

developmental constraints (restrictions in

the manufacturing path).69,70,75 Adistinct school of thinking explicitly

opposite to adaptionism is structuralism,

which stresses the study of structures

independent of selection for

functionality.70,77

Self-organisation is part of complexity

theory, which emerged as a field of study

in its own right in the 1980s24,78,79 but

which still lacks a rigorous definition, its

novelty and necessity is still being

questioned by some scientists. (This paper

will not refer to theories that address thequestion of macroscopic evolution and

the origin of the complexity of systems

itself, which is a different matter.80)

Nevertheless, by uniting various

theoretical concepts aimed at

understanding the origin of ordered, ie

non-random, patterns through self-

organisation, the approach of the complex

system theories has added substantial

momentum to the community of

researchers that emphasise Type II

explanations for particular features. By

providing astonishingly simple

explanations for the inherent robustness of

many counterintuitive patterns, such as

stripes, spirals, fractals, power-law

relationships, etc, which are found in

living as well as non-living systems,24

complexity theory may provide some of

the deeper principles of organisationwhich embody the design constraints and

which structuralists were seeking.

It is important to note that the Type I

and Type II mechanisms are not mutually

exclusive, but are consistent with each

other and can cooperate in some rather

convoluted way. For example, a particular

intrinsically robust structure, such as the

fractal branching patterns of pulmonary

bronchi81 or a network motif in

molecular networks (see below), may

simply have been exploited becausethey happen to be advantageous rather

than created by natural selection.

Variability and selection pressure may not

have been strong enough to mould such

structures from scratch by stepwise

exploration of many alternative, similar

but less functional forms in the

phenotype space. In other words,

functional adaptation may be a by-

product of an inherently robust process

Table 3: Major points of the two mechanisms for explaining the existence of distinct features in biology. Although the twomechanisms are not mutually exclusive, but complementary and perhaps synergistic, they have become polarised meansof explanation among biologists, with the majority today leaning towards Type I

Type I mechanism: Adaptation (natural selection) Type II mechanism: Intrinsic constraints

(Neo)-Darwinian evolution: random mutation generates phenotypicvariation, selection of a distinct feature based on fitness in a givenenvironment

Intrinsic inevitability (robustness) of a feature due to constraints in themechanism of its genesis or design: architecture constraints physical (mechanical, energetic, allometric) constraints self-organisation, pattern formation developmental constraints historical constraints (phylogenetic inertia, genetic drift, etc)

Biology as a historical science Biology can be historical or ahistorical(Local) optimisation of a trait for functionality No optimisation necessary, but can secondarily shape traitNotion of purposefulness is centralGradual, cumulative evolution used to explain unlikely complex trait

No notion of purposefulnessSelf-organisation used to explain unlikely trait

Adaptation is a powerful moulding force that generates traits fromscratch

Adaptation just maintains favourable traits that are inevitable due toconstraints

Natural selection is creative Natural selection is conservativeAdaptation is cause for a trait Adaptation is consequence of a trait


Huang


9/19

that creates structure based on internal

design principles. Selection then does the

fine-tuning.

Unfortunately, the above dichotomy

a matter of two non-competing and

overlapping explanations becomes a

matter of competing, polarised points of

view. The majority of modern molecular

biologists gravitate towards the adaptionist

perspective and put all faith in the

extraordinary power of natural selection

in the creation and moulding of traits.

This unquestioned belief in natural

selection as the core, irreducible principle

of explanation in biology hinders the

pluralistic endeavour to disentangle and

assess the relative contribution to a trait ofeither mechanism. Even Darwin himself

cautioned us in the introduction to The

Origin of Species: I am convinced that

Natural Selection has been the main, but

not the exclusive, means of

modification.82

EVOLUTION OF NETWORKSTRUCTURES: INTRINSICCONSTRAINTSThe dualistic scheme of explanation can

now be applied to the non-random

architectural features of networks. In thespirit of the prevailing bias towards

adaptionism, the identification of

particular network structures was almost

always automatically followed by the

interpretation that they must reflect some

optimisation of functionality by

evolution.18,83,84 Such statements,

although particularly attractive to the

engineers mind accustomed to

purposeful, optimal design, are dangerous.

When invoking the survival of the fittest

in the absence of rigorous analysis of theactual mechanism of the fitness associated

with a trait and of its history, one can

easily succumb to the inherently self-

referential logic of this statement (if what

is fit survives, then what survives must be

fit). Although pure Darwinian

adaptionism applies in many specific

scenarios, eg in one of its purest forms

the generation of antibiotic resistance, in

most cases both mechanisms appear to

contribute, in a synergistic manner, to the

prevalence of a given trait.

Topological features of networks are

ideal objects of study for determining the

relative contribution of either type of

mechanism to a distinct trait. The

deviation from randomness of a network

structure can be precisely determined and

a mechanism for the structural and

historical constraint formalised. A

network feature represents an elementary

phenotype, but unlike classical

phenotypic traits it does not undergo

ontogenesis, which complicates debates

on evolution. Thus, developmental

constraints do not have to be considered.

From the discussion presented above, itbecomes obvious that before attributing a

feature to adaptation, one needs to

exclude any constraint accounting for the

inherent inevitability of that structure.

The demonstration that a network feature

is significantly more abundant than one

would expect if the network were to have

been generated randomly, by comparing

it with a random graph, does not suffice

to support natural selection. Furthermore,

the demonstration of convergence, ie

similarity of a trait in the absence of

genetic evidence for common ancestry, isoften used as an argument in favour of

adaptation.85 Other than dispelling the

idea of a common ancestry, however,

convergence does not exclude the role of

various kinds of intrinsic constraints in

enforcing a particular feature, but may

instead point precisely to some

universality of the constraining design

rules.

The origin of a network motif needs to

be examined in more detail. The feed-

forward loop has been shown, based on acomparison with the random graphs, to

be a prominent, hence adaptive feature

in the E. coligene regulatory network

(Figure 2).18 This conclusion is

problematic, since real gene regulatory

networks cannot be randomly redrawn

like the nodes and links in a graph. On

the contrary, evolution of the genome

and, hence, of the network that it

encodes, is subject to a series of

Not all traits are

evolved by natural

selection alone

Structural constraints

may have helped shape

network topology

features




10/19

mechanistic constraints that may

significantly reduce and bias the space of

the possible random network

architectures represented by the

immaterial, random graphs. Genes are

embodied by a linear physical structure,

the DNA, which rearranges in particular

ways. Genome evolution occurs via a

finite set of physicochemical mechanisms,

including chromosome and tandem gene

duplication, followed by diversification of

the duplicated gene pair due to mutations

of the redundant genes.5355 Moreover,

rearrangements of DNA are caused by

DNA segment inversion or deletion, or

insertion of DNA fragments via

transposable elements, crossovers,etc.54,55,86 These processes, as well as

promoter-generating mutations,87 drive

the rewiring of gene regulatory networks

by shuffling cis- and trans-regulatory

regions. For example, the duplication of a

gene, with subsequent mutations and

acquisition of new properties in the

coding region but not in the promoter

region,53 would create a branching

network motif (one gene activates two

genes). These mechanisms could facilitate

the genesis of particular network motifs

over others, independent of functionality.Figure 2 shows a hypothetical mechanism

that could explain the genesis of a feed-

forward loop18 via only a gene insertion

and a gene duplication event, plus a few

minor mutations.

Mechanistic and historical

considerations help to place the potential

effect of adaptation by selection in the

appropriate light. Unlike the ad hoc

assumption of an effective selective

advantage, the assumption of mechanistic

and historical constraints can be rigorouslytested.75 Analysis of the DNA sequence to

trace back the history of the cis- and trans-

regulatory elements within a specific

network motif, combined with the study

of generative algorithms constrained by

mechanistic considerations of DNA

rearrangement, could reveal the

contribution by Type II mechanisms to a

network structure.

In the case of global structures, like the

power-law degree distribution ofk

discussed above, Barabasi and coworkers

proposed a generative algorithm for

network growth (increase in node

number) that inevitably results in scale-

free networks.20 An essential ingredient of

their algorithm is that genes newly added

to the existing, growing network

preferentially establish a connection to an

already highly connected gene. Although

the model of genesis is cast in generic

terms, one can now examine whether

simulations of the actual physical

mechanism (increase in genome size by

gene duplication, rewiring by mutations,

particular exon shuffling, etc) are

compatible with the generic model.88

Infact, protein sequence comparisons that

allowed the determination of the

evolutionary age of a protein appear to

suggest that the hubs are older and that

preferential attachment might have taken

place during evolution.89 A power-law

architecture has also been found in a large

number of non-biological networks, such

as social and computer networks,90,91

suggesting that adaptive evolution is not

likely to be the creator of this

architecture.

This section on the evolution ofnetwork features concludes by

emphasising again that each of the types

of explanation for a particular structure

adaptation and intrinsic constraints

does not exclude the other. Evolution

will secondarily, by chance, coopt an

intrinsically robust feature that happens to

have desirable properties and maintain it.

For example, a giant component of the

network which is an unavoidable

structure may coincide with functional

advantages: in the case of the metabolicnetwork, it might facilitate global

coordination of energy utilisation upon

change of nutritional source; and in the

case of the regulatory network, global

connectivity allows regulation of whole-

cell behaviour, such as cell fate switches,

as discussed below. Similarly, structural

modularity of networks, for which several

advantages have been attributed in terms

of functionality and evolvability,92 may

Natural selection and

historical/structural

constraints cooperate in

network evolution


Huang


11/19

simply have been a fortuitous accident of

horizontal (vector-mediated) transfer of

entire clusters of genes. Such groups

of genes could collectively confer a

biochemical function to the host,

allowing it to invade a new metabolic

niche, and hence be maintained as a

module by selection.87 Since the

expression of these genes is coordinated

inter sesuch as to perform a metabolic

task, they necessarily form a network

module. Finally, as proposed below, the

power-law feature of regulatory networks

may have the functional benefit of

generating ordered global network

dynamics. Thus, adaptation is part of the

game, but it may be conservative, or justdoing the fine tuning, rather than

creative.

NETWORK FUNCTION:FROM TOPOLOGY TODYNAMICSMost functional interpretation of

networks has been based on their

topology alone. For example, it has been

suggested that highly connected hubs in

networks are more likely to be essential

genes because they interact with many

proteins. In fact, for the yeast proteomethere is a weak association between

degree k of a protein and the probability

that it is essential.12 The power-law

network has been shown to be generically

more tolerant to random failure of

individual nodes, but vulnerable to

targeted attacks on the hubs.93 This

would be in line with the suggestion that

hub proteins have a slower rate of

evolution.94

Much of the topology-based reasoning

about function rests on the unarticulatedpremise that the molecular network acts

like a communication network in which

some information flows in the links from

node to node. Although this may be

appropriate for metabolic reaction

networks, it certainly does not apply to

networks of regulation, like the protein or

transcription networks, where a link

represents an influence rather than a flow.

To understand functionality, it is,

therefore, necessary to move from the

domain oftopology to that ofdynamics.

Network dynamics come into play when

considering two essential properties of

molecular regulation networks:

(i) A node iin the network (gene or

protein) takes a value, xi(t), which

represents its expression or activation

level, which changes with time in

response to incoming influences from

other nodes j, k, l, . . . , represented by

the links.

(ii) The links of the network represent

influences (rather than flows) and have

a modality (eg inhibition or

stimulation). Together withinteraction rules (eg additive effects, or

some Boolean functions), they

determine how the level xi(t) of a

given protein i(output) is jointly

affected by the values xj,k,l (t) of its

interaction partners j, k, l, etc

(inputs).

The dynamics of the system is the

concerted change in the levels of xi(t) for

all of the nodes iof the network, and can

be represented by the N-dimensional state

vectorS(t) [x1(t), x2(t),. . .

xN(t)] for anetwork with Nnodes. While the

topology is represented by a graph, the

dynamics can be imagined as a more

abstract N-dimensional state space in

which S(t) moves its position with time t

as defined by its components [x1(t), x2(t),

. . . xN(t)] (see Figure 3 for details). The

interactions of the network now impose

restrictions on the movement of the state

vectorS(t): a given state can only move in

certain directions in the state space

because its components xi(t) cannotchange independently but influence each

other. This frustration of the freedom of

movement is where the unfathomable

complexity of large networks collapses. If

the wiring diagram is right, this

restriction will give rise to simple, robust,

higher-order behaviour, as will be seen

below.

Mathematically, the dynamics of the

network can be described and modelled




12/19

using a large variety of formalisms that

differ in the level of coarse-graining. For

example, systems of ordinary differential

equations, with continuous values of

concentrations forxi(t) that are subjected

to mass action kinetics (like chemical

reactions), are often used for smaller

circuits consisting of a handful of proteins

or genes.9598 This is already an

abstraction, in particular when used to

model gene regulatory or signal

transduction, since xi(t) represents a

hypothetical average of the stochastically

fluctuating activity of node i, taken over

time and over an ensemble of

heterogeneous cells. At the even more

Figure 3: From topology to dynamics; illustrated for a small circuit and a large network. Toppanel: Small circuit (signalling module) consisting of two proteins A and B, which are mutuallyinhibitory, and promote their own decay. For a wide range of kinetic parameters, this topologygives rise to bistability. The middle diagram shows a phase plane (two-dimensional state space)that represents the dynamics. Each dot represents a state S(t) [xA(t), xB(t)] of the circuit attime t, defined by the expression or activity values x(t) of the two nodes A and B (see text). Theticks emanating from each dot point to the direction in state space in which that respectivestate will move in the next time unit to comply with the interactions (ie, systems equations) ofthe circuit. There are two stable states, Xand Y, to which the ticks point. The right-handdiagram shows an intuitive projection of the state space (along the line p), emphasising thebistability of the system, with states Xand Yas potential wells (attractors), and a hill inbetween representing an unstable region. Bottom panel: Analogous illustration of topology anddynamics, but for a larger network (N 24). Now the state space is N-dimensional; it containsall the network states, defined by S(t) [x1(t), x2(t), . . . xN(t)], and cannot be drawn, however,it can be schematically considered as a landscape in which the pits represent (high-dimensional)attractor states (right-hand diagram). In this projection, every point in the xy plane representsa network state S(t), arranged so as to generate the landscape, where the vertical axis wouldrepresent some state change potential value reflecting the dissimilarity of the state Sxy atposition (x,y) to that of the attractor state that it will eventually reach. Each attractor statecorresponds to a distinct phenotypic cell state, the cell fate. The landscape can be thought of asthe formal and molecular equivalent of Waddingtons epigenetic landscape (see Figure 4)

A B

A

B

low Ahigh B

high Alow B

1

State X

State X

State Y

State Y

2D phase plane

( N-dimensionalstate space )

Projection as potential wells

SMALLCIRCUITS

LARGE

NETWORKS

TOPOLOGY

Schematic projection as3D epigenetic landscape

DYNAMICS PHENOTYPIC BEHAVIOUR

p

p

N = 2

N = 24 attractors = cell fates


Huang


13/19

abstract end of the spectrum of models are

the discrete networks in which genes are

represented by binary, ON-OFF

switches, with Boolean functions

determining the interaction modalities.

Because of their computational simplicity,

such Boolean networks have been used as

a convenient, tractable caricature for

studying fundamental aspects of the global

behaviour of large networks using

statistical ensembles of networks of

varying topologies.4,22 In fact, Boolean

functions (eg AND, OR, . . .) capture the

link modalities and their cooperation

rather well/although these are, in reality,

embodied by the molecular interactions at

promoters and in protein proteincomplexes.47,99,100

Quantitative modelling of local

modules of signalling and gene regulatory

pathways that consider non-linear aspects

(such as positive and negative feedback

loops) can predict a variety of interesting

behaviours, such as robustness, multi-

stability, oscillation and switch-like

behaviour.101 These emergent qualities

are not evident from study of the

topology, nor can they intuitively be

predicted by the linear upstream

downstream schemes of theexperimentalist (Figure 3). This pathway-

centred modelling, however, predicts the

dynamics only of the components of a

particular signalling module, not of

phenotypic cell behaviour.

CELL FATES ASATTRACTORS INGENOME-SCALENETWORKSA globalist view of the dynamics of the

network may also, therefore, be necessary.Two findings suggest that global network

dynamics may be relevant. First, the

existence of a giant component in the

protein interaction network12 (and

probably also in the gene regulatory

network6) raises the question of whether

there is a collective, higher-order

behaviour that arises from the large

number (N 1,00010,000) of

interdependent genes and proteins, or

whether their activity changes simply

average out and get lost over the large

population of the nodes, so that no

interesting global pattern emerges. In

other words, is the genome-wide

network (or large portions of it) wired in

such a way that it acts coherently as one

entity with a distinct macroscopic

dynamics and, thus, contains additional

information which would not be

accessible in the study of pathway

modules in separation? Secondly, cells in

multicellular organisms exhibit a simple,

coherent whole-cell behaviour which

may precisely reflect a higher-order

dynamics of the global network: the

switching between cell fates. This strictlyregulated, rule-based systems behaviour is

robust and remarkably simple compared

with the complexity of the underlying

molecular network.100

Cells in multicellular organisms are able

to rapidly and reliably integrate a

multitude of simultaneous and often

conflicting signals from their tissue

environment, and respond by selecting

just one of a few discrete cell fates, such as

quiescence, proliferation, migration or

differentiation into distinct cell types.100

Tight regulation of the balance betweencell fates is required for the development

and homeostasis of the tissue. Cell fates

are discrete, mutually exclusive

behavioural types, as Waddington

described back in the 1940s.102 What is

particularly counterintuitive from the

localist, pathway-centred perspective, is

that the same cell fate switch can be

triggered by a broad variety of unrelated

biochemical and physical stimuli, while

the same biochemical stimulus can elicit

entirely different cell fate switches depending on the context.59,100 This

defies the localist notion of modular

pathways that serve a specific cellular

function. The mutual exclusivity of cell

fates is often taken for granted but is a

systems property that requires that the

pathways that have been associated with

particular cell fates be globally

coordinated. This would be consistent

with the function of a giant component in

The dynamics of largenetworks is highly

constrained

Cells undergo almost

discrete transitions

between a few distinct

cell phenotypes




14/19

the regulatory networks that spans almost

the entire genome (even if this structural

feature is statistically almost unavoidable,

and thus may not have evolved for this

function, as discussed above).

Do the dynamics of large networks in

fact generate robust, simple, macroscopic

behaviour that is manifest as regulated cell

fate switching? To overcome the lack of

specific information on genes and the

precise network topology, yet be able to

study the universal behaviour of large

networks, more than 30 years ago

Kauffman used random Boolean networks

as models to show that, given some

minimal topological constraints (such as

sparse connectivity), the global dynamicsof a network with Ngenes will be far

from chaotic (disordered), but that due

to the constraints imposed by the

particular genegene interactions, it will

inevitably exhibit some ordered collective

behaviour: most states S(t) are not stable,

thus forcing S(t) to move in the state

space until it arrives at a stable state to

which unstable states will be attracted (see

Figure 3 for details).3,22 These attractor

states can now be designated as the cell

fates or, in Kauffmans more general view,

as the various cell types in a multicellularorganism.22,100 In fact, the dynamics of a

network with attractor states naturally

captures the essential properties of cell fate

dynamics, including mutual exclusivity,

robustness and all-or-none transitions

between cell fates in response not to a

single specific instructive signal but to a

large variety of signals.

Although Kauffman used randomly

connected networks (with restrictions

only with regard to connectivity degree

and type of Boolean function), he wasable to show some universal dynamic

properties, such as the existence of robust

attractors.22 Kauffmans pioneering work

receives little attention from modern day

experimental and computational

molecular biologists, however,34,35 who

in their localist point of view emphasise

the description of specific, idiosyncratic

details over abstraction and analysis of

higher-order behaviour. Interestingly, it

turned out that the scale-free network

topology that is now found in real

networks is even more likely than

randomly connected networks to generate

a reasonable, ordered dynamics, in the

sense of Kauffman (ie containing stable,

compact attractors, consistent with cell

fate behaviour).103 Thus, the self-

organised scale-free topology relaxes

Kauffmans restrictions on gene network

topology for producing ordered

behaviour.

The idea that attractors correspond to

distinct, stable phenotypic states is just one

aspect of a more general view. The

genome-wide molecular network and,

hence, the cell do not randomly andindifferently visit all possible network

states S(t). Instead, because of the

interactions between the molecular

constituents, and the particularity of their

wiring diagram, the state space is not

isotropic, but highly sub structured, or

compartmentalised, into stable attractor

states and their basins of attraction. Thus,

the interactions between genes and

proteins are wired such as to impose

restrictions on the dynamics of the

network S(t). The complexity of the

genomic network collapses into simple,robust behaviour with relatively few

stable network states the biological cell

fates. The attractors in the state space

landscape (Figure 3) may thus represent

a molecular and formal explanation for

Waddingtons epigenetic landscape

(Figure 4), which he proposed as an

intuitive metaphor to explain the

discreteness of cell fates more than 60

years ago, without the notion of genes

and networks.

CONCLUSIONIn this paper the author has discussed

disparate but complementary viewpoints

in biology that the burgeoning notion of

system-wide networks has exposed. On

the one hand, a localist view that stresses

the importance of detailed pathway

analysis at the level of nominal genes and

proteins, and that considers a network as

the sum of all the pathways; and on the


Huang


15/19

other hand, a globalist view that

emphasises the additional emergentinformation harboured by a network as an

entity. Since living systems are both

complicated and complex,27 both

viewpoints are equally important.

Network structures provide an ideal

handle with which to determine the

relative contribution of natural selection

versus intrinsic constraints to particular

traits. Global network dynamics,

however, open a new perspective to cell

fate decisions in multicellular organisms

which will foster understanding ofphenomena such as the multi-potency of

stem cells.

In conclusion, however, it is important

to remember that networks are just a

convenient abstraction, which fail to

capture some essential properties of living

systems. First, they represent a level of

description that is devoid of space and

physicality, such as the subcellular

localisation of the proteins representing

the network nodes, as well as mechanical

forces within cells.76 Secondly, in

multicellular organisms, every cell

contains its own individual network. One

forgets that in a theoretical treatment one

analyses and models a single proxy

network whose behaviour is then

mentally multiplied to billions of identical

replicate cells in order to represent a tissue

with its billions of cells. This

subconscious, amplifying projection is

problematic, since there is no

representative average cell104 because of

molecular noise105107 and other,

epigenetic variations.108 Instead, a

population, even of clonal cells, can

dramatically vary in its network state, asevidenced by the dispersion of behaviour

of the same genes in individual cells

within a population.104,109 Perhaps this

population heterogeneity itself is a desired

feature, because the dispersion of single

cell behaviour establishes robust,

characteristic macroscopic response

dynamics for the tissue as an entity. In this

case, the detailed behaviour of a single

cellular network would not matter, while

sloppiness within the network of

individual cells may even be beneficial for

the behaviour of tissues. Finally, at thetissue level, there is another network: the

cellcell communication network,

mediated by secreted factors, extracellular

matrix and mechanical forces.

Understanding intracellular molecular

network is, therefore, all but a first step

towards this network of networks that

makes multicellular living systems work.

Acknowledgment

This work was funded by the Air Force Office of

Scientific Research (AFOSR/USAF) and thePatterson Trust. The author also wishes to thank

Joy P. Maliackal for helpful discussions on

networks and Donald E. Ingber for his continuous

support.

References

1. Monod, J. and Jacob, F. (1961), Generalconclusions: Teleonomic mechanisms incellular metabolism, growth anddifferentiation, Cold Spring Harbor Symp.Quant. Biol., Vol. 26, pp. 389 401.

Figure 4: Epigenetic landscape proposedby Waddington in the 1940s as a metaphorto capture the observable dynamics of cellfates. The marble represents a cell state.Since Waddington was studyingdevelopment, the overall slope representsthe driving force of development. As themarble rolls downwards, the cell is forced tochoose between discrete fates represented by the valleys. Reprinted fromWaddington (1957), The Strategy of theGenes, Allen and Unwin, London.




16/19

2. Delbruck, M. (1949), Discussion, in CNRS,Eds, Unites biologiques douees de continuitegenetique, Colloques Internationaux duCentre National de la RechercheScientifique, Paris.

3. Kauffman, S. A. (1969), Metabolic stabilityand epigenesis in randomly constructedgenetic nets, J. Theor. Biol., Vol. 22,pp. 437467.

4. Huang, S. (2001), Genomics, complexityand drug discovery: Insights from Booleannetwork models of cellular regulation,Pharmacogenomics, Vol. 2, pp. 203 222.

5. Overbeek, R., Larsen, N., Pusch, G. D. et al.(2000), WIT: integrated system for high-throughput genome sequence analysis andmetabolic reconstruction, Nucleic Acids Res.,Vol. 28, pp. 123 125.

6. Lee, T. I., Rinaldi, N. J., Robert, F. et al.

(2002), Transcriptional regulatory networksin Saccharomyces cerevisiae, Science, Vol. 298,pp. 799804.

7. Salgado, H., Santos-Zavaleta, A., Gama-Castro, S. et al. (2001), RegulonDB (version3.2): transcriptional regulation and operonorganization in Escherichia coliK-12, NucleicAcids Res., Vol. 29, pp. 72 74.

8. Mewes, H. W., Frishman, D., Guldener, U.et al. (2002), MIPS: A database for genomesand protein sequences, Nucleic Acids Res.,Vol. 30, pp. 3134.

9. Uetz, P., Giot, L., Cagney, G. et al. (2000),A comprehensive analysis of proteinprotein

interactions in Saccharomyces cerevisiae, Nature,Vol. 403, pp. 623627.

10. Ito, T., Chiba, T., Ozawa, R. et al. (2001), Acomprehensive two-hybrid analysis toexplore the yeast protein interactome, Proc.Natl. Acad. Sci. USA, Vol. 98, pp. 45694574.

11. Jeong, H., Tombor, B., Albert, R. et al.(2000), The large-scale organization ofmetabolic networks, Nature, Vol. 407, pp.651654.

12. Jeong, H., Mason, S. P., Barabasi, A.-L. andOltvai, Z. N. (2001), Lethality and centralityin protein networks, Nature, Vol. 411,pp. 4142.

13. Wagner, A. and Fell, D. A. (2001), Thesmall world inside large metabolic networks,Proc. R. Soc. Lond., B, Biol. Sci., Vol. 268,pp. 1803 1810.

14. Maslov, S. and Sneppen, K. (2002),Specificity and stability in topology ofprotein networks, Science, Vol. 296,pp. 910913.

15. Ravasz, E., Somera, A. L., Mongru, D. A.et al. (2002), Hierarchical organization ofmodularity in metabolic networks, Science,Vol. 297, pp. 15511555.

16. Milo, R., Shen-Orr, S., Itzkovitz, S. et al.(2002), Network motifs: Simple buildingblocks of complex networks, Science, Vol.298, pp. 824 827.

17. Wuchty, S. (2002), Interaction and domainnetworks of yeast, Proteomics, Vol. 2,pp. 1715 1723.

18. Shen-Orr, S. S., Milo, R., Mangan, S. andAlon, U. (2002), Network motifs in thetranscriptional regulation network ofEscherichia coli, Nat. Genet., Vol. 31,pp. 6468.

19. Sacks, O. (2002), Forgetting and Neglect inScience, in Silver, R. B., Ed., HiddenHistories of Science, New York Review ofBooks, pp. 141 179.

20. Barabasi, A. L. and Albert, R. (1999),Emergence of scaling in random networks,Science, Vol. 286, pp. 509512.

21. Goodwin, B. (1993), How the LeopardChanged Its Spots: The Evolution ofComplexity, Princeton University Press,Princeton, NJ.

22. Kauffman, S. A. (1993), The Origins ofOrder, Oxford University Press, New York,NY.

23. Kauffman, S. A. (2000), Investigations,Oxford University Press, New York, NY.

24. Ball, P. (1998), The Self-made Tapestry:Pattern Formation in Nature, OxfordUniversity Press, New York, NY.

25. Coffey, D. S. (1998), Self-organization,

complexity and chaos: The new biology formedicine, Nat. Med., Vol. 4, pp. 882 885.

26. Strohman, R. C. (1997), The comingKuhnian revolution in biology, Nat.Biotechnol., Vol. 15, pp. 194200.

27. Huang, S. (2000), The practical problems ofpost-genomic biology, Nat. Biotechnol., Vol.18, pp. 471 472.

28. Oltvai, Z. N. and Barabasi, A. L. (2002),Systems biology. Lifes complexity pyramid,Science, Vol. 298, pp. 763764.

29. Murphy, D. (2002), Gene expression studiesusing microarrays: Principles, problems, andprospects, Adv. Physiol. Educ., Vol. 26, pp.256270.

30. Mollay, M. P. and Witzmann, F. A. (2002),Proteomics: Technologies and applications,Brief. Funct. Genom. Proteom., Vol. 1, pp. 2329.

31. Marcotte, E. M. (2001), The path not taken,Nat. Biotechnol., Vol. 19, pp. 626627.

32. Aebersold, R., Hood, L. E. and Watts, J. D.(2000), Equipping scientists for the newbiology, Nat. Biotechnol., Vol. 18, p. 359.

33. Ideker, T., Thorsson, V., Ranish, J. A. et al.(2002), Integrated genomic and proteomic


Huang


17/19

analyses of a systematically perturbedmetabolic network, Science, Vol. 292,pp. 929934.

34. Endy, D. and Brent, R. (2001), Modelling

cellular behaviour, Nature, Vol. 409(Suppl.),pp. 391395.

35. Bray, D. (2003), Molecular networks: Thetop-down view, Science, Vol. 301, pp. 18641865.

36. Nurse, P. (2003), Systems biology:Understanding cells, Nature, Vol. 424,p. 883.

37. Stuart, J. M., Segal, E., Koller, D. and Kim.S. K. (2003), A gene-coexpression networkfor global discovery of conserved geneticmodules, Science, Vol. 302, pp. 249255.

38. Tornow, S. and Mewes, H. W. (2003),Functional modules by relating protein

interaction networks and gene expression,Nucleic Acids Res., Vol. 31, pp. 6283 6289.

39. Bader, G. D., Heilbut, A., Andrews, B. et al.(2003), Functional genomics andproteomics: Charting a multidimensionalmap of the yeast cell, Trends Cell. Biol., Vol.13, pp. 344 356.

40. Ge, H., Walhout, A. J. and Vidal, M. (2003),Integrating omic information: a bridgebetween genomics and systems biology,Trends Genet., Vol. 19, pp. 551560.

41. Evans, G. (2000), Designer science and theomic revolution, Nat. Biotechnol., Vol. 18,p. 127.

42. Eisen, M. B., Spellman, P. T., Brown, P. O.and Botstein, D. (1998), Cluster analysis anddisplay of genome-wide expression patterns,Proc. Natl. Acad. Sci. USA, Vol. 95,pp. 1486314868.

43. Steffen, M., Petti, A., Aach, J. et al. (2002),Automated modelling of signal transductionnetworks, BMC Bioinformatics, Vol. 3, pp. 34.

44. Shannon, P., Markiel, A., Ozier, O. et al.(2003), Cytoscape: a software environmentfor integrated models of biomolecularinteraction networks, Genome Res., Vol. 13,pp. 2498 2504.

45. Bader, G. D. and Hogue, C. W. (2000),BIND-a data specification for storing and

describing biomolecular interactions,molecular complexes and pathways,Bioinformatics, Vol. 16, pp. 465 477.

46. Chen, Y. and Xu, D. (2003), Computationalanalyses of high-throughput proteinproteininteraction data, Curr. Protein Pept. Sci., Vol.4, pp. 159 181.

47. Davidson, E. H., Rast, J. P., Oliveri, P. et al.(2002), A genomic regulatory network fordevelopment, Science, Vol. 295, pp. 16691678.

48. Hasty, J., McMillen, D., Isaacs, F. and

Collins, J. J. (2001), Computational studiesof gene regulatory networks: in numeromolecular biology, Nat. Rev. Genet., Vol. 2,pp. 268279.

49. Dhaeseleer, P., Liang, S. and Somogyi, R.(2000), Genetic network inference: Fromco-expression clustering to reverseengineering, Bioinformatics, Vol. 16,pp. 707726.

50. Gardner, T. S., di Bernardo, D., Lorenz, D.and Collins, J. J. (2003), Inferring geneticnetworks and identifying compound mode ofaction via expression profiling, Science, Vol.301, pp. 102105.

51. Tamada, Y., Kim, S., Bannai, H. et al. (2003),Estimating gene networks from geneexpression data by combining Bayesiannetwork model with promoter elementdetection, Bioinformatics, Vol. 19 (Suppl. 2),

pp. II227 II236.52. Tatusov, R. L., Galperin, M. Y., Natale,

D. A. and Koonin, E. V. (2000), The COGdatabase: A tool for genome-scale analysis ofprotein functions and evolution, Nucleic AcidsRes., Vol. 28, pp. 33 36.

53. Lynch, M., Conery, J. S. (2000), Theevolutionary fate and consequences ofduplicate genes, Science, Vol. 290, pp. 11511155.

54. Eichler, E. E. and Sankoff, D. (2003),Structural dynamics of eukaryoticchromosome evolution, Science, Vol. 301,pp. 793797.

55. Long, M., Deutsch, M., Wang, W. et al.(2003), Origin of new genes: Evidence fromexperimental and computational analyses,Genetica, Vol. 118, pp. 171 182.

56. Williams, C. S. and DuBois, R. N. (1996),Prostaglandin endoperoxide synthase: whytwo isoforms?, Am. J. Physiol., Vol. 270,pp. G393 G400.

57. Tanabe, T. and Tohnai, N. (2002),Cyclooxygenase isozymes and their genestructures and expression, Prostaglandins OtherLipid Mediat., Vol. 6869, pp. 95114.

58. Kam, P. C. and See, A. U. (2000), Cyclo-oxygenase isoenzymes: Physiological andpharmacological role, Anaesthesia, Vol. 55,

pp. 442449.

59. Huang, S. and Ingber, D. E. (2000), Shape-dependent control of cell growth,differentiation and apoptosis: Switchingbetween attractors in cell regulatorynetworks, Exp. Cell Res., Vol. 261, pp.91103.

60. Ashburner, M. and Lewis, S. (2002), Onontologies for biologists. The Gene Ontology

untangling the web, Novartis Found Symp.,Vol. 247, pp. 6680.

61. Wu, C. H., Huang, H., Arminski, L. et al.




18/19

(2002), The protein information resource:An integrated public resource of functionalannotation of proteins, Nucleic Acids Res.,Vol. 30, pp. 3537.

62. von Mering, C., Krause, R., Snel, B. et al.(2002), Comparative assessment of large-scale data sets of protein-protein interactions,Nature, Vol. 417, pp. 399403.

63. Deane, C. M., Salwinski, L., Xenarios, I. andEisenberg, D. (2002), Protein interactions:Two methods for assessment of the reliabilityof throughput observations, Mol. Cell.Proteomics, Vol. 1, pp. 349 356.

64. Strogatz, S. H. (2001), Exploring complexnetworks, Nature, Vol. 410, pp. 268 276.

65. Erdos, P. and Renyi, A. (1959), On randomgraphs, Publ. Math., Vol. 6, p. 290.

66. Callaway, D. S., Hopcroft, J. E., Kleinberg,

J. M. et al. (2001), Are randomly growngraphs really random?, Phys. Rev. E Stat.Nonlin. Soft Matter Phys., Vol. 64, 041902.

67. Joy, M. P., Brock, A. and Huang, S.,manuscript in preparation.

68. Itzkovitz, S., Milo, R., Kashtan, N. et al.(2003), Subgraphs in random networks,Phys. Rev. E Stat. Nonlin. Soft Matter Phys.,Vol. 68, 026127.

69. Gould, S. J. and Lewontin, R. C. (1979),The spandrels of San Marco and thePanglossian paradigm: A critique of theadaptationist programme, Proc. R. Soc. Lond.,B, Biol. Sci., Vol. 205, pp. 581 598.

70. Webster, G. and Goodwin, B. (1981),History and structure in biology, Perspect.Biol. Med., Vol. 25, pp. 39 62.

71. Gould, S. J. (1966), Allometry and size inontogeny and phylogeny, Biol. Rev. Camb.Philos. Soc., Vol. 41, pp. 587 640.

72. West, G. B., Brown, J. H. and Enquist, B. J.(1999), The fourth dimension of life: fractalgeometry and allometric scaling oforganisms, Science, Vol. 284, pp. 16771679.

73. Thompson, D. W. (1942), Growth andForm, Macmillian, New York, NY.

74. Seilacher, A. (1970), Arbeitskonzept zurKonstruktionsmorphologie, Lethaia, Vol. 3,

pp. 393396.

75. Autumn, K., Ryan, M. J. and Wake, D. B.(2002), Integrating historical and mechanisticbiology enhances the study of adaptation,Q. Rev. Biol., Vol. 77, pp. 383408.

76. Ingber, D. E. (2003), Tensegrity I. Cellstructure and hierarchical systems biology,J. Cell Sci., Vol. 116, pp. 1157 1173.

77. Hughes, A. J. and Lambert, D. M. (1984),Functionalism and structuralism, and Waysof Seeing , J. Theor. Biol., Vol. 111,pp. 787800.

78. Mitchell Waldrop, M. (1992), Complexity.The emerging science at the edge of orderand chaos, Simon and Schuster, New York,NY.

79. Bar-Yam, Y. (1997), Dynamics in complexsystems (studies in nonlinearity), PerseusPublishing New York, NY.

80. Carlson, J. M. and Doyle, J. (2002),Complexity and robustness, Proc. Natl. Acad.Sci. USA, Vol. 99(Suppl. 1), pp. 25382545.

81. Weibel, E. R. (1991), Fractal geometry: Adesign principle for living organisms, Am. J.Physiol., Vol. 1, pp. L361 L369.

82. Darwin, C. (1964), On the origin of species:A facsimile of the first edition, HarvardUniversity Press, Cambridge, MA.

83. Alon, U. (2003), Biological networks: Thetinkerer as an engineer, Science, Vol. 301,

pp. 1866 1867.84. Wuchty, S., Oltvai, Z. N. and Barabasi, A. L.

(2003), Evolutionary conservation of motifconstituents in the yeast protein interactionnetwork, Nat. Genet., Vol. 35, pp. 176179.

85. Conant, G. C. and Wagner, A. (2003),Convergent evolution of gene circuits, Nat.Genet. Vol. 34, pp. 264 266.

86. Lawrence, J. G. (2002), Shared strategies ingene organization among prokaryotes andeukaryotes, Cell, Vol. 110, pp. 407 413.

87. Dabizzi, S., Ammannato, S. and Fani, R.(2001), Expression of horizontally transferredgene clusters: Activation by promoter-generating mutations, Res. Microbiol., Vol.152, pp. 539 549.

88. Berg, J., Lassig, M. and Wagner, A. (2003),Structure and evolution of protein networks(in review). Available at http://arxiv.org/abs/cond-mat/0207711.

89. Eisenberg, E. and Levanon, E. Y. (2003),Preferential attachment in the proteinnetwork evolution, Phys. Rev. Lett., Vol. 91,138701.

90. Amaral, L. A., Scala, A., Barthelemy, M. andStanley, H. E. (2000), Classes of small-worldnetworks, Proc. Natl. Acad. Sci. USA, Vol.97, pp. 1114911152.

91. Barabasi, A.-L. (2002), Linked: How

everything is connected to everything andwhat it means, Perseus Publishing New

York, NY.

92. Hartwell, L. H., Hopfield, J. J., Leibler, S.and Murray, A. W. (1999), From molecularto modular cell biology, Nature, Vol.402(Suppl.), pp. C47C52.

93. Albert, R., Jeong, H. and Barabasi, A.-L.(2000), Error and attack tolerance ofcomplex networks, Nature, Vol. 406, pp.378382.

94. Fraser, H. B., Wall, D. P. and Hirsh, A. E.


Huang


19/19

(2003), A simple dependence betweenprotein evolution rate and the number ofproteinprotein interactions, BMC Evol.Biol., Vol. 23, p. 11.

95. Tyson, J. J., Chen, K. and Novak, B. (2001),Network dynamics and cell physiology, Nat.Rev. Mol. Cell. Biol., Vol. 2, pp. 908 916.

96. Ferrell Jr, J. E. and Machleder, E. M. (1998),The biochemical basis of an all-or-none cellfate switch in xenopus oocytes, Science, Vol.280, pp. 895898.

97. Bhalla, U. S., Ram, P. T. and Iyengar, R.(2002), MAP kinase phosphatase as a locus offlexibility in a mitogen-activated proteinkinase signaling network, Science, Vol. 297,pp. 1018 1023.

98. Hoffmann, A., Levchenko, A., Scott, M. L.and Baltimore, D. (2002), The IkappaB-NF-kappaB signaling module: temporal control

and selective gene activation, Science, Vol.295, pp. 12411245.

99. Dueber, J. E., Yeh, B. J., Chak, K. and Lim,W. A. (2003), Reprogramming control of anallosteric signaling switch through modularrecombination, Science, Vol. 301, pp. 19041908.

100. Huang, S (2002), Regulation of cellularstates in mammalian cells from a genome-wide view, in Collado-Vides, J. andHofestadt, R., Eds, Gene Regulation andMetabolism: Post-Genomic ComputationalApproach, MIT Press, Cambridge, MA.

101. Tyson, J. J., Chen, K. C. and Novak, B.

(2003), Sniffers, buzzers, toggles andblinkers: Dynamics of regulatory andsignaling pathways in the cell, Curr. Opin.Cell Biol., Vol. 15, pp. 221231.

102. Waddington, C. H. (1956), Principles ofEmbryology, Allen and Unwin Ltd, London.

103. Fox, J. J. and Hill, C. C. (2001), Fromtopology to dynamics in biochemicalnetworks, Chaos, Vol. 11, pp. 809 815.

104. Levsky, J. M. and Singer, R. H. (2003),Gene expression and the myth of the averagecell, Trends Cell Biol., Vol. 13, pp. 46.

105. McAdams, H. H. and Arkin, A. (1999), Its anoisy business! Genetic regulation at thenanomolar scale, Trends Genet., Vol. 15, pp.6569.

106. Elowitz, M. B., Levine, A. J., Siggia, E. D.and Swain, P. S. (2002), Stochastic gene

expression in a single cell, Science, Vol. 297,pp. 1183 1186.

107. Blake, W. J., Kaern, M., Cantor, C. R. andCollins, J. J. (2003), Noise in eukaryoticgene expression, Nature, Vol. 422, pp.633637.

108. Spudich, J. L. and Koshland Jr, D. E. (1976),Non-genetic individuality: Chance in thesingle cell, Nature, Vol. 262, pp. 467471.

109. Takasuka, N., White, M. R., Wood, C. D.et al. (1998), Dynamic changes in prolactinpromoter activation in individual livinglactotrophic cells, Endocrinology, Vol. 139,pp. 1361 1368.



Documents

Back to the Biology in Systems Biology