Back to the Biology in Systems Biology

Embed Size (px)

Citation preview

  • 8/4/2019 Back to the Biology in Systems Biology

    1/19

    Sui Huang

    received his MD and PhD from

    the University of Zurich and is

    an assistant professor atHarvard Medical School. His

    research focuses on c omplexity

    in multicellularity and cancer.

    Keywords: systems biology,

    biocomplexity, genetic

    networks, intrinsic constraints,adaptation, attractors

    Sui Huang,

    Department of Surgery,

    Childrens Hospital,

    Harvard Medical School,

    300 Longwood Avenue,

    Boston,

    MA 02115

    USA

    Tel: +1 617 919 2393

    e-mail: [email protected]

    Back to the biology in systems

    biology: What can we learnfrom biomolecular networks?Sui HuangDate received (in revised form): 1st December 2003

    AbstractGenome-scale molecular networks, including protein interaction and gene regulatory

    networks, have taken centre stage in the investigation of the burgeoning disciplines of systems

    biology and biocomplexity. What do networks tell us? Some see in networks simply the

    comprehensive, detailed description of all cellular pathways, others seek in networks simple,

    higher-order qualities that emerge from the collective action of the individual pathways. This

    paper discusses networks from an encompassing category of thinking that will hopefully help

    readers to bridge the gap between these polarised viewpoints. Systems biology so far has

    emphasised the characterisation of large pathway maps. Now one has to ask: where is the

    actual biology in systems biology? As structures midway between genome and phenome, and

    by serving as an extended genotype or an elementary phenotype, molecular networks open

    a new window to the study of evolution and gene function in complex living systems. For the

    study of evolution, features in network topology offer a novel starting point for addressing the

    old debate on the relative contributions of natural selection versus intrinsic constraints to a

    particular trait. To study the function of genes, it is necessary not only to see them in the

    context of gene networks, but also to reach beyond describing network topology and to

    embrace the global dynamics of networks that will reveal higher-order, collective behaviour of

    the interacting genes. This will pave the way to understanding how the complexity of genome-

    wide molecular networks collapses to produce a robust whole-cell behaviour that manifests astightly-regulated switching between distinct cell fates the basis for multicellular life.

    INTRODUCTION: MINDTHE GAP OF MINDSEver since molecular biology came into

    existence as a discipline devoted to

    studying the function and functioning of

    genes some 40 years ago, scientists have

    entertained the idea of gene regulatory

    circuits,1,2 and even of genome-scale

    genetic networks,3 for understanding how

    the activity of genes by influencingeach other ultimately translates into

    the phenotypic behaviour of an organism.

    In the absence of gene-specific data, the

    concepts of genetic networks were

    explored mostly at the level of abstract,

    generic models, without molecule-

    specific information.4 Despite intriguing

    insights into the fundamental properties of

    the collective behaviour of a large number

    of interacting components,3 such work

    remained largely unnoticed by

    practitioners of molecular biology, who

    emphasised the cloning, characterising

    and categorising of individual genes.

    Molecular biologists both

    experimental and computational used

    to studying specific, measurable

    biochemical details, were not ready for

    the abstraction required to handle

    networks as an object of study. It was onlythe recent availability of gene-specific

    data for genome-scale networks of

    molecular interactions, made possible by

    high-throughput genomic technologies,

    that revived the interest in the idea of

    large gene networks.

    The arrival of genome-scale

    information on networks, ranging from

    networks of metabolic reactions5 to gene

    regulatory6,7 and protein interaction

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 27 9

  • 8/4/2019 Back to the Biology in Systems Biology

    2/19

    networks,810 has led to a recent spate of

    publications in which the topology

    (wiring diagram) of molecular networks

    was analysed.1118 When reading the

    literature, however, it is important to be

    aware of two camps of researchers with

    philosophically opposite mindsets and,

    hence, disparate motivation. These two

    camps can be labelled the globalists (or,

    generalists) and the localists (or,

    particularists) (Table 1). Historically, a

    similar polarisation existed in brain

    research, with globalists emphasising the

    idea that the entire network of neurons in

    the brain collectively determine

    behaviour by distributed information

    processing, while the localists believedmore in localising a brain function to a

    particular anatomical region.19 It is now

    known that both were right.

    The first wave of work on genome-

    wide networks was conducted mostly by

    physicists in the globalists perspective, so

    that networks were analysed in a more

    general sense, ie as an entity in their own

    right.11,12,14,15,20 In this globalists

    abstraction the identification of individual

    genes and their relationships and functions

    were not the primary objectives of

    analysis. Thus, this early work on

    networks failed to attract the attention of

    scientists in the localists camp, which

    consists largely of mainstream molecular

    biologists habituated to characterising

    specific genetic pathways one at a time.

    There is a sharp, natural, but

    unarticulated intellectual disconnect

    between the two camps because of the

    fundamentally different motivation of the

    two groups of scientists. The globalists are

    typically interested in deeper principlesof complex systems,2124 they are attracted

    to biology as a new source of complexity,

    which led to the notion of

    biocomplexity.2528 They now seek

    molecule-specific data, made possible by

    technological advances, to validate their

    ideas.

    A similar polarisation

    between localists and

    globalists existed in

    brain research

    Table 1: Polarity of two points of view in network biology. The table represents a caricature of extreme positions forillustration purposes.

    The localist (particularist) view

    (Those who see the trees first)

    The globalist (generalist) view

    (Those who see the forest first)

    Level of original focus Gene- and pathway-centric Network-centricNew field created Systems biology BiocomplexityUse of hypothesis Hypothesis at level of individual pathways. No systems-

    level hypothesis: research becomes discovery-driven.Example hypotheses:Gene A inhibits Gene B, is required for function X, etc.

    Hypothesis at systems level concerning generic designof network and network position of genesExample hypotheses:Hub proteins are importantPower-law architecture favours ordered dynamics

    Philosophy System is complicatedProperties of systems lie in the property of thecomponentsComprehensiveness.The whole equals the sum of the parts

    System is complexHigher-order system properties emerge fromcollective behaviour of componentsHolismThe whole is different from sum of parts

    Practical aims of study To characterise exhaustively the biochemistry of (all)

    individual pathways and their functionsTo describe idiosyncrasy

    To understand generic aspects of genome-scale

    networks as an entity with its higher-order propertiesTo understand universality

    Gene identity in models Of primary interest. Specific models with nominal genesand their idiosyncratic properties

    Of secondary interest. Models with anonymous genesas generic entities may sometime suffice

    Network topology Precise biochemical characterisation and categorising ofphysical and regulatory interactions in specific pathways

    Analysis of large scale features, based on globalstatistics of local network features (degree distribution,modularity, clustering)

    Network dynamics Detailed modelling of individual small circuits (modules)in separation as a low-dimensional dynamic system

    Global dynamics of networkmaps into whole cell behaviour (cell fates)

    Function Focus on local cellular functions (eg protein synthesis,vesicle transport, filopodia extension, DNA repair, etc)associated with a specific pathway

    Emphasise emergent whole-cell behaviour, such asswitch between discrete cell phenotypes (cell fates)

    Typical non-biologist partners Computer scientists, engineers Physicists

    2 8 0 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

    Huang

  • 8/4/2019 Back to the Biology in Systems Biology

    3/19

    By contrast, the localists view is rooted

    in classical molecular biology, hence is

    shaped by decades of devotion to the

    study of individual cellular pathways that

    represent to them linear causal

    relationships. But the prevalence of

    pleiotropy and convergence in cell

    signalling, and of crosstalk between

    pathways, has led to the increasing

    awareness that understanding gene

    function requires that one reaches beyond

    the narrow focus on individual pathways.

    The advent of massively-parallel analysis

    and high-throughput genomics and

    proteomics technologies29,30 finally led to

    the acceptance of the idea that molecular

    pathways are just parts embedded in onegenome-scale system,31 leading to the

    explicit inception of systems biology.32

    Nevertheless, despite considering

    networks and quantitative modelling, the

    pathway-centric ideology shines through

    in this new initiative towards

    integration.3337 To the localists,

    integration essentially means the

    combination of different kinds of large

    datasets in order to find correlations to

    or at least to facilitate the generation of

    hypotheses on relationships between

    individual genes, proteins and biologicalfunctions.3740 Thus, the systems biology

    approach seeks comprehensive, qualitative

    and quantitative descriptions of all

    constituent parts, rather than an

    understanding of a higher-order quality

    that may arise from the collective action

    of the individual parts. The implicit

    notion is that extension of the existing

    ways of analysing individual pathways to

    the exhaustive, simultaneous

    characterisation of all the molecular

    interactions in the genome will suffice forunderstanding how living organisms

    work.27,35,41 Concretely, the emphasis on

    the analysis of vast amounts of

    biochemical detail has led to the

    involvement of computer scientists for

    developing computational tools that have

    become indispensable for representing,

    mining and modelling the bewildering

    amount and heterogeneity of collected

    information.4246 Another natural

    extension of pathway-centricism to the

    entire system, which has become a major

    research focus of systems biology, is to

    reconstruct the topology of a network by

    systematic experimental identification of

    interactions69,33,47 or to reverse engineer

    it by inference from measurement of

    network dynamics.4851

    In summary, whereas globalists are

    attracted by the complex, seeking to

    understand general principles giving rise

    to the whole, and are ready to abstract

    away specific details, localists-turned-

    systems biologists are more inclined to

    attack the complicated, and seek

    comprehensiveness of detailed description.

    Because of the unique blend in biology ofuniversality and idiosyncrasy, both

    approaches are complementary to each

    other and will be necessary in spite of a

    persisting, unarticulated mental divide

    (Table 1).

    This paper discusses molecular

    networks in a broadly encompassing

    category of view, going beyond the

    characterisation of network maps, and

    addresses the fundamental questions that

    investigators of both ideologies studying

    living systems will ultimately have to

    address: what do networks tell us aboutthe biology of an organism?

    FROM NETWORKS INBIOLOGY TO THEBIOLOGY OF NETWORKSAs with the information in DNA and

    amino acid sequences, what networks can

    tell us about the biology of an organism

    can be divided into two broad domains:

    evolution and function (Table 2).

    While comparative sequence analysis

    proved to be a powerful tool fordetermining taxonomic relationships,52

    and can shed light on the history and

    mechanism of genome evolution,5355 it

    remains of limited use for predicting gene

    function. Although conservation can

    point to functional importance, and

    sequence homology (eg catalytic motifs)

    allows the prediction of the biochemical

    function of a protein, the road from

    sequence to gene function and,

    Understanding living

    systems requires both

    detailed description andabstraction

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 1

    Back to the biology in systems biology

  • 8/4/2019 Back to the Biology in Systems Biology

    4/19

    subsequently, to drug development

    turned out to be bumpier than was

    anticipated in the dawn of the genomicrevolution. It is often overseen that the

    regulation and the context of expression

    of a gene is part of what defines its

    biological function. For example, based

    on sequence analysis, one readily finds

    that the protein cyclo-oxygenase-2

    (COX-2) has the same enzymatic activity

    as COX-1, namely, a cyclo-oxygenase in

    the synthesis of prostaglandins.56,57 This

    information in itself was of little use until

    it was established that COX-2 is regulated

    differently than COX-1, and thus has a

    different biological function.56 Beinginduced in inflammatory tissues, as

    opposed to the constitutively active

    COX-1, COX-2 obviously serves a

    different function and thus provides a

    more specific drug target than its non-

    regulated paralogue. Thus, what

    determines the distinctive biological

    functions of COX-1 and COX-2 is

    encoded in their regulation, not in their

    enzymatic activity.57,58 Furthermore, a

    regulatory protein, such as Ras, myc or

    NFkB, can have multiple, disparate, if notopposite, functions (eg either cell

    proliferation or differentiation/quiescence

    or apoptosis), depending on the cellular

    context, ie the presence of other proteins,

    and the cell state and type.27,59

    Hence, despite great efforts spent in

    functional annotation,60,61 it is only in

    rather exceptional cases that a specific

    physiological function can unconditionally

    and unambiguously be assigned to a given

    protein as a separate entity, defined just by

    the coding region of its gene. It is likely

    that most proteins, notably regulatoryproteins, exert their biological function as

    a member of a molecular network.

    What is less obvious is what networks

    can tell us about biological evolution

    (Table 2). Representing an intermediate

    level of information, between the level of

    DNA sequence and that of organismal

    phenotype, network structures are more

    complex than DNA sequences, yet simple

    enough for a mathematical representation

    using the formalism from graph theory.

    Thus, by serving, to all intents and

    purposes, as an extended genotype,networks open an entirely new window

    on the workings of evolution. Before

    discussing what networks can teach about

    the evolution of the organism, the author

    will firstly summarise a few recent, salient

    findings on network topology obtained

    from the globalist point of view.

    GLOBAL TOPOLOGY OFNETWORKSThe wiring diagram of the entire

    network can readily be drawn from dataon relationships between biomolecules,

    including metabolic reactions, physical

    protein interactions (heterodimeric

    complexes) and transcriptional gene

    regulation. Network topology

    information is available for metabolic

    networks in various microorganisms5 and,

    more recently, for a genome-wide

    proteinprotein interaction network of

    the yeast Saccharomyces cerevisiae.810,62,63

    Context of gene

    expression is part of

    gene function

    Table 2: Molecular networks as intermediate level of description complementing information from easily formalisablegenotype (gene sequence) and phenotype (organism).

    Gene sequence Molecular network Organism

    Formalisation Linear array easy to compare

    Graph Morphometry, etc difficult to compare

    Clues onevolution

    Clues onfunction

    Simple: relationship from sequencecomparison

    Difficult: heuristics of protein domainprediction, comparison

    New opportunity:intermediate betweengenotype and phenotype

    Difficult: common ancestry orconvergence?

    Simple: evident from observationor experiment

    2 8 2 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

    Huang

  • 8/4/2019 Back to the Biology in Systems Biology

    5/19

    Genome-wide gene regulatory network

    data are currently available forEscherichia

    coli.7 In the terminology of graph theory,

    a network consists of nodes (vertices) that

    can be connected by links (edges) (Figure

    1). Genome-wide networks consist of

    thousands of nodes, and thus are referred

    to as complex networks,64 by contrast

    with the networks with low node

    numbers studied in traditional graph

    theory. Moreover, an important aspect of

    biological networks is that they grow over

    time, by contrast with the networks with

    constant node numbers seen in traditional

    graph theory.65 Thus, biomolecular

    networks have a history of experiencing

    an increase in the number of nodes.66

    What catches the investigators interest is

    every topological feature that deviates

    from that of a random network, in which

    Nnodes are randomly connected by L

    links.65 The non-random structures found

    in complex networks, then, provide the

    starting point for further research into

    their origin and functional significance.

    To illustrate this, focus on the results

    obtained for the most complete

    eukaryotic protein interaction network

    available, the protein interaction network

    ofS. cerevisiae, which is based on genome-

    wide data of pair-wise proteinprotein

    interactions (Figure 1).810 It is important

    to note that the incompleteness and

    inaccuracies in the yeast protein

    interaction dataset, mostly due to

    potentially false-positive links,62,63 do not

    essentially affect the global statistics, as

    comparisons of various datasets by the

    authors in the cited works have shown.

    Many questions posed by the globalist can

    be addressed with incomplete and not

    perfectly accurate datasets, while thelocalist agenda requires precise and

    detailed information. One elementary

    global feature that was striking for

    biologists especially those who think in

    categories of isolated pathway modules

    named after their putative cellular

    functions is that, despite the overall

    sparseness of the network connectivity

    Figure 1: Global picture of the topology of large (complex) networks. Upper panels: graph ofthe network, displayed using Pajek software (http://vlado.fmf.uni-lj.si/pub/networks/pajek/). A.Computer generated, not growing random network,65 N 500 nodes, mean connectivitydegree ,k. $ 4. Note the dominating giant component (containing 487 connected nodes). B.Computer generated, growing network using the generative algorithm with preferentialattachment of new nodes,20 N 483, ,k. $ 4. (The algorithm necessarily leads to one singlecomponent). C. Yeast protein interaction network consisting of N 2,165 proteins, with,k. $ 4, based on the CORE data set.63 Lower panels: probability density distribution P(k) inloglog plots. The density distribution exhibits a straight line for B, Cindicating a powerlaw(scale-free) distribution

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 3

    Back to the biology in systems biology

  • 8/4/2019 Back to the Biology in Systems Biology

    6/19

    (three to four connections per protein, on

    average), the vast majority (.75 per cent)

    of the proteins in yeast participate in a

    single, large sub-network of connected

    nodes, a so-called giant component12

    (Figure 1). Although counterintuitive at

    first sight, this finding is not unexpected

    from a graph theory point of view. Given

    some minimal connectivity (for a

    growing, random graph, at least k 1/8

    links per node on average66), such a giant

    component is practically unavoidable in

    protein interaction networks (where

    average k % 4), and is thus robust in a

    fundamental way. Hence, this feature

    does not even qualify as a non-random

    feature that would point to a particularbiological purposefulness.

    By contrast, Barabasi and coworkers

    describe an interesting global topology

    feature that indeed deviates from that of a

    random graph.12,20 If all the nodes iof the

    network are treated as one population,

    and ki of each node i(ie the numberk of

    interaction partners that node ihas) is

    measured and displayed as a distribution,

    then the occurrence probability P(k) of

    each value ofk exhibits, approximately, a

    power-law distribution, P(k) $ kg, rather

    than as would be the case in randomnetworks a Poisson distribution. The

    exponent gis a characteristic constant

    (Figure 1). A power-law distribution

    implies that the more nodes that are

    sampled, the higher is the mean value of k

    for the sampled population. Thus, there is

    no stable mean that is characteristic for an

    ensemble of proteins (nodes), and such

    protein interaction networks are called

    scale-free. Although it remains debated

    as to whether the degree of distribution

    precisely follows a power-law, its heavy-tailed nature implies that there are more

    proteins with a high-degree k than one

    would expect in a random network: the

    protein interaction network is biased to

    contain more hubs than one would

    expect.

    Other non-random topological features

    that deviate from those of a random graph

    have been identified in the yeast

    interaction network. For example, hub

    proteins tend not to interact with each

    other.14 Moreover, the proteins or

    metabolites in real networks have a higher

    tendency than the nodes in random

    networks to be organised into clusters

    (cliques) in which the partners of a given

    protein are also likely to interact with

    each other.13,17 Together with the scale-

    free property, these features give rise to

    hierarchical modularity, as has been

    described for metabolic networks.15 The

    yeast protein network also contains a

    significant number of proteins that have

    low connectivity degrees, yet are central

    in that they lie on many of the shortest

    paths that connect any pair of proteins.67

    Of interest also is the recent study ofnetwork motifs across the network.

    Network motifs are subgraphs, ie locally

    defined patterns of interaction between a

    small number of nodes, such as triangles,

    branching structures, etc.6,16,18,68 In gene

    regulatory (transcription) networks, links

    represent transactivation and are,

    therefore, arrows rather than undirected

    connections; thus, transcription networks

    form directed graphs. Such digraphs exhibit

    a higher diversity of network motifs than

    the networks from protein interaction

    databases, which yield undirectedconnections. Specifically, Shen and

    coworkers found that in bacterial gene

    regulatory networks, certain motifs of

    directed links were significantly over-

    represented, such as the feed-forward

    loop (Figure 2).18

    In view of all of these non-random

    features of network topology, two

    interdependent questions immediately

    come to mind: how do such structures

    originate, and how do they affect

    biological function?

    NETWORKS ANDEVOLUTION:ADAPTATION VERSUSINTRINSIC CONSTRAINTSA fundamental question in biology is why

    organisms are the way they are: why does

    a given trait exist, how does it arise? It is

    obvious that the non-random network

    features discussed above can now be

    Topology of biological

    networks deviate from

    that of random

    networks

    2 8 4 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

    Huang

  • 8/4/2019 Back to the Biology in Systems Biology

    7/19

    treated as elementary phenotypic traits. In

    an encompassing view of evolution, one

    can distinguish between two

    fundamentally distinct, but not mutually

    exclusive, groups of mechanisms that can

    contribute to the very existence and

    nature of a given trait in living systems.

    Because authors use a large variety of

    terms with essentially identical or

    overlapping core meaning, here they will

    simply be called Type I and Type II

    mechanisms, to avoid semantic ambiguity

    or bias (Table 3).

    The Type I mechanism concerns the

    (neo-)Darwinian principles ofselection and

    adaptation. Traits can (smoothly) vary in a

    random fashion due to mutations; andselection (in a given environment) of the

    fittest (of the inheritable) variants drives

    the features towards a(n) (local) optimum

    in terms of functionality. Because of the

    underlying connotation of directedness or

    finalism (even if the mutations themselves

    are not), the adaptation mechanism is in

    essence equivalent to the engineers

    notion of optimisation of a task by

    deliberate design, or the tinkerers trial

    and error approach. The vast majority of

    biologists think in categories of this

    mechanism to explain almost everyfeature. The Type II mechanism

    comprises a variety of alternative, ie non-

    adaptionist mechanisms, summarised here

    under the general term intrinsic

    constraints. The mechanisms have in

    common the fact that they embody a set

    of rules that impose constraints so as to

    make particular features unavoidable, ie

    intrinsically robust, independent of a notion

    of usefulness or self-maintenance in the

    face of external perturbation.69,70 This

    implies that no external force, likeadaptation or any other notion of

    functionality or purposeful design, is

    required. Constraint is a creative rather

    than inhibitory force, for it is that which

    prevents (constrains) a system from

    settling down in a random,

    undifferentiated, pattern less state. The

    intrinsic constraints can be imposed by

    elementary physical laws (eg spherical

    shape), allometric relationships,71,72

    X

    A

    B

    XRE Gene X

    XRE Gene X

    Gene A ARE

    1

    2

    3

    4

    GENE DUPLICATION

    DIVERGING MUTATIONS

    XRE Gene XGene A ARE

    Duplicated segment

    INSERTION

    XRE Gene X

    Gene B ARE

    Gene A

    XREDuplicated gene A,mutated to gene B

    Figure 2: Hypothetical mechanism for the generation of a feed-forwardloop in the absence of selection for functionality. 1. Primordial, auto-regulative gene Xencodes a transcription factor that binds to its own cis-regulatory element, X-responsive element (XRE). 2. A second auto-regulative gene A, together with its cis-regulatory region, ARE, (eg, aregulon) is inserted into the 59 region ofX(eg mediated by transposable

    elements) in the reverse direction. If XRE is a bidirectional promoterelement, XRE can contribute to transactivation of A. (This is the case inthe arabinose operon, which is part of a feedforward-loop.18) 3.Duplication of the DNA segment Gene A-ARE-XREcreates a new genethat is responsive to both A and X. Due to subsequent diversifyingmutations, Gene A becomes Gene B; ARE is partially deleted because it isredundant. Now factor Xregulates itself and Gene A via XRE, while bothproducts, A and R, regulate gene B. On top of this structural andprocedural scheme, which facilitates the spontaneous creation of a feed-forward loop, natural selection may then do the fine tuning, eg changestrength and modality of regulation by point mutations in the respectivecis or trans regulatory regions

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 5

    Back to the biology in systems biology

  • 8/4/2019 Back to the Biology in Systems Biology

    8/19

    architectural69,7375 (eg symmetry, space

    filling) and mechanical principles (eg

    forcebalance rules in tensegrity76) and,

    importantly, self-organisation processes,

    such as symmetry-breaking patterns24 (see

    below). Other non-adaptive mechanisms

    include historical (phylogenic inertia,

    genetic drift, frozen accidents, etc) and

    developmental constraints (restrictions in

    the manufacturing path).69,70,75 Adistinct school of thinking explicitly

    opposite to adaptionism is structuralism,

    which stresses the study of structures

    independent of selection for

    functionality.70,77

    Self-organisation is part of complexity

    theory, which emerged as a field of study

    in its own right in the 1980s24,78,79 but

    which still lacks a rigorous definition, its

    novelty and necessity is still being

    questioned by some scientists. (This paper

    will not refer to theories that address thequestion of macroscopic evolution and

    the origin of the complexity of systems

    itself, which is a different matter.80)

    Nevertheless, by uniting various

    theoretical concepts aimed at

    understanding the origin of ordered, ie

    non-random, patterns through self-

    organisation, the approach of the complex

    system theories has added substantial

    momentum to the community of

    researchers that emphasise Type II

    explanations for particular features. By

    providing astonishingly simple

    explanations for the inherent robustness of

    many counterintuitive patterns, such as

    stripes, spirals, fractals, power-law

    relationships, etc, which are found in

    living as well as non-living systems,24

    complexity theory may provide some of

    the deeper principles of organisationwhich embody the design constraints and

    which structuralists were seeking.

    It is important to note that the Type I

    and Type II mechanisms are not mutually

    exclusive, but are consistent with each

    other and can cooperate in some rather

    convoluted way. For example, a particular

    intrinsically robust structure, such as the

    fractal branching patterns of pulmonary

    bronchi81 or a network motif in

    molecular networks (see below), may

    simply have been exploited becausethey happen to be advantageous rather

    than created by natural selection.

    Variability and selection pressure may not

    have been strong enough to mould such

    structures from scratch by stepwise

    exploration of many alternative, similar

    but less functional forms in the

    phenotype space. In other words,

    functional adaptation may be a by-

    product of an inherently robust process

    Table 3: Major points of the two mechanisms for explaining the existence of distinct features in biology. Although the twomechanisms are not mutually exclusive, but complementary and perhaps synergistic, they have become polarised meansof explanation among biologists, with the majority today leaning towards Type I

    Type I mechanism: Adaptation (natural selection) Type II mechanism: Intrinsic constraints

    (Neo)-Darwinian evolution: random mutation generates phenotypicvariation, selection of a distinct feature based on fitness in a givenenvironment

    Intrinsic inevitability (robustness) of a feature due to constraints in themechanism of its genesis or design: architecture constraints physical (mechanical, energetic, allometric) constraints self-organisation, pattern formation developmental constraints historical constraints (phylogenetic inertia, genetic drift, etc)

    Biology as a historical science Biology can be historical or ahistorical(Local) optimisation of a trait for functionality No optimisation necessary, but can secondarily shape traitNotion of purposefulness is centralGradual, cumulative evolution used to explain unlikely complex trait

    No notion of purposefulnessSelf-organisation used to explain unlikely trait

    Adaptation is a powerful moulding force that generates traits fromscratch

    Adaptation just maintains favourable traits that are inevitable due toconstraints

    Natural selection is creative Natural selection is conservativeAdaptation is cause for a trait Adaptation is consequence of a trait

    2 8 6 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

    Huang

  • 8/4/2019 Back to the Biology in Systems Biology

    9/19

    that creates structure based on internal

    design principles. Selection then does the

    fine-tuning.

    Unfortunately, the above dichotomy

    a matter of two non-competing and

    overlapping explanations becomes a

    matter of competing, polarised points of

    view. The majority of modern molecular

    biologists gravitate towards the adaptionist

    perspective and put all faith in the

    extraordinary power of natural selection

    in the creation and moulding of traits.

    This unquestioned belief in natural

    selection as the core, irreducible principle

    of explanation in biology hinders the

    pluralistic endeavour to disentangle and

    assess the relative contribution to a trait ofeither mechanism. Even Darwin himself

    cautioned us in the introduction to The

    Origin of Species: I am convinced that

    Natural Selection has been the main, but

    not the exclusive, means of

    modification.82

    EVOLUTION OF NETWORKSTRUCTURES: INTRINSICCONSTRAINTSThe dualistic scheme of explanation can

    now be applied to the non-random

    architectural features of networks. In thespirit of the prevailing bias towards

    adaptionism, the identification of

    particular network structures was almost

    always automatically followed by the

    interpretation that they must reflect some

    optimisation of functionality by

    evolution.18,83,84 Such statements,

    although particularly attractive to the

    engineers mind accustomed to

    purposeful, optimal design, are dangerous.

    When invoking the survival of the fittest

    in the absence of rigorous analysis of theactual mechanism of the fitness associated

    with a trait and of its history, one can

    easily succumb to the inherently self-

    referential logic of this statement (if what

    is fit survives, then what survives must be

    fit). Although pure Darwinian

    adaptionism applies in many specific

    scenarios, eg in one of its purest forms

    the generation of antibiotic resistance, in

    most cases both mechanisms appear to

    contribute, in a synergistic manner, to the

    prevalence of a given trait.

    Topological features of networks are

    ideal objects of study for determining the

    relative contribution of either type of

    mechanism to a distinct trait. The

    deviation from randomness of a network

    structure can be precisely determined and

    a mechanism for the structural and

    historical constraint formalised. A

    network feature represents an elementary

    phenotype, but unlike classical

    phenotypic traits it does not undergo

    ontogenesis, which complicates debates

    on evolution. Thus, developmental

    constraints do not have to be considered.

    From the discussion presented above, itbecomes obvious that before attributing a

    feature to adaptation, one needs to

    exclude any constraint accounting for the

    inherent inevitability of that structure.

    The demonstration that a network feature

    is significantly more abundant than one

    would expect if the network were to have

    been generated randomly, by comparing

    it with a random graph, does not suffice

    to support natural selection. Furthermore,

    the demonstration of convergence, ie

    similarity of a trait in the absence of

    genetic evidence for common ancestry, isoften used as an argument in favour of

    adaptation.85 Other than dispelling the

    idea of a common ancestry, however,

    convergence does not exclude the role of

    various kinds of intrinsic constraints in

    enforcing a particular feature, but may

    instead point precisely to some

    universality of the constraining design

    rules.

    The origin of a network motif needs to

    be examined in more detail. The feed-

    forward loop has been shown, based on acomparison with the random graphs, to

    be a prominent, hence adaptive feature

    in the E. coligene regulatory network

    (Figure 2).18 This conclusion is

    problematic, since real gene regulatory

    networks cannot be randomly redrawn

    like the nodes and links in a graph. On

    the contrary, evolution of the genome

    and, hence, of the network that it

    encodes, is subject to a series of

    Not all traits are

    evolved by natural

    selection alone

    Structural constraints

    may have helped shape

    network topology

    features

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 7

    Back to the biology in systems biology

  • 8/4/2019 Back to the Biology in Systems Biology

    10/19

    mechanistic constraints that may

    significantly reduce and bias the space of

    the possible random network

    architectures represented by the

    immaterial, random graphs. Genes are

    embodied by a linear physical structure,

    the DNA, which rearranges in particular

    ways. Genome evolution occurs via a

    finite set of physicochemical mechanisms,

    including chromosome and tandem gene

    duplication, followed by diversification of

    the duplicated gene pair due to mutations

    of the redundant genes.5355 Moreover,

    rearrangements of DNA are caused by

    DNA segment inversion or deletion, or

    insertion of DNA fragments via

    transposable elements, crossovers,etc.54,55,86 These processes, as well as

    promoter-generating mutations,87 drive

    the rewiring of gene regulatory networks

    by shuffling cis- and trans-regulatory

    regions. For example, the duplication of a

    gene, with subsequent mutations and

    acquisition of new properties in the

    coding region but not in the promoter

    region,53 would create a branching

    network motif (one gene activates two

    genes). These mechanisms could facilitate

    the genesis of particular network motifs

    over others, independent of functionality.Figure 2 shows a hypothetical mechanism

    that could explain the genesis of a feed-

    forward loop18 via only a gene insertion

    and a gene duplication event, plus a few

    minor mutations.

    Mechanistic and historical

    considerations help to place the potential

    effect of adaptation by selection in the

    appropriate light. Unlike the ad hoc

    assumption of an effective selective

    advantage, the assumption of mechanistic

    and historical constraints can be rigorouslytested.75 Analysis of the DNA sequence to

    trace back the history of the cis- and trans-

    regulatory elements within a specific

    network motif, combined with the study

    of generative algorithms constrained by

    mechanistic considerations of DNA

    rearrangement, could reveal the

    contribution by Type II mechanisms to a

    network structure.

    In the case of global structures, like the

    power-law degree distribution ofk

    discussed above, Barabasi and coworkers

    proposed a generative algorithm for

    network growth (increase in node

    number) that inevitably results in scale-

    free networks.20 An essential ingredient of

    their algorithm is that genes newly added

    to the existing, growing network

    preferentially establish a connection to an

    already highly connected gene. Although

    the model of genesis is cast in generic

    terms, one can now examine whether

    simulations of the actual physical

    mechanism (increase in genome size by

    gene duplication, rewiring by mutations,

    particular exon shuffling, etc) are

    compatible with the generic model.88

    Infact, protein sequence comparisons that

    allowed the determination of the

    evolutionary age of a protein appear to

    suggest that the hubs are older and that

    preferential attachment might have taken

    place during evolution.89 A power-law

    architecture has also been found in a large

    number of non-biological networks, such

    as social and computer networks,90,91

    suggesting that adaptive evolution is not

    likely to be the creator of this

    architecture.

    This section on the evolution ofnetwork features concludes by

    emphasising again that each of the types

    of explanation for a particular structure

    adaptation and intrinsic constraints

    does not exclude the other. Evolution

    will secondarily, by chance, coopt an

    intrinsically robust feature that happens to

    have desirable properties and maintain it.

    For example, a giant component of the

    network which is an unavoidable

    structure may coincide with functional

    advantages: in the case of the metabolicnetwork, it might facilitate global

    coordination of energy utilisation upon

    change of nutritional source; and in the

    case of the regulatory network, global

    connectivity allows regulation of whole-

    cell behaviour, such as cell fate switches,

    as discussed below. Similarly, structural

    modularity of networks, for which several

    advantages have been attributed in terms

    of functionality and evolvability,92 may

    Natural selection and

    historical/structural

    constraints cooperate in

    network evolution

    2 8 8 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

    Huang

  • 8/4/2019 Back to the Biology in Systems Biology

    11/19

    simply have been a fortuitous accident of

    horizontal (vector-mediated) transfer of

    entire clusters of genes. Such groups

    of genes could collectively confer a

    biochemical function to the host,

    allowing it to invade a new metabolic

    niche, and hence be maintained as a

    module by selection.87 Since the

    expression of these genes is coordinated

    inter sesuch as to perform a metabolic

    task, they necessarily form a network

    module. Finally, as proposed below, the

    power-law feature of regulatory networks

    may have the functional benefit of

    generating ordered global network

    dynamics. Thus, adaptation is part of the

    game, but it may be conservative, or justdoing the fine tuning, rather than

    creative.

    NETWORK FUNCTION:FROM TOPOLOGY TODYNAMICSMost functional interpretation of

    networks has been based on their

    topology alone. For example, it has been

    suggested that highly connected hubs in

    networks are more likely to be essential

    genes because they interact with many

    proteins. In fact, for the yeast proteomethere is a weak association between

    degree k of a protein and the probability

    that it is essential.12 The power-law

    network has been shown to be generically

    more tolerant to random failure of

    individual nodes, but vulnerable to

    targeted attacks on the hubs.93 This

    would be in line with the suggestion that

    hub proteins have a slower rate of

    evolution.94

    Much of the topology-based reasoning

    about function rests on the unarticulatedpremise that the molecular network acts

    like a communication network in which

    some information flows in the links from

    node to node. Although this may be

    appropriate for metabolic reaction

    networks, it certainly does not apply to

    networks of regulation, like the protein or

    transcription networks, where a link

    represents an influence rather than a flow.

    To understand functionality, it is,

    therefore, necessary to move from the

    domain oftopology to that ofdynamics.

    Network dynamics come into play when

    considering two essential properties of

    molecular regulation networks:

    (i) A node iin the network (gene or

    protein) takes a value, xi(t), which

    represents its expression or activation

    level, which changes with time in

    response to incoming influences from

    other nodes j, k, l, . . . , represented by

    the links.

    (ii) The links of the network represent

    influences (rather than flows) and have

    a modality (eg inhibition or

    stimulation). Together withinteraction rules (eg additive effects, or

    some Boolean functions), they

    determine how the level xi(t) of a

    given protein i(output) is jointly

    affected by the values xj,k,l (t) of its

    interaction partners j, k, l, etc

    (inputs).

    The dynamics of the system is the

    concerted change in the levels of xi(t) for

    all of the nodes iof the network, and can

    be represented by the N-dimensional state

    vectorS(t) [x1(t), x2(t),. . .

    xN(t)] for anetwork with Nnodes. While the

    topology is represented by a graph, the

    dynamics can be imagined as a more

    abstract N-dimensional state space in

    which S(t) moves its position with time t

    as defined by its components [x1(t), x2(t),

    . . . xN(t)] (see Figure 3 for details). The

    interactions of the network now impose

    restrictions on the movement of the state

    vectorS(t): a given state can only move in

    certain directions in the state space

    because its components xi(t) cannotchange independently but influence each

    other. This frustration of the freedom of

    movement is where the unfathomable

    complexity of large networks collapses. If

    the wiring diagram is right, this

    restriction will give rise to simple, robust,

    higher-order behaviour, as will be seen

    below.

    Mathematically, the dynamics of the

    network can be described and modelled

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 28 9

    Back to the biology in systems biology

  • 8/4/2019 Back to the Biology in Systems Biology

    12/19

    using a large variety of formalisms that

    differ in the level of coarse-graining. For

    example, systems of ordinary differential

    equations, with continuous values of

    concentrations forxi(t) that are subjected

    to mass action kinetics (like chemical

    reactions), are often used for smaller

    circuits consisting of a handful of proteins

    or genes.9598 This is already an

    abstraction, in particular when used to

    model gene regulatory or signal

    transduction, since xi(t) represents a

    hypothetical average of the stochastically

    fluctuating activity of node i, taken over

    time and over an ensemble of

    heterogeneous cells. At the even more

    Figure 3: From topology to dynamics; illustrated for a small circuit and a large network. Toppanel: Small circuit (signalling module) consisting of two proteins A and B, which are mutuallyinhibitory, and promote their own decay. For a wide range of kinetic parameters, this topologygives rise to bistability. The middle diagram shows a phase plane (two-dimensional state space)that represents the dynamics. Each dot represents a state S(t) [xA(t), xB(t)] of the circuit attime t, defined by the expression or activity values x(t) of the two nodes A and B (see text). Theticks emanating from each dot point to the direction in state space in which that respectivestate will move in the next time unit to comply with the interactions (ie, systems equations) ofthe circuit. There are two stable states, Xand Y, to which the ticks point. The right-handdiagram shows an intuitive projection of the state space (along the line p), emphasising thebistability of the system, with states Xand Yas potential wells (attractors), and a hill inbetween representing an unstable region. Bottom panel: Analogous illustration of topology anddynamics, but for a larger network (N 24). Now the state space is N-dimensional; it containsall the network states, defined by S(t) [x1(t), x2(t), . . . xN(t)], and cannot be drawn, however,it can be schematically considered as a landscape in which the pits represent (high-dimensional)attractor states (right-hand diagram). In this projection, every point in the xy plane representsa network state S(t), arranged so as to generate the landscape, where the vertical axis wouldrepresent some state change potential value reflecting the dissimilarity of the state Sxy atposition (x,y) to that of the attractor state that it will eventually reach. Each attractor statecorresponds to a distinct phenotypic cell state, the cell fate. The landscape can be thought of asthe formal and molecular equivalent of Waddingtons epigenetic landscape (see Figure 4)

    A B

    A

    B

    low Ahigh B

    high Alow B

    1

    State X

    State X

    State Y

    State Y

    2D phase plane

    ( N-dimensionalstate space )

    Projection as potential wells

    SMALLCIRCUITS

    LARGE

    NETWORKS

    TOPOLOGY

    Schematic projection as3D epigenetic landscape

    DYNAMICS PHENOTYPIC BEHAVIOUR

    p

    p

    N = 2

    N = 24 attractors = cell fates

    2 9 0 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

    Huang

  • 8/4/2019 Back to the Biology in Systems Biology

    13/19

    abstract end of the spectrum of models are

    the discrete networks in which genes are

    represented by binary, ON-OFF

    switches, with Boolean functions

    determining the interaction modalities.

    Because of their computational simplicity,

    such Boolean networks have been used as

    a convenient, tractable caricature for

    studying fundamental aspects of the global

    behaviour of large networks using

    statistical ensembles of networks of

    varying topologies.4,22 In fact, Boolean

    functions (eg AND, OR, . . .) capture the

    link modalities and their cooperation

    rather well/although these are, in reality,

    embodied by the molecular interactions at

    promoters and in protein proteincomplexes.47,99,100

    Quantitative modelling of local

    modules of signalling and gene regulatory

    pathways that consider non-linear aspects

    (such as positive and negative feedback

    loops) can predict a variety of interesting

    behaviours, such as robustness, multi-

    stability, oscillation and switch-like

    behaviour.101 These emergent qualities

    are not evident from study of the

    topology, nor can they intuitively be

    predicted by the linear upstream

    downstream schemes of theexperimentalist (Figure 3). This pathway-

    centred modelling, however, predicts the

    dynamics only of the components of a

    particular signalling module, not of

    phenotypic cell behaviour.

    CELL FATES ASATTRACTORS INGENOME-SCALENETWORKSA globalist view of the dynamics of the

    network may also, therefore, be necessary.Two findings suggest that global network

    dynamics may be relevant. First, the

    existence of a giant component in the

    protein interaction network12 (and

    probably also in the gene regulatory

    network6) raises the question of whether

    there is a collective, higher-order

    behaviour that arises from the large

    number (N 1,00010,000) of

    interdependent genes and proteins, or

    whether their activity changes simply

    average out and get lost over the large

    population of the nodes, so that no

    interesting global pattern emerges. In

    other words, is the genome-wide

    network (or large portions of it) wired in

    such a way that it acts coherently as one

    entity with a distinct macroscopic

    dynamics and, thus, contains additional

    information which would not be

    accessible in the study of pathway

    modules in separation? Secondly, cells in

    multicellular organisms exhibit a simple,

    coherent whole-cell behaviour which

    may precisely reflect a higher-order

    dynamics of the global network: the

    switching between cell fates. This strictlyregulated, rule-based systems behaviour is

    robust and remarkably simple compared

    with the complexity of the underlying

    molecular network.100

    Cells in multicellular organisms are able

    to rapidly and reliably integrate a

    multitude of simultaneous and often

    conflicting signals from their tissue

    environment, and respond by selecting

    just one of a few discrete cell fates, such as

    quiescence, proliferation, migration or

    differentiation into distinct cell types.100

    Tight regulation of the balance betweencell fates is required for the development

    and homeostasis of the tissue. Cell fates

    are discrete, mutually exclusive

    behavioural types, as Waddington

    described back in the 1940s.102 What is

    particularly counterintuitive from the

    localist, pathway-centred perspective, is

    that the same cell fate switch can be

    triggered by a broad variety of unrelated

    biochemical and physical stimuli, while

    the same biochemical stimulus can elicit

    entirely different cell fate switches depending on the context.59,100 This

    defies the localist notion of modular

    pathways that serve a specific cellular

    function. The mutual exclusivity of cell

    fates is often taken for granted but is a

    systems property that requires that the

    pathways that have been associated with

    particular cell fates be globally

    coordinated. This would be consistent

    with the function of a giant component in

    The dynamics of largenetworks is highly

    constrained

    Cells undergo almost

    discrete transitions

    between a few distinct

    cell phenotypes

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 29 1

    Back to the biology in systems biology

  • 8/4/2019 Back to the Biology in Systems Biology

    14/19

    the regulatory networks that spans almost

    the entire genome (even if this structural

    feature is statistically almost unavoidable,

    and thus may not have evolved for this

    function, as discussed above).

    Do the dynamics of large networks in

    fact generate robust, simple, macroscopic

    behaviour that is manifest as regulated cell

    fate switching? To overcome the lack of

    specific information on genes and the

    precise network topology, yet be able to

    study the universal behaviour of large

    networks, more than 30 years ago

    Kauffman used random Boolean networks

    as models to show that, given some

    minimal topological constraints (such as

    sparse connectivity), the global dynamicsof a network with Ngenes will be far

    from chaotic (disordered), but that due

    to the constraints imposed by the

    particular genegene interactions, it will

    inevitably exhibit some ordered collective

    behaviour: most states S(t) are not stable,

    thus forcing S(t) to move in the state

    space until it arrives at a stable state to

    which unstable states will be attracted (see

    Figure 3 for details).3,22 These attractor

    states can now be designated as the cell

    fates or, in Kauffmans more general view,

    as the various cell types in a multicellularorganism.22,100 In fact, the dynamics of a

    network with attractor states naturally

    captures the essential properties of cell fate

    dynamics, including mutual exclusivity,

    robustness and all-or-none transitions

    between cell fates in response not to a

    single specific instructive signal but to a

    large variety of signals.

    Although Kauffman used randomly

    connected networks (with restrictions

    only with regard to connectivity degree

    and type of Boolean function), he wasable to show some universal dynamic

    properties, such as the existence of robust

    attractors.22 Kauffmans pioneering work

    receives little attention from modern day

    experimental and computational

    molecular biologists, however,34,35 who

    in their localist point of view emphasise

    the description of specific, idiosyncratic

    details over abstraction and analysis of

    higher-order behaviour. Interestingly, it

    turned out that the scale-free network

    topology that is now found in real

    networks is even more likely than

    randomly connected networks to generate

    a reasonable, ordered dynamics, in the

    sense of Kauffman (ie containing stable,

    compact attractors, consistent with cell

    fate behaviour).103 Thus, the self-

    organised scale-free topology relaxes

    Kauffmans restrictions on gene network

    topology for producing ordered

    behaviour.

    The idea that attractors correspond to

    distinct, stable phenotypic states is just one

    aspect of a more general view. The

    genome-wide molecular network and,

    hence, the cell do not randomly andindifferently visit all possible network

    states S(t). Instead, because of the

    interactions between the molecular

    constituents, and the particularity of their

    wiring diagram, the state space is not

    isotropic, but highly sub structured, or

    compartmentalised, into stable attractor

    states and their basins of attraction. Thus,

    the interactions between genes and

    proteins are wired such as to impose

    restrictions on the dynamics of the

    network S(t). The complexity of the

    genomic network collapses into simple,robust behaviour with relatively few

    stable network states the biological cell

    fates. The attractors in the state space

    landscape (Figure 3) may thus represent

    a molecular and formal explanation for

    Waddingtons epigenetic landscape

    (Figure 4), which he proposed as an

    intuitive metaphor to explain the

    discreteness of cell fates more than 60

    years ago, without the notion of genes

    and networks.

    CONCLUSIONIn this paper the author has discussed

    disparate but complementary viewpoints

    in biology that the burgeoning notion of

    system-wide networks has exposed. On

    the one hand, a localist view that stresses

    the importance of detailed pathway

    analysis at the level of nominal genes and

    proteins, and that considers a network as

    the sum of all the pathways; and on the

    2 9 2 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

    Huang

  • 8/4/2019 Back to the Biology in Systems Biology

    15/19

    other hand, a globalist view that

    emphasises the additional emergentinformation harboured by a network as an

    entity. Since living systems are both

    complicated and complex,27 both

    viewpoints are equally important.

    Network structures provide an ideal

    handle with which to determine the

    relative contribution of natural selection

    versus intrinsic constraints to particular

    traits. Global network dynamics,

    however, open a new perspective to cell

    fate decisions in multicellular organisms

    which will foster understanding ofphenomena such as the multi-potency of

    stem cells.

    In conclusion, however, it is important

    to remember that networks are just a

    convenient abstraction, which fail to

    capture some essential properties of living

    systems. First, they represent a level of

    description that is devoid of space and

    physicality, such as the subcellular

    localisation of the proteins representing

    the network nodes, as well as mechanical

    forces within cells.76 Secondly, in

    multicellular organisms, every cell

    contains its own individual network. One

    forgets that in a theoretical treatment one

    analyses and models a single proxy

    network whose behaviour is then

    mentally multiplied to billions of identical

    replicate cells in order to represent a tissue

    with its billions of cells. This

    subconscious, amplifying projection is

    problematic, since there is no

    representative average cell104 because of

    molecular noise105107 and other,

    epigenetic variations.108 Instead, a

    population, even of clonal cells, can

    dramatically vary in its network state, asevidenced by the dispersion of behaviour

    of the same genes in individual cells

    within a population.104,109 Perhaps this

    population heterogeneity itself is a desired

    feature, because the dispersion of single

    cell behaviour establishes robust,

    characteristic macroscopic response

    dynamics for the tissue as an entity. In this

    case, the detailed behaviour of a single

    cellular network would not matter, while

    sloppiness within the network of

    individual cells may even be beneficial for

    the behaviour of tissues. Finally, at thetissue level, there is another network: the

    cellcell communication network,

    mediated by secreted factors, extracellular

    matrix and mechanical forces.

    Understanding intracellular molecular

    network is, therefore, all but a first step

    towards this network of networks that

    makes multicellular living systems work.

    Acknowledgment

    This work was funded by the Air Force Office of

    Scientific Research (AFOSR/USAF) and thePatterson Trust. The author also wishes to thank

    Joy P. Maliackal for helpful discussions on

    networks and Donald E. Ingber for his continuous

    support.

    References

    1. Monod, J. and Jacob, F. (1961), Generalconclusions: Teleonomic mechanisms incellular metabolism, growth anddifferentiation, Cold Spring Harbor Symp.Quant. Biol., Vol. 26, pp. 389 401.

    Figure 4: Epigenetic landscape proposedby Waddington in the 1940s as a metaphorto capture the observable dynamics of cellfates. The marble represents a cell state.Since Waddington was studyingdevelopment, the overall slope representsthe driving force of development. As themarble rolls downwards, the cell is forced tochoose between discrete fates represented by the valleys. Reprinted fromWaddington (1957), The Strategy of theGenes, Allen and Unwin, London.

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 29 3

    Back to the biology in systems biology

  • 8/4/2019 Back to the Biology in Systems Biology

    16/19

    2. Delbruck, M. (1949), Discussion, in CNRS,Eds, Unites biologiques douees de continuitegenetique, Colloques Internationaux duCentre National de la RechercheScientifique, Paris.

    3. Kauffman, S. A. (1969), Metabolic stabilityand epigenesis in randomly constructedgenetic nets, J. Theor. Biol., Vol. 22,pp. 437467.

    4. Huang, S. (2001), Genomics, complexityand drug discovery: Insights from Booleannetwork models of cellular regulation,Pharmacogenomics, Vol. 2, pp. 203 222.

    5. Overbeek, R., Larsen, N., Pusch, G. D. et al.(2000), WIT: integrated system for high-throughput genome sequence analysis andmetabolic reconstruction, Nucleic Acids Res.,Vol. 28, pp. 123 125.

    6. Lee, T. I., Rinaldi, N. J., Robert, F. et al.

    (2002), Transcriptional regulatory networksin Saccharomyces cerevisiae, Science, Vol. 298,pp. 799804.

    7. Salgado, H., Santos-Zavaleta, A., Gama-Castro, S. et al. (2001), RegulonDB (version3.2): transcriptional regulation and operonorganization in Escherichia coliK-12, NucleicAcids Res., Vol. 29, pp. 72 74.

    8. Mewes, H. W., Frishman, D., Guldener, U.et al. (2002), MIPS: A database for genomesand protein sequences, Nucleic Acids Res.,Vol. 30, pp. 3134.

    9. Uetz, P., Giot, L., Cagney, G. et al. (2000),A comprehensive analysis of proteinprotein

    interactions in Saccharomyces cerevisiae, Nature,Vol. 403, pp. 623627.

    10. Ito, T., Chiba, T., Ozawa, R. et al. (2001), Acomprehensive two-hybrid analysis toexplore the yeast protein interactome, Proc.Natl. Acad. Sci. USA, Vol. 98, pp. 45694574.

    11. Jeong, H., Tombor, B., Albert, R. et al.(2000), The large-scale organization ofmetabolic networks, Nature, Vol. 407, pp.651654.

    12. Jeong, H., Mason, S. P., Barabasi, A.-L. andOltvai, Z. N. (2001), Lethality and centralityin protein networks, Nature, Vol. 411,pp. 4142.

    13. Wagner, A. and Fell, D. A. (2001), Thesmall world inside large metabolic networks,Proc. R. Soc. Lond., B, Biol. Sci., Vol. 268,pp. 1803 1810.

    14. Maslov, S. and Sneppen, K. (2002),Specificity and stability in topology ofprotein networks, Science, Vol. 296,pp. 910913.

    15. Ravasz, E., Somera, A. L., Mongru, D. A.et al. (2002), Hierarchical organization ofmodularity in metabolic networks, Science,Vol. 297, pp. 15511555.

    16. Milo, R., Shen-Orr, S., Itzkovitz, S. et al.(2002), Network motifs: Simple buildingblocks of complex networks, Science, Vol.298, pp. 824 827.

    17. Wuchty, S. (2002), Interaction and domainnetworks of yeast, Proteomics, Vol. 2,pp. 1715 1723.

    18. Shen-Orr, S. S., Milo, R., Mangan, S. andAlon, U. (2002), Network motifs in thetranscriptional regulation network ofEscherichia coli, Nat. Genet., Vol. 31,pp. 6468.

    19. Sacks, O. (2002), Forgetting and Neglect inScience, in Silver, R. B., Ed., HiddenHistories of Science, New York Review ofBooks, pp. 141 179.

    20. Barabasi, A. L. and Albert, R. (1999),Emergence of scaling in random networks,Science, Vol. 286, pp. 509512.

    21. Goodwin, B. (1993), How the LeopardChanged Its Spots: The Evolution ofComplexity, Princeton University Press,Princeton, NJ.

    22. Kauffman, S. A. (1993), The Origins ofOrder, Oxford University Press, New York,NY.

    23. Kauffman, S. A. (2000), Investigations,Oxford University Press, New York, NY.

    24. Ball, P. (1998), The Self-made Tapestry:Pattern Formation in Nature, OxfordUniversity Press, New York, NY.

    25. Coffey, D. S. (1998), Self-organization,

    complexity and chaos: The new biology formedicine, Nat. Med., Vol. 4, pp. 882 885.

    26. Strohman, R. C. (1997), The comingKuhnian revolution in biology, Nat.Biotechnol., Vol. 15, pp. 194200.

    27. Huang, S. (2000), The practical problems ofpost-genomic biology, Nat. Biotechnol., Vol.18, pp. 471 472.

    28. Oltvai, Z. N. and Barabasi, A. L. (2002),Systems biology. Lifes complexity pyramid,Science, Vol. 298, pp. 763764.

    29. Murphy, D. (2002), Gene expression studiesusing microarrays: Principles, problems, andprospects, Adv. Physiol. Educ., Vol. 26, pp.256270.

    30. Mollay, M. P. and Witzmann, F. A. (2002),Proteomics: Technologies and applications,Brief. Funct. Genom. Proteom., Vol. 1, pp. 2329.

    31. Marcotte, E. M. (2001), The path not taken,Nat. Biotechnol., Vol. 19, pp. 626627.

    32. Aebersold, R., Hood, L. E. and Watts, J. D.(2000), Equipping scientists for the newbiology, Nat. Biotechnol., Vol. 18, p. 359.

    33. Ideker, T., Thorsson, V., Ranish, J. A. et al.(2002), Integrated genomic and proteomic

    2 9 4 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

    Huang

  • 8/4/2019 Back to the Biology in Systems Biology

    17/19

    analyses of a systematically perturbedmetabolic network, Science, Vol. 292,pp. 929934.

    34. Endy, D. and Brent, R. (2001), Modelling

    cellular behaviour, Nature, Vol. 409(Suppl.),pp. 391395.

    35. Bray, D. (2003), Molecular networks: Thetop-down view, Science, Vol. 301, pp. 18641865.

    36. Nurse, P. (2003), Systems biology:Understanding cells, Nature, Vol. 424,p. 883.

    37. Stuart, J. M., Segal, E., Koller, D. and Kim.S. K. (2003), A gene-coexpression networkfor global discovery of conserved geneticmodules, Science, Vol. 302, pp. 249255.

    38. Tornow, S. and Mewes, H. W. (2003),Functional modules by relating protein

    interaction networks and gene expression,Nucleic Acids Res., Vol. 31, pp. 6283 6289.

    39. Bader, G. D., Heilbut, A., Andrews, B. et al.(2003), Functional genomics andproteomics: Charting a multidimensionalmap of the yeast cell, Trends Cell. Biol., Vol.13, pp. 344 356.

    40. Ge, H., Walhout, A. J. and Vidal, M. (2003),Integrating omic information: a bridgebetween genomics and systems biology,Trends Genet., Vol. 19, pp. 551560.

    41. Evans, G. (2000), Designer science and theomic revolution, Nat. Biotechnol., Vol. 18,p. 127.

    42. Eisen, M. B., Spellman, P. T., Brown, P. O.and Botstein, D. (1998), Cluster analysis anddisplay of genome-wide expression patterns,Proc. Natl. Acad. Sci. USA, Vol. 95,pp. 1486314868.

    43. Steffen, M., Petti, A., Aach, J. et al. (2002),Automated modelling of signal transductionnetworks, BMC Bioinformatics, Vol. 3, pp. 34.

    44. Shannon, P., Markiel, A., Ozier, O. et al.(2003), Cytoscape: a software environmentfor integrated models of biomolecularinteraction networks, Genome Res., Vol. 13,pp. 2498 2504.

    45. Bader, G. D. and Hogue, C. W. (2000),BIND-a data specification for storing and

    describing biomolecular interactions,molecular complexes and pathways,Bioinformatics, Vol. 16, pp. 465 477.

    46. Chen, Y. and Xu, D. (2003), Computationalanalyses of high-throughput proteinproteininteraction data, Curr. Protein Pept. Sci., Vol.4, pp. 159 181.

    47. Davidson, E. H., Rast, J. P., Oliveri, P. et al.(2002), A genomic regulatory network fordevelopment, Science, Vol. 295, pp. 16691678.

    48. Hasty, J., McMillen, D., Isaacs, F. and

    Collins, J. J. (2001), Computational studiesof gene regulatory networks: in numeromolecular biology, Nat. Rev. Genet., Vol. 2,pp. 268279.

    49. Dhaeseleer, P., Liang, S. and Somogyi, R.(2000), Genetic network inference: Fromco-expression clustering to reverseengineering, Bioinformatics, Vol. 16,pp. 707726.

    50. Gardner, T. S., di Bernardo, D., Lorenz, D.and Collins, J. J. (2003), Inferring geneticnetworks and identifying compound mode ofaction via expression profiling, Science, Vol.301, pp. 102105.

    51. Tamada, Y., Kim, S., Bannai, H. et al. (2003),Estimating gene networks from geneexpression data by combining Bayesiannetwork model with promoter elementdetection, Bioinformatics, Vol. 19 (Suppl. 2),

    pp. II227 II236.52. Tatusov, R. L., Galperin, M. Y., Natale,

    D. A. and Koonin, E. V. (2000), The COGdatabase: A tool for genome-scale analysis ofprotein functions and evolution, Nucleic AcidsRes., Vol. 28, pp. 33 36.

    53. Lynch, M., Conery, J. S. (2000), Theevolutionary fate and consequences ofduplicate genes, Science, Vol. 290, pp. 11511155.

    54. Eichler, E. E. and Sankoff, D. (2003),Structural dynamics of eukaryoticchromosome evolution, Science, Vol. 301,pp. 793797.

    55. Long, M., Deutsch, M., Wang, W. et al.(2003), Origin of new genes: Evidence fromexperimental and computational analyses,Genetica, Vol. 118, pp. 171 182.

    56. Williams, C. S. and DuBois, R. N. (1996),Prostaglandin endoperoxide synthase: whytwo isoforms?, Am. J. Physiol., Vol. 270,pp. G393 G400.

    57. Tanabe, T. and Tohnai, N. (2002),Cyclooxygenase isozymes and their genestructures and expression, Prostaglandins OtherLipid Mediat., Vol. 6869, pp. 95114.

    58. Kam, P. C. and See, A. U. (2000), Cyclo-oxygenase isoenzymes: Physiological andpharmacological role, Anaesthesia, Vol. 55,

    pp. 442449.

    59. Huang, S. and Ingber, D. E. (2000), Shape-dependent control of cell growth,differentiation and apoptosis: Switchingbetween attractors in cell regulatorynetworks, Exp. Cell Res., Vol. 261, pp.91103.

    60. Ashburner, M. and Lewis, S. (2002), Onontologies for biologists. The Gene Ontology

    untangling the web, Novartis Found Symp.,Vol. 247, pp. 6680.

    61. Wu, C. H., Huang, H., Arminski, L. et al.

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 29 5

    Back to the biology in systems biology

  • 8/4/2019 Back to the Biology in Systems Biology

    18/19

    (2002), The protein information resource:An integrated public resource of functionalannotation of proteins, Nucleic Acids Res.,Vol. 30, pp. 3537.

    62. von Mering, C., Krause, R., Snel, B. et al.(2002), Comparative assessment of large-scale data sets of protein-protein interactions,Nature, Vol. 417, pp. 399403.

    63. Deane, C. M., Salwinski, L., Xenarios, I. andEisenberg, D. (2002), Protein interactions:Two methods for assessment of the reliabilityof throughput observations, Mol. Cell.Proteomics, Vol. 1, pp. 349 356.

    64. Strogatz, S. H. (2001), Exploring complexnetworks, Nature, Vol. 410, pp. 268 276.

    65. Erdos, P. and Renyi, A. (1959), On randomgraphs, Publ. Math., Vol. 6, p. 290.

    66. Callaway, D. S., Hopcroft, J. E., Kleinberg,

    J. M. et al. (2001), Are randomly growngraphs really random?, Phys. Rev. E Stat.Nonlin. Soft Matter Phys., Vol. 64, 041902.

    67. Joy, M. P., Brock, A. and Huang, S.,manuscript in preparation.

    68. Itzkovitz, S., Milo, R., Kashtan, N. et al.(2003), Subgraphs in random networks,Phys. Rev. E Stat. Nonlin. Soft Matter Phys.,Vol. 68, 026127.

    69. Gould, S. J. and Lewontin, R. C. (1979),The spandrels of San Marco and thePanglossian paradigm: A critique of theadaptationist programme, Proc. R. Soc. Lond.,B, Biol. Sci., Vol. 205, pp. 581 598.

    70. Webster, G. and Goodwin, B. (1981),History and structure in biology, Perspect.Biol. Med., Vol. 25, pp. 39 62.

    71. Gould, S. J. (1966), Allometry and size inontogeny and phylogeny, Biol. Rev. Camb.Philos. Soc., Vol. 41, pp. 587 640.

    72. West, G. B., Brown, J. H. and Enquist, B. J.(1999), The fourth dimension of life: fractalgeometry and allometric scaling oforganisms, Science, Vol. 284, pp. 16771679.

    73. Thompson, D. W. (1942), Growth andForm, Macmillian, New York, NY.

    74. Seilacher, A. (1970), Arbeitskonzept zurKonstruktionsmorphologie, Lethaia, Vol. 3,

    pp. 393396.

    75. Autumn, K., Ryan, M. J. and Wake, D. B.(2002), Integrating historical and mechanisticbiology enhances the study of adaptation,Q. Rev. Biol., Vol. 77, pp. 383408.

    76. Ingber, D. E. (2003), Tensegrity I. Cellstructure and hierarchical systems biology,J. Cell Sci., Vol. 116, pp. 1157 1173.

    77. Hughes, A. J. and Lambert, D. M. (1984),Functionalism and structuralism, and Waysof Seeing , J. Theor. Biol., Vol. 111,pp. 787800.

    78. Mitchell Waldrop, M. (1992), Complexity.The emerging science at the edge of orderand chaos, Simon and Schuster, New York,NY.

    79. Bar-Yam, Y. (1997), Dynamics in complexsystems (studies in nonlinearity), PerseusPublishing New York, NY.

    80. Carlson, J. M. and Doyle, J. (2002),Complexity and robustness, Proc. Natl. Acad.Sci. USA, Vol. 99(Suppl. 1), pp. 25382545.

    81. Weibel, E. R. (1991), Fractal geometry: Adesign principle for living organisms, Am. J.Physiol., Vol. 1, pp. L361 L369.

    82. Darwin, C. (1964), On the origin of species:A facsimile of the first edition, HarvardUniversity Press, Cambridge, MA.

    83. Alon, U. (2003), Biological networks: Thetinkerer as an engineer, Science, Vol. 301,

    pp. 1866 1867.84. Wuchty, S., Oltvai, Z. N. and Barabasi, A. L.

    (2003), Evolutionary conservation of motifconstituents in the yeast protein interactionnetwork, Nat. Genet., Vol. 35, pp. 176179.

    85. Conant, G. C. and Wagner, A. (2003),Convergent evolution of gene circuits, Nat.Genet. Vol. 34, pp. 264 266.

    86. Lawrence, J. G. (2002), Shared strategies ingene organization among prokaryotes andeukaryotes, Cell, Vol. 110, pp. 407 413.

    87. Dabizzi, S., Ammannato, S. and Fani, R.(2001), Expression of horizontally transferredgene clusters: Activation by promoter-generating mutations, Res. Microbiol., Vol.152, pp. 539 549.

    88. Berg, J., Lassig, M. and Wagner, A. (2003),Structure and evolution of protein networks(in review). Available at http://arxiv.org/abs/cond-mat/0207711.

    89. Eisenberg, E. and Levanon, E. Y. (2003),Preferential attachment in the proteinnetwork evolution, Phys. Rev. Lett., Vol. 91,138701.

    90. Amaral, L. A., Scala, A., Barthelemy, M. andStanley, H. E. (2000), Classes of small-worldnetworks, Proc. Natl. Acad. Sci. USA, Vol.97, pp. 1114911152.

    91. Barabasi, A.-L. (2002), Linked: How

    everything is connected to everything andwhat it means, Perseus Publishing New

    York, NY.

    92. Hartwell, L. H., Hopfield, J. J., Leibler, S.and Murray, A. W. (1999), From molecularto modular cell biology, Nature, Vol.402(Suppl.), pp. C47C52.

    93. Albert, R., Jeong, H. and Barabasi, A.-L.(2000), Error and attack tolerance ofcomplex networks, Nature, Vol. 406, pp.378382.

    94. Fraser, H. B., Wall, D. P. and Hirsh, A. E.

    2 9 6 & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O N A L G E N O M I C S A N D P R O T E O M I C S. VOL 2. NO 4. 279297. FEBRUARY 2004

    Huang

  • 8/4/2019 Back to the Biology in Systems Biology

    19/19

    (2003), A simple dependence betweenprotein evolution rate and the number ofproteinprotein interactions, BMC Evol.Biol., Vol. 23, p. 11.

    95. Tyson, J. J., Chen, K. and Novak, B. (2001),Network dynamics and cell physiology, Nat.Rev. Mol. Cell. Biol., Vol. 2, pp. 908 916.

    96. Ferrell Jr, J. E. and Machleder, E. M. (1998),The biochemical basis of an all-or-none cellfate switch in xenopus oocytes, Science, Vol.280, pp. 895898.

    97. Bhalla, U. S., Ram, P. T. and Iyengar, R.(2002), MAP kinase phosphatase as a locus offlexibility in a mitogen-activated proteinkinase signaling network, Science, Vol. 297,pp. 1018 1023.

    98. Hoffmann, A., Levchenko, A., Scott, M. L.and Baltimore, D. (2002), The IkappaB-NF-kappaB signaling module: temporal control

    and selective gene activation, Science, Vol.295, pp. 12411245.

    99. Dueber, J. E., Yeh, B. J., Chak, K. and Lim,W. A. (2003), Reprogramming control of anallosteric signaling switch through modularrecombination, Science, Vol. 301, pp. 19041908.

    100. Huang, S (2002), Regulation of cellularstates in mammalian cells from a genome-wide view, in Collado-Vides, J. andHofestadt, R., Eds, Gene Regulation andMetabolism: Post-Genomic ComputationalApproach, MIT Press, Cambridge, MA.

    101. Tyson, J. J., Chen, K. C. and Novak, B.

    (2003), Sniffers, buzzers, toggles andblinkers: Dynamics of regulatory andsignaling pathways in the cell, Curr. Opin.Cell Biol., Vol. 15, pp. 221231.

    102. Waddington, C. H. (1956), Principles ofEmbryology, Allen and Unwin Ltd, London.

    103. Fox, J. J. and Hill, C. C. (2001), Fromtopology to dynamics in biochemicalnetworks, Chaos, Vol. 11, pp. 809 815.

    104. Levsky, J. M. and Singer, R. H. (2003),Gene expression and the myth of the averagecell, Trends Cell Biol., Vol. 13, pp. 46.

    105. McAdams, H. H. and Arkin, A. (1999), Its anoisy business! Genetic regulation at thenanomolar scale, Trends Genet., Vol. 15, pp.6569.

    106. Elowitz, M. B., Levine, A. J., Siggia, E. D.and Swain, P. S. (2002), Stochastic gene

    expression in a single cell, Science, Vol. 297,pp. 1183 1186.

    107. Blake, W. J., Kaern, M., Cantor, C. R. andCollins, J. J. (2003), Noise in eukaryoticgene expression, Nature, Vol. 422, pp.633637.

    108. Spudich, J. L. and Koshland Jr, D. E. (1976),Non-genetic individuality: Chance in thesingle cell, Nature, Vol. 262, pp. 467471.

    109. Takasuka, N., White, M. R., Wood, C. D.et al. (1998), Dynamic changes in prolactinpromoter activation in individual livinglactotrophic cells, Endocrinology, Vol. 139,pp. 1361 1368.

    & HENRY STEWART PUBLICATIONS 1473-9550. B R I E F I N G S I N F U N C T I O NA L G E N O M I C S A N D P R O T E O M I C S . VOL 2. NO 4. 279297. FEBRUARY 2004 29 7

    Back to the biology in systems biology