6
Leading Edge Essay Statistical Mechanics of Pluripotency Ben D. MacArthur 1, * and Ihor R. Lemischka 2 1 Centre for Human Development, Stem Cells and Regeneration, Institute of Developmental Sciences, School of Mathematics, and Institute for Life Sciences, University of Southampton, SO17 1BJ, UK 2 Department of Developmental and Regenerative Biology, Department of Pharmacology and Systems Therapeutics, Black Family Stem Cell Institute, Mount Sinai School of Medicine, New York, NY 10029, USA *Correspondence: [email protected] http://dx.doi.org/10.1016/j.cell.2013.07.024 Recent reports using single-cell profiling have indicated a remarkably dynamic view of pluripotent stem cell identity. Here, we argue that the pluripotent state is not well defined at the single-cell level but rather is a statistical property of stem cell populations, amenable to analysis using the tools of statistical mechanics and information theory. Pluripotent stem cells (PSCs) are able to self-renew indefinitely in culture and pro- duce all embryonic cell lineages in vivo. Experimentally, pluripotency is assessed using a variety of functional criteria including the ability to differentiate to all three germ layers in vitro and form tera- tomas in vivo and, in mice, the ability to contribute substantially to development when introduced into embryos (Martı´ et al., 2013). These assays reliably dis- tinguish functionally pluripotent from nonpluripotent cell populations and are essential to stem cell research. However, it is notable that they do not generally assess the potency of individual stem cells, but rather the regenerative potential of stem cell derived populations. For homogeneous populations, in which all cells are qualitatively the same, the infer- ence of single-cell identity from popu- lation behavior is reasonable. However, for heterogeneous populations in which qualitatively different subpopulations of cells coexist, the validity of this inference is not clear. Here, we will argue that this distinction is critical and that appropri- ately defined variability is an essential feature of pluripotent cell populations, reflective of the latent potential of individ- ual PSCs to rapidly increase cellular diver- sity during early development in vivo. Recent studies using high-throughput single-cell gene expression profiling have uncovered a surprising degree of cell-to-cell variability within apparently functionally homogeneous PSC popula- tions. In particular, a number PSC identity regulators including key transcription factors such as Nanog, Rex1, and Klf4 have been found to exhibit significant expression level variability within single cells (Canham et al., 2010; Chambers et al., 2007; Hayashi et al., 2008; Kalmar et al., 2009; Kobayashi et al., 2009; MacArthur et al., 2012; Macfarlan et al., 2012; Niwa et al., 2009; Singh et al., 2007; Toyooka et al., 2008; Trott et al., 2012; Zalzman et al., 2010). Interestingly, these variations do not simply define distinct static subpopulations. Rather, intracel- lular expression fluctuations appear to give rise to a state of ‘‘dynamic equilib- rium’’ in which individual cells transit sto- chastically between distinct metastable states yet the overall structure of the population remains stable (Hayashi et al., 2008). Because individual cells interconvert between distinct metastable states, this extraordinary dynamic vari- ability can endow stem cell populations with remarkable robustness to perturba- tion—for instance, temporary deletion of any particular subpopulation—and may therefore be a significant functional prop- erty of PSC communities (Chambers et al., 2007; Silva and Smith, 2008). Although PSC population variability may be decreased in vitro using defined culture conditions (for instance, in which Mek and Gsk3b activity are selectively inhibited; conditions thought to more faithfully mimic the in vivo environment of the inner cell mass) (Macfarlan et al., 2012; Marks et al., 2012; Miyanari and Torres-Padilla, 2012), it is nevertheless thought to have physiological importance in the in vivo developmental program (Hayashi et al., 2008). For example, the homeodomain transcription factor Nanog, one of the most widely studied yet still incompletely understood regula- tors of pluripotency (Chambers et al., 2003; Chambers et al., 2007; Mitsui et al., 2003; Silva et al., 2009), has been observed to exhibit large, apparently stochastic, temporal fluctuations in expression within individual Oct4 positive embryonic stem cells (ESCs) (Chambers et al., 2007; Kalmar et al., 2009; Miyanari and Torres-Padilla, 2012). These fluctua- tions apparently temporarily sensitize in- dividual ESCs to differentiation-inducing signals, transiently priming them for differentiation without marking definitive commitment. Expression variations of other pluripotency markers such as Rex1 (Toyooka et al., 2008) and Stella (Hayashi et al., 2008) as well as lineage regulators such as Hex (Canham et al., 2010) and Hes1 (Kobayashi et al., 2009) have similarly been observed to confer transient lineage biases to PSC subpopu- lations. These fluctuations may be impor- tant because collectively they combine to allow the entire population to respond appropriately to a wide variety of persis- tent environmental signals while avoid- ing premature lineage commitment in response to transitory stimuli (Graf and Stadtfeld, 2008). However, the origin and ultimate biological significance of such fluctuations are currently unclear (see Note Added in Proof). Taken together, these reports indicate a remarkably dynamic view in which indi- vidual stem cells are apparently not closely regulated, yet well-defined and robust population-level behavior emerges from stochastic dynamics at the single- cell level (Silva and Smith, 2008). This interdependence of the population and the individual cell levels is reminiscent of 484 Cell 154, August 1, 2013 ª2013 Elsevier Inc.

Statistical Mechanics of Pluripotency

  • Upload
    ihor-r

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Statistical Mechanics of Pluripotency

Leading Edge

Essay

Statistical Mechanics of Pluripotency

Ben D. MacArthur1,* and Ihor R. Lemischka21Centre for HumanDevelopment, StemCells andRegeneration, Institute of Developmental Sciences, School ofMathematics, and Institute for

Life Sciences, University of Southampton, SO17 1BJ, UK2Department of Developmental and Regenerative Biology, Department of Pharmacology and Systems Therapeutics, Black Family Stem Cell

Institute, Mount Sinai School of Medicine, New York, NY 10029, USA

*Correspondence: [email protected]

http://dx.doi.org/10.1016/j.cell.2013.07.024

Recent reports using single-cell profiling have indicated a remarkably dynamic view of pluripotentstem cell identity. Here, we argue that the pluripotent state is not well defined at the single-cell levelbut rather is a statistical property of stem cell populations, amenable to analysis using the tools ofstatistical mechanics and information theory.

Pluripotent stem cells (PSCs) are able to

self-renew indefinitely in culture and pro-

duce all embryonic cell lineages in vivo.

Experimentally, pluripotency is assessed

using a variety of functional criteria

including the ability to differentiate to all

three germ layers in vitro and form tera-

tomas in vivo and, in mice, the ability to

contribute substantially to development

when introduced into embryos (Martı

et al., 2013). These assays reliably dis-

tinguish functionally pluripotent from

nonpluripotent cell populations and are

essential to stem cell research. However,

it is notable that they do not generally

assess the potency of individual stem

cells, but rather the regenerative potential

of stem cell derived populations. For

homogeneous populations, in which all

cells are qualitatively the same, the infer-

ence of single-cell identity from popu-

lation behavior is reasonable. However,

for heterogeneous populations in which

qualitatively different subpopulations of

cells coexist, the validity of this inference

is not clear. Here, we will argue that this

distinction is critical and that appropri-

ately defined variability is an essential

feature of pluripotent cell populations,

reflective of the latent potential of individ-

ual PSCs to rapidly increase cellular diver-

sity during early development in vivo.

Recent studies using high-throughput

single-cell gene expression profiling

have uncovered a surprising degree of

cell-to-cell variability within apparently

functionally homogeneous PSC popula-

tions. In particular, a number PSC identity

regulators including key transcription

factors such as Nanog, Rex1, and Klf4

have been found to exhibit significant

484 Cell 154, August 1, 2013 ª2013 Elsevier

expression level variability within single

cells (Canham et al., 2010; Chambers

et al., 2007; Hayashi et al., 2008; Kalmar

et al., 2009; Kobayashi et al., 2009;

MacArthur et al., 2012; Macfarlan et al.,

2012; Niwa et al., 2009; Singh et al., 2007;

Toyooka et al., 2008; Trott et al., 2012;

Zalzman et al., 2010). Interestingly, these

variations do not simply define distinct

static subpopulations. Rather, intracel-

lular expression fluctuations appear to

give rise to a state of ‘‘dynamic equilib-

rium’’ in which individual cells transit sto-

chastically between distinct metastable

states yet the overall structure of the

population remains stable (Hayashi

et al., 2008). Because individual cells

interconvert between distinct metastable

states, this extraordinary dynamic vari-

ability can endow stem cell populations

with remarkable robustness to perturba-

tion—for instance, temporary deletion of

any particular subpopulation—and may

therefore be a significant functional prop-

erty of PSC communities (Chambers

et al., 2007; Silva and Smith, 2008).

Although PSC population variability

may be decreased in vitro using defined

culture conditions (for instance, in which

Mek and Gsk3b activity are selectively

inhibited; conditions thought to more

faithfully mimic the in vivo environment

of the inner cell mass) (Macfarlan et al.,

2012; Marks et al., 2012; Miyanari and

Torres-Padilla, 2012), it is nevertheless

thought to have physiological importance

in the in vivo developmental program

(Hayashi et al., 2008). For example,

the homeodomain transcription factor

Nanog, one of the most widely studied

yet still incompletely understood regula-

Inc.

tors of pluripotency (Chambers et al.,

2003; Chambers et al., 2007; Mitsui

et al., 2003; Silva et al., 2009), has been

observed to exhibit large, apparently

stochastic, temporal fluctuations in

expression within individual Oct4 positive

embryonic stem cells (ESCs) (Chambers

et al., 2007; Kalmar et al., 2009; Miyanari

and Torres-Padilla, 2012). These fluctua-

tions apparently temporarily sensitize in-

dividual ESCs to differentiation-inducing

signals, transiently priming them for

differentiation without marking definitive

commitment. Expression variations of

other pluripotency markers such as

Rex1 (Toyooka et al., 2008) and Stella

(Hayashi et al., 2008) as well as lineage

regulators such as Hex (Canham et al.,

2010) and Hes1 (Kobayashi et al., 2009)

have similarly been observed to confer

transient lineage biases to PSC subpopu-

lations. These fluctuations may be impor-

tant because collectively they combine

to allow the entire population to respond

appropriately to a wide variety of persis-

tent environmental signals while avoid-

ing premature lineage commitment in

response to transitory stimuli (Graf and

Stadtfeld, 2008). However, the origin and

ultimate biological significance of such

fluctuations are currently unclear (see

Note Added in Proof).

Taken together, these reports indicate

a remarkably dynamic view in which indi-

vidual stem cells are apparently not

closely regulated, yet well-defined and

robust population-level behavior emerges

from stochastic dynamics at the single-

cell level (Silva and Smith, 2008). This

interdependence of the population and

the individual cell levels is reminiscent of

Page 2: Statistical Mechanics of Pluripotency

statistical mechanics, one of the most

successful and powerful theories in mod-

ern physics.

Statistical MechanicsStatistical mechanics was largely devel-

oped around the turn of the 20th century

by some of the great physicists of that

time (Maxwell, Boltzmann, Planck, etc.).

It provides a formal framework that relates

macroscopic bulk properties of matter,

or macrostates (such as temperature,

pressure, etc.), to the stochastic kinetics

of the microscopic elements (atoms and

molecules) of which matter is composed

(Pathria, 1996). For instance, an early

result from the kinetic theory of gasses

(a forerunner of modern statistical me-

chanics) demonstrated that the pressure

of an ideal gas in a closed container is

due to the collective impact of its con-

stitutive molecules with the walls of the

container, depending on the average

kinetic energy of the molecules. By

contrast, the detailed description of all

the molecules in the gas at any instant

(i.e., their instantaneous positions and

momenta) is an example of a microstate.

Because pressure only depends upon

average properties of the collection of

gas molecules, there are clearly very

many different molecular arrangements

that give rise to the same pressure and,

in general, numerous different inter-

changeable microstates may realize the

same macrostate.

This distinction between microstates

and macrostates is informative when

considering pluripotency. The standard

(albeit often implicit) view, based upon

population expression profiling, is that

pluripotency can, in principle, be associ-

ated with a unique, static, molecular pro-

file (Ivanova et al., 2002; Ramalho-Santos

et al., 2002). This approach has proven to

be remarkably successful at dissecting

the molecular circuitry of pluripotency

(Jaenisch and Young, 2008). However,

recent observations of dynamic variability

of single cells within PSC populations

indicate that a more nuanced per-

spective, which distinguishes between

pluripotency as a molecular state and

pluripotency as a function, is required

(Blau et al., 2001; Cahan and Daley,

2013). These reports indicate that the

pluripotent state is not unique but rather

appears to be analogous to a cellular

macrostate, compatible with a wide

variety of interchangeable molecular

microstates (patterns of gene/protein

expression). In this view, individual cells

within the population independently

explore a variety of different expression

states, in a manner regulated both by

genetic regulatory and signaling networks

and intrinsic gene/protein expression

noise (Paulsson, 2004). This continual

dynamic exploration transiently primes

each individual cell to respond to range

of different differentiation-inducing stim-

uli, depending upon its instantaneous

molecular state. Such dynamic variability

at the single-cell level naturally gives rise

to a diverse population that is continually

able to rapidly respond to a range of

environmental signals. Thus, in this

perspective, while population structure is

robustly maintained, there is no unique

pluripotent state at the single-cell level.

Rather, functional pluripotency emerges

spontaneously from the dynamic vari-

ability intrinsic to the pluripotent state.

Here, we will explore this idea, and its

consequences, further and consider the

ways in which tools from statistical

mechanics, as well as related methods

from information theory, may be used to

better understand the molecular and

cellular basis of pluripotency.

Connecting Dynamic Variabilitywith Population DiversityProtein and mRNA expression are inher-

ently stochastic processes regulated by

complex transcriptional, epigenetic, and

signaling networks (Paulsson, 2004).

The collective action of these multiple

interacting mechanisms can give rise to

robust cell-cell variability within a popu-

lation in complex ways that are difficult

to understand using experiment and

intuition alone. However, mathematical

models and methods can help dissect

this complexity (Liberali and Pelkmans,

2012).

In order to investigate cell-cell vari-

ability quantitatively it is useful to consider

the probability mass function p(x), which

denotes the likelihood of finding a cell at

position x = [x1, x2, . xn] at equilibrium,

where the vector x enumerates the copy

numbers of all the mRNAs/proteins of

interest (estimating p(x) from single-cell

data and extensions to the nonequilibrium

case present challenges that we will not

Cell 1

consider here; we assume for simplicity

that p(x) is accurately known). We will

refer to the set of all possible mRNA/pro-

tein expression patterns as the state

space. Theoretically, if there exists a

unique equilibrium distribution p(x), to

which every initial condition converges

sufficiently rapidly, then the underlying

stochastic process is said to be ergodic

and each cell in the population will inde-

pendently explore the same patterns of

expression within the state space in

accordance with p(x). In this case the

population is intrinsically robust to tar-

geted removal of any particular subpopu-

lation because the remaining cells in the

population will eventually ‘‘recolonize’’

(in state space) the removed subpopula-

tion and reconstitute the equilibrium dis-

tribution p(x). Similar reconstitution has

been experimentally observed with

respect to a number of key PSC markers,

including Nanog (Chambers et al., 2007;

Kalmar et al., 2009), Rex1 (Toyooka

et al., 2008), and Stella (Hayashi et al.,

2008) as well as with respect to the

stem cell surface marker Sca-1 in he-

matopoietic progenitor cells (Chang

et al., 2008) and between distinct pheno-

typic states within cancer cell populations

(Gupta et al., 2011), suggesting that

that the underlying stochastic processes

that regulate expression variation are

‘‘ergodic-like.’’ It remains to be seen

how widespread this property is. Howev-

er, it may be important because it con-

nects dynamic variability at the single-

cell level with diversity at the population

level. Indeed, this connection may

be central to functional pluripotency

because it theoretically endows every

individual cell with the latent ability to

reproduce the entire diverse population

through self-renewal divisions.

Statistical Mechanics andInformation TheoryClearly, not all patterns of mRNA/protein

expression will be equally likely and p(x)

may exhibit rich heterogeneity according

to the various expression patterns present

within the population. But how should

such ‘‘heterogeneity’’ actually be quanti-

fied? For complex multivariate distri-

butions a full description of variation

cannot be captured in a single number;

therefore, it depends on precisely how

‘‘heterogeneity’’ is defined. For example,

54, August 1, 2013 ª2013 Elsevier Inc. 485

Page 3: Statistical Mechanics of Pluripotency

if heterogeneity is taken to be synony-

mous with variability we might examine

the covariance structure of p(x), while if

heterogeneity is taken to mean ‘‘the num-

ber of qualitatively different subpopula-

tions’’ then we might examine the number

of distinct modes (or ‘‘peaks’’) in p(x) and

so forth. Each such method is useful and

each sheds light on different aspects of

what is meant by ‘‘heterogeneity.’’ A num-

ber of authors have considered such ap-

proaches (Buganim et al., 2012; Guo

et al., 2010; MacArthur et al., 2012). We

suggest that, in addition to such statistical

methods, another measure, the entropy,

may also be useful for assessing vari-

ability in stem cell populations.

Informally, entropy is used in statistical

mechanics and information theory as a

measure of ‘‘disorder’’ or ‘‘uncertainty’’

(Cover and Thomas, 2006; Pathria, 1996).

Formally, the entropy H(q) of a discrete

probabilitymass functionq(x) isdefinedas:

HðqÞ= � kX

x

qðxÞ log qðxÞ;

where the sum is taken over the entire

state space. In physical applications, k is

Boltzmann’s constant and natural loga-

rithms are used, whereas in information

theory k = 1, and the logarithm is usually

to the base 2 (in which case, entropy is

measured in bits). Clausius introduced

the modern notion of entropy in the mid

19th century to quantify the loss of useful

energy to heat in mechanical engines.

The statistical form given above was

derived by Gibbs in the 1870s and was

argued to be a general measure of uncer-

tainty by Shannon in the 1940s (Cover

and Thomas, 2006; Pathria, 1996). The

connection between entropy and uncer-

tainty is not immediately apparent from

the mathematical definition above but

can be understood using a simple

example. Consider flipping a (possibly

biased) coin, which has probability q of

obtaining a head and probability (1 � q)

of obtaining a tail. Applying the above for-

mula (and adopting the information-theo-

retic definition) the entropy of the coin

flip is H = � q log2q�ð1� qÞ log2ð1� qÞbits, which is zero when q = 0 or q = 1

(the coin is completely biased) and rea-

ches its maximum H = 1 bit, when q =

1/2 (the coin is completely fair). In other

words, the entropy increases as the

outcome of the coin flip becomes more

486 Cell 154, August 1, 2013 ª2013 Elsevier

uncertain. For our purposes, the entropy

H(p) of a cell population is the amount of

uncertainty concerning the molecular

state of a cell drawn at random from the

population. If the population has high

entropy then there is a large amount of

uncertainty concerning the identity of a

randomly selected cell, whereas if the

population has low entropy then there is

less uncertainty. Thus, entropy is a useful

measure of ‘‘variability’’: populations with

low entropy exhibit well-defined patterns

of mRNA/protein expression, whereas

those with high entropy exhibit diverse

patterns of expression.

Entropy is also an extremely useful tool

for analyzing the behavior of stochastic

processes, such as those defined by the

genetic regulatory networks that control

the mRNA and protein expression pat-

terns that ultimately define cell population

heterogeneity (Paulsson, 2004). In clas-

sical statistical mechanics the second

law of thermodynamics states that the

entropy of an isolated system increases

over time toward the maximum entropy

uniform distribution (in which all states

are equally likely). However, for biologi-

cally plausible systems, which interact

with their environment and are likely to

be subject to numerous complex regula-

tory constraints, the second law is not

directly applicable. Rather, subject to

certain reasonable assumptions, a related

quantity known as the relative entropy or

Kullback-Leibler divergence (with respect

to the equilibrium distribution) decreases

with time (Cover and Thomas, 2006;

Gardiner, 2009). For biologically relevant

dynamics, the entropy may not always

increase, and the equilibrium distribution

is not expected to be uniform except

in certain very specific circumstances

(Cover and Thomas, 2006; Gardiner,

2009). Rather, in general, regulatory inter-

actions between genes and proteins

introduce correlations in expression that

reduce uncertainty in expression patterns

and therefore reduce the entropy of the

population at equilibrium. Informally, the

equilibrium distribution maximizes en-

tropy subject to satisfying any imposed

regulatory constraints. This connection

suggests a broad principle: at equilibrium,

cell populations that are subject to

strict regulatory constraints should exhibit

well-defined and low entropy expression

patterns, whereas those that are subject

Inc.

to weaker regulatory constraints should

exhibit more diverse, higher entropy

expression patterns. Viewing variability

in this light indicates that PSC populations

may be more diverse than differentiated

populations because they are subject to

weaker regulatory constraints.

Statistical Mechanics ofPluripotencyThis reasoning above is in agreement with

recent reports that PSCs have a remark-

ably permissive and dynamic chromatin

structure that is globally enriched for

histone marks indicative of active tran-

scription, such as H3K4me3 (Meshorer

and Misteli, 2006). Although widespread

activation of gene expression is moder-

ated by numerous mechanisms, including

coexpression of repressive marks such

as H3K27me3 (at so-called bivalent

domains) and repression by polycomb

group complex proteins, this globally

hyperdynamic chromatin provides a

uniquely open regulatory environment in

which lineage markers are sporadically

expressed at low levels, allowing PSCs

to remain poised for differentiation while

keeping ‘‘all options open’’ (Efroni et al.,

2008; Gaspar-Maia et al., 2011). Although

this gene expression ‘‘noise’’ is buffered,

for instance by targeting of preinitiation

complexes for proteasomal degradation

(Szutorisz et al., 2006), it may naturally

give rise to variation in expression pat-

terns within the population (Efroni et al.,

2009). However, upon differentiation, this

transcriptionally hyperactive poised state

is gradually lost. Increasingly numerous

regions of heterochromatin form and

chromatin architectural proteins, such as

linker histone H1 and the heterochromatin

component HP1, become more tightly

bound, resulting in less expression noise

and more closely regulated patterns of

gene expression (Efroni et al., 2008).

Taken together, these results suggest

that the characteristically loose regulatory

architecture of PSCs imposes only weak

constraints on mRNA/protein expression,

which in turn naturally results in high-

entropy expression patterns within the

population. However, as differentiation

progresses and chromatin condenses,

expression patterns become more tightly

constrained and population entropy

decreases. Thus, this model predicts

that cell population entropy is positively

Page 4: Statistical Mechanics of Pluripotency

Figure 1. Entropy and Developmental PotencyThe permissive regulatory architecture of PSCs imposes weak constraints onmRNA/protein expression, giving rise to high-entropy expression patternswithin the population. As differentiation progresses expression patternsbecome more tightly constrained and population entropy decreases.

related to developmental po-

tency (see Figure 1).

ConclusionsHere, we have argued that it is

useful to think of pluripotency

as a statistical property,

similar to a macrostate in

statistical physics. In this

view, although individual

PSCs may have the potential

to produce pluripotent popu-

lations through self-renewal

divisions, the pluripotent state

is not well defined at the sin-

gle-cell level. Rather, func-

tional pluripotency emerges

spontaneously from the dy-

namic variability intrinsic to

the pluripotent state. During

normal development in vivo, this dynamic

variability is strictly spatiotemporally

regulated (Soriano and Jaenisch, 1986).

However, in vitro these restrictions are

largely released and, dependent upon

culture conditions, the intrinsic variability

in the population becomes apparent.

Intuitively, this view is quite simple: highly

variable expression patterns are equiva-

lent to, and responsible for, the neces-

sarily large number of developmental

‘‘choices’’ characteristic of PSCs. Indeed,

from this perspective, diverse expression

patterns may be the defining property of

PSC populations. In contrast, differenti-

ated cells have defined fates and limited

choices and therefore well-defined, less

diverse, expression patterns.

Despite recent advances in single-cell

profiling, a number of important issues

remain to be addressed. Foremost among

these is the extent to which observed

in vitro expression variability in PSC pop-

ulations reflects genuinely important as-

pects of development rather than the ef-

fects of poorly defined culture conditions

or artifacts due to the use of reporter cell

lines (Marks et al., 2012). Refinement of

PSC culture protocols, development of

procedures to reliably and robustly

assess the developmental potency of

individual cells, and generation of tech-

niques to profile the entire transcrip-

tome/proteome of large numbers of indi-

vidual cells will clarify these issues

(Newman et al., 2006; Schubert, 2011;

Tang et al., 2009; Ying et al., 2008). It will

be particularly interesting to use such

technical advances to compare patterns

of variability in different stem populations

in a range of physiologically relevant cul-

ture conditions and thereby determine

the extent to which quantitative mea-

sures of population diversity, such as

the expression entropy, relate to devel-

opmental potency. We anticipate that

the use of high-throughput screening

technologies, such as RNA interference

screens (Boutros and Ahringer, 2008),

for factors that regulate population vari-

ability in different stem cell populations

will also greatly improve our understand-

ing in this area. Similarly, advances

in recently developed techniques that

are able to noninvasively track multi-

dimensional expression changes over

time in large numbers of individual

cells will be particularly important to

dissect the molecular mechanisms, and

ultimate functional significance, of dy-

namic expression variability in single

cells (Schroeder, 2008; Sigal et al.,

2006; Spiller et al., 2010). Such experi-

mental advances will produce large

amounts of complex data. Therefore

concurrent with these experimental de-

velopments, new bioinformatic and

mathematical tools for analyzing and in-

terpreting high-throughput single-cell

expression data will also be required

(Tang et al., 2011). Ultimately, such com-

bined experimental and theoretical de-

velopments will advance our understand-

ing of how the intracellular molecular

circuitry of pluripotency not only controls

the behavior of individual cells but also

Cell 154, August 1

regulates variability within

PSC populations.

The notion that the function

of a stem cell population may

be controlled by the interplay

between deterministic and

stochastic mechanisms at

the single-cell level is concor-

dant with the extensive body

of work on the role of stochas-

ticity in regulating hematopoi-

etic stem cell fate decisions

(particularly the pioneering

work of McCulloch, Till and

Siminovitch in the 1960s and

Suda, Ogawa, and coworkers

in the 1980s) and recent ob-

servations of dynamic vari-

ability in cancer cell popula-

tions (Gupta et al., 2011;

MacArthur et al., 2009). This suggests

that a statistical view of cell identity and

function may also be appropriate in a

wide variety of different systems (Pelk-

mans, 2012). It remains to be seen if this

is the case. However, a more rigorous

evaluation of population variability and

its underlying causes and consequences,

as well as the development of methods to

dissect and control variability, will be crit-

ical to understand the molecular mecha-

nisms of pluripotency and early develop-

ment. Such an understanding will, in

turn, be necessary to take pluripotent

stem cells from the lab to the clinic.

ACKNOWLEDGMENTS

We thank Tom Fleming and Tim Sluckin for useful

feedback on earlier versions of this article.

REFERENCES

Blau, H.M., Brazelton, T.R., and Weimann, J.M.

(2001). The evolving concept of a stem cell: entity

or function? Cell 105, 829–841.

Boutros, M., and Ahringer, J. (2008). The art and

design of genetic screens: RNA interference. Nat.

Rev. Genet. 9, 554–566.

Buganim, Y., Faddah, D.A., Cheng, A.W., Itsko-

vich, E., Markoulaki, S., Ganz, K., Klemm, S.L.,

van Oudenaarden, A., and Jaenisch, R. (2012).

Single-cell expression analyses during cellular

reprogramming reveal an early stochastic and a

late hierarchic phase. Cell 150, 1209–1222.

Cahan, P., and Daley, G.Q. (2013). Origins and

implications of pluripotent stem cell variability

and heterogeneity. Nat. Rev. Mol. Cell Biol. 14,

357–368.

, 2013 ª2013 Elsevier Inc. 487

Page 5: Statistical Mechanics of Pluripotency

Canham, M.A., Sharov, A.A., Ko, M.S.H., and

Brickman, J.M. (2010). Functional heterogeneity

of embryonic stem cells revealed through transla-

tional amplification of an early endodermal tran-

script. PLoS Biol. 8, e1000379.

Chambers, I., Colby, D., Robertson, M., Nichols, J.,

Lee, S., Tweedie, S., and Smith, A. (2003). Func-

tional expression cloning of Nanog, a pluripotency

sustaining factor in embryonic stem cells. Cell 113,

643–655.

Chambers, I., Silva, J., Colby, D., Nichols, J., Nij-

meijer, B., Robertson, M., Vrana, J., Jones, K.,

Grotewold, L., and Smith, A. (2007). Nanog safe-

guards pluripotency and mediates germline devel-

opment. Nature 450, 1230–1234.

Chang, H.H., Hemberg, M., Barahona, M., Ingber,

D.E., and Huang, S. (2008). Transcriptome-wide

noise controls lineage choice in mammalian pro-

genitor cells. Nature 453, 544–547.

Cover, T.M., and Thomas, J.A. (2006). Elements of

information theory (Hoboken, NJ: Wiley-inter-

science).

Efroni, S., Duttagupta, R., Cheng, J., Dehghani, H.,

Hoeppner, D.J., Dash, C., Bazett-Jones, D.P., Le

Grice, S., McKay, R.D.G., Buetow, K.H., et al.

(2008). Global transcription in pluripotent embry-

onic stem cells. Cell Stem Cell 2, 437–447.

Efroni, S., Melcer, S., Nissim-Rafinia, M., and

Meshorer, E. (2009). Stem cells do play with dice:

a statistical physics view of transcription. Cell

Cycle 8, 43–48.

Gardiner, C.W. (2009). Stochastic methods (Berlin:

Springer).

Gaspar-Maia, A., Alajem, A., Meshorer, E., and

Ramalho-Santos, M. (2011). Open chromatin in

pluripotency and reprogramming. Nat. Rev. Mol.

Cell Biol. 12, 36–47.

Graf, T., and Stadtfeld, M. (2008). Heterogeneity of

embryonic and adult stem cells. Cell Stem Cell 3,

480–483.

Guo, G.J., Huss, M., Tong, G.Q., Wang, C.Y., Li

Sun, L., Clarke, N.D., and Robson, P. (2010).

Resolution of cell fate decisions revealed by

single-cell gene expression analysis from zygote

to blastocyst. Dev. Cell 18, 675–685.

Gupta, P.B., Fillmore, C.M., Jiang, G.Z., Shapira,

S.D., Tao, K., Kuperwasser, C., and Lander, E.S.

(2011). Stochastic state transitions give rise to

phenotypic equilibrium in populations of cancer

cells. Cell 146, 633–644.

Hayashi, K., Lopes, S.M.C.D., Tang, F., and Surani,

M.A. (2008). Dynamic equilibrium and heterogene-

ity of mouse pluripotent stem cells with distinct

functional and epigenetic states. Cell Stem Cell 3,

391–401.

Ivanova, N.B., Dimos, J.T., Schaniel, C., Hackney,

J.A., Moore, K.A., and Lemischka, I.R. (2002).

A stem cell molecular signature. Science 298,

601–604.

Jaenisch, R., and Young, R. (2008). Stem cells, the

molecular circuitry of pluripotency and nuclear re-

programming. Cell 132, 567–582.

488 Cell 154, August 1, 2013 ª2013 Elsevier

Kalmar, T., Lim, C., Hayward, P., Munoz-Descalzo,

S., Nichols, J., Garcia-Ojalvo, J., and Martinez

Arias, A. (2009). Regulated fluctuations in nanog

expression mediate cell fate decisions in embry-

onic stem cells. PLoS Biol. 7, e1000149.

Kobayashi, T., Mizuno, H., Imayoshi, I., Furusawa,

C., Shirahige, K., and Kageyama, R. (2009). The

cyclic gene Hes1 contributes to diverse differen-

tiation responses of embryonic stem cells. Genes

Dev. 23, 1870–1875.

Liberali, P., and Pelkmans, L. (2012). Towards

quantitative cell biology. Nat. Cell Biol. 14, 1233.

MacArthur, B.D., Ma’ayan, A., and Lemischka, I.R.

(2009). Systems biology of stem cell fate and

cellular reprogramming. Nat. Rev. Mol. Cell Biol.

10, 672–681.

MacArthur, B.D., Sevilla, A., Lenz, M., Muller, F.J.,

Schuldt, B.M., Schuppert, A.A., Ridden, S.J.,

Stumpf, P.S., Fidalgo, M., Ma’ayan, A., et al.

(2012). Nanog-dependent feedback loops regulate

murine embryonic stem cell heterogeneity. Nat.

Cell Biol. 14, 1139–1147.

Macfarlan, T.S., Gifford, W.D., Driscoll, S., Lettieri,

K., Rowe, H.M., Bonanomi, D., Firth, A., Singer, O.,

Trono, D., and Pfaff, S.L. (2012). Embryonic stem

cell potency fluctuates with endogenous retrovirus

activity. Nature 487, 57–63.

Marks, H., Kalkan, T., Menafra, R., Denissov, S.,

Jones, K., Hofemeister, H., Nichols, J., Kranz, A.,

Stewart, A.F., Smith, A., and Stunnenberg, H.G.

(2012). The transcriptional and epigenomic foun-

dations of ground state pluripotency. Cell 149,

590–604.

Martı, M., Mulero, L., Pardo, C., Morera, C., Car-

rio, M., Laricchia-Robbio, L., Esteban, C.R., and

Izpisua Belmonte, J.C. (2013). Characterization

of pluripotent stem cells. Nat. Protoc. 8,

223–253.

Meshorer, E., and Misteli, T. (2006). Chromatin in

pluripotent embryonic stem cells and differentia-

tion. Nat. Rev. Mol. Cell Biol. 7, 540–546.

Mitsui, K., Tokuzawa, Y., Itoh, H., Segawa, K.,

Murakami, M., Takahashi, K., Maruyama, M.,

Maeda, M., and Yamanaka, S. (2003). The home-

oprotein Nanog is required for maintenance of

pluripotency in mouse epiblast and ES cells. Cell

113, 631–642.

Miyanari, Y., and Torres-Padilla, M.E. (2012).

Control of ground-state pluripotency by allelic

regulation of Nanog. Nature 483, 470–473.

Newman, J.R.S., Ghaemmaghami, S., Ihmels, J.,

Breslow, D.K., Noble, M., DeRisi, J.L., and Weiss-

man, J.S. (2006). Single-cell proteomic analysis of

S. cerevisiae reveals the architecture of biological

noise. Nature 441, 840–846.

Niwa, H., Ogawa, K., Shimosato, D., and Adachi,

K. (2009). A parallel circuit of LIF signalling path-

ways maintains pluripotency of mouse ES cells.

Nature 460, 118–122.

Pathria, R. (1996). Statistical mechanics (Oxford,

UK: Butterworth Heinemann).

Paulsson, J. (2004). Summing up the noise in gene

networks. Nature 427, 415–418.

Inc.

Pelkmans, L. (2012). Cell Biology. Using cell-to-cell

variability—a new era in molecular biology. Sci-

ence 336, 425–426.

Ramalho-Santos, M., Yoon, S., Matsuzaki, Y., Mul-

ligan, R.C., and Melton, D.A. (2002). ‘‘Stemness’’:

transcriptional profiling of embryonic and adult

stem cells. Science 298, 597–600.

Schroeder, T. (2008). Imaging stem-cell-driven

regeneration in mammals. Nature 453, 345–351.

Schubert, C. (2011). Single-cell analysis: The

deepest differences. Nature 480, 133–137.

Sigal, A., Milo, R., Cohen, A., Geva-Zatorsky, N.,

Klein, Y., Liron, Y., Rosenfeld, N., Danon, T.,

Perzov, N., and Alon, U. (2006). Variability and

memory of protein levels in human cells. Nature

444, 643–646.

Silva, J., and Smith, A. (2008). Capturing pluripo-

tency. Cell 132, 532–536.

Silva, J., Nichols, J., Theunissen, T.W., Guo, G.,

van Oosten, A.L., Barrandon, O., Wray, J., Yama-

naka, S., Chambers, I., and Smith, A. (2009).

Nanog is the gateway to the pluripotent ground

state. Cell 138, 722–737.

Singh, A.M., Hamazaki, T., Hankowski, K.E., and

Terada, N. (2007). A heterogeneous expression

pattern for Nanog in embryonic stem cells. Stem

Cells 25, 2534–2542.

Soriano, P., and Jaenisch, R. (1986). Retroviruses

as probes for mammalian development: allocation

of cells to the somatic and germ cell lineages. Cell

46, 19–29.

Spiller, D.G., Wood, C.D., Rand, D.A., and White,

M.R.H. (2010). Measurement of single-cell dy-

namics. Nature 465, 736–745.

Szutorisz, H., Georgiou, A., Tora, L., and Dillon, N.

(2006). The proteasome restricts permissive tran-

scription at tissue-specific gene loci in embryonic

stem cells. Cell 127, 1375–1388.

Tang, F.C., Barbacioru, C., Wang, Y.Z., Nordman,

E., Lee, C., Xu, N.L., Wang, X.H., Bodeau, J.,

Tuch, B.B., Siddiqui, A., et al. (2009). mRNA-Seq

whole-transcriptome analysis of a single cell. Nat.

Methods 6, 377–382.

Tang, F., Lao, K., and Surani, M.A. (2011). Devel-

opment and applications of single-cell transcrip-

tome analysis. Nat. Methods 8(4, Suppl),

S6–S11.

Toyooka, Y., Shimosato, D., Murakami, K., Taka-

hashi, K., and Niwa, H. (2008). Identification

and characterization of subpopulations in undif-

ferentiated ES cell culture. Development 135,

909–918.

Trott, J., Hayashi, K., Surani, A., Babu, M.M., and

Martinez-Arias, A. (2012). Dissecting ensemble

networks in ES cell populations reveals micro-het-

erogeneity underlying pluripotency. Mol. Biosyst.

8, 744–752.

Ying, Q.L., Wray, J., Nichols, J., Batlle-Morera, L.,

Doble, B., Woodgett, J., Cohen, P., and Smith, A.

(2008). The ground state of embryonic stem cell

self-renewal. Nature 453, 519–523.

Zalzman, M., Falco, G., Sharova, L.V., Nishiyama,

A., Thomas, M., Lee, S.L., Stagg, C.A., Hoang,

Page 6: Statistical Mechanics of Pluripotency

H.G., Yang, H.T., Indig, F.E., et al. (2010). Zscan4

regulates telomere elongation and genomic stabil-

ity in ES cells. Nature 464, 858–863.

Note Added in Proof

Since the submission of this manuscript, several

reports have been published which question the

biological significance of previously reported

heterogeneity in Nanog expression. Using single-

molecule mRNA-FISH to quantify transcript

expression in single cells Faddah et al. observed

that coexpression patterns of a range of pluripo-

tency factors, including Nanog, are more uniformly

distributed within embryonic stem cell populations

than has previously been reported using heterozy-

gous loss-of-function knockin reporters. They

attribute previously described heterogeneity to

artifacts associated with reporter disruption of

endogenous gene expression. Because entropy

measures how uniformly expression probability is

spread over the available state space, these new

results are in accordance with the entropic classi-

fication of PSC diversity that we suggest, and

highlight the need to properly quantify, and distin-

guish between, ‘‘diversity’’ and ‘‘heterogeneity’’ of

expression patterns.

Cell 1

Additional references include the following:

Faddah, D.A., Wang, H., Cheng, A.W., Katz, Y.,

Buganim, Y., Jaenisch, R. (2013). Single-Cell Anal-

ysis Reveals that Expression of Nanog Is Biallelic

and Equally Variable as that of Other Pluripotency

Factors in Mouse ESCs. Cell Stem Cell 13, 23–29.

Filipczyk, A., Gkatzis, K., Fu, J., Hoppe, P.S.,

Lickert, H., Anastassiadis, K., and Schroeder, T.

(2013). Biallelic Expression of Nanog Protein in

Mouse Embryonic Stem Cells. Cell Stem Cell 13,

12–13.

Smith, A. (2013). Nanog Heterogeneity: Tilting at

Windmills? Cell Stem Cell 13, 6–7.

54, August 1, 2013 ª2013 Elsevier Inc. 489