15
Supplementary Methods Sections: 1. Barcode distance matrix computation 2. Analysis of sources of error in MEMOIR operation, and their impact on reconstruction accuracy 3. Simulation of MEMOIR operation for missing cells and sparse trees 4. Inference of the switching rates between the two Esrrb gene expression states 1. Barcode distance matrix computation In this section we describe how we computed the matrix of pairwise barcode distances between cells (Figure 3e). These distances are based on the complete barcoded scratchpad profile of each cell, including both the total number of transcripts, denoted as , and the fraction of those transcripts that colocalized with scratchpad, denoted as . First, to account for statistical fluctuations and errors in the detected fraction of colocalized transcripts, we converted the measured value of to a distribution of possible values for the colocalization fraction. To account for measurement errors, we imposed a lower bound of 0.04 and an upper bound of 0.96 on the measured value of . To determine the distribution of possible values of , we assumed that fluctuations follow a binomial distribution with mean and variance (1 − ). This assumption follows from the physical reasoning that the probability of colocalization of one transcript is independent from that of other transcripts in the cell. The distribution of possible values of the colocalization fraction, , takes the form, () = *+ 1− -.* + To keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. ! ≈ ( + 1). To compute the cell-to-cell distance for a single barcode, we quantified how the colocalization fraction distribution differs for a given barcode between two cells. We denote the distribution of colocalization WWW.NATURE.COM/NATURE | 1 SUPPLEMENTARY INFORMATION doi:10.1038/nature20777

SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

1

SupplementaryMethods

Sections:1. Barcodedistancematrixcomputation2. AnalysisofsourcesoferrorinMEMOIRoperation,andtheirimpactonreconstruction

accuracy3. SimulationofMEMOIRoperationformissingcellsandsparsetrees4. InferenceoftheswitchingratesbetweenthetwoEsrrbgeneexpressionstates

1. Barcodedistancematrixcomputation

In this sectionwedescribehowwecomputedthematrixofpairwisebarcodedistancesbetweencells

(Figure 3e). These distances are based on the complete barcoded scratchpad profile of each cell,

includingboththetotalnumberoftranscripts,denotedas𝑁,andthefractionofthosetranscriptsthat

colocalizedwithscratchpad,denotedas𝑝.

First,toaccountforstatisticalfluctuationsanderrorsinthedetectedfractionofcolocalizedtranscripts,

weconvertedthemeasuredvalueof𝑝toadistributionofpossiblevaluesforthecolocalizationfraction.

Toaccountformeasurementerrors,weimposedalowerboundof0.04andanupperboundof0.96on

the measured value of 𝑝. To determine the distribution of possible values of 𝑝, we assumed that

fluctuations follow a binomial distribution with mean𝑁𝑝 and variance𝑁𝑝(1 − 𝑝). This assumption

follows from the physical reasoning that the probability of colocalization of one transcript is

independentfromthatofothertranscriptsinthecell.

Thedistributionofpossiblevaluesofthecolocalizationfraction,𝑥,takestheform,

𝐹(𝑥𝑁) = 𝑁𝑥

𝑝*+ 1 − 𝑝 -.* +

Tokeepnumerical computations tractable,weapproximated the factorials in thebinomial coefficient

usinggammafunctions,i.e.𝑁! ≈ 𝛤(𝑁 + 1).

Tocomputethecell-to-celldistanceforasinglebarcode,wequantifiedhowthecolocalizationfraction

distributiondiffersforagivenbarcodebetweentwocells.Wedenotethedistributionofcolocalization

WWW.NATURE.COM/NATURE | 1

SUPPLEMENTARY INFORMATIONdoi:10.1038/nature20777

Page 2: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

fractionofbarcode𝑖incell𝑎as𝐹56(𝑥)andthecomparabledistributionincell𝑏as𝐹86(𝑦).Wedefinethe

barcode𝑑6(𝑎, 𝑏)distancebetweenthetwocellsas,

𝑑6 𝑎, 𝑏 = 𝐹56 𝑥 𝐹86 𝑦 |𝑥 − 𝑦|𝑑𝑥𝑑𝑦-

=

-

=

Finally,theoverallcell-to-celldistancebasedonallbarcodeinformationwascalculatedastheaverage

over the distances derived for each barcode weighted for each barcode by the smaller number of

barcode transcripts ineitherof the twocells. Thisensures thatbarcodes thatarehighlyexpressed in

bothcellsareweightedmoreheavily thanbarcodes thatareweaklyexpressed ineitherorbothcells.

Denotingthenumberoftranscriptsofbarcode𝑖 incell𝑎as𝑁56 ,andsimilarlyforcell𝑏,thecell-to-cell

distancematrix,showninFig.3e,thentakestheform,

𝑑 𝑎, 𝑏 = min 𝑁56 , 𝑁86 𝑑6(𝑎, 𝑏)6

2. AnalysisofsourcesoferrorinMEMOIRoperation,andtheirimpactonreconstructionaccuracy

2a.IncorporatingnoiseintoMEMOIRsimulations

To understand what factors impact the overall reconstruction errors in MEMOIR, we simulated the

recording, readout, and reconstruction processes, incorporating relevant sources of stochastic

fluctuations(noise).Sourcesoferrorcanbedividedintothreecategories:(1)noiseinherenttothebasic

mechanismofstochasticscratchpadcollapse,(2)recordingnoise,and(3)readoutnoise(seeExtended

DataFig.6a).

Noise inherent to the system. The first noise category captures fluctuations due to the general

approachofrecording lineage information indiscretestochasticmutagenic(collapse)events,butdoes

not include noise arising from the specific experimental implementation of the system (those noise

sourcesaredescribedbelow).Reconstructionerrorsinthiscategoryresultfromcollapseeventsthatby

chance occur too frequently or too infrequently to provide lineage information, as well as identical

collapse events occurring in different cells (coincidences). For example, if the exact same scratchpad

collapseevent(s)occur independently in twosistercells, the resultingcladeswillbe indistinguishable.

The simulations shown in Figure 3g and Extended Data Figure 8 assume irreversible collapse of an

idealized set of binary scratchpads, occurring at a constant rate and therefore incorporate only this

WWW.NATURE.COM/NATURE | 2

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 3: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

source of error. Analysis of these simulations thus shows how the stochasticity of collapse events

impactsreconstructionaccuracyintheabsenceofadditionalsourcesofnoise.

Recording noise. Recording noise accounts for fluctuations in the collapse rate due to stochastic

fluctuationsintheexpressionlevelsofgRNAandCas9.Toaccuratelyrepresentthemagnitudeofthese

effects,wefirstmeasuredthesefluctuationsexperimentally,andthenincorporatedempiricalvaluesfor

thesefluctuationsintothesimulations.

First,wequantifiedgRNAexpressionvariabilitybothwithincolonies(intra-colony)andbetweencolonies

(inter-colony)using the co-expressed fluorescent reporter in individual cells inall of the108MEM-01

colonies.Wedenote the gRNAexpression in cell 𝑖 in colony𝑚 as𝑔6C. (ExtendedData Fig. 6b,c). The

intra-colony noise was computed by first determining the variance of the single-cell measurements

amongstcellsineachindividualcolonyandthenaveragingthatvarianceacrossallcolonies.Namely,

𝜎EF =1𝑁

1𝑛C

𝑔6C −1𝑛C

𝑔6CHI

6J-

FHI

6J-

+

CJ-

Here,𝑁isthetotalnumberofcolonies,and𝑛Cisthenumberofcellsincolony𝑚.

Similarly,the inter-colonynoisewascomputedbytakingthemeanof𝑔6Camongstthecells inagiven

colonyandthencomputingthevarianceofthecolonymeansacrossallcolonies.

𝜏EF =1𝑁

1𝑛C

𝑔6CHI

6J-

− 𝑔

F+

CJ-

where𝑔isthemeangRNAexpressionacrossallcellsinallcolonies,

𝑔 =1𝑁

1𝑛C

𝑔6CHI

6J-

+

CJ-

WWW.NATURE.COM/NATURE | 3

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 4: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

Second,wequantifiedvariabilityinCas9expressionusingthesameanalysis,performedonthenumber

ofCas9 transcripts,𝑐, readoutusingsmFISH in the final roundofhybridization.Wedenote the intra-

colony and inter-colony variation in Cas9 expression as 𝜎M and 𝜏M, respectively, and themean Cas9

expressionlevelacrossallcellinallcoloniesas𝑐.

We next incorporated the empirically determined noise measurements for Cas9 and gRNA into the

simulations.Wesettheeffectivecollapserateineachcell,𝜆,proportionaltotheproductofthenumber

ofCas9 transcripts,𝑐, and the gRNAactivity,𝑔:𝜆 = 𝑔𝑐. To incorporate recordingnoise,weassumed

each colony had its own values for 𝑐 and𝑔, drawn from the observed intercolony distributionswith

means𝑐and𝑔andstandarddeviations𝜏M and𝜏Erespectively.Next,weincorporatedtheintra-colony

noise,byaddingfluctuationstovaluesof𝑐and𝑔 ineachcell inagivencolony,withthemagnitudeof

fluctuationssetbytheempiricalintra-colonyvariations,𝜎M and𝜎E.Theprobabilityofcollapseforeach

scratchpadpergeneration,𝑝,wascalculatedfrom𝜆assumingaPoissonprocess,namely,𝑝 = 1 − 𝑒.PQ.

The normalization constant, 𝜂, was selected to ensure an average collapse probability of 0.1 per

scratchpadpergeneration,consistentwiththeempiricallyestimatedaveragecollapserateintheMEM-

01 colonies (seeFig. 2c andMethods).Weverified that the resulting fluctuations in the collapse rate

(from roughly 0.05 to 0.2 per scratchpad per generation) were consistent with the experimentally

observedfluctuationsinthelossofcolocalizationfractionbetweenindividualcellsacrossthe108MEM-

01coloniesusedintheanalysisshowninFigure3.Overall,asshowninFig.3iandExtendedDataFig.6d,

this recording noise produced only a relatively small decrease in the reconstruction accuracy of the

simulatedcolonies.

Readoutnoise.Thiscategoryincludes(a)stochasticexpressionofindividualbarcodedscratchpads,(b)

the effects of multiple integrations of the same barcoded scratchpad type, and (c) errors in smFISH

imagingreadout.Tosimulatetheimpactofreadoutnoiseonreconstructionaccuracy,wefirstobtained

theempiricaldistributionofsingle-celltranscriptcountsofeachofthe13scratchpadsinthecellsinall

108analyzedMEM-01colonies(seeFigurebelow).Thetranscriptcountdistributionsforthebarcoded

scratchpadswerewell-approximatedasgammadistributions(withanaveragemeanof5.4).Thegamma

distributionisexpectedwhenthereisaconstantprobabilityperunittimeofinitiatingatranscriptional

burst,andthenumberofmRNAsproducedperburstfollowsanexponentialdistribution,andhasbeen

observedempiricallytodescribediversestochasticgeneexpressionprocesses43,44.

WWW.NATURE.COM/NATURE | 4

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 5: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

Distributionofthenumberoftranscriptsofeachbarcodeinindividualcellsinthe108MEM-01coloniesanalyzedinFigure3.All13detectedbarcodesareshown.BlacklinesarefitstoGammadistributions.

We quantified the intra-colony and inter-colony variability in the expression levels of the barcoded

scratchpadsusing thesameproceduredescribedabove forgRNAandCas9. In this case,variancewas

calculatedforeachbarcodeandthensummedover13barcodespercelltoestimateeachcell’soverall

expression variance. These empirical parameterizations for the transcript count distributions were

incorporatedinthesimulations.

For readout, we first associatedwith each barcode a random number of transcripts drawn from the

gamma distribution associated with that barcode. Second, we incorporated additional noise from

smFISH imaging, based on experimental quantification of the accuracy of smFISH colocalization

detectionrates(Fig.2c).Ifascratchpadwasintact,thentheprobabilityofcolocalization,𝑝M,wassetto

0.87; if a scratchpad was collapsed, the probability of erroneous colocalization, 𝑝S, was set to 0.05.

Theseestimateswereobtainedfromourmeasurementsofcolocalizationfractionsbeforeactivationof

theMEM-01cells(Fig.2c),andareconsistentwithpreviousreportsonsmFISHfidelity16,17.Thenumber

ofscratchpadsthatcolocalizedforeachbarcodewasdrawnfrombinomialdistributionsbasedonthese

colocalizationdetectionprobabilities.Namely,thenumberofcolocalizedtranscripts𝑞fromatotalof𝑇

transcripts,foranintactintegrationofabarcode,wasdrawnfromthedistribution,

0 500

100

200

Cel

l cou

nts

BC: 2

0 50 1000

100

200BC: 9

0 10 20 300

100

200

300BC: 10

0 500

100

200BC: 12

0 500

100

200

300BC: 13

0 10 20 300

100

200

300

Cel

l cou

nts

BC: 14

0 10 200

200

400

BC: 16

0 10 20 300

100

200

300BC: 20

0 5 10 15Transcripts

0

200

400

600BC: 21

0 20 40Transcripts

0

100

200

300BC: 22

0 10 20 30Transcripts

0

100

200

300

Cel

l cou

nts

BC: 23

0 50Transcripts

0

50

100

150BC: 26

0 10 20Transcripts

0

200

400BC: 27

WWW.NATURE.COM/NATURE | 5

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 6: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

𝑝6HV5MV 𝑞; 𝑇 =𝑇𝑞𝑝MX 1 − 𝑝M Y.X

Similarly,thenumberofcolocalizedtranscriptsforacollapsedintegrationofabarcodewasdrawnfrom

thedistribution,

𝑝MZ[[5\]S^ 𝑞; 𝑡 =𝑇𝑞𝑝SX 1 − 𝑝S Y.X

Finally,weaccountedformultiple integrationsofthesamebarcodeasfollows: Ifabarcodehadmore

thanoneintegration,thetotalnumberoftranscriptsandthenumberofcolocalizedtranscriptsforthat

barcode were determined by summing the relevant transcripts from each individual integration.

Importantly,thedistributionoftheexpressionlevelofeachindividualintegrationwasselectedsuchthat

the distribution of the total number of transcripts (obtained by taking the convolution of the single-

integration distributions) matched the experimentally observed distribution for that barcode. For

example,ifabarcodehadtwointegrationsandanexpressiondistributioncharacterizedbyameanof10

transcriptsandshapeparameter2(fromthegammafit),theneachintegrationwasassumedtohavean

expressiondistributioncharacterizedbyameanof5transcriptsandshapeparameter1.

This procedure produced an ensemble of simulated colonies, where every cell was assigned a total

number of transcripts and a number of colocalized transcripts for each barcode (same as the

experimentaldata).Theresultingsimulatedcolonieswerethenreconstructedwiththesameneighbor-

joiningalgorithmusedontheexperimentaldata(seeMethods).

Impactof readoutnoiseonoverall reconstructionaccuracy. Figure3iandExtendedDataFigure6e,f

showthatnoiseinscratchpadexpressionandmultipleincorporationsofthesamebarcodedscratchpad

account formostof the reconstructionerror.Bycontrast,othercomponents suchas recordingnoise,

noiseintrinsictotheMEMOIRdesign,andnoisefromsmFISHimagingfidelitycontributedminimallyto

theoverallreconstructionerror.

WWW.NATURE.COM/NATURE | 6

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 7: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

Finally, to determine if the addition of fluctuations to the simulations matched the experimentally

observeddecrease in reconstructionaccuracy,wesimulated treeswithall threecomponentsofnoise

for various numbers of integrations per barcode and compared the distribution of reconstruction

accuracytothatobservedforthe108experimentalcolonies.AsshowninExtendedDataFigure6f,the

distribution of reconstruction accuracy obtained from simulations is generally consistent with the

experimentally observed distribution, with an effective value of 2 integration sites for each barcode.

Note that no fitting parameters were used besides the empirically measured fluctuations in the

recording and readout components. Together, these results indicate that this implementation of

MEMOIRbehavedasexpectedofasystemwith13scratchpadsandrealisticsourcesofnoise.Themain

sources of error, the stochastic expression of scratchpad transcripts and variable expression from

multiple integrations of the same barcoded scratchpad types, can be improved in future

implementations.

2b.Potentialsolutionstoaddresssourcesofnoiseandreduceerrors

HerewedescribewaystomitigatethekeysourcesofnoisecurrentlyaffectingMEMOIRreconstruction

accuracy:

● Noiseinherenttothesystem.Becausecollapseeventsarestochastic,similarcollapsepatterns

canarisebychanceindependently,creatingambiguities.Thelikelihoodofsucherrorsdecreases

rapidly as the number and diversity of barcoded scratchpads is increased. Further, tunable

control of Cas9 and gRNA expression or scratchpad affinity (e.g., by varying gRNA target

complementarity)allowsaveragecollapseratetobeoptimizedfordifferentregimes, including

accessingmore generations (ExtendedData Fig. 8b) or low collapse rates for studying sparse

trees(ExtendedDataFig.9).

● Readout noise: Multiple incorporations of barcoded scratchpad types. This is currently a

significant source of noise, as shown in Extended Data Fig. 6, and it is readily addressed as

above,e.g.,bystartingfromalargerandmorediversepoolofbarcodes.

● Readout noise: Stochastic scratchpad expression. Individual scratchpads are read out at the

RNAlevel,andthereforesubjecttofluctuationsintranscription.Thissourceofnoisethatcanbe

addressed by using 1) stronger promoters, 2) site-specific integration in well-expressed loci,

and/or 3) stabilized RNA transcripts (in order to build up signal over longer periods).

Alternatively,barcodeexpressionnoisecanbecircumventedbyemployingotherinsitulabeling

methods,likesingle-moleculeDNAFISH.

WWW.NATURE.COM/NATURE | 7

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 8: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

3. SimulationofMEMOIRoperationformissingcellsandsparsetrees

3a.Missingcells

To simulate reconstruction for treeswithmissing cells (ExtendedDataFig. 8c), the forwardalgorithm

described inMethods was used to generate a full binary tree of three generations, which was then

pruned by removing a given number of randomly chosen endpoint cells. To reconstruct the lineage

relationshipsoftheremainingcells,weassignedeachofthepossible315reconstructionsascoregiven

by 𝐻6a − 𝑇6ab6,acF,where𝐻6a is theHammingdistancebetween thebarcodedscratchpadcollapse

patterns of cells 𝑖 and 𝑗, 𝑇6a is the lineage distance between cells 𝑖 and 𝑗 from that particular

reconstruction(1forsisters,2forfirstcousins,and3forsecondcousins),andthesummationrunsover

all pairs of cells. Matrices 𝐻 and 𝑇 were normalized so that their elements summed to one. This

reconstruction score format was useful not only in scoring the topology of possible lineage

arrangements,butalsoinestimatingbranchlengths(e.g.,whethertwocellswerelikelycloselyrelated

sistersormoredistantlyrelatedcousins).Thereconstructionwiththelowestscorewasselectedasthe

reconstructedtreeandusedtocomputethefractionofrelationshipsthatwerecorrectlyidentified(i.e.,

fraction of correct relationships) shown in the heat-maps. If multiple reconstructions had the same

lowestscore,thefractionofcorrectrelationshipswasaveragedoverallofthem.

3b.SimulationofMEMOIRoperationforsparsetrees

Tosimulatesparse trees,weassumedcollapseevents inagiven lineageoccurata specifiedconstant

collapse rate λ (per cell per generation). The sparse regime occurs when λ < 1. Because generating

hundreds of collapse events requires large trees of many generations, we did not simulate the

underlying proliferating binary tree. Instead, we directly generated a tree of collapse events with

variable branch lengths corresponding to the stochastic time intervals between collapse events.

Assuming a constant rate of collapse (a Poisson process),we randomly drewbranch lengths froman

exponentialdistributionwiththetimescalesetbythecollapserate,i.e.meanbranchlength∼1/λ.At

anygivenlevelofthetree,ifthetreehad𝑁branches,thetimeintervaltothenextbranchingeventwas

selectedfromanexponentialdistributionwithrate𝑁𝜆,andthebranchingeventwasrandomlyassigned

tooneofthe𝑁branches.Thebranchingprocesswascontinueduntilthedesirednumberofleaveswas

generated. ExtendedData Figure 9c,d showsexamplesof sparse treeswith variousnumberof leaves

and tree depths (defined as the cumulative number of collapse events since the common ancestor

experienced by each leaf averaged over all the leaves of the tree). For clarity, the approximate

WWW.NATURE.COM/NATURE | 8

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 9: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

correspondence between these sparse trees (characterized by number of leaves and tree depth) and

underlyingfullbinarytrees(characterizedbycollapserateandnumberofgenerations)wasdetermined

bysimulatingfullbinarytreesinthesparseregimeandthenfindingparametersthatgeneratedeffective

sparse treeswith thedesirednumberof leavesanddepth (seethecartoon inExtendedDataFig.9a).

The collapse rate and number of generations of the full binary trees corresponding to the example

sparse trees are reported in Extended Data Figure 9c,d. To reconstruct sparse trees, we used the

generatedcollapsepatternattheleavesandaSankoffmaximumparsimonyalgorithm45.Thefractionof

correct partitions identified by the reconstructed trees was calculated using the Robinson-Foulds

distancemetric40.SimulationswereimplementedinMatlab,andPythonusingtheBiopythonpackage46

andETE3toolkit47.

4. InferenceoftheswitchingratesbetweenthetwoEsrrbgeneexpressionstates

Inthissection,wedescribehowweinferredthetransitionratesbetweenthelowandhighexpression

states of Esrrb. In 4a, we compute the occurrence frequencies of pairs of cells in the same gene

expressionstate.In4b,wepresentastraightforwardinferenceofthetransitionratesfromthedecayin

theobservedfrequencyofpairsofcellsinthesameexpressionstate.Finally,in4c,wepresentamore

involvedbutmoreaccurateinferencealgorithm.

4a.OccurrencefrequencyofrelatedpairsofcellsinthesameEsrrbgeneexpressionstateonMEMOIR

trees

Inthissubsection,wedescribehowtheco-occurrencefrequenciesofpairsofrelatedcellsinthesame

Esrrb expression state can be extracted from theMEMOIR reconstructed lineage trees and endpoint

geneexpressionmeasurements.

Tocomputethefrequencies,wefirstassigneachcellaprobabilityofbeinginoneofthetwoEsrrbgene

expressionstatesbasedonitsEsrrbmRNAcount.Todoso,wefitthebimodaldistributionofsingle-cell

Esrrb transcript counts (Figure 4b)with aweighted sum of two negative binomial distributions, each

correspondingtoonegeneexpressionstate.WedenotethedistributioncorrespondingtotheEsrrblow

expression state as𝑃.(𝑚), and the distribution corresponding to the Esrrb high expression state as

𝑃h(𝑚), where𝑚 is the number of transcripts in an individual cell. The bimodal distribution is fit to

𝑓𝑃. 𝑚 + 1 − 𝑓 𝑃h(𝑚),where𝑓isanadditionalfittingparameterthatcorrespondstothepopulation

fractionofcellsinthelowexpressionstate.

WWW.NATURE.COM/NATURE | 9

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 10: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

Withthefitparametersdetermined,theprobabilitythatanindividualcellwith𝑛EsrrbmRNAmolecules

isinthelowexpressionstate(E-)is

𝑝j.(𝑛) =𝑓𝑃. 𝑛

𝑓𝑃. 𝑛 + (1 − 𝑓)𝑃h 𝑛

Similarly,theprobabilitythatthecellisinthehighgeneexpressionstate(E+)issimply𝑝jh 𝑛 = 1 −

𝑝j.(𝑛).

Next,wegeneralize thisapproach to jointlyobserving thestatesofapairofcells. If cell1 is in theE-

statewithprobability𝑝j.- andcell2isintheE-statewithprobability𝑝j.F thentheprobabilitythatboth

cells are simultaneously in E- states is𝑝j.- 𝑝j.F . Similarly, the probability of observing cell 1 in the E+

stateandcell2intheE-stateis𝑝jh- 𝑝j.F .Wecandenotetheprobabilityofobservingthetwocellsinany

ofthefourpossiblepairsofstatesbya2×2matrix,

𝑝j.- 𝑝j.F 𝑝jh- 𝑝j.F

𝑝j.- 𝑝jhF 𝑝jh- 𝑝jhF

The entries in thismatrixmust sum to1. Finally, to compute the occurrence frequency of a pair of

sisters cells ineachof the fourpossiblepairsof states,weaverage theabovematrixoverallpairsof

sisterscells.

𝑪 1 =1

𝑁]6]VSm]𝑝j.6 𝑝j.

a 𝑝jh6 𝑝j.a

𝑝j.6 𝑝jha 𝑝jh6 𝑝jh

a6,a

wherethesummationrunsoverall𝑁]6]VSm]pairsofsisterscellsinthe30coloniesusedintheanalysis.

Wenotethatsincetheorderingofthesisterpairisarbitrary,weturntheresultingmatrixsymmetricby

averagingtheoff-diagonalterms.

𝑪 1 isa2x2matrix forsisters.Theargument ‘1’denotesthefact thatsistersare1generationapart.

Similarly, we define𝑪 2 as the frequency of observing a pair of first-cousin cells in a given pair of

WWW.NATURE.COM/NATURE | 10

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 11: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

states. 𝑪(2), is determinedby taking the average over all the observedpairs of first-cousin cells. The

occurrence frequencyof sisters, first cousins,andsecondcousins (𝑪(3)) in thesameEsrrbexpression

state(eitherbothE-,orbothE+)isplottedinFigure4d.

Finally,weestimatedthestatisticalerrorontheobservedfrequenciesintwoways(errorbarsinFigure

4d).First,fromthetotalnumberofobservedpairsofcellsandthecomputedfrequencies,weestimated

thecountingerrorassumingamultinomialdistribution.Thatis,foranobservedfrequency𝑓computed

using𝑁pairsofcells,thestatisticalerrorwasestimatedas 𝑓 1 − 𝑓 𝑁.Tounderstandhowmuchof

the similarity observed between states of closely related cells was just due to chance, we randomly

permutedthelineagerelationshipswithineachcolonyandrecomputedthefrequencies.Thestatistical

errorwasestimatedasthevarianceofthefrequenciescomputedover1000iterationsofsuchrandom

permutations. The first error estimate only accounts for the finite size of the total number of

observations.Thesecondestimateaccountsforcorrelationsbetweenthestatesofcloselyrelatedcells

duetochancewiththenumberofdifferentgeneexpressionstates ineachcolonyheldfixed.Thetwo

methods produced similar estimates reflecting the level intra-colony heterogeneity. Because the two

approaches do not capture independent statistical fluctuations, we combined them by taking the

maximumofthetwoestimates.

4b. Inferenceofswitchingdynamics fromreconstructed lineagetreesandendpointgeneexpression

measurements

Here,wedescribetheinferenceoftransitionratesfromthefrequenciesderivedintheprevioussection.

To do so, we first derive the relationship between transitions rates and the expected probability of

observingapairofrelatedcellsinthesamegeneexpressionstate.

We define a minimal model of stochastic and memoryless transitions between the two expression

states.WeassumethattheprobabilityoftransitioningfromtheE-statetotheE+statepergenerationis

constantanddenoteitas𝑇(𝐸h|𝐸.).TheprobabilitythatacellintheE-statewillretainitsstatefrom

one generation to the next is give by 𝑇 𝐸.|𝐸. = 1 − 𝑇(𝐸h|𝐸.). Similarly, the probability of

transitioningoutoftheE+statepergenerationisdenotedas𝑇(𝐸.|𝐸h).Thetransitionprobabilitiescan

becombinedtogetherintoatransitionratematrix,

WWW.NATURE.COM/NATURE | 11

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 12: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

𝑻 =𝑇(𝐸.|𝐸.) 𝑇(𝐸.|𝐸h)𝑇(𝐸h|𝐸.) 𝑇(𝐸h|𝐸h)

First, we present a simple approach for inferring the transition rate matrix from the decay in the

observed frequencies of pairs of cells in the same gene expression state going from first-cousins to

second-cousins.Next,wepresentamore involvedderivationthatalsotakes intoaccountthevalueof

the observed frequencies in addition to how they decay from closely related cells to more distant

relatives.

Wecanrelatethefrequencyofobservingsecond-cousinsinthesamestate(eitherbothinE-orbothE+)

to the frequencyof observing the first-cousins in a particular pair of statesby taking into account all

possiblestatetransitionsgoingfromonegenerationtothenext.Itfollows,

𝑪-- 3 = 1 − 𝑇-F 1 − 𝑇-F 𝑪-- 2 + 2𝑇F- 1 − 𝑇-F 𝑪-F 2 + 𝑇F-𝑇F-𝑪FF(2)

𝑪FF 3 = 1 − 𝑇F- 1 − 𝑇F- 𝑪FF 2 + 2𝑇-F 1 − 𝑇F- 𝑪F- 2 + 𝑇-F𝑇-F𝑪--(2)

Wenumericallysolvedtheabovetwonon-linearequationsfor𝑇-Fand𝑇F- fortheobservedvaluesof

𝑪(2) and𝑪(3) (theobserved frequencieswere first corrected formisclassificationsof the two states

caused by measurement uncertainties, see below). To estimate the statistical error on the inferred

transitionsrates,weproducedsimulatedfrequencymatricesbasedontheestimatedstatisticalerrorof

the frequencies (see previous section).We solved for the transition rates for 10000 iterations of the

simulateddataandcomputedthestandarddeviationovertherates.Wefound𝑇-F = 0.10 ± 0.07and

𝑇F- = 0.04 ± 0.03.

4c.Amoreaccurateinferenceofthetransitionrates

Next, to get a more accurate set of inferred transition rates, we incorporated the values of the

frequencies in addition to their decay over the generations in our calculations. To do so, we first

calculatedthejoint-probabilityoffindingapairofsistercells instates𝑖and𝑗usingthetransitionrate

matrix.

𝑪6a 1 = 𝑻 𝑖 𝑠 𝑻 𝑗 𝑠 𝑝(𝑠)]Jjv,jw

WWW.NATURE.COM/NATURE | 12

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 13: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

Thesummation inaboveequationrunsover thetwopossiblestatesof theparentcell.Assumingthat

thepopulationfractionsoftheE-andE+cellsdonotchangesignificantlyduringthethreegenerations

spanned by the experiment, 𝑝(𝐸.) and 𝑝 𝐸h can be set approximately equal to the population

fractions derived from fitting the bimodal Esrrb transcript-count distribution (see previous section).

Namely,𝑝 𝐸. = 𝑓 and𝑝 𝐸h = 1 − 𝑓.We experimentally validated the assumption that theEsrrb

transcript-countdistributiondoesnotchangesignificantlyinourexperimentalconditionsoveraperiod

of48hoursorapproximatelyfourgenerations(seeExtendedDataFig.10).

Tokeepthepopulationfractionsconstant,thenumberofcellstransitioningfromtheE-statetotheE+

statemust be equal to the number transitioning in the reverse direction. Hence,𝑇 𝐸h 𝐸. 𝑝 𝐸. =

𝑇 𝐸. 𝐸h 𝑝(𝐸h).Withthissimplification,

𝑪6a 1 = 𝑻 𝑖 𝑠 𝑻 𝑠 𝑗 𝑝 𝑗 = 𝑝 𝑗 𝑻F(𝑖|𝑗)]Jjv,jw

where𝑻F(𝑖|𝑗) denotes the 𝑖, 𝑗thelementof the squareof the transitionmatrix. Similarly, the joint-

probabilityofobservingapairofcousins,andsecond-cousins instates 𝑖and𝑗 is respectivelyequal to

𝑪6a 2 = 𝑝 𝑗 𝑻x(𝑖|𝑗)and𝑪6a 3 = 𝑝(𝑗)𝑻y(𝑖|𝑗).

To infer the state transition rates, we used the measured frequencies and population fractions and

solvedtheaboveequationsfor𝑻.

However, we first need to introduce a correction to the measured frequencies to account for

misclassificationofexpressionstateswhentheyareprobabilisticallyassigned.Namely, it ismorelikely

to misclassify a cell in the E- state as E+ than vice versa because of the asymmetrical shape of the

distributions𝑃±(𝑚) (seethe fits inFigure4b).Tocorrect formisclassifications,wefirstcomputedthe

probabilityofassigningacellinstate𝑗mistakenlytostate𝑖bysimplyintegratingtheprobabilitythata

a cell with 𝑛 Esrrb transcripts is assigned to state 𝑖 (computed in the previous section) over the

probabilitydistributionofobserving𝑛transcriptsinacellinstate𝑗.

𝑸6a = 𝑓𝑃6 𝑛

𝑓𝑃. 𝑛 + 1 − 𝑓 𝑃h 𝑛𝑃a(𝑛)𝑑𝑛

WWW.NATURE.COM/NATURE | 13

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 14: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

Forouranalysis,E-cellsaremistakenlyassignedtoE+cellswithprobability0.12,whereasE+cellsare

mistakenlyassignedtotheE-statewithprobability0.04.

Matrix𝑸effectivelyintroducesapparenttransitionsbetweenthetwostatesatthemeasurementstage.

Theseadditionaltransitionscanbeexplicitlyincludedinthecalculationofthejoint-probabilityoffinding

apairofsistercellsinstates𝑖and𝑗,

𝑪6a 1 = 𝑸𝒊𝒌𝑻 𝑘 𝑠 𝑸𝒋𝒍𝑻 𝑙 𝑠 𝑝 𝑠 =𝒍𝒌]Jjv,jw

𝑸𝒊𝒌𝑻 𝑘 𝑠 𝑻 𝑠 𝑙 𝑸𝒍𝒋𝑻 𝑝 𝑙𝒍𝒌]Jjv,jw

= 𝑸𝒊𝒌𝑻𝟐 𝑘 𝑙 𝑝(𝑙)𝑸𝒍𝒋𝑻

𝒍𝒌

However, theapparent transitions representedby𝑸donotcorrespondtoactual transitionsbetween

theEsrrbexpression states.Rather, they representmisclassificationsdue touncertainties in the state

assignment.Tocorrecttheestimate,weremovedthecontributionsoftheseapparenttransitionstothe

measured frequencies, by applying the inverse of matrix 𝑸 to the frequency matrices, 𝑪 =

𝑸.-𝑪 𝑸Y .-.

Finally,toinferthetransitionratesshowninFigure4d,wesolvedforthetransitionratematrix𝑻using

the corrected frequencymatrix for second-cousins , 𝑪(3).𝑪(3) exhibits a lower statistical error than

𝑪(2)and𝑪(1)becausethenumberofpairsofsecondcousinsislargerthanthenumberofpairsoffirst-

cousins and sisters. To validate the inferred rates, we checked that inferring 𝑻 using the corrected

frequencymatricesforfirst-cousins𝑪(2)andsisters𝑪(1)producedconsistenttransitionrates.Finally,

to estimate the statistical error of the inferred rates we used the estimated statistical error in the

measuredfrequencies(seeprevioussection)andproducedsimulatedfrequencymatrices.Theerroron

theinferredrateswasestimatedasthestandarddeviationoftheratesinferredover10000iterationsof

simulateddata.ResultsareshowninFigure4d.

WWW.NATURE.COM/NATURE | 14

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777

Page 15: SU PPLEMENTAR Y INFORMATIONTo keep numerical computations tractable, we approximated the factorials in the binomial coefficient using gamma functions, i.e. !! ≈ 1(! + 1). To compute

WWW.NATURE.COM/NATURE | 15

SUPPLEMENTARY INFORMATIONRESEARCHdoi:10.1038/nature20777