110
B D T E Tracy Heath Ecology, Evolution, & Organismal Biology Iowa State University @trayc7 http://phyloworks.org 2015 SSB Workshop Ann Arbor, MI USA

Bayesian Divergence Time Estimation

Embed Size (px)

Citation preview

Page 1: Bayesian Divergence Time Estimation

B D TETracy Heath

Ecology, Evolution, & Organismal BiologyIowa State University

@trayc7http://phyloworks.org

2015 SSB WorkshopAnn Arbor, MI USA

Page 2: Bayesian Divergence Time Estimation

OLecture & Demo: Bayesian Fundamentals• Bayes theorem, priors and posteriors, MCMC• Example: the birth-death process• RevBayes Demonstration

breakLecture: Bayesian Inference of Species Divergence Times• Relaxed clock models – accounting for variation insubstitution rates among lineages

• Tree priors and fossil calibration

breakBEAST v2 Tutorial — dating species divergences under thefossilized birth-death process

lunchCourse materials: http://phyloworks.org/resources/ssbws.html

Page 3: Bayesian Divergence Time Estimation

P BFor each of the topics in the pre-course survey, most of youfeel that you “have some experience, but there’s much moreto learn”.

diversification models

relaxed−clock models

Bayesian inference

model−based phylo

probability theory

unfa

mili

ar

I’ve h

eard

of it

fam

iliar, b

ut m

ore

to le

arn

pre

tty

com

fort

abl

e

expert

(29 participants responded to the survey)

Page 4: Bayesian Divergence Time Estimation

B I

Estimate the probability of a hypothesis (model) conditionalon observed data.

The probability represents the researcher’s degree of belief.

Bayes’ Theorem specifies the conditional probability of thehypothesis given the data.

Page 5: Bayesian Divergence Time Estimation

B’ T

Bayesian Fundamentals

Page 6: Bayesian Divergence Time Estimation

B’ T

Bayesian Fundamentals

Page 7: Bayesian Divergence Time Estimation

B’ T

Bayesian Fundamentals

Page 8: Bayesian Divergence Time Estimation

B’ T

Bayesian Fundamentals

Page 9: Bayesian Divergence Time Estimation

B’ T

Bayesian Fundamentals

Page 10: Bayesian Divergence Time Estimation

B’ T

The posterior probability of a discrete parameter δconditional on the data D is

Pr(δ | D) =Pr(D | δ)Pr(δ)∑δ Pr(D | δ)Pr(δ)

∑δ Pr(D | δ)Pr(δ) is the likelihood marginalized over allpossible values of δ.

Bayesian Fundamentals

Page 11: Bayesian Divergence Time Estimation

B’ T

The posterior probability density a continuous parameter θconditional on the data D is

f(θ | D) =f(D | θ)f(θ)∫

θ f(D | θ)f(θ)dθ

∫θ f(D | θ)f(θ)dθ is the likelihood marginalized over allpossible values of θ.

Bayesian Fundamentals

Page 12: Bayesian Divergence Time Estimation

P

The the distribution of θ before any data are collected isthe prior

f(θ)

The prior describes your uncertainty in the parameters ofyour model.

Bayesian Fundamentals

Page 13: Bayesian Divergence Time Estimation

PWe may assume a gamma-prior distribution on θ with ashape parameter α and a scale parameter β.

θ ∼ Gamma(α, β)

f(θ | α, β) = 1Γ(α)βαθα−1e

− θβ

0 1 2 3

Density

θ

This requires us to assign values for α and β based on ourprior belief, or we can place hyperpriors on theseparameters if we are uncertain about their values.

Bayesian Fundamentals

Page 14: Bayesian Divergence Time Estimation

E: T B-D PA time machine allowed us to observe the dates of eachspeciation event in the history of extant bears

010

20

30

Oligocene Miocene Plio. Plei.

Paleogene Neogene Quat.

33.24

27.9

15.9

13.1

11.6

9.95

2.8

Example

Page 15: Bayesian Divergence Time Estimation

E: T B-D P

We assume that thediversification of bearsmatches a birth deathprocess with parameters:

λ = speciation rateμ = extinction rate

010

20

30

Oligocene Miocene Plio. Plei.

Paleogene Neogene Quat.

33.24

27.9

15.9

13.1

11.6

9.95

2.8

Example

Page 16: Bayesian Divergence Time Estimation

E: T B-D P

The birth-death process allows us to compute the probabilitydensity of our observed time-tree (Ψ) conditional on anyvalue of speciation (λ) and any value of extinction (μ).

f(Ψ | λ, μ)

This model states that Ψ ∼ BD(λ, μ)

Example

Page 17: Bayesian Divergence Time Estimation

E: T B-D P

Another way of expressing Ψ ∼ BD(λ, μ) is with aprobabilistic graphical model

Our “observed” time-tree isconditioned on someconstant value of λ and μ.

Example

Page 18: Bayesian Divergence Time Estimation

E: T B-D PIf our time machine also allowed us to know the rate ofspeciation and extinction, we can easily calculate thelikelihood of our observed tree:

f(Ψ | λ, μ) = N!(λ−μ)λN−1 e−(λ−μ)x1

λ− μe−(λ−μ)x1

N−1∏i=1

(λ− μ)2e−(λ−μ)x1

(λ− μe−(λ−μ)x1)2

f(Ψ | λ = 0.5, μ = 0.1) = 3.49215e−32

Example

Page 19: Bayesian Divergence Time Estimation

RB D: T B-D P

RevBayes

Fully integrative Bayesian inference ofphylogenetic parameters usingprobabilistic graphical models and aninterpreted language

http://RevBayes.com

Example

Page 20: Bayesian Divergence Time Estimation

G M RB

Graphical models provide tools forvisually & computationally representingcomplex, parameter-rich probabilisticmodels

We can depict the conditionaldependence structure of variousparameters and other random variables

Höhna, Heath, Boussau, Landis, Ronquist, Huelsenbeck. 2014.Probabilistic Graphical Model Representation in Phylogenetics.Systematic Biology. (doi: 10.1093/sysbio/syu039)

Page 21: Bayesian Divergence Time Estimation

RB D: T B-D PThe Rev language specifying a birth-death model on thebear phylogeny with fixed values for λ and μ.

Example

Page 22: Bayesian Divergence Time Estimation

E: T B-D P

What if we do not know λ and μ?

We can use frequentist or Bayesian methods for estimatingtheir values.

Frequentist methods require us to find the values of λ andμ that maximize f(Ψ | λ, μ).

Bayesian methods use prior distributions to describe ouruncertainty in λ and μ and estimate f(λ, μ | Ψ).

Example

Page 23: Bayesian Divergence Time Estimation

E: T B-D P

We must define prior distributions for λ and μ to estimatethe posterior probability density

f(λ, μ | Ψ, y, z) =f(Ψ | λ, μ)f(λ | y)f(μ | z)∫

λ∫μ f(Ψ | λ, μ)f(λ | y)f(μ | z)dλdμ

Now y and z are the parameters of the prior distributionson λ and μ.

Example

Page 24: Bayesian Divergence Time Estimation

E: T B-D P

We can choose exponential prior distributions for λ and μ.Now y and z represent the rate parameters of theexponential priors.

λ ∼ Exponential(y)μ ∼ Exponential(z)

f(λ | y) = ye−yλ

f(μ | z) = ze−zμ

Example

Page 25: Bayesian Divergence Time Estimation

RB D: T B-D PThe Rev language specifying a birth-death model on thebear phylogeny with exponential priors on λ and μ.

Example

Page 26: Bayesian Divergence Time Estimation

E: T B-D PNow that we have a defined model, how do we estimatethe posterior probability density?

λ ∼ Exponential(y)μ ∼ Exponential(z)

f(λ, μ | Ψ, y, z) =f(Ψ | λ, μ)f(λ | y)f(μ | z)∫

λ∫μ f(Ψ | λ, μ)f(λ | y)f(μ | z)dλdμ

Example

Page 27: Bayesian Divergence Time Estimation

M C M C (MCMC)

An algorithm for approximating the posterior distribution

Metropolis, Rosenblusth, Rosenbluth, Teller, Teller. 1953. Equations of state calculations by fast computingmachines. J. Chem. Phys.

Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika.

Bayesian Fundamentals

Page 28: Bayesian Divergence Time Estimation

M C M C (MCMC)

More on MCMC from Paul Lewis—our esteemed SSBPresident—and his lecture on Bayesian phylogenetics

Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis

Bayesian Fundamentals

Page 29: Bayesian Divergence Time Estimation

Paul O. Lewis (2014 Woods Hole Molecular Evolution Workshop) 42

MCMC robot’s rules

Uphill steps are always accepted

Slightly downhill steps are usually accepted

Drastic “off the cliff” downhill steps are almost never accepted

With these rules, it is easy to see why the

robot tends to stay near the tops of hills

Page 30: Bayesian Divergence Time Estimation

Paul O. Lewis (2014 Woods Hole Molecular Evolution Workshop) 43

(Actual) MCMC robot rules

Uphill steps are always accepted because R > 1

Slightly downhill steps are usually accepted because R is near 1

Drastic “off the cliff” downhill steps are almost never accepted because R is near 0

Currently at 1.0 m Proposed at 2.3 m R = 2.3/1.0 = 2.3

Currently at 6.2 m Proposed at 5.7 m R = 5.7/6.2 =0.92 Currently at 6.2 m

Proposed at 0.2 m R = 0.2/6.2 = 0.03

6

8

4

2

0

10

The robot takes a step if it draws a Uniform(0,1) random deviate that is less than or equal to R

Page 31: Bayesian Divergence Time Estimation

=

f(D|�⇤)f(�⇤)f(D)

f(D|�)f(�)f(D)

Paul O. Lewis (2014 Woods Hole Molecular Evolution Workshop) 44

Cancellation of marginal likelihood

When calculating the ratio R of posterior densities, the marginal probability of the data cancels.

f(�⇤|D)

f(�|D)

Posterior odds

=f(D|�⇤)f(�⇤)f(D|�)f(�)

Likelihood ratio Prior odds

Page 32: Bayesian Divergence Time Estimation

Paul O. Lewis (2014 Woods Hole Molecular Evolution Workshop) 45

Target vs. Proposal Distributions

Pretend this proposal distribution allows good mixing. What does good

mixing mean?

Page 33: Bayesian Divergence Time Estimation

default2.TXT

State0 2500 5000 7500 10000 12500 15000 17500

-10

-9

-8

-7

-6

-5

-4

-3

-2

-1

0

Paul O. Lewis (2014 Woods Hole Molecular Evolution Workshop) 46

Trace plots

“White noise” appearance is a sign of good mixing

I used the program Tracer to create this plot: http://tree.bio.ed.ac.uk/software/tracer/ !

AWTY (Are We There Yet?) is useful for investigating convergence:

http://king2.scs.fsu.edu/CEBProjects/awty/awty_start.php

log(

post

erio

r)

Page 34: Bayesian Divergence Time Estimation

Paul O. Lewis (2014 Woods Hole Molecular Evolution Workshop) 47

Target vs. Proposal Distributions

Proposal distributions with smaller variance...

Disadvantage: robot takes smaller steps, more time required to explore the same area

Advantage: robot seldom refuses to take proposed steps

Page 35: Bayesian Divergence Time Estimation

smallsteps.TXT

State0 2500 5000 7500 10000 12500 15000 17500

-6

-5

-4

-3

-2

-1

0

Paul O. Lewis (2014 Woods Hole Molecular Evolution Workshop) 48

If step size is too small, large-scale trends will be apparentlo

g(po

ster

ior)

Page 36: Bayesian Divergence Time Estimation

Paul O. Lewis (2014 Woods Hole Molecular Evolution Workshop) 49

Target vs. Proposal Distributions

Proposal distributions with larger variance...

Disadvantage: robot often proposes a step that would take it off a cliff, and refuses to move

Advantage: robot can potentially cover a lot of ground quickly

Page 37: Bayesian Divergence Time Estimation

bigsteps2.TX

T

State0 2500 5000 7500 10000 12500 15000 17500

-12

-11

-10

-9

-8

-7

-6

-5

-4

-3

-2

Paul O. Lewis (2014 Woods Hole Molecular Evolution Workshop) 50

Chain is spending long periods of time “stuck” in one place

“Stuck” robot is indicative of step sizes that are too large (most proposed steps would take the robot “off the cliff”)

log(

post

erio

r)

Page 38: Bayesian Divergence Time Estimation

M C M C (MCMC)Thanks, Paul!

Slides source: https://molevol.mbl.edu/index.php/Paul_Lewis

See MCMCRobot, a helpfulsoftware program for learningMCMC by Paul Lewis

http://www.mcmcrobot.org

Bayesian Fundamentals

Page 39: Bayesian Divergence Time Estimation

RB D: T B-D PThe Rev language specifying the MCMC sampler for thebirth-death model.

Example

Page 40: Bayesian Divergence Time Estimation

RB D: T B-D PThe trace-plot of the MCMC samples for speciation rate

spe

cia

tio

n

State0 5000 10000 15000 20000 25000 30000

0

2.5E-2

5E-2

7.5E-2

0.1

0.125

0.15

0.175

Example

Page 41: Bayesian Divergence Time Estimation

RB D: T B-D PMarginal posterior densities of the speciation rate andextinction rate.

De

nsi

ty

Combined

extinction

speciation

0 5E-2 0.1 0.15 0.2 0.250

10

20

30

40

50

Example

Page 42: Bayesian Divergence Time Estimation

E: T B-D PAlas, we do not have a time machine and we do not knowlineage divergence times without error.

010

20

30

Oligocene Miocene Plio. Plei.

Paleogene Neogene Quat.

33.24

27.9

15.9

13.1

11.6

9.95

2.8

Example

Page 43: Bayesian Divergence Time Estimation

D T E

Goal: Estimate the ages of interior nodes to understand thetiming and rates of evolutionary processes

Model how rates aredistributed across the tree

Describe the distribution ofspeciation events over time

External calibrationinformation for estimates ofabsolute node times

fossil Ailuropodinae

fossil Arctodus

fossil Ursus

fossil pinnipeds

Gray wolf

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Urs

ida

e

Time (My)

60 2040 0

Eocene Oligocene Miocene

Plio

Ple

is

Paleocene

stem fossil Ursidae

fossil canids

(Figure from Heath et al., PNAS 2014)

Page 44: Bayesian Divergence Time Estimation

OLecture & Demo: Bayesian Fundamentals• Bayes theorem, priors and posteriors, MCMC• Example: the birth-death process• RevBayes Demonstration

breakLecture: Bayesian Inference of Species Divergence Times• Relaxed clock models – accounting for variation insubstitution rates among lineages

• Tree priors and fossil calibration

breakBEAST v2 Tutorial — dating species divergences under thefossilized birth-death process

lunchCourse materials: http://phyloworks.org/resources/ssbws.html

Page 45: Bayesian Divergence Time Estimation

OLecture & Demo: Bayesian Fundamentals• Bayes theorem, priors and posteriors, MCMC• Example: the birth-death process• RevBayes Demonstration

breakLecture: Bayesian Inference of Species Divergence Times• Relaxed clock models – accounting for variation insubstitution rates among lineages

• Tree priors and fossil calibration

breakBEAST v2 Tutorial — dating species divergences under thefossilized birth-death process

lunchCourse materials: http://phyloworks.org/resources/ssbws.html

Page 46: Bayesian Divergence Time Estimation

D T E

Goal: Estimate the ages of interior nodes to understand thetiming and rates of evolutionary processes

Model how rates aredistributed across the tree

Describe the distribution ofspeciation events over time

External calibrationinformation for estimates ofabsolute node times

fossil Ailuropodinae

fossil Arctodus

fossil Ursus

fossil pinnipeds

Gray wolf

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Urs

ida

e

Time (My)

60 2040 0

Eocene Oligocene Miocene

Plio

Ple

is

Paleocene

stem fossil Ursidae

fossil canids

(Figure from Heath et al., PNAS 2014)

Page 47: Bayesian Divergence Time Estimation

A T-S EPhylogenetic trees can provide both topological informationand temporal information

100 0.020.040.060.080.0

EquusRhinocerosBosHippopotamusBalaenopteraPhyseterUrsusCanisFelisHomoPanGorillaPongoMacacaCallithrixLorisGalagoDaubentoniaVareciaEulemurLemurHapalemurPropithecusLepilemur

MirzaM. murinusM. griseorufus

M. myoxinusM. berthaeM. rufus1M. tavaratraM. rufus2M. sambiranensisM. ravelobensis

Cheirogaleus

Sim

iiform

es

Mic

roce

bu

s

Cretaceous Paleogene Neogene Q

Time (Millions of years)

Understanding Evolutionary Processes (Yang & Yoder Syst. Biol. 2003; Heath et al. MBE 2012)

Page 48: Bayesian Divergence Time Estimation

T G M C

Assume that the rate ofevolutionary change isconstant over time

(branch lengths equalpercent sequencedivergence) 10%

400 My

200 My

A B C

20%

10%10%

(Based on slides by Jeff Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)

Page 49: Bayesian Divergence Time Estimation

T G M C

We can date the tree if weknow the rate of change is1% divergence per 10 My N

A B C

20%

10%10%

10%200 My

400 My

200 My

(Based on slides by Jeff Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)

Page 50: Bayesian Divergence Time Estimation

T G M C

If we found a fossil of theMRCA of B and C, we canuse it to calculate the rateof change & date the rootof the tree

N

A B C

20%

10%10%

10%200 My

400 My

(Based on slides by Jeff Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)

Page 51: Bayesian Divergence Time Estimation

R G M CRates of evolution vary across lineages and over time

Mutation rate:Variation in• metabolic rate• generation time• DNA repair

Fixation rate:Variation in• strength and targets ofselection

• population sizes

10%

400 My

200 My

A B C

20%

10%10%

Page 52: Bayesian Divergence Time Estimation

U A

Sequence data provideinformation about branchlengths

In units of the expected # ofsubstitutions per site

branch length = rate × time0.2 expected

substitutions/site

Ph

ylo

ge

ne

tic R

ela

tio

nsh

ips

Se

qu

en

ce

Da

ta

Page 53: Bayesian Divergence Time Estimation

R T

The sequence dataprovide informationabout branch length

for any possible rate,there’s a time that fitsthe branch lengthperfectly

0

1

2

3

4

5

0 1 2 3 4 5

Bra

nch

Ra

te

Branch Time

time = 0.8rate = 0.625

branch length = 0.5

(based on Thorne & Kishino, 2005)

Page 54: Bayesian Divergence Time Estimation

R TThe expected # of substitutions/site occurring along abranch is the product of the substitution rate and time

length = rate × time length = rate length = time

Methods for dating species divergences estimate thesubstitution rate and time separately

Page 55: Bayesian Divergence Time Estimation

B D T E

length = rate length = time

R = (r, r, r, . . . , rN−)

A = (a, a, a, . . . , aN−)

N = number of tips

Page 56: Bayesian Divergence Time Estimation

B D T E

length = rate length = time

R = (r, r, r, . . . , rN−)

A = (a, a, a, . . . , aN−)

N = number of tips

Page 57: Bayesian Divergence Time Estimation

B D T E

Posterior probability

f (R,A, θR, θA, θs | D,Ψ)

R Vector of rates on branchesA Vector of internal node ages

θR, θA, θs Model parametersD Sequence dataΨ Tree topology

Page 58: Bayesian Divergence Time Estimation

B D T E

f(R,A, θR, θA, θs | D) =

f (D |R,A, θs) f(R | θR) f(A | θA) f(θs)f(D)

f(D |R,A, θR, θA, θs) Likelihoodf(R | θR) Prior on rates

f(A | θA) Prior on node agesf(θs) Prior on substitution parametersf(D) Marginal probability of the data

Page 59: Bayesian Divergence Time Estimation

B D T E

Estimating divergence times relies on 2 main elements:

• Branch-specific rates: f (R | θR)

• Node ages: f (A | θA,C)

Page 60: Bayesian Divergence Time Estimation

M R VSome models describing lineage-specific substitution ratevariation:

• Global molecular clock (Zuckerkandl & Pauling, 1962)• Local molecular clocks (Hasegawa, Kishino & Yano 1989;Kishino & Hasegawa 1990; Yoder & Yang 2000; Yang & Yoder2003, Drummond and Suchard 2010)

• Punctuated rate change model (Huelsenbeck, Larget andSwofford 2000)

• Log-normally distributed autocorrelated rates (Thorne,Kishino & Painter 1998; Kishino, Thorne & Bruno 2001; Thorne &Kishino 2002)

• Uncorrelated/independent rates models (Drummond et al.2006; Rannala & Yang 2007; Lepage et al. 2007)

• Mixture models on branch rates (Heath, Holder, Huelsenbeck2012)

Models of Lineage-specific Rate Variation

Page 61: Bayesian Divergence Time Estimation

G M C

The substitution rate isconstant over time

All lineages share the samerate

branch length = substitution rate

low high

Models of Lineage-specific Rate Variation (Zuckerkandl & Pauling, 1962)

Page 62: Bayesian Divergence Time Estimation

R-C M

To accommodate variation in substitution rates‘relaxed-clock’ models estimate lineage-specific substitutionrates

• Local molecular clocks• Punctuated rate change model• Log-normally distributed autocorrelated rates• Uncorrelated/independent rates models• Mixture models on branch rates

Page 63: Bayesian Divergence Time Estimation

L M C

Rate shifts occurinfrequently over the tree

Closely related lineageshave equivalent rates(clustered by sub-clades)

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010)

Page 64: Bayesian Divergence Time Estimation

L M C

Most methods forestimating local clocksrequired specifying thenumber and locations ofrate changes a prioriDrummond and Suchard(2010) introduced aBayesian method thatsamples over a broad rangeof possible random localclocks

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010)

Page 65: Bayesian Divergence Time Estimation

A R

Substitution rates evolvegradually over time –closely related lineages havesimilar rates

The rate at a node isdrawn from a lognormaldistribution with a meanequal to the parent rate

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Thorne, Kishino & Painter 1998; Kishino, Thorne & Bruno 2001)

Page 66: Bayesian Divergence Time Estimation

P R C

Rate changes occur alonglineages according to apoint process

At rate-change events, thenew rate is a product ofthe parent’s rate and aΓ-distributed multiplier

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Huelsenbeck, Larget and Swofford 2000)

Page 67: Bayesian Divergence Time Estimation

I/U R

Lineage-specific rates areuncorrelated when the rateassigned to each branch isindependently drawn froman underlying distribution

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Drummond et al. 2006)

Page 68: Bayesian Divergence Time Estimation

I M M

Dirichlet process prior:Branches are partitionedinto distinct rate categories

Random variables under theDPP informed by the data:• the number of rateclasses

• the assignment ofbranches to classes

• the rate value for eachclass

branch length = substitution rate

c5

c4

c3

c2

substitution rate classes

c1

Models of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE)

Page 69: Bayesian Divergence Time Estimation

M R V

These are only a subset of the available models forbranch-rate variation

• Global molecular clock• Local molecular clocks• Punctuated rate change model• Log-normally distributed autocorrelated rates• Uncorrelated/independent rates models• Dirchlet process prior

Models of Lineage-specific Rate Variation

Page 70: Bayesian Divergence Time Estimation

M R VAre our models appropriate across all data sets?

cave bear

American

black bear

sloth bear

Asian

black bear

brown bear

polar bear

American giant

short-faced bear

giant panda

sun bear

harbor seal

spectacled

bear

4.08

5.39

5.66

12.86

2.75

5.05

19.09

35.7

0.88

4.58

[3.11–5.27]

[4.26–7.34]

[9.77–16.58]

[3.9–6.48]

[0.66–1.17]

[4.2–6.86]

[2.1–3.57]

[14.38–24.79]

[3.51–5.89]14.32

[9.77–16.58]

95% CI

mean age (Ma)

t 2

t 3

t 4

t 6

t 7

t 5

t 8

t 9

t 10

t x

node

MP•MLu•MLp•Bayesian

100•100•100•1.00

100•100•100•1.00

85•93•93•1.00

76•94•97•1.00

99•97•94•1.00

100•100•100•1.00

100•100•100•1.00

100•100•100•1.00

t 1

Eocene Oligocene Miocene Plio Plei Hol

34 5.3 1.823.8 0.01

Epochs

Ma

Global expansion of C4 biomassMajor temperature drop and increasing seasonality

Faunal turnover

Krause et al., 2008. Mitochondrial genomes reveal anexplosive radiation of extinct and extant bears near theMiocene-Pliocene boundary. BMC Evol. Biol. 8.

Taxa

1

5

10

50

100

500

1000

5000

10000

20000

0100200300MYA

Ophidiiformes

Percomorpha

Beryciformes

Lampriformes

Zeiforms

Polymixiiformes

Percopsif. + Gadiif.

Aulopiformes

Myctophiformes

Argentiniformes

Stomiiformes

Osmeriformes

Galaxiiformes

Salmoniformes

Esociformes

Characiformes

Siluriformes

Gymnotiformes

Cypriniformes

Gonorynchiformes

Denticipidae

Clupeomorpha

Osteoglossomorpha

Elopomorpha

Holostei

Chondrostei

Polypteriformes

Clade r ε ΔAIC

1. 0.041 0.0017 25.32. 0.081 * 25.53. 0.067 0.37 45.1 4. 0 * 3.1Bg. 0.011 0.0011

Ostariophysi

Acanthomorpha

Teleo

stei

Santini et al., 2009. Did genome duplication drive the originof teleosts? A comparative study of diversification inray-finned fishes. BMC Evol. Biol. 9.

Page 71: Bayesian Divergence Time Estimation

M R V

These are only a subset of the available models forbranch-rate variation

• Global molecular clock• Local molecular clocks• Punctuated rate change model• Log-normally distributed autocorrelated rates• Uncorrelated/independent rates models• Dirchlet process prior

Model selection and model uncertainty are very importantfor Bayesian divergence time analysis

Models of Lineage-specific Rate Variation

Page 72: Bayesian Divergence Time Estimation

B D T E

Estimating divergence times relies on 2 main elements:

• Branch-specific rates: f (R | θR)

• Node ages: f (A | θA,C)

http://bayesiancook.blogspot.com/2013/12/two-sides-of-same-coin.html

Page 73: Bayesian Divergence Time Estimation

P N T

Relaxed clock Bayesian analyses require a prior distributionon node times

f(A | θA)

Different node-age priors make different assumptions aboutthe timing of divergence events

Node Age Priors

Page 74: Bayesian Divergence Time Estimation

G N T P

Assumed to be vague or uninformative by not makingassumptions about biological processes

Uniform prior: the time ata given node has equalprobability across theinterval between the timeof the parent node and thetime of the oldest daughternode(conditioned on root age)

Node Age Priors

Page 75: Bayesian Divergence Time Estimation

S B P

Node-age priors based on stochastic models of lineagediversification

Yule process: assumes aconstant rate of speciation,across lineagesA pure birth process—everynode leaves extantdescendants (no extinction)

Node Age Priors

Page 76: Bayesian Divergence Time Estimation

S B P

Node-age priors based on stochastic models of lineagediversification

Constant-rate birth-deathprocess: at any point intime a lineage can speciateat rate λ or go extinct witha rate of μ

Node Age Priors

Page 77: Bayesian Divergence Time Estimation

S B P

Node-age priors based on stochastic models of lineagediversification

Constant-rate birth-deathprocess: at any point intime a lineage can speciateat rate λ or go extinct witha rate of μ

Node Age Priors

Page 78: Bayesian Divergence Time Estimation

S B P

Node-age priors based on stochastic models of lineagediversification

Constant-rate birth-deathprocess: at any point intime a lineage can speciateat rate λ or go extinct witha rate of μ

Node Age Priors

Page 79: Bayesian Divergence Time Estimation

S B P

Different values of λ and μ leadto different trees

Bayesian inference under thesemodels can be very sensitive tothe values of these parameters

Using hyperpriors on λ and μaccounts for uncertainty in thesehyperparameters

Node Age Priors

Page 80: Bayesian Divergence Time Estimation

S B P

Node-age priors based on stochastic models of lineagediversificationBirth-death-samplingprocess: an extension ofthe constant-rate birth-deathmodel that accounts forrandom sampling of tipsConditions on a probabilityof sampling a tip, ρ

Node Age Priors

Page 81: Bayesian Divergence Time Estimation

P N T

Sequence data are only informative on relative rates & timesNode-time priors cannot give precise estimates of absolutenode ages

We need external information (like fossils) to calibrate orscale the tree to absolute time

Node Age Priors

Page 82: Bayesian Divergence Time Estimation

C D T

Fossils (or other data) are necessary to estimate absolutenode ages

There is no information inthe sequence data forabsolute timeUncertainty in theplacement of fossils

N

A B C

20%

10%10%

10%200 My

400 My

Page 83: Bayesian Divergence Time Estimation

C D

Bayesian inference is well suited to accommodatinguncertainty in the age of the calibration node

Divergence times arecalibrated by placingparametric densities oninternal nodes offset by ageestimates from the fossilrecord

N

A B C

200 My

De

nsity

Age

Page 84: Bayesian Divergence Time Estimation

A F CMisplaced fossils can affect node age estimates throughoutthe tree – if the fossil is older than its presumed MRCA

Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)

Page 85: Bayesian Divergence Time Estimation

A F C

Crown clade: allliving species andtheir most-recentcommon ancestor(MRCA)

Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)

Page 86: Bayesian Divergence Time Estimation

A F C

Stem lineages:purely fossil formsthat are closer totheir descendantcrown clade thanany other crownclade

Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)

Page 87: Bayesian Divergence Time Estimation

A F C

Fossiliferoushorizons: thesources in therock record forrelevant fossils

Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)

Page 88: Bayesian Divergence Time Estimation

F C

Age estimates from fossilscan provide minimum timeconstraints for internalnodes

Reliable maximum boundsare typically unavailable

Minimum age Time (My)

Calibrating Divergence Times

Page 89: Bayesian Divergence Time Estimation

P D C N

Common practice in Bayesian divergence-time estimation:

Parametric distributions aretypically off-set by the ageof the oldest fossil assignedto a clade

These prior densities do not(necessarily) requirespecification of maximumbounds

Uniform (min, max)

Exponential (λ)

Gamma (α, β)

Log Normal (µ, σ2)

Time (My)Minimum age

Calibrating Divergence Times

Page 90: Bayesian Divergence Time Estimation

P D C N

Calibration densities describethe waiting time betweenthe divergence event andthe age of the oldest fossil

Minimum age

Exponential (λ)

Time (My)

Calibrating Divergence Times

Page 91: Bayesian Divergence Time Estimation

P D C N

Common practice in Bayesian divergence-time estimation:

Estimates of absolute nodeages are driven primarily bythe calibration density

Specifying appropriatedensities is a challenge formost molecular biologists

Uniform (min, max)

Exponential (λ)

Gamma (α, β)

Log Normal (µ, σ2)

Time (My)Minimum age

Calibration Density Approach

Page 92: Bayesian Divergence Time Estimation

I F C

We would prefer toeliminate the need forad hoc calibrationprior densities

Calibration densitiesdo not account fordiversification of fossils

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 93: Bayesian Divergence Time Estimation

I F C

We want to use allof the available fossils

Example: Bears12 fossils are reducedto 4 calibration ageswith calibration densitymethods

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 94: Bayesian Divergence Time Estimation

I F C

We want to use allof the available fossils

Example: Bears12 fossils are reducedto 4 calibration ageswith calibration densitymethods

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 95: Bayesian Divergence Time Estimation

I F C

Because fossils arepart of thediversification process,we can combine fossilcalibration withbirth-death models

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 96: Bayesian Divergence Time Estimation

I F C

This relies on abranching model thataccounts forspeciation, extinction,and rates offossilization,preservation, andrecovery

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 97: Bayesian Divergence Time Estimation

T F B-D P (FBD)

Improving statistical inference of absolute node ages

Eliminates the need to specify arbitrarycalibration densities

Better capture our statisticaluncertainty in species divergence dates

All reliable fossils associated with aclade are used

Useful for calibration or ‘total-evidence’dating

150 100 50 0

Time

(Heath, Huelsenbeck, Stadler. 2014 PNAS)

Page 98: Bayesian Divergence Time Estimation

T F B-D P (FBD)

Recovered fossil specimensprovide historicalobservations of thediversification process thatgenerated the tree ofextant species

150 100 50 0

Time

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 99: Bayesian Divergence Time Estimation

T F B-D P (FBD)

The probability of the treeand fossil observationsunder a birth-death modelwith rate parameters:

λ = speciationμ = extinctionψ = fossilization/recovery

150 100 50 0

Time

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 100: Bayesian Divergence Time Estimation

T F B-D P (FBD)

The probability of the treeand fossil observationsunder a birth-death modelwith rate parameters:

λ = speciationμ = extinctionψ = fossilization/recovery

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 101: Bayesian Divergence Time Estimation

T F B-D P (FBD)

We use MCMC to samplerealizations of thediversification process,integrating over thetopology—includingplacement of thefossils—and speciation times

0250 50100150200

Time (My)

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 102: Bayesian Divergence Time Estimation

I FBD TExtensions of the fossilized birth-death process accommodatevariation in fossil sampling, non-random species sampling, &shifts in diversification rates.

010

20

30

40

50

60

70

80

90

100

110

120

130

140

150

160

170

180

190

200

Lower

Middle

Upper

Lower

Upper

Paleocene

Eocene

Oligocene

Miocene

Pliocene

Pleistocen

Jurassic Cretaceous Paleogene Neogene Q.

With character data for both fossil & extant species, weaccount for uncertainty in fossil placement

Page 103: Bayesian Divergence Time Estimation

S B-D P

A piecewise shifting modelwhere parameters changeover timeUsed to estimateepidemiological parametersof an outbreak

0175 255075100125150

Days

(see Stadler et al. PNAS 2013 and Stadler et al. PLoS Currents Outbreaks 2014)

Page 104: Bayesian Divergence Time Estimation

S B-D Pl is the number ofparameter intervalsRi is the effectivereproductive numberfor interval i ∈ lδ is the rate ofbecomingnon-infectiouss is the probability ofsampling an individualafter becomingnon-infectious

Page 105: Bayesian Divergence Time Estimation

S B-D P

l is the number ofparameter intervalsλi is the transmissionrate for interval i ∈ lμ is the viral lineagedeath rateψ is the rate eachindividual is sampled

Page 106: Bayesian Divergence Time Estimation

S B-D P

Page 107: Bayesian Divergence Time Estimation

S B-D P

A decline in R over thehistory of HIV-1 in the UKis consistent with theintroduction of effectivedrug therapies

After 1998 R decreasedbelow 1, indicating adeclining epidemic

(Stadler et al. PNAS 2013)

Page 108: Bayesian Divergence Time Estimation

OLecture & Demo: Bayesian Fundamentals• Bayes theorem, priors and posteriors, MCMC• Example: the birth-death process• RevBayes Demonstration

breakLecture: Bayesian Inference of Species Divergence Times• Relaxed clock models – accounting for variation insubstitution rates among lineages

• Tree priors and fossil calibration

breakBEAST v2 Tutorial — dating species divergences under thefossilized birth-death process

lunchCourse materials: http://phyloworks.org/resources/ssbws.html

Page 109: Bayesian Divergence Time Estimation

D T E S

Program Models/Methodr8s Strict clock, local clocks, NPRS, PLape (R) NPRS, PLmultidivtime log-n autocorrelated (plus some others)PhyBayes OU, log-n autocorrelated (plus some others)PhyloBayes CIR, white noise (uncorrelated) (plus some others)BEAST Uncorrelated (log-n & exp), local clocks (plus others)TreeTime Dirichlet model, CPP, uncorrelatedMrBayes 3.2 CPP, strict clock, autocorrelated, uncorrelatedDPPDiv DPP, strict clock, uncorrelatedRevBayes “the limit is the sky”∗

∗and other methods

Page 110: Bayesian Divergence Time Estimation

E: C Y O ADating Bear DivergenceTimes with the FossilizedBirth-Death Process

Agriarctos spp X

Ailurarctos lufengensis X

Ailuropoda melanoleuca

Arctodus simus X

Ballusia elmensis X

Helarctos malayanus

Indarctos arctoides X

Indarctos punjabiensis X

Indarctos vireti X

Kretzoiarctos beatrix X

Melursus ursinus

Parictis montanus X

Tremarctos ornatus

Ursavus brevihinus X

Ursavus primaevus X

Ursus abstrusus X

Ursus americanus

Ursus arctos

Ursus maritimus

Ursus spelaeus X

Ursus thibetanus

Zaragocyon daamsi X

Ste

m

Be

ars

Cro

wn

Be

ars

Pandas

Tremarctinae

Brown Bears

Ursinae

Origin

Root(Total Group)

Crown

01020304050Y

pres

ian

Lute

tian

Bar

toni

an

Pria

boni

an

Rup

elia

n

Cha

ttian

Aqu

itani

an

Bur

diga

lian

Lang

hian

Ser

rava

llian

Tort

onia

n

Mes

sini

an

Zan

clea

nP

iace

nzia

nG

elas

ian

Cal

abria

nM

iddl

eU

pper

Eoc

ene

Olig

ocen

e

Mio

cene

Plio

cene

Ple

isto

cene

Hol

ocen

e

Paleogene Neogene Quat.

Ailuropoda melanoleuca

Tremarctos ornatus

Melursus ursinus

Ursus arctosUrsus maritimus

Helarctos malayanusUrsus americanusUrsus thibetanus

Parictis montanus XZaragocyon daamsi X

Ballusia elmensis XUrsavus primaevus X

Ursavus brevihinus X

Indarctos vireti XIndarctos arctoides X

Indarctos punjabiensis XAilurarctos lufengensis X

Agriarctos spp X

Kretzoiarctos beatrix X

Ursus abstrusus X

Ursus spelaeus X

Arctodus simus X

●●

●●

●●

Estimating EpidemiologicalParameters of an EbolaOutbreak

0175 255075100125150

Days

Course materials: http://phyloworks.org/resources/ssbws.html