18
Finding good models of molecular evolution in phylogenetics Rob Lanfear Australian National University, National Evolutionary Synthesis Centre, USA

Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

Embed Size (px)

Citation preview

Page 1: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

Finding good models of molecular evolution in phylogenetics

Rob Lanfear

Australian National University,

National Evolutionary Synthesis Centre, USA

Page 2: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

Acknowledgements

Simon Ho

Stephane Guindon

Brett Calcott

Alexis Statmatakis

Page 3: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

Page 4: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

A G

C T

Rate Matrix

πA + πC + πG + πT = 1

Base Frequencies

+ I + G

Site Rates

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

Page 5: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

A G

C T

Rate Matrix

πA + πC + πG + πT = 1

Base Frequencies

+ I + G

Site Rates

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

a f

b

cd

e

JCa=b=c=d=e=fπA=πC=πG=πT

No I or G 0 free parameters

GTR+I+Ga, b, c, d, e, fπA, πC, πG, πT

I, G10 free parameters

GTRa, b, c, d, e, fπA, πC, πG, πT

No I or G 8 free parameters

HKYa=c=d=f, b=eπA, πC, πG, πT

No I or G 4 free parameters

Page 6: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

Model Selection

Compare all models.

2. Penalise models with more parameters

e.g. Bayesian Information Criterion (BIC)

1. Calculate the Likelihood of each model

3. Use the model with the smallest BIC score

Page 7: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

The Problem

Almost always select GTR+I+G(the most complex model)

“like an overweight man shopping in the women's petites department”Gatesy J, Trends Ecol Evol 2007, 22:509-10

Page 8: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

Partitioning

Page 9: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

GTR+I+Ga, b, c, d, e, fπA, πC, πG, πT

I, G10 free parameters

Page 10: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

actgactgactgactgactgactgactgactgactgactgactgac

GTR+I+Ga, b, c, d, e, fπA, πC, πG, πT

I, G10 free parameters

GTR+I+Ga, b, c, d, e, fπA, πC, πG, πT

I, G10 free parameters

Page 11: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

Spp1 actgactgactgactgactgactgactgactgactgactgactga

Spp2 actgactgactgactgactgactgactgactgactgactgactga

Spp3 actgactgactgactgactgactgactgactgactgactgactga

Spp4 actgactgactgactgactgactgactgactgactgactgactga

Spp5 actgactgactgactgactgactgactgactgactgactgactga

Gene 1 Gene 2 Gene 3 Subsets

9

6

2

Page 12: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

A Solution

Compare all possible partitioning schemes.

2. Penalise schemes with more parameters

e.g. Bayesian Information Criterion (BIC)

1. Calculate the Likelihood of each scheme

3. Use the scheme with the smallest BIC score

Page 13: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear
Page 14: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

Many models (HKY, GTR) for each subset

Many ways to partition a dataset

The Problem

Page 15: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

PartitionFinderwww.robertlanfear.com/partitionfinder

Page 16: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear
Page 17: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

15,404 sites from whole mitochondrial genomes

87 data blocks

8,000 unit improvement in BIC score

Page 18: Finding Good Models of Molecular Evolution in Phylogenetics - Rob Lanfear

Future directions

1. Genome scale analyses (finished)2. Cloud computing (started)3. GUI (planned)4. Better algorithms (planned)