Upload
lawson
View
36
Download
5
Embed Size (px)
DESCRIPTION
How to date. Xuhua Xia [email protected] http://dambe.bio.uottawa.ca. Objectives. Two major objectives of molecular phylogenetics Branching patterns (speciation or gene duplication events) Dating of the speciation or gene duplication events Classification of methods Criteria used - PowerPoint PPT Presentation
Citation preview
Slide 2
Objectives• Two major objectives of molecular phylogenetics
– Branching patterns (speciation or gene duplication events)– Dating of the speciation or gene duplication events
• Classification of methods– Criteria used
• Maximum likelihood (ML) method (e.g., PAML)• Bayesian methods (e.g., BEAST)• Least-squares (LS) method based on distance-based matrices (e.g., DAMBE)
– Hard- or soft-bound– Global or local clock
• Data needed– A topology– A set of aligned sequences OR a distance matrix satisfying the molecular
clock hypothesis (either globally or locally)– Calibration points
• One or more from fossil record• From sampling time for rapidly evolving species (e.g., RNA viruses)
Slide 3
The LS method in linear regression
X Y R(Residual)
3 11.5 a+b*3 – 11.5
2 7.5 a+b*2 – 7.5
1 5 a+b*1 – 5
4 14 a+b*4 – 14
42 2 2 2 2
1( 3 11.5) ( 2 5.5) ( 5) ( 4 14)i
iRSS R a b a b a b a b
2 24 20 76 30 221 409.5
8 20 76
20 60 221
1.75, 3.10
RSS a ab a b b
RSSa b
aRSS
a bb
a b
Y = a + b x
y = 3.1x + 1.75
R2 = 0.9907
2
4
6
8
10
12
14
16
0 1 2 3 4 5
X
Y
2
( )( )
( )
X X Y Yb
X X
a Y bX
RSS = 0 means a perfect fit of the linear model to the data. A large RSS means a poor fit.
Slide 4
The rational of the LS method
4
Sp1
Sp2 d12
Sp3 d13 d23
Sp4 d14 d24 d34
2 2 2 212 3 13 2 23 2 34 1( - 2 ) ( - 2 ) ( - 2 ) ... ( - 2 )RSS d rt d rt d rt d rT
14 24 34
1
13 23 1 13 232
14 24 34
12 1 123
14 24 34
6
3( )
2( ) 4
3
2
d d dr
T
d d T d dt
d d d r
d T dt
d d d r
t3
t2
T1
4
2
1
3
Slide 5
Multiple calibration points
4
Sp1
Sp2 d12
Sp3 d13 d23
Sp4 d14 d24 d34
2 2 2 212 3 13 2 23 2 34 1( - 2 ) ( - 2 ) ( - 2 ) ... ( - 2 )RSS d rT d rt d rt d rT
T3
t2
T1
4
2
1
3
3 12 1 14 1 24 1 342 2
3 1
2 213 23 3 1 13 23
23 12 1 14 1 24 1 34
2 3
3
2 4
T d T d T d T dr
T T
d d T T d dt
T d T d T d T d r
Slide 6
t1 t6
t5
T4
t3
T2
7. gibbon
6. Sumatran organgutan
5. Bornean orangutan
4. gorilla
3. bonobo
2. chimpanzee
1. human
human
chimpanzee
bonobo
1.818±0.180
5.487±0.434
gorilla
7.258±0.530
orangutan
sumatran
3.206±0.280
14.757±0.217
gibbon
20.903±1.503
Soft calibration point = 14 million years
Soft calibration point = 7 million years
1.754±0.184
7±0
7.079±0.527
3.104±0.273
14±0
20.655±1.221
human
chimpanzee
bonobo
gorilla
orangutan
sumatran
gibbon
Hard calibration point = 14 million years
Hard calibration point = 7 million years
a)
b)
OTU1
OTU2
OTU3
OTU4
5r1
6
3
3
2
2r2
OTU1
OTU2 7
OTU3 10 7
OTU4 16 13 12T1 = 10
t2 = 5
t3 = 1.6667
OTU1
OTU2
OTU3
OTU4
T1 = 10
t2 = 6.2195
t3 = 5.1220RSS = 13.1667
r = 0.6833
RSS = 0
r0 = 0.6, r1 = 3, r2 = 1.2
a) b)
c)
Dating with local clocks
Slide 9
0
3
6
9
12
15
18
21
0 3 6 9 12 15 18 21
T (LS), Myr
T, M
yr
RY07
BEAST
Method comparison
Galago
Loris46.073±5.575
Varecia
Eulemur
Lemur
Hapalemur9.608±1.533
14.668±1.861
18.125±2.285
Propithecus
26.049±2.955
Daubentonia
49.231±4.101
66.992±5.038
Callithrix
Macaca
Pongo
Gorilla
Homo
Pan7.890±1.308
9.450±1.522
12.988±2.053
32.564±4.103
56.059±5.436
78.210±1.871
Lepilemur
M.murinus
M.griseorufus7.089±1.119
M.sambiranensis
M.rufus24.348±0.845
M.rufus1
M.myoxinus
M.berthae
2.3±0.4
2.2±0.4
4.617±0.808
M.tavaratra
5.3±0.9
M.ravelobensis7.9±1.4
9.7±1.3
Mirza
21.639±2.906
Cheirogaleus26.761±3.528
37.682±3.785
36.351±2.849
calibration time = 77 Myr
calibration time = 35 Myr
calibration time = 10 Myr
Homo
Macaca
Daubentonia
M.myoxinus
Gorilla
Loris
M.murinus
M.rufus1
M.sambiranensis
Mirza
Galago
Lemur
M.tavaratra
Varecia
Cheirogaleus
M.ravelobensis
Hapalemur
M.griseorufus
Pan
Propithecus
M.rufus2
Lepilemur
Eulemur
Callithrix
Pongo
M.berthae
[6.8351,11.8498][20.9195,33.1815]
[11.2396,17.6519]
[1.3,2.6]
[3.7,6.0]
[53.26,79.827]
[10.9426,17.9562]
[18.7945,29.488]
[68.927,92.9419]
[3.3,5.3]
[8.059,12.1832]
[13.8197,21.595]
[2.7447,4.8025]
[1.6314,2.9896]
[27.9235,37.5614]
[39.2193,60.6649]
[22.6131,34.6002]
[4.8645,8.6395]
[31.6677,53.3297]
[46.4962,69.6653]
[25.534,38.8127]
[7.3,11.7]
[14.5445,23.4733]
[5.7784,9.3783]
[5.5629,9.3534]
calibration time = 77 my
calibration time = 35 my
calibration time = 10 my
0
10
20
30
40
50
60
70
80
90
0 10 20 30 40 50 60 70 80 90
T (LS), Myr
T (
BE
AS
T),
Myr
Slide 13
Rationale of Tip-Dating
RSS=(d12/r+15-2*t1)2+(d13/r+10-2*t3)2+(d14/r+20-2*t3)2
+(d15/r+30-2*t5)2+(d16/r+25-2*t5)2+(d23/r+15+10-2*t3)2
+(d24/r+15+20-2*t3)2+(d25/r+15+30-2*t5)2+(d26/r+15+25-2*t5)2
+(d34/r+10+20-2*t2)2+(d35/r+10+30-2*t5)2+(d36/r+10+25-2*t5)2
+(d45/r+20+30-2*t5)2+(d46/r+20+25-2*t5)2+(d56/r+30+25-2*t4)2
s1@1990
s2@1975
t1=?
s4@1970
t2=?
t3=?
s5@1960
s6@1965
t4=?
t5=?
15 yr
25 yr
30 yr
20 yr
10 yrs3@1980
20
50
r = 0.01
40
30
40
Slide 14
Final dated tree
s1@1990
s2@1975
1970
s3@1980
s4@1970
1960
1950
s5@1960
s6@1965
1950
1940
Slide 15
Dates with standard deviation
S1@1980
S2@1965
S6@1970
S5@2000
S3@1945
S4@1968
S7@1962
S8@1985
1,902.81±8.42
1,875.51±13.38
1,817.23±16.71
1,791.98±21.08
1,792.77±22.00
1,770.96±24.30
1,766.78±25.64
Slide 16H1
H2
H3
H4
H5
H6
H7
H8
H9
H10
H11
H12
H13
H14
Dating and cospeciation
P1
P2
P3
P4
P5
P6
P7
P8
P9
P10
P11
P12
P13
P14
Dating and cospeciation
H1
H2
H3
H4
H5
H6
H7
H8
H9
H10
H11
H12
H13
H14
P1
P2
P3
P4
P5
P6
P7
P8
P9
P10
P11
P12
P13
P14
12 10 8 6 4 2 0 0 2 4 6 8 10 12