View
229
Download
2
Category
Tags:
Preview:
Citation preview
A branch =An edge
External node - leaf
Human ChimpChicken Gorilla
The root
Internal nodes
Terminology
The maximum parsimony principle.
(The shortest path)
Modified from Inferring Phylogenies (Book),Author: Prof. Joe Felsenstein
s1 s4 s3 s2 s5
Gene number 1, Option number 2.
Minimal number of changes for gene 1 (character 1) = 1
1 1 1 0 0
1
0
0
1
s1 s4 s3 s2 s5
0 1 1 0 0
0
0
0
0Number of changes for gene 2 (character 2) = 2
Gene number 2, Option number 3.
Sum of changes = 9
Genes: 0 = absence, 1 = presence
speciesg1g2g3g4g5g6
s1100110
s2001000
s3110000
s4110111
s5001110
Total number of changes
given the tree
121221
Intermediate Summary
MP tree = one for which minimal number of changes are needed to explain the data
We can now search for the best tree under the MP criterion
Challenges
Evaluating big tree “by hand” can be problematic. We want the computer to do it.
Going over all the trees? How many trees are there?
Can we generalize to nucleotides? To amino acids?
Is the parsimony criterion ideal?
Positions :
speciesp1p2p3p4p5p6
s1AAGTAA
s2CAAAAC
s3CAGGAA
s4AAATAC
s5GCGCCA
s1 AAGTAA
s2 CAAAAC
s3 CAGGAA
s4 AAATAC
s5 GCGCCA
GACA GGGACAAG GCGAGAAA
Human ChimpChicken GorillaDuck
Find the MP score of the tree for these sequences
Exercise
How to efficiently compute the MP score of a tree
A GC CA
Human ChimpChicken GorillaDuck
{A,G}
{A,C,G}
{A,C}
{A,C}
Postorder tree scan. In each node, if the intersection between the leaves is empty: we apply a union operator. Otherwise, an intersection.
The Fitch algorithm (1971):
A GC CA
Human ChimpChicken GorillaDuck
{A,G}
{A,C,G}
{A,C}
{A,C}
Total number of changes = number of union operators.
Rooting the tree
From Wiki commons
Positions :
speciesp1p2p3p4p5p6
HumanAAGTAA
ChimpAATTAC
GorillaACATAA
A A A A A AA A A
C H G G C HH C G
Total number of changes = 0
For all 3 possible tree topologies
Positions :
speciesp1p2p3p4p5p6
HumanAAGTAA
ChimpAATTAC
GorillaACATAA
A A C C A AA A C
C H G G C HH C G
Total number of changes = 1
For all 3 possible tree topologies
Positions :
speciesp1p2p3p4p5p6
HumanAAGTAA
ChimpAATTAC
GorillaACATAA
T G A A T GG T A
C H G G C HH C G
Total number of changes = 2
For all 3 possible tree topologies
Positions :
speciesp1p2p3p4p5p6
HumanAAGTAA
ChimpAATTAC
GorillaACATAA
C H G G C HH C G
Total number of changes is always the same
for all 3 possible tree topologies
With 4 taxa
Orangutan
G O HC H C GOO C HG
G H CO H O CGO H GC
G C OH H O GCO C GH O C GH
O H GC
O C HG
C H GO
C O HG
C O GH
Chimp
Orangutan
Gorilla
Human
C
GC A
G
G
G
G
G
G
A
G
After “bending” the trees, the association of changes and branches does not change!
Rooting does not change MP score
G
Chimp
Orangutan
Gorilla
Human
C
GC C
G
G
G
C
C
G
C
G
C
After “bending” the trees, the association of changes and branches does not change!
Rooting does not change MP score
Back to solving the relationships between human, chimp and gorilla…
Using an outgroup
1
2
3
3 1
2
No MP with 3 species
Back to solving the relationships between human, chimp and gorilla…
Using an outgroup
Human
Chimp
Chicken
Gorilla
Human
Gorilla
Chimp
Chicken
Human
Chicken
Chimp
Gorilla
With 4 taxa, there are 3 difference unrooted trees.
Human
Chimp
Chicken
Gorilla
Human
Gorilla
Chimp
Chicken
Human
Chicken
Chimp
Gorilla
One tree gets a better score (less changes) than the other trees.
Human
Chimp
Chicken
Gorilla
We then use an external knowledge, that chicken is the outgroup and get a rooted tree
Recommended