Perfect Phylogeny Tutorial #10 © Ilan Gronau Original slides by Shlomo Moran

Perfect PhylogenyTutorial #10

Original slides by Shlomo Moran

The underlying model:• A character-vector is given for every specie in S.• Each character represents some observable trait.• Each character takes values from a finite set.• Basic Underlying Assumption: characters are

homoplasy free.

Perfect Phylogeny

no reversals

Homoplasy-Free Characters

no convergence

Homoplasy-free characters induce a convex coloring of the phylogenetic tree

The Perfect Phylogeny Problem:

Given character-vectors for S, find:- a phylogenetic tree T over S.

(S is the leaf-set of T)- convex character assignments to

all vertices of T.! This problem is generally NP-hard !If exists

Directed binary characters: • 0 – property exists• 1 – property doesn’t exist• Initially (at the root) all propertied do not exist.

Input: binary coloring (C1,…,Cm) of a set S (nxm binary matrix M)

Problem: Find a phylogenetic tree T over S (if one exists), s.t.1. For j=1,…,m, the partial coloring induced by Cj is convex in

T.2. The root has state 0 in all characters.

Directed Binary Perfect Phylogeny

We will present a polynomial-time solution

(11000)

(00100)

(01000)

(00110)

(11001)

m characters

sExample

C1 C2 C3 C4 C5

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 0 1D 0 0 1 1 0E 0 1 0 0 0

Input: Possible output:

(00000)

(11000)

(01000)(00100)

zero-root

A tree is a directed perfect phylogeny for a given 0/1 matrix

iff we can map each character to an

edge/vertex on which this character was “turned on”.

C1 C2 C3 C4 C5

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 0 1D 0 0 1 1 0E 0 1 0 0 0

Example:

An Important Observation

C2 origin of C2

Laminar MatricesDefinitions: Oj – set of objects that have character Cj (Oj={i : Mij=1}). A collection of sets {S1 ,…, Sk} is laminar if

for all i, j, either Si and Sj are disjoint, or one includes the other.

Theorem: A binary matrix M has a perfect phylogenetic tree iff the collection {O1 ,…, Om} is laminar.

C1 C2 C3 C4 C5

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 0 1D 0 0 1 1 0E 0 1 0 0 0

C1 C2 C3 C4 C5

A 1 1 0 0 0B 0 0 1 0 1C 1 1 0 0 1D 0 0 1 1 0E 0 1 0 0 1

Laminar Not Laminar

Proof of Theorem

Assume M has a perfect phylogeny.Consider the edges labeled Ci and Cj: If there is a root-to-leaf path containing both edges (C1,C2 below),

then Oi includes Oj or vice-versa. Otherwise, Oi and Oj are disjoint (C1,C3 below).

Assume that the collection {O1 ,…, Ok} is laminar. We prove by induction on the number of characters k that M has a perfect phylogenetic tree.

Basis: one character. There are at most two (distinct) objects, one with and one without this character.

A 1B 0

ABroot

Proof of Theorem (cont)

Assume that the collection {O1 ,…, Ok} is laminar.

Induction step: assume correctness for n-1 characters.Consider a matrix with n characters (non-zero columns), and assume WLOG that O1 is not contained in Oj for all j > 1. S1 – the set of objects i for which Mi1 = 1. S2 – the remaining objects. Claim: each character belongs to objects in S1 or S2 , but not to both.

By induction there are trees T1 and T2 for S1 and S2. C1 C2 C3 C4 C5

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 0 1D 0 0 1 1 0E 1 0 0 0 0

C1S1 ={A,C,E}S2 ={B,D}

Proof of Theorem (cont)

why is this?

Efficient Implementation1. Sort the columns (characters) according to decreasing binary

value.

Claim: If the binary value of column i is larger than that of column j, then Oi is not a proper subset of Oj.

Proof: Oi > Oj means the 1’s in Oi are not covered by the 1’s in Oj.

C1 C2 C3 C4 C5

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 0 1D 0 0 1 1 0E 0 1 0 0 0

C2 C1 C3 C5 C4

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 1 0D 0 0 1 0 1E 1 0 0 0 0

why is this?

2. Make a backwards linked list of the 1’s in each row

Claim: If the columns are sorted, then the set of columns is laminar ifffor each column i, all the links leaving column i point at the same column.

If the matrix is laminar then these pointers define the inclusion hierarchy

Efficient Implementation (cont)

C2 C1 C3 C5 C4

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 1 0D 0 0 1 0 1E 1 0 0 0 0

C2 C1 C3 C5 C4

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 1 0D 0 0 1 0 1E 0 0 1 1 0

(11000)

(00100)

(01000)

(00110)

(11001)

(00000)

(11000)

(10000)(00100)

3. If the matrix is laminar, compute the inclusion hierarchy4. Reconstruct topology of the phylogenetic tree and ancestral

character states

Efficient Implementation (cont)

C2 C1 C3 C5 C4

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 1 0D 0 0 1 0 1E 1 0 0 0 0

1. Sort the columns (characters) according to decreasing binary value.

2. Make a backwards linked list of the 1’s in each row 3. If the matrix is laminar, compute the inclusion hierarchy4. Reconstruct topology of the phylogenetic tree and ancestral

character states

Complexity: O(mn) – use radix (bucket) sort in stage 1.

Efficient Implementation - Summary

C1 C2 C3 C4 C5

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 0 1D 0 0 1 1 0E 0 1 0 0 0

C2 C1 C3 C5 C4

A 1 1 0 0 0B 0 0 1 0 0C 1 1 0 1 0D 0 0 1 0 1E 1 0 0 0 0

Perfect Phylogeny Tutorial #10 © Ilan Gronau Original slides by Shlomo Moran

Documents

Gronau Kagermeier Ecomm 2003 Karlstad workshop_1e

Plgw03, 17/12/07 1 On the Hardness of Inferring Phylogenies from Triplet-Dissimilarities Ilan Gronau Shlomo Moran Technion – Israel Institute of Technology

Inference in HMM Tutorial #6 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger

Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger

Shlomo Hershkop1 Introduction to java 3101-003 Class 1 Fall 2003 Shlomo Hershkop

Ustec iLAN Home Network Training presents…. 2 Agenda UStec iLAN Series iLAN Packages

Sidur Tikun Shlomo

Scaling and Memory in Stock Market and Currency Variations: Similarities to Earthquakes Shlomo Havlin Bar-Ilan, Israel in collaboration with Kazuko Yamasaki

Scaling, renormalization and self- similarity in complex networks Chaoming Song (CCNY) Lazaros Gallos (CCNY) Shlomo Havlin (Bar-Ilan, Israel) Hernan A

The Neighbor Joining Tree-Reconstruction Technique Lecture 13 ©Shlomo Moran & Ilan Gronau

Gronau kagermeier success_factors_gltrg_london_2005

VDH Fortbildung I 2016€¦ · by Bridgett M. vonHoldt, James A. Cahill, Zhenxin Fan, Ilan Gronau, Jacqueline Robinson, John P. Pollinger, Beth Shapiro, Jeff Wall, and Robert K. Wayne

Bayesian inference of ancient human demography from ... · Bayesian inference of ancient human demography from individual genome sequences Ilan Gronau, Melissa J. Hubisz, Brad Gulko,

Sonderkommando - Shlomo Venezia

Werner Gronau, Wolfgang Fischer, Robert Pressl (Ed.) Aspects of … · 2015-10-15 · Werner Gronau, Wolfgang Fischer, Robert Pressl (Ed.) W. Gronau , W. Fischer, R.Pressl (Ed.) Aspects

Maximum Parsimony Probabilistic Models of Evolutions Distance Based Methods Lecture 12 © Shlomo Moran, Ilan Gronau

Shai Carmi Bar-Ilan, BU Together with: Shlomo Havlin, Chaoming Song, Kun Wang, and Hernan Makse

PROF. SHLOMO GROSSMAN

präsentiert: Platz- bzw. Tischreservierungen unter 02565 ... · JAZZFEST GRONAU Jazz & Dine … vorzüglich speisen bei exzellenter Musik … (eine Kooperation von Jazzfest Gronau

ilangr/papers/short_edges_full_Jan08.pdf · Fast and Reliable Reconstruction of Phylogenetic Trees with Very Short Edges⁄ (Draft) Ilan Gronauy Shlomo Moranz Sagi Snirx January 17,