31
More ideas about new orthology inference methods Manuela Geiß Bioinformatics Group University of Leipzig 34 th TBI Winterseminar Bled, 12 th February 2019 Manuela Geiß More ideas about new orthology inference methods

More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

More ideas about new orthology inference methods

Manuela Geiß

Bioinformatics GroupUniversity of Leipzig

34th TBI WinterseminarBled, 12th February 2019

Manuela Geiß More ideas about new orthology inference methods

Page 2: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Why Orthology Analysis?

Orthology analysis is an important part of data analysis in manyareas such as comparative genomics and molecular phylogenetics

source: https://en.wikipedia.org/wiki/Tree of life (biology)

Idea: There is only one true tree of life – we just need goodmethods to detect it!

Manuela Geiß More ideas about new orthology inference methods

Page 3: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Tree-based vs. graph-based methods

Tree-based:

species tree must be known,gene tree via sequencealignments→ tree reconciliation givesorthology relation

accuracy highly depends onquality of trees

high computational costs

Graph-based:

construction of the treesfrom sequence data

lower computational costs

many tools restricted tosmall number of species(except ProteinOrtho1)

some tools even includemanual correction

→ Our overall-goal: improve orthology inference/develop newmethods

1Lechner M, Findeiß S, Steiner L, Marz M, Stadler PF, Prohaska SJ, 2011.Proteinortho: detection of (co-)orthologs in large-scale analysis. BMCBioinformatics 12:124

Manuela Geiß More ideas about new orthology inference methods

Page 4: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Tree-based vs. graph-based methods

Tree-based:

species tree must be known,gene tree via sequencealignments→ tree reconciliation givesorthology relation

accuracy highly depends onquality of trees

high computational costs

Graph-based:

construction of the treesfrom sequence data

lower computational costs

many tools restricted tosmall number of species(except ProteinOrtho1)

some tools even includemanual correction

→ Our overall-goal: improve orthology inference/develop newmethods

1Lechner M, Findeiß S, Steiner L, Marz M, Stadler PF, Prohaska SJ, 2011.Proteinortho: detection of (co-)orthologs in large-scale analysis. BMCBioinformatics 12:124

Manuela Geiß More ideas about new orthology inference methods

Page 5: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

What are best matches?

True divergence times of genes/species often not known → manytools use Best Match Heuristics

Definition

The sequence y of species Y is a best match of the sequence x ofspecies X if y is “closest” to x among all genes in species Y .

Definition

The sequences x and y are reciprocal best matches if y is closestto x and x is closest to y .

→ Goal: Deeper understanding of (reciprocal) Best Match Graphsto make the process more efficient

Manuela Geiß More ideas about new orthology inference methods

Page 6: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

What are best matches?

True divergence times of genes/species often not known → manytools use Best Match Heuristics

Definition

The sequence y of species Y is a best match of the sequence x ofspecies X if y is “closest” to x among all genes in species Y .

Definition

The sequences x and y are reciprocal best matches if y is closestto x and x is closest to y .

→ Goal: Deeper understanding of (reciprocal) Best Match Graphsto make the process more efficient

Manuela Geiß More ideas about new orthology inference methods

Page 7: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

What are best matches?

True divergence times of genes/species often not known → manytools use Best Match Heuristics

Definition

The sequence y of species Y is a best match of the sequence x ofspecies X if y is “closest” to x among all genes in species Y .

Definition

The sequences x and y are reciprocal best matches if y is closestto x and x is closest to y .

→ Goal: Deeper understanding of (reciprocal) Best Match Graphsto make the process more efficient

Manuela Geiß More ideas about new orthology inference methods

Page 8: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

What are best matches?

True divergence times of genes/species often not known → manytools use Best Match Heuristics

Definition

The sequence y of species Y is a best match of the sequence x ofspecies X if y is “closest” to x among all genes in species Y .

Definition

The sequences x and y are reciprocal best matches if y is closestto x and x is closest to y .

→ Goal: Deeper understanding of (reciprocal) Best Match Graphsto make the process more efficient

Manuela Geiß More ideas about new orthology inference methods

Page 9: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Best Match Graphs (BMGs)

here: “closest” = closest last common ancestor (lca)

Definition

The leaf y is a best match of the leaf x in a tree T if σ(x) 6= σ(y),and

(i) lca(x , y) � lca(x , y ′) for all leaves y ′ from speciesσ(y ′) = σ(y).

We write x → y .

σ = colors (= species)

a1 b2b1 a2c1

c2

a1

b2

b1

a2

c1

c2

Manuela Geiß More ideas about new orthology inference methods

Page 10: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Reciprocal Best Match Graphs (RBMGs)

Definition

The leaf y is a reciprocal best match of the leaf x in a tree T ifσ(x) 6= σ(y), and

(i) lca(x , y) � lca(x , y ′) for all leaves y ′ from speciesσ(y ′) = σ(y), and

(ii) lca(x , y) � lca(y , x ′) for all leaves x ′ from speciesσ(x ′) = σ(x).

σ = colors (= species)

a1 b2b1 a2c1

c2

a1

b2

b1

a2

c1

c2

→ Which (un-)directed graphs are (Reciprocal) Best Match Graphs, i.e.,have a tree representation?

Manuela Geiß More ideas about new orthology inference methods

Page 11: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Reciprocal Best Match Graphs (RBMGs)

Definition

The leaf y is a reciprocal best match of the leaf x in a tree T ifσ(x) 6= σ(y), and

(i) lca(x , y) � lca(x , y ′) for all leaves y ′ from speciesσ(y ′) = σ(y), and

(ii) lca(x , y) � lca(y , x ′) for all leaves x ′ from speciesσ(x ′) = σ(x).

σ = colors (= species)

a1 b2b1 a2c1

c2

a1

b2

b1

a2

c1

c2

→ Which (un-)directed graphs are (Reciprocal) Best Match Graphs, i.e.,have a tree representation?

Manuela Geiß More ideas about new orthology inference methods

Page 12: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Mathematical Results about (Reciprocal) Best MatchGraphs

BMGs1

Two characterizations for2-cBMGs via triples andneighborhoods→ Recognition in polynomialtime

Characterization for n-cBMGsvia Aho-Tree from 2-cBMGs→ Recognition and treereconstruction in polynomialtime

Unique least resolved tree

RBMGs2

Classification of three distinctgroups of 3-cRBMGs→ Recognition in polynomialtime

Characterization for n-cRBMGsvia supertree from 3-cRBMGs→ Recognition and treereconstruction presumably notin polynomial time

No unique least resolved tree→ Much information lost byonly looking at RBMGs!

1M. Geiß, E. Chavez, M. Gonzalez, A. Lopez, D. Valdivia, M. H. Rosales, B.M.R. Stadler, Marc Hellmuth,

P.F. Stadler, 2018, Best Match Graphs, J. Math. Biology (accepted - to appear)2

M. Geiß, Marc Hellmuth, P.F. Stadler, 2019, Reciprocal Best Match Graphs.. (manuscript in preparation)

Manuela Geiß More ideas about new orthology inference methods

Page 13: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Mathematical Results about (Reciprocal) Best MatchGraphs

BMGs1

Two characterizations for2-cBMGs via triples andneighborhoods→ Recognition in polynomialtime

Characterization for n-cBMGsvia Aho-Tree from 2-cBMGs→ Recognition and treereconstruction in polynomialtime

Unique least resolved tree

RBMGs2

Classification of three distinctgroups of 3-cRBMGs→ Recognition in polynomialtime

Characterization for n-cRBMGsvia supertree from 3-cRBMGs→ Recognition and treereconstruction presumably notin polynomial time

No unique least resolved tree

→ Much information lost byonly looking at RBMGs!

1M. Geiß, E. Chavez, M. Gonzalez, A. Lopez, D. Valdivia, M. H. Rosales, B.M.R. Stadler, Marc Hellmuth,

P.F. Stadler, 2018, Best Match Graphs, J. Math. Biology (accepted - to appear)2

M. Geiß, Marc Hellmuth, P.F. Stadler, 2019, Reciprocal Best Match Graphs.. (manuscript in preparation)

Manuela Geiß More ideas about new orthology inference methods

Page 14: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Mathematical Results about (Reciprocal) Best MatchGraphs

BMGs1

Two characterizations for2-cBMGs via triples andneighborhoods→ Recognition in polynomialtime

Characterization for n-cBMGsvia Aho-Tree from 2-cBMGs→ Recognition and treereconstruction in polynomialtime

Unique least resolved tree

RBMGs2

Classification of three distinctgroups of 3-cRBMGs→ Recognition in polynomialtime

Characterization for n-cRBMGsvia supertree from 3-cRBMGs→ Recognition and treereconstruction presumably notin polynomial time

No unique least resolved tree→ Much information lost byonly looking at RBMGs!

1M. Geiß, E. Chavez, M. Gonzalez, A. Lopez, D. Valdivia, M. H. Rosales, B.M.R. Stadler, Marc Hellmuth,

P.F. Stadler, 2018, Best Match Graphs, J. Math. Biology (accepted - to appear)2

M. Geiß, Marc Hellmuth, P.F. Stadler, 2019, Reciprocal Best Match Graphs.. (manuscript in preparation)

Manuela Geiß More ideas about new orthology inference methods

Page 15: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

How can we use all this?

Theorem 1

In pure DL scenarios (i.e. in the absence of HGT events) thereciprocal best match graph can only produce false positive but notfalse negative orthology assignments.⇒ The true orthology relation has to be contained in the RBMG.

→ Some false positive edges can be identified using best match graphs

a b c1c2

c2

c1

b

a

Remove middle edge:P4-Editing (P4E)

1M. Geiß, A. Lopez, D. Valdivia, M. H. Rosales, Marc Hellmuth, P.F. Stadler, 2019, Best Match Graphs and

Reconciliation of Gene Trees with Species Trees. J. Math. Biology (manuscript in preparation)

Manuela Geiß More ideas about new orthology inference methods

Page 16: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

How can we use all this?

Theorem 1

In pure DL scenarios (i.e. in the absence of HGT events) thereciprocal best match graph can only produce false positive but notfalse negative orthology assignments.⇒ The true orthology relation has to be contained in the RBMG.

→ Some false positive edges can be identified using best match graphs

a b c1c2

c2

c1

b

a

Remove middle edge:P4-Editing (P4E)

1M. Geiß, A. Lopez, D. Valdivia, M. H. Rosales, Marc Hellmuth, P.F. Stadler, 2019, Best Match Graphs and

Reconciliation of Gene Trees with Species Trees. J. Math. Biology (manuscript in preparation)

Manuela Geiß More ideas about new orthology inference methods

Page 17: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Simulation results with 0 HGT events

Number of Duplications

Num

ber

of L

osse

s

5

10

15

20

25

10 20 30 400.0

0.1

0.2

0.3

0.4

0.5

FPRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s

5

10

15

20

25

10 20 30 400.0

0.1

0.2

0.3

0.4

0.5

Number of Duplications

Num

ber

of L

osse

s

5

10

15

20

25

10 20 30 400.0

0.1

0.2

0.3

0.4

0.5

FNRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s

5

10

15

20

25

10 20 30 400.0

0.1

0.2

0.3

0.4

0.5

Manuela Geiß More ideas about new orthology inference methods

Page 18: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Simulation results with 0 HGT events

Number of Duplications

Num

ber

of L

osse

s

5

10

15

20

25

10 20 30 400.0

0.1

0.2

0.3

0.4

0.5

FPRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s

5

10

15

20

25

10 20 30 400.0

0.1

0.2

0.3

0.4

0.5

Number of Duplications

Num

ber

of L

osse

s

5

10

15

20

25

10 20 30 400.0

0.1

0.2

0.3

0.4

0.5

FNRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s5

10

15

20

25

10 20 30 400.0

0.1

0.2

0.3

0.4

0.5

Manuela Geiß More ideas about new orthology inference methods

Page 19: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Simulation results with 1 HGT event

Number of Duplications

Num

ber

of L

osse

s

0

2

4

6

5 10 150.0

0.2

0.4

0.6

0.8

1.0

FPRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s

0

2

4

6

5 10 150.0

0.2

0.4

0.6

0.8

1.0

Number of Duplications

Num

ber

of L

osse

s

0

2

4

6

5 10 150.0

0.2

0.4

0.6

0.8

1.0

FNRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s

0

2

4

6

5 10 150.0

0.2

0.4

0.6

0.8

1.0

Manuela Geiß More ideas about new orthology inference methods

Page 20: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Simulation results with 1 HGT event

Number of Duplications

Num

ber

of L

osse

s

0

2

4

6

5 10 150.0

0.2

0.4

0.6

0.8

1.0

FPRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s

0

2

4

6

5 10 150.0

0.2

0.4

0.6

0.8

1.0

Number of Duplications

Num

ber

of L

osse

s

0

2

4

6

5 10 150.0

0.2

0.4

0.6

0.8

1.0

FNRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s

0

2

4

6

5 10 150.0

0.2

0.4

0.6

0.8

1.0

Manuela Geiß More ideas about new orthology inference methods

Page 21: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Simulation results with 4 HGT events

Number of Duplications

Num

ber

of L

osse

s

5

10

15

10 15 20 250.0

0.2

0.4

0.6

0.8

1.0

FPRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s

5

10

15

10 15 20 250.0

0.2

0.4

0.6

0.8

1.0

Number of Duplications

Num

ber

of L

osse

s

5

10

15

10 15 20 250.0

0.2

0.4

0.6

0.8

1.0

FNRbefore

vs.afterP4E

Number of Duplications

Num

ber

of L

osse

s5

10

15

10 15 20 250.0

0.2

0.4

0.6

0.8

1.0

Manuela Geiß More ideas about new orthology inference methods

Page 22: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Summary & Outlook

Results so far:

Characterization and tree reconstruction algorithms for BMGsand RBMGs

RBMG contains no false positive orthologs in the absence ofHGT

P4-Editing in the absence of HGT

RBMG contains false negative orthologs in the presence ofHGT

→ RBMG loses much information that is still contained in theBMG!

Next steps:

BMGs might help to detect HGT events

Improved graph editing based on characterization of BMGsand RBMGs

Manuela Geiß More ideas about new orthology inference methods

Page 23: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Summary & Outlook

Results so far:

Characterization and tree reconstruction algorithms for BMGsand RBMGs

RBMG contains no false positive orthologs in the absence ofHGT

P4-Editing in the absence of HGT

RBMG contains false negative orthologs in the presence ofHGT

→ RBMG loses much information that is still contained in theBMG!

Next steps:

BMGs might help to detect HGT events

Improved graph editing based on characterization of BMGsand RBMGs

Manuela Geiß More ideas about new orthology inference methods

Page 24: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Special Thanks to:Peter F. StadlerMarc HellmuthNicolas WiesekeEdgar ChavezMarcos GonzalezMaribel Hernandez RosalesAlitzel LopezDulce Valdivia

Thank youfor your attention!

Manuela Geiß More ideas about new orthology inference methods

Page 25: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

Some basics: Rooted Trees and Triples

Rooted Tree T :

a cb edacyclic, connected graph

Triples:

T displays a triple ab|c if the pathfrom c to the root is notintersected by the path from a to b.

R(T ) = {ab|c , ab|d , ab|e}A set of triples R is said to beconsistent if there is a tree T withR ⊆ R(T ).

Consistency-check viaBUILD-algorithm in polynomialtime. In case of consistency, itreturns a tree T (the ”Aho Tree”)with R ⊆ R(T ).

Manuela Geiß More ideas about new orthology inference methods

Page 26: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

How do n-cBMGs look like?

Theorem

A colored digraph (G , σ) is a n-cBMG if and only if all inducedsubgraphs on two colors are 2-cBMG’s and the union of the triplesobtained from their least resolved trees forms a consistent set.

least resolved = ”lowest possible resolution”

→ The unique least resolved tree for (G , σ) can be reconstructedin cubic time

→ All information that is needed, is contained in the 2-coloredbest match graphs!

Manuela Geiß More ideas about new orthology inference methods

Page 27: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

The case of two colors: Characterization via triples

Some 2-colored subgraphs on 3 vertices give us constraints on thetree topology:

a b

c

a b

c

a b

c

a b

c

X1, X2, X3, and X4 all give the informative triple ab|c.

Theorem

A connected 2-colored digraph (G , σ) is a 2-cBMG if and only if(G , σ) = G (Aho(R(G , σ)), σ), where R(G , σ) is the set of allinformative triples of (G , σ).

Manuela Geiß More ideas about new orthology inference methods

Page 28: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

The case of two colors: Characterization viaout-neighborhoods

Augenkratze-Theorem

A connected 2-colored digraph (G , σ) is a 2-cBMG if and only if(G , σ) satisfies properties (N0), (N1), (N2), and (N3). Moreover,the tree T defined by the H′ := {R ′(α) | α ∈ N} is the uniqueleast resolved tree that explains (G , σ).

(N0) β ⊆ N(α) or β ∩ N(α) = ∅(N1) α ∩ N(β) = β ∩ N(α) = ∅ implies

N(α) ∩ N(N(β)) = N(β) ∩ N(N(α)) = ∅.(N2) N(N(N(α))) ⊆ N(α)

(N3) α ∩ N(N(β)) = β ∩ N(N(α)) = ∅ and N(α) ∩ N(β) 6= ∅implies N−(α) = N−(β) and N(α) ⊆ N(β) or N(β) ⊆ N(α)

properties can be nicely checked by an algorithm

Manuela Geiß More ideas about new orthology inference methods

Page 29: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

The three classes of 3-cRBMGs

There are exactly three classes of 3-cRBMGs:

a1 b1

c

a1 b1

c

a1 b1 a2 c1

a2a1 b1 c1

a1 b1 a2 c1 b2c2

a1 b1

a2

c1

b2

c2

(A) (B) (C)

Theorem

A graph (G , σ) is a 3-cRBMG if and only if the constructionalgorithm returns a tree that explains (G , σ).

Manuela Geiß More ideas about new orthology inference methods

Page 30: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

The three classes of 3-cRBMGs

There are exactly three classes of 3-cRBMGs:

a1

a2

a3

b1

b2

b3 c

a1

a2

b1

b2c

a3 b3

a1 b1 a2 c1

a4 c2

b2

c3

a3b3

a2

c2

a1 b1

b3

a3

c1

b2 a4

c3

a1 b1 a2 c1

a4 c4

b4

c3

a3b3

b2c2c5

a3

b3

a1 b1

a2

c1

a4

c4

b4

c3

b2

c2

c5

(A) (B) (C)

Theorem

A graph (G , σ) is a 3-cRBMG if and only if the constructionalgorithm returns a tree that explains (G , σ).

Manuela Geiß More ideas about new orthology inference methods

Page 31: More ideas about new orthology inference methods · !Our overall-goal: improve orthology inference/develop new methods 1Lechner M, Findeiˇ S, Steiner L, Marz M, Stadler PF, Prohaska

What can we say so far about n-cRBMGs?

Idea: Similarly to the case of BMGs, all information needed iscontained in the 3-colored induced subgraphs of (G , σ)

Conjecture

An undirected colored graph (G , σ) is an n-cRBMG if and only iffor any (Grst , σrst) there exists a tree (Trst , σrst) that explains(Grst , σrst), such that P :=

⋃r ,s,t Trst is compatible.

(Grst , σrst) := induced subgraph on colors r , s, t of (G , σ)

→ It looks like there is no polynomial-time construction algorithmfor n-cRBMGs

Manuela Geiß More ideas about new orthology inference methods