Upload
brenna-morris
View
28
Download
3
Embed Size (px)
DESCRIPTION
Faster Sorting by Reversals Eric Tannier, Marie-France Sagot INRIA, Lyon, France. Motivations Genome Rearrangements. Human. Mouse. Sorting by Reversals. 0 7 5 3 -1 -6 -2 4 8. (HS). (MM). 0 1 2 3 4 5 6 7 8. - PowerPoint PPT Presentation
Citation preview
Faster Sorting by Reversals
Eric Tannier, Marie-France Sagot
INRIA, Lyon, France
MotivationsGenome Rearrangements
Human
Mouse
Sorting by Reversals
0 7 5 3 -1 -6 -2 4 8 (HS)
0 1 2 3 4 5 6 7 8 (MM)
Sorting by Reversals
0 7 5 3 -1 -6 -2 4 8 (HS)
0 1 2 3 4 5 6 7 8 (MM)
0 1 -3 -5 -7 -6 -2 4 8
Sorting by Reversals
0 7 5 3 -1 -6 -2 4 8 (HS)
0 1 2 3 4 5 6 7 8 (MM)
0 1 -3 -5 -7 -6 -2 4 8
0 1 -3 -5 -4 2 6 7 8
Sorting by Reversals
0 7 5 3 -1 -6 -2 4 8 (HS)
0 1 2 3 4 5 6 7 8 (MM)
0 1 -3 -5 -7 -6 -2 4 8
0 1 -3 -5 -4 2 6 7 8
0 1 -3 -2 4 5 6 7 8
Sorting by Reversals
0 7 5 3 -1 -6 -2 4 8 (HS)
0 1 2 3 4 5 6 7 8 (MM)
0 1 -3 -5 -7 -6 -2 4 8
0 1 -3 -5 -4 2 6 7 8
0 1 -3 -2 4 5 6 7 8
History
1995 Hannenhalli and Pevznerfirst polynomial algorithm O(n4)
1996 Berman and Hannenhallicomplexity improvement O(n2a(n))
1997 Kaplan, Shamir and Tarjancomplexity improvement O(n2)
1997 CapraraNP-completeness of the unsigned problem
2003 Bergeronsimple presentation
2003 Ozery-Flato and Shamir"It is a central problem in the study of genome rearrangements whether one can obtain a subquadratic algorithm for sorting by reversals"
The Breakpoint Graph
0 7 5 3 -1 -6 -2 4 8
0 -1 -2 3 4 5 -6 7 8
Reality
Desire
The Breakpoint Graph
4 5 1-cycle, adjacency
3 -4 52-cycle
3 -4 5 63-cycle
Two 2-cycles3 -4 -4.5 5 6
The effect of a reversal on the cycles
0 7 5 3 -1 -6 -2 4 8
0 -1 -2 3 4 5 -6 7 8
0 1 -2 -3 4 -5 -6 -7 8
0 1 -3 -5 -7 -6 -2 4 8
0 7 5 3 -1 -6 -2 4 8
0 -1 -2 3 4 5 -6 7 8
0 7 -4 2 6 1 -3 -5 8
Oriented cycle0 1 2 3 -4 -5 6 7 8
Non-oriented cycle
In the Breakpoint Graph
Oriented cycle = with blue edges joining different signs
Component = Set of cycles, not crossing others cycles outside
Oriented Component = Component with an oriented cycle
Unoriented Component = Component with non oriented cycle
The theorem of Hannenhalli and Pevzner
d = n + 1 - c + t
minimum number of reversals
size of the permutation
number of cycles in the breakpoint graph
number of reversals to clear unoriented components
The theorem of Hannenhalli and Pevzner
d = n + 1 - c
minimum number of reversals
size of the permutation
number of cycles in the breakpoint graph
(no unoriented component)
0 -1 -2 3 4 5 -6 7 8
A bad choice among oriented cycles
0 1 -2 -3 4 5 6 7 8
0 7 5 6 1 -3 -2 4 8
0 7 5 3 -1 -6 -2 4 8
Different approaches
Naive: Choose any oriented cycle, apply the corresponding reversal, and if it creates an unoriented component, choose another one O(n3)
Better: Test some properties on oriented cycles that cannot create unoriented component O(n2)
Our method: Bad oriented cycles are good ones... later
The algorithm
0 -1 -2 3 4 5 -6 7 8
Solution : empty
AB
CD
The algorithm
Solution : D
AB
C
0 1 -2 -3 4 5 6 7 8
The algorithm
Solution : D,C
AB
0 1 2 3 4 5 6 7 8
The algorithm
Solution : (D,C)
AB
C
0 1 -2 -3 4 5 6 7 8
The algorithm
Solution : (D,C)
AB
CD
0 -1 -2 3 4 5 -6 7 8
The algorithm
Solution : A...(D,C)
B
CD
0 1 -2 -3 4 -5 -6 -7 8
The algorithm
Solution : A,B...(D,C)C
D
0 1 2 -3 -4 -5 6 7 8
The algorithm
Solution : A,B,D,C
0 1 2 3 4 5 6 7 8
Time complexity
With any classical data structure, it takes linear time to perform a reversal, so at least quadratic time to sort.
Kaplan and Verbin (2003) invented a data structure to represent permutation, which allows to pick an oriented cycle and perform a reversal in time O(sqrt(n log(n)))
We use the same data structure to sort by reversals in time O(sqrt(n log(n))).
0 7 5 3 -1 -6 -2 4 8
The data structure
-1
5
0 3
7
-2
-6 4
8
Future work
Can we do better in time complexity?
Can the method give ideas to- sort with several (>2) permutations? (NP-hard, Caprara, 2002)- sort by transpositions?(unknown complexity)