45
Linear-time computation of local Linear-time computation of local periods periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov (Moscow) and Jean-Pierre Duval, Thierry Lecroq, Arnaud Lefebvre (Rouen) Haifa Stringology Workshop, April 3-8 2005

Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

Embed Size (px)

DESCRIPTION

3 Finding periodicities CGCGGCAGTTTTGCCGACTGTTTGGGACTTGCTCGAACTTGCCTATGCCAAGCTGCCGACGATTC CGCCCACCCTGTTGGAACGCGATTTTAATTTCCCGCCTTTTTCCGAACTCGAAGCCGAAGTCGCC AAAATCGCCGATTATCAAACGCGTGCCGGAAAGGAATGCCGCCGTGCAGCCTGAAACCTCCGCCC AATACCAGCACCGTTTCGCCCAAGCCATACGCGGGGGCGAAGCCGCAGACGGTCTGCCGCAAGAC CGACTGAACGTCTATATCCGCCTGATACGCAACAATATCTACAGCTTTATCGACCGTTGTTATAC CGAAACGCTGCAATACTTTGACCGCGAAGAATGGGGCCGTCTGAAAGAAGGTTTCGTCCGCGACG CGTGCGCCCAAACGCCCTATTTTCAAGAAATCCCCGGCGAGTTCCTCCAATATTGCCAAAGCCTG CCGCTTTTAGACGGCATTTTGGCACTGATGGATTTTGAATATACCCAATTGCTGGCAGAAGTTGC TCAAATTCCGGATATTCCCGACATTCATTATTCAAATGACAGCAAATACACACCTTCCCCTGCGG CCTTTATCCGGCAATATCGATATGATGTTACCGATGATTTGCATGAAGCGGAAACAGCCTTGTTA ATATGGCGAAACGCCGAAGATGATGTGATGTACCAAACATTGGACGGCTTCGATATGATGCTGCT AGAAATAATGGGGTTCTCCGCGCTTTCGTTTGACACCCTCGCCCAAACCCTTGTCGAATTTATGC CTGAGGACGATAATTGGAAAAATATTTTGCTTGGGAAATGGTCAGGCTGGACTGAACAAAGGATT ATCATCCCCTCCTTGTCCGCCATATCCGAAAATATGGAAGACAATTCCCCGGGCC

Citation preview

Page 1: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

Linear-time computation of local Linear-time computation of local periodsperiods

Gregory KucherovINRIA/LORIA

Nancy, France

joint work with Roman Kolpakov (Moscow) and Jean-Pierre Duval, Thierry Lecroq, Arnaud

Lefebvre (Rouen)

Haifa Stringology Workshop, April 3-8 2005

Page 2: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

2

Periodicities (repetitions) in stringsPeriodicities (repetitions) in strings

period: the (global) period: minimal period periodicity = word of period Example: square, cube : fractional periodicity periodicities = “runs” of squares (cyclic) root, 8/3 exponent

Page 3: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

3

Finding periodicitiesFinding periodicities

CGCGGCAGTTTTGCCGACTGTTTGGGACTTGCTCGAACTTGCCTATGCCAAGCTGCCGACGATTCCGCCCACCCTGTTGGAACGCGATTTTAATTTCCCGCCTTTTTCCGAACTCGAAGCCGAAGTCGCCAAAATCGCCGATTATCAAACGCGTGCCGGAAAGGAATGCCGCCGTGCAGCCTGAAACCTCCGCCCAATACCAGCACCGTTTCGCCCAAGCCATACGCGGGGGCGAAGCCGCAGACGGTCTGCCGCAAGACCGACTGAACGTCTATATCCGCCTGATACGCAACAATATCTACAGCTTTATCGACCGTTGTTATACCGAAACGCTGCAATACTTTGACCGCGAAGAATGGGGCCGTCTGAAAGAAGGTTTCGTCCGCGACGCGTGCGCCCAAACGCCCTATTTTCAAGAAATCCCCGGCGAGTTCCTCCAATATTGCCAAAGCCTGCCGCTTTTAGACGGCATTTTGGCACTGATGGATTTTGAATATACCCAATTGCTGGCAGAAGTTGCTCAAATTCCGGATATTCCCGACATTCATTATTCAAATGACAGCAAATACACACCTTCCCCTGCGGCCTTTATCCGGCAATATCGATATGATGTTACCGATGATTTGCATGAAGCGGAAACAGCCTTGTTAATATGGCGAAACGCCGAAGATGATGTGATGTACCAAACATTGGACGGCTTCGATATGATGCTGCTAGAAATAATGGGGTTCTCCGCGCTTTCGTTTGACACCCTCGCCCAAACCCTTGTCGAATTTATGCCTGAGGACGATAATTGGAAAAATATTTTGCTTGGGAAATGGTCAGGCTGGACTGAACAAAGGATTATCATCCCCTCCTTGTCCGCCATATCCGAAAATATGGAAGACAATTCCCCGGGCC

Page 4: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

4

Finding periodicitiesFinding periodicities

CGCGGCAGTTTTGCCGACTGTTTGGGACTTGCTCGAACTTGCCTATGCCAAGCTGCCGACGATTCCGCCCACCCTGTTGGAACGCGATTTTAATTTCCCGCCTTTTTCCGAACTCGAAGCCGAAGTCGCCAAAATCGCCGATTATCAAACGCGTGCCGGAAAGGAATGCCGCCGTGCAGCCTGAAACCTCCGCCCAATACCAGCACCGTTTCGCCCAAGCCATACGCGGGGGCGAAGCCGCAGACGGTCTGCCGCAAGACCGACTGAACGTCTATATCCGCCTGATACGCAACAATATCTACAGCTTTATCGACCGTTGTTATACCGAAACGCTGCAATACTTTGACCGCGAAGAATGGGGCCGTCTGAAAGAAGGTTTCGTCCGCGACGCGTGCGCCCAAACGCCCTATTTTCAAGAAATCCCCGGCGAGTTCCTCCAATATTGCCAAAGCCTGCCGCTTTTAGACGGCATTTTGGCACTGATGGATTTTGAATATACCCAATTGCTGGCAGAAGTTGCTCAAATTCCGGATATTCCCGACATTCATTATTCAAATGACAGCAAATACACACCTTCCCCTGCGGCCTTTATCCGGCAATATCGATATGATGTTACCGATGATTTGCATGAAGCGGAAACAGCCTTGTTAATATGGCGAAACGCCGAAGATGATGTGATGTACCAAACATTGGACGGCTTCGATATGATGCTGCTAGAAATAATGGGGTTCTCCGCGCTTTCGTTTGACACCCTCGCCCAAACCCTTGTCGAATTTATGCCTGAGGACGATAATTGGAAAAATATTTTGCTTGGGAAATGGTCAGGCTGGACTGAACAAAGGATTATCATCCCCTCCTTGTCCGCCATATCCGAAAATATGGAAGACAATTCCCCGGGCC

Page 5: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

5

Some work has been done ...Some work has been done ...

... see R.Kolpakov,G.Kucherov, Periodic structures in words, chapter of the 3rd Lothaire volume Applied Combinatorics on Words, Cambridge University Press, 2005

Page 6: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

6

Some work has been done ...Some work has been done ...

... see R.Kolpakov,G.Kucherov, Periodic structures in words, chapter of the 3rd Lothaire volume Applied Combinatorics on Words, Cambridge University Press, 2005

different results based on common simple techniques: extension functions and s-factorization

Page 7: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

7

Rest of this talkRest of this talk

Basics– extension functions– computing periodicities in time– s-factorisation (Lempel-Ziv factorization)– computing periodicities in time

Computing all local periods in time

Page 8: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

8

Extension function: simplest definitionExtension function: simplest definition

all values can be computed in time [Main&Lorentz 84]

Page 9: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

9

Extension function: simplest definitionExtension function: simplest definition

all values can be computed in time [Main&Lorentz 84] a refined algorithm is presented in [Lothaire

05] (inspired from Manacher’s linear-time algorithm for computing palindromes)

Page 10: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

10

Extension function: variantsExtension function: variants

Page 11: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

11

Using extension functions to compute Using extension functions to compute periodicitiesperiodicities

Lemma: There exists a square of period iff

Page 12: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

12

Using extension functions to compute Using extension functions to compute periodicitiesperiodicities

Example:

a t a c g a a c g a a c g g t a c g a a c g a

c g a a c g a ag a a c g a a c

Page 13: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

13

Using extension functions to compute Using extension functions to compute periodicitiesperiodicities

Example:

a t a c g a a c g a a c g g t a c g a a c g a

c g a a c g a ag a a c g a a c

Page 14: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

14

Using extension functions to compute Using extension functions to compute periodicitiesperiodicities

This implies (using binary division) that one can compute a compact representation of

all squares (maximal periodicieis) in time one can compute all squares in time

[Crochemore 81, Main&Lorentz 84] one can test the square-freeness in time

Page 15: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

15

ss-factorization -factorization ((Lempel-Ziv factorization)Lempel-Ziv factorization)

, where :– if letter which immediately follows

does not occur in , then– otherwise is the longest subword

occurring at least twice in Example: s-factorization (Lempel-Ziv factorization) can

be computed in linear time using suffix tree or DAWG

Page 16: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

16

Why Why s-s-factorization is useful herefactorization is useful here

Page 17: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

17

Why Why s-s-factorization is useful herefactorization is useful here

Page 18: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

18

Why Why s-s-factorization is useful herefactorization is useful here

lemma of [Main 89]

Page 19: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

19

Computing (a compact representation of) Computing (a compact representation of) all squares in linear timeall squares in linear time

1. compute the s-factorization of (in )2. for each factor

A. compute all maximal periodicities ending inside and crossing the border between and (in )

B. recover all maximal periodicities occurring inside from a left copy of (in )

Important: the number of maximal periodicities is while the number of squares can be

Page 20: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

20

Using extension functions + Using extension functions + s-s-factorization factorization to compute periodicitiesto compute periodicities

This implies that one can compute a compact representation of

all squares (maximal periodicities) in time [Kolpakov,Kucherov 99]

one can compute all squares (but also cubes, ...) in time

one can test the square-freeness in time [Crochemore 83, Main&Lorentz 85]

Page 21: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

21

Local Local periodperiodss

minimal (local) square at = minimal square centered at local period at (denoted ) = root length of the minimal square at

internal square

right-external square

left- and right-external square

Page 22: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

22

Critical Factorization TheoremCritical Factorization Theorem

for any , global period of

Critical Factorization Theorem: For every , there exists a position such that = global period of

Page 23: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

23

Computing local periods (minimal squares)Computing local periods (minimal squares)

compute separately– internal minimal squares– left-external and right-external minimal

squares– both left- and right-external minimal

squares focus on internal minimal squares compute s-factorization for each factor , compute minimal squares

ending in this factor

Page 24: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

24

Minimal squares inside a factorMinimal squares inside a factor

Page 25: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

25

Minimal squares inside a factorMinimal squares inside a factor

Page 26: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

26

Minimal squares crossing factor borderMinimal squares crossing factor border

focus on squares crossing the left border of

Page 27: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

27

Minimal squares crossing factor borderMinimal squares crossing factor border

focus on squares crossing the left border of focus on those of them centered inside

Page 28: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

28

Minimal squares crossing factor borderMinimal squares crossing factor border

focus on squares crossing the left border of focus on those of them centered inside general idea: compute squares and pick the minimal ones

Page 29: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

29

Minimal squares crossing factor borderMinimal squares crossing factor border

focus on squares crossing the left border of focus on those of them centered inside general idea: compute squares and pick the minimal ones be careful, the number of squares can be super-linear!!

Page 30: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

30

Minimal squares crossing factor borderMinimal squares crossing factor border

focus on squares crossing the left border of focus on those of them centered inside general idea: compute squares and pick the minimal ones be careful, the number of squares can be super-linear!! compute maximal periodicities in increasing order of periods

Page 31: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

31

Minimal squares crossing factor borderMinimal squares crossing factor border

focus on squares crossing the left border of focus on those of them centered inside general idea: compute squares and pick the minimal ones be careful, the number of squares can be super-linear!! compute maximal periodicities in increasing order of periods only a linear number of squares need to be tested for

minimality!!

Page 32: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

32

Sketch of the proofSketch of the proof

assume we are looking at squares of period

Page 33: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

33

Sketch of the proofSketch of the proof

assume we are looking at squares of period consider largest period for which squares have

been found

Page 34: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

34

Sketch of the proofSketch of the proof

assume we are looking at squares of period consider largest period for which squares have

been found if , then test all squares of period (at most )

Page 35: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

35

Sketch of the proofSketch of the proof

assume we are looking at squares of period consider largest period for which squares have

been found if , then test all squares of period (at most ) if , then either , or

Page 36: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

36

Sketch of the proofSketch of the proof

assume we are looking at squares of period consider largest period for which squares have

been found if , then test all squares of period (at most ) if , then either , or

Page 37: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

37

Sketch of the proofSketch of the proof

assume we are looking at squares of period consider largest period for which squares have

been found if , then test all squares of period (at most ) if , then either , or

Page 38: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

38

Sketch of the proofSketch of the proof

assume we are looking at squares of period consider largest period for which squares have

been found if , then test all squares of period (at most ) if , then either , or

Page 39: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

39

Sketch of the proofSketch of the proof

assume we are looking at squares of period consider largest period for which squares have

been found if , then test all squares of period (at most ) if , then either , or at most squares need to be tested

Page 40: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

40

Computing (right-)external squaresComputing (right-)external squares

Page 41: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

41

Computing (right-)external squaresComputing (right-)external squares

use extension functions!

Page 42: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

42

Computing (right-)external squaresComputing (right-)external squares

use extension functions!

Page 43: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

43

Computing (right-)external squaresComputing (right-)external squares

use extension functions!

Page 44: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

44

Computing (right-)external squaresComputing (right-)external squares

use extension functions! for each , find minimal such that can be done in time

Page 45: Linear-time computation of local periods Linear-time computation of local periods Gregory Kucherov INRIA/LORIA Nancy, France joint work with Roman Kolpakov

45

ConclusionsConclusions

All local periods can be computed in

note that the global period of is