117
Karp-Rabin and Streaming Pattern Matching Jakub Radoszewski University of Warsaw, Poland Text Algorithms Karp-Rabin and Streaming Pattern Matching 1/24

 · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Karp-Rabin and Streaming Pattern Matching

Jakub Radoszewski

University of Warsaw, Poland

Text Algorithms

Karp-Rabin and Streaming Pattern Matching 1/24

Page 2:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Pattern Matching

Problem (Pattern Matching)

Given two strings, a text t[0..n − 1] and a pattern p[0..m − 1], find alloccurrences of the pattern p in the text t.

a b a b a a a b a b a a a b a b a

a b a a a b a b

a b a a a b a b aa b a a a b a b

a b a a a b a b

a b a b a a a b aa b a a a b a b

a b a a a b a b

Karp-Rabin and Streaming Pattern Matching 2/24

Page 3:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Pattern Matching

Problem (Pattern Matching)

Given two strings, a text t[0..n − 1] and a pattern p[0..m − 1], find alloccurrences of the pattern p in the text t.

a b a b a a a b a b a a a b a b a

a b a a a b a b

a b a a a b a b aa b a a a b a b

a b a a a b a b

a b a b a a a b aa b a a a b a b

a b a a a b a b

Karp-Rabin and Streaming Pattern Matching 2/24

Page 4:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Pattern Matching

Problem (Pattern Matching)

Given two strings, a text t[0..n − 1] and a pattern p[0..m − 1], find alloccurrences of the pattern p in the text t.

a b a b a a a b a b a a a b a b a

a b a a a b a b

a b a a a b a b aa b a a a b a b

a b a a a b a b

a b a b a a a b aa b a a a b a b

a b a a a b a b

Karp-Rabin and Streaming Pattern Matching 2/24

Page 5:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

String Hashing

For a string s of length m, we use the Karp-Rabin hash function(polynomial hash, rolling hash):

h(s) = (s[0]rm−1 + s[1]rm−2 + . . .+ s[m − 2]r + s[m − 1]) mod q.

Here r and q are integer parameters.

Theorem

If q is a prime number, the probability that for a random r two strings ofthe same length m receive the same hash (i.e., h(s) = h(t) for s 6= t) is¬ m

q−1 .

Conclusion: One needs to properly choose q and r .

Karp-Rabin and Streaming Pattern Matching 3/24

Page 6:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

String Hashing

For a string s of length m, we use the Karp-Rabin hash function(polynomial hash, rolling hash):

h(s) = (s[0]rm−1 + s[1]rm−2 + . . .+ s[m − 2]r + s[m − 1]) mod q.

Here r and q are integer parameters.

Theorem

If q is a prime number, the probability that for a random r two strings ofthe same length m receive the same hash (i.e., h(s) = h(t) for s 6= t) is¬ m

q−1 .

Conclusion: One needs to properly choose q and r .

Karp-Rabin and Streaming Pattern Matching 3/24

Page 7:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

String Hashing

For a string s of length m, we use the Karp-Rabin hash function(polynomial hash, rolling hash):

h(s) = (s[0]rm−1 + s[1]rm−2 + . . .+ s[m − 2]r + s[m − 1]) mod q.

Here r and q are integer parameters.

Theorem

If q is a prime number, the probability that for a random r two strings ofthe same length m receive the same hash (i.e., h(s) = h(t) for s 6= t) is¬ m

q−1 .

Conclusion: One needs to properly choose q and r .

Karp-Rabin and Streaming Pattern Matching 3/24

Page 8:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Computing String Hashes

Denote:Hi = h(s[0..i ]).

Then:

H0 = s[0] mod q

H1 = (s[0]r + s[1]) mod q

H2 = (s[0]r 2 + s[1]r + s[2]) mod q

. . .

Hi+1 = (Hi r + s[i + 1]) mod q

Thus we can compute Hm−1 = h(s) in O(m) time and O(1) space.

Karp-Rabin and Streaming Pattern Matching 4/24

Page 9:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Computing String Hashes

Denote:Hi = h(s[0..i ]).

Then:

H0 = s[0] mod q

H1 = (s[0]r + s[1]) mod q

H2 = (s[0]r 2 + s[1]r + s[2]) mod q

. . .

Hi+1 = (Hi r + s[i + 1]) mod q

Thus we can compute Hm−1 = h(s) in O(m) time and O(1) space.

Karp-Rabin and Streaming Pattern Matching 4/24

Page 10:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Computing String Hashes

Denote:Hi = h(s[0..i ]).

Then:

H0 = s[0] mod q

H1 = (s[0]r + s[1]) mod q

H2 = (s[0]r 2 + s[1]r + s[2]) mod q

. . .

Hi+1 = (Hi r + s[i + 1]) mod q

Thus we can compute Hm−1 = h(s) in O(m) time and O(1) space.

Karp-Rabin and Streaming Pattern Matching 4/24

Page 11:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Rolling Property

Denote:Hi,j = h(s[i ..j ]).

Then:

Hi,j = (s[i ]r j−i + s[i + 1]r j−i−1 . . .+ s[j ]) mod q

Hi+1,j+1 = (s[i + 1]r j−i + . . .+ s[j ]r + s[j + 1]) mod q

Hi+1,j+1 = ((Hi,j − s[i ]r j−i ) · r + s[j + 1]) mod q

The power r a can be computed in O(log a) time using doubling.

Karp-Rabin and Streaming Pattern Matching 5/24

Page 12:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Rolling Property

Denote:Hi,j = h(s[i ..j ]).

Then:

Hi,j = (s[i ]r j−i + s[i + 1]r j−i−1 . . .+ s[j ]) mod q

Hi+1,j+1 = (s[i + 1]r j−i + . . .+ s[j ]r + s[j + 1]) mod q

Hi+1,j+1 = ((Hi,j − s[i ]r j−i ) · r + s[j + 1]) mod q

The power r a can be computed in O(log a) time using doubling.

Karp-Rabin and Streaming Pattern Matching 5/24

Page 13:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Karp-Rabin Pattern Matching

Idea of the algorithm:1 Compute the hash of the pattern, h(p)

2 Compute the rolling hashes of all substrings of the text of length m,h(t[i −m + 1..i ]) = Hi−m+1,i

3 If any of the hashes is equal to h(p):(a) Either report an occurrence at position i (may incur false matches

with a small probability)(b) Or check if this is indeed an occurrence assiduously (has O(nm)-time

behavior in the worst case)

Karp-Rabin and Streaming Pattern Matching 6/24

Page 14:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

A Property of String Hashes

i j ku = s[i ..j ] v = s[j + 1..k]

w = s[i ..k]

h(w) = (h(u) · rk−j + h(v)) mod q

h(v) = (h(w)− h(u) · rk−j) mod q

h(u) = ((h(w)− h(v)) · (r−1)k−j) mod q

r−1 mod q can be precomputed in O(log q) time.

Lemma

Knowing the hashes of any two of the three strings u, v ,w , we cancompute the hash of the third in constant time.

Trick: For a string x of length m, together with its hash, we storerm mod q and (r−1)m mod q.

Karp-Rabin and Streaming Pattern Matching 7/24

Page 15:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

A Property of String Hashes

i j ku = s[i ..j ] v = s[j + 1..k]

w = s[i ..k]

h(w) = (h(u) · rk−j + h(v)) mod q

h(v) = (h(w)− h(u) · rk−j) mod q

h(u) = ((h(w)− h(v)) · (r−1)k−j) mod q

r−1 mod q can be precomputed in O(log q) time.

Lemma

Knowing the hashes of any two of the three strings u, v ,w , we cancompute the hash of the third in constant time.

Trick: For a string x of length m, together with its hash, we storerm mod q and (r−1)m mod q.

Karp-Rabin and Streaming Pattern Matching 7/24

Page 16:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

A Property of String Hashes

i j ku = s[i ..j ] v = s[j + 1..k]

w = s[i ..k]

h(w) = (h(u) · rk−j + h(v)) mod q

h(v) = (h(w)− h(u) · rk−j) mod q

h(u) = ((h(w)− h(v)) · (r−1)k−j) mod q

r−1 mod q can be precomputed in O(log q) time.

Lemma

Knowing the hashes of any two of the three strings u, v ,w , we cancompute the hash of the third in constant time.

Trick: For a string x of length m, together with its hash, we storerm mod q and (r−1)m mod q.

Karp-Rabin and Streaming Pattern Matching 7/24

Page 17:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

A Property of String Hashes

i j ku = s[i ..j ] v = s[j + 1..k]

w = s[i ..k]

h(w) = (h(u) · rk−j + h(v)) mod q

h(v) = (h(w)− h(u) · rk−j) mod q

h(u) = ((h(w)− h(v)) · (r−1)k−j) mod q

r−1 mod q can be precomputed in O(log q) time.

Lemma

Knowing the hashes of any two of the three strings u, v ,w , we cancompute the hash of the third in constant time.

Trick: For a string x of length m, together with its hash, we storerm mod q and (r−1)m mod q.

Karp-Rabin and Streaming Pattern Matching 7/24

Page 18:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

A Property of String Hashes

i j ku = s[i ..j ] v = s[j + 1..k]

w = s[i ..k]

h(w) = (h(u) · rk−j + h(v)) mod q

h(v) = (h(w)− h(u) · rk−j) mod q

h(u) = ((h(w)− h(v)) · (r−1)k−j) mod q

r−1 mod q can be precomputed in O(log q) time.

Lemma

Knowing the hashes of any two of the three strings u, v ,w , we cancompute the hash of the third in constant time.

Trick: For a string x of length m, together with its hash, we storerm mod q and (r−1)m mod q.

Karp-Rabin and Streaming Pattern Matching 7/24

Page 19:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

A Property of String Hashes

i j ku = s[i ..j ] v = s[j + 1..k]

w = s[i ..k]

h(w) = (h(u) · rk−j + h(v)) mod q

h(v) = (h(w)− h(u) · rk−j) mod q

h(u) = ((h(w)− h(v)) · (r−1)k−j) mod q

r−1 mod q can be precomputed in O(log q) time.

Lemma

Knowing the hashes of any two of the three strings u, v ,w , we cancompute the hash of the third in constant time.

Trick: For a string x of length m, together with its hash, we storerm mod q and (r−1)m mod q.

Karp-Rabin and Streaming Pattern Matching 7/24

Page 20:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

A Property of String Hashes

i j ku = s[i ..j ] v = s[j + 1..k]

w = s[i ..k]

h(w) = (h(u) · rk−j + h(v)) mod q

h(v) = (h(w)− h(u) · rk−j) mod q

h(u) = ((h(w)− h(v)) · (r−1)k−j) mod q

r−1 mod q can be precomputed in O(log q) time.

Lemma

Knowing the hashes of any two of the three strings u, v ,w , we cancompute the hash of the third in constant time.

Trick: For a string x of length m, together with its hash, we storerm mod q and (r−1)m mod q.

Karp-Rabin and Streaming Pattern Matching 7/24

Page 21:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Pattern Matching in the Streaming Model

Streaming model:

It was introduced for the purposes of processing massive data insmall space

We cannot afford to store the whole input data

The input data arrives as a stream of items. It is read only once

Karp-Rabin and Streaming Pattern Matching 8/24

Page 22:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Example 1: Missing Number

Problem.A sequence of numbers is given; it contains all numbers from {1, . . . , n}(in some order) except for one number. Find the missing number!

Example. n = 76, 3, 4, 1, 7, 2

−→ 5 is missing!

Karp-Rabin and Streaming Pattern Matching 9/24

Page 23:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Example 1: Missing Number

Problem.A sequence of numbers is given; it contains all numbers from {1, . . . , n}(in some order) except for one number. Find the missing number!

Example. n = 76, 3, 4, 1, 7, 2 −→ 5 is missing!

Karp-Rabin and Streaming Pattern Matching 9/24

Page 24:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Solution

Goal: Use O(1) space.

a — Missing numberS — Sum of input numbers

S + a = 1 + . . .+ nSo a = n(n+1)

2 − S

Karp-Rabin and Streaming Pattern Matching 10/24

Page 25:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Solution

Goal: Use O(1) space.

a — Missing numberS — Sum of input numbers

S + a = 1 + . . .+ nSo a = n(n+1)

2 − S

Karp-Rabin and Streaming Pattern Matching 10/24

Page 26:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Solution

Goal: Use O(1) space.

a — Missing numberS — Sum of input numbers

S + a = 1 + . . .+ n

So a = n(n+1)2 − S

Karp-Rabin and Streaming Pattern Matching 10/24

Page 27:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Solution

Goal: Use O(1) space.

a — Missing numberS — Sum of input numbers

S + a = 1 + . . .+ nSo a = n(n+1)

2 − S

Karp-Rabin and Streaming Pattern Matching 10/24

Page 28:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Example 2: Two Missing Numbers

Problem.A sequence of numbers is given; it contains all numbers from {1, . . . , n}(in some order) except for two numbers. Find the two missing numbers!

Example. n = 76, 4, 1, 7, 2

−→ 3 and 5 are missing

Karp-Rabin and Streaming Pattern Matching 11/24

Page 29:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Example 2: Two Missing Numbers

Problem.A sequence of numbers is given; it contains all numbers from {1, . . . , n}(in some order) except for two numbers. Find the two missing numbers!

Example. n = 76, 4, 1, 7, 2 −→ 3 and 5 are missing

Karp-Rabin and Streaming Pattern Matching 11/24

Page 30:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Solution

a, b — missing numbersS1 — sum of input numbers

S2 — sum of squares of input numbers

S1 + a + b = 1 + . . .+ n = n(n+1)2

S2 + a2 + b2 = 12 + . . .+ n2 = n(n+1)(2n+1)6

Then:

a + b = n(n+1)2 − S1

= X

a2 + b2 = n(n+1)(2n+1)6 − S2

= Y

Karp-Rabin and Streaming Pattern Matching 12/24

Page 31:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Solution

a, b — missing numbersS1 — sum of input numbers

S2 — sum of squares of input numbers

S1 + a + b = 1 + . . .+ n = n(n+1)2

S2 + a2 + b2 = 12 + . . .+ n2 = n(n+1)(2n+1)6

Then:

a + b = n(n+1)2 − S1

= X

a2 + b2 = n(n+1)(2n+1)6 − S2

= Y

Karp-Rabin and Streaming Pattern Matching 12/24

Page 32:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Solution

a, b — missing numbersS1 — sum of input numbersS2 — sum of squares of input numbers

S1 + a + b = 1 + . . .+ n = n(n+1)2

S2 + a2 + b2 = 12 + . . .+ n2 = n(n+1)(2n+1)6

Then:

a + b = n(n+1)2 − S1

= X

a2 + b2 = n(n+1)(2n+1)6 − S2

= Y

Karp-Rabin and Streaming Pattern Matching 12/24

Page 33:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Solution

a, b — missing numbersS1 — sum of input numbersS2 — sum of squares of input numbers

S1 + a + b = 1 + . . .+ n = n(n+1)2

S2 + a2 + b2 = 12 + . . .+ n2 = n(n+1)(2n+1)6

Then:

a + b = n(n+1)2 − S1

= X

a2 + b2 = n(n+1)(2n+1)6 − S2

= Y

Karp-Rabin and Streaming Pattern Matching 12/24

Page 34:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Solution

a, b — missing numbersS1 — sum of input numbersS2 — sum of squares of input numbers

S1 + a + b = 1 + . . .+ n = n(n+1)2

S2 + a2 + b2 = 12 + . . .+ n2 = n(n+1)(2n+1)6

Then:

a + b = n(n+1)2 − S1

= X

a2 + b2 = n(n+1)(2n+1)6 − S2

= Y

Karp-Rabin and Streaming Pattern Matching 12/24

Page 35:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Solution

a, b — missing numbersS1 — sum of input numbersS2 — sum of squares of input numbers

S1 + a + b = 1 + . . .+ n = n(n+1)2

S2 + a2 + b2 = 12 + . . .+ n2 = n(n+1)(2n+1)6

Then:

a + b = n(n+1)2 − S1 = X

a2 + b2 = n(n+1)(2n+1)6 − S2 = Y

Karp-Rabin and Streaming Pattern Matching 12/24

Page 36:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

System of Equations

{a + b = X

a2 + b2 = Y

{b = X − a

a2 + (X − a)2 = Y

{b = X − a

2a2 − 2aX + X 2 = Y

{b = X − a

2a2 − (2X )a + (X 2 − Y ) = 0

∆ = 4X 2 − 8(X 2 − Y ) = 8Y − 4X 2{a1 = 2X−

√∆

4

b1 = X − a1

{a2 = 2X+

√∆

4

b2 = X − a2

Why two solutions?

Karp-Rabin and Streaming Pattern Matching 13/24

Page 37:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

System of Equations

{a + b = X

a2 + b2 = Y

{b = X − a

a2 + (X − a)2 = Y

{b = X − a

2a2 − 2aX + X 2 = Y

{b = X − a

2a2 − (2X )a + (X 2 − Y ) = 0

∆ = 4X 2 − 8(X 2 − Y ) = 8Y − 4X 2{a1 = 2X−

√∆

4

b1 = X − a1

{a2 = 2X+

√∆

4

b2 = X − a2

Why two solutions?

Karp-Rabin and Streaming Pattern Matching 13/24

Page 38:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

System of Equations

{a + b = X

a2 + b2 = Y

{b = X − a

a2 + (X − a)2 = Y

{b = X − a

2a2 − 2aX + X 2 = Y

{b = X − a

2a2 − (2X )a + (X 2 − Y ) = 0

∆ = 4X 2 − 8(X 2 − Y ) = 8Y − 4X 2{a1 = 2X−

√∆

4

b1 = X − a1

{a2 = 2X+

√∆

4

b2 = X − a2

Why two solutions?

Karp-Rabin and Streaming Pattern Matching 13/24

Page 39:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

System of Equations

{a + b = X

a2 + b2 = Y

{b = X − a

a2 + (X − a)2 = Y

{b = X − a

2a2 − 2aX + X 2 = Y

{b = X − a

2a2 − (2X )a + (X 2 − Y ) = 0

∆ = 4X 2 − 8(X 2 − Y ) = 8Y − 4X 2{a1 = 2X−

√∆

4

b1 = X − a1

{a2 = 2X+

√∆

4

b2 = X − a2

Why two solutions?

Karp-Rabin and Streaming Pattern Matching 13/24

Page 40:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

System of Equations

{a + b = X

a2 + b2 = Y

{b = X − a

a2 + (X − a)2 = Y

{b = X − a

2a2 − 2aX + X 2 = Y

{b = X − a

2a2 − (2X )a + (X 2 − Y ) = 0

∆ = 4X 2 − 8(X 2 − Y ) = 8Y − 4X 2

{a1 = 2X−

√∆

4

b1 = X − a1

{a2 = 2X+

√∆

4

b2 = X − a2

Why two solutions?

Karp-Rabin and Streaming Pattern Matching 13/24

Page 41:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

System of Equations

{a + b = X

a2 + b2 = Y

{b = X − a

a2 + (X − a)2 = Y

{b = X − a

2a2 − 2aX + X 2 = Y

{b = X − a

2a2 − (2X )a + (X 2 − Y ) = 0

∆ = 4X 2 − 8(X 2 − Y ) = 8Y − 4X 2{a1 = 2X−

√∆

4

b1 = X − a1

{a2 = 2X+

√∆

4

b2 = X − a2

Why two solutions?

Karp-Rabin and Streaming Pattern Matching 13/24

Page 42:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

System of Equations

{a + b = X

a2 + b2 = Y

{b = X − a

a2 + (X − a)2 = Y

{b = X − a

2a2 − 2aX + X 2 = Y

{b = X − a

2a2 − (2X )a + (X 2 − Y ) = 0

∆ = 4X 2 − 8(X 2 − Y ) = 8Y − 4X 2{a1 = 2X−

√∆

4

b1 = X − a1

{a2 = 2X+

√∆

4

b2 = X − a2

Why two solutions?

Karp-Rabin and Streaming Pattern Matching 13/24

Page 43:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Pattern Matching in the Streaming Model

First the pattern arrives, symbol by symbol. Then the text arrives,symbol by symbol

For convenience we assume that we know the lengths of the pattern(m) and of the text (n)

We are aiming at total space O(logm)

The time of processing a letter from the pattern/text should beO(logm)

Is this possible at all?

No, unless randomization is allowed!

Karp-Rabin and Streaming Pattern Matching 14/24

Page 44:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Pattern Matching in the Streaming Model

First the pattern arrives, symbol by symbol. Then the text arrives,symbol by symbol

For convenience we assume that we know the lengths of the pattern(m) and of the text (n)

We are aiming at total space O(logm)

The time of processing a letter from the pattern/text should beO(logm)

Is this possible at all?

No, unless randomization is allowed!

Karp-Rabin and Streaming Pattern Matching 14/24

Page 45:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Pattern Matching in the Streaming Model

First the pattern arrives, symbol by symbol. Then the text arrives,symbol by symbol

For convenience we assume that we know the lengths of the pattern(m) and of the text (n)

We are aiming at total space O(logm)

The time of processing a letter from the pattern/text should beO(logm)

Is this possible at all?

No, unless randomization is allowed!

Karp-Rabin and Streaming Pattern Matching 14/24

Page 46:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching, Main Idea

1 Let pj be the prefix of the pattern of length 2j .(We assume pk = p for k = dlogme)Compute h(pi ) when reading the pattern.

pp4

p3

p2

p1

p0

2 If i is a position in the text, let ti,j be the suffix of length |pj | oft[0..i ].Store (in a compact way) all the positions where pj occurs in ti,j+1.

t[0..i ]i − 2j+1 iti,j+1

pj

Karp-Rabin and Streaming Pattern Matching 15/24

Page 47:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching, Main Idea

1 Let pj be the prefix of the pattern of length 2j .(We assume pk = p for k = dlogme)Compute h(pi ) when reading the pattern.

pp4

p3

p2

p1

p0

2 If i is a position in the text, let ti,j be the suffix of length |pj | oft[0..i ].Store (in a compact way) all the positions where pj occurs in ti,j+1.

t[0..i ]i − 2j+1 iti,j+1

pj

Karp-Rabin and Streaming Pattern Matching 15/24

Page 48:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Compact Representation of Occurrences

Fact from Combinatorics on Words

Assume that |t| ¬ 2|p| and that string p has at least three occurrences instring t. Then the occurrences of p in t form an arithmetic sequence withdifference equal to the shortest period of p.

b a a b a b a a b a b a a b a b a a b a b a a b b

a a b a b a a b a b a a b

b a b a a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

1

b a a b a b a b a a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

6

b a a b a b a a b a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

11

Karp-Rabin and Streaming Pattern Matching 16/24

Page 49:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Compact Representation of Occurrences

Fact from Combinatorics on Words

Assume that |t| ¬ 2|p| and that string p has at least three occurrences instring t. Then the occurrences of p in t form an arithmetic sequence withdifference equal to the shortest period of p.

b a a b a b a a b a b a a b a b a a b a b a a b b

a a b a b a a b a b a a b

b a b a a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

1

b a a b a b a b a a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

6

b a a b a b a a b a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

11

Karp-Rabin and Streaming Pattern Matching 16/24

Page 50:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Compact Representation of Occurrences

Fact from Combinatorics on Words

Assume that |t| ¬ 2|p| and that string p has at least three occurrences instring t. Then the occurrences of p in t form an arithmetic sequence withdifference equal to the shortest period of p.

b a a b a b a a b a b a a b a b a a b a b a a b b

a a b a b a a b a b a a b

b a b a a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

1

b a a b a b a b a a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

6

b a a b a b a a b a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

11

Karp-Rabin and Streaming Pattern Matching 16/24

Page 51:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Compact Representation of Occurrences

Fact from Combinatorics on Words

Assume that |t| ¬ 2|p| and that string p has at least three occurrences instring t. Then the occurrences of p in t form an arithmetic sequence withdifference equal to the shortest period of p.

b a a b a b a a b a b a a b a b a a b a b a a b b

a a b a b a a b a b a a b

b a b a a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

1

b a a b a b a b a a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

6

b a a b a b a a b a b a b a a b ba a b a b a a b a b a a b

a a b a b a a b a b a a b

11

Karp-Rabin and Streaming Pattern Matching 16/24

Page 52:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Compact representation of endpoints of occurrences of pj in ti,j+1(Occ(j)):

If there are at most two occurrences, store their positions explicitly.

t[0..i ]ti,j+1

pj

Otherwise the occurrences form an arithmetic progression:a, a + b, a + 2b, . . . , a + (`− 1)b. It suffices to store: a, b, and `.

t[0..i ]ti,j+1

pj

Karp-Rabin and Streaming Pattern Matching 17/24

Page 53:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Compact representation of endpoints of occurrences of pj in ti,j+1(Occ(j)):

If there are at most two occurrences, store their positions explicitly.

t[0..i ]ti,j+1

pj

Otherwise the occurrences form an arithmetic progression:a, a + b, a + 2b, . . . , a + (`− 1)b. It suffices to store: a, b, and `.

t[0..i ]ti,j+1

pj

Karp-Rabin and Streaming Pattern Matching 17/24

Page 54:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,10,1

a a

11

x

x

x

x

a

1,21,21,2

a a

1,21,2

x

x

x

x

a

2,32,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 55:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,10,1

a a

11

x

x

x

x

a

1,21,21,2

a a

1,21,2

x

x

x

x

a

2,32,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 56:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,10,1

a a

11

x

x

x

x

a

1,21,21,2

a a

1,21,2

x

x

x

x

a

2,32,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 57:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,1

0,10,1

a a

11

x

x

x

x

a

1,21,21,2

a a

1,21,2

x

x

x

x

a

2,32,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 58:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,1

0,1

0,1

a a

1

1

x

x

x

x

a

1,21,21,2

a a

1,21,2

x

x

x

x

a

2,32,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 59:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,21,2

a a

1,21,2

x

x

x

x

a

2,32,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 60:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,2

1,21,2

a a

1,21,2

x

x

x

x

a

2,32,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 61:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,2

1,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 62:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 63:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,3

2,32,3

a a

1,2,31,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 64:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,3

2,3

2,3

a a

1,2,3

1,2,31,2,3

a a a a

33

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 65:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,3

1,2,3

1,2,3

a a a a

3

3

xx

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 66:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 67:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,4

3,43,4

a a

2,3,42,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 68:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,4

3,4

3,4

a a

2,3,4

2,3,42,3,4

a a a a

3,43,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 69:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,4

2,3,4

2,3,4

a a a a

3,4

3,4

xx

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 70:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 71:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,5

4,54,5

a a

3,4,53,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 72:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,5

4,5

4,5

a a

3,4,5

3,4,53,4,5

a a a a

3,4,53,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 73:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,5

3,4,5

3,4,5

a a a a

3,4,5

3,4,5

xx

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 74:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

555

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 75:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

5

55

a a

4,54,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 76:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

5

5

5

a a

4,5

4,54,5

a a a a

3,4,53,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 77:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,5

4,5

4,5

a a a a

3,4,5

3,4,5

xx

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 78:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

77

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 79:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

555

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 80:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

5

55

a a a a

3,4,53,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 81:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

5

5

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

xx

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 82:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,5

3,4,5

a a a a b a a a

x

x

a

7,87,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 83:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

x

x

a

7,8

7,87,8

a a

87

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 84:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

x

x

a

7,8

7,8

7,8

a a

8

7

a a a a

4,54,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 85:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

x

x

a

7,87,8

7,8

a a

8

7

a a a a

4,5

4,54,5

a a a a b a a a

xx

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 86:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

x

x

a

7,87,8

7,8

a a

8

7

a a a a

4,5

4,5

4,5

a a a a b a a a

x

x

a

8,98,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 87:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

x

x

a

7,87,8

7,8

a a

8

7

a a a a

4,5

4,54,5

a a a a b a a a

x

x

a

8,9

8,98,9

a a

8,98,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 88:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

x

x

a

7,87,8

7,8

a a

8

7

a a a a

4,5

4,54,5

a a a a b a a a

x

x

a

8,9

8,9

8,9

a a

8,9

8,9

a a a a

555

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 89:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

x

x

a

7,87,8

7,8

a a

8

7

a a a a

4,5

4,54,5

a a a a b a a a

x

x

a

8,98,9

8,9

a a

8,9

8,9

a a a a

5

55

a a a a b a a a

99

Karp-Rabin and Streaming Pattern Matching 18/24

Page 90:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

x

x

a

7,87,8

7,8

a a

8

7

a a a a

4,5

4,54,5

a a a a b a a a

x

x

a

8,98,9

8,9

a a

8,9

8,9

a a a a

5

5

5

a a a a b a a a

9

9

Karp-Rabin and Streaming Pattern Matching 18/24

Page 91:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

First update Occ(0). Then update Occ(j), for each j :1 Remove the first element of Occ(j) if it is equal to i ′ = i − 2j

2 If the first element of Occ(j) is i ′ + 1, check if it can be extendedinto an occurrence of pj+1 and, if so, add it to Occ(j + 1)

a a a a a a b a a a at =0 1 2 3 4 5 6 7 8 9 10

a a a a b a a ap3 = p =

a a a ap2 =

a ap1 =

ap0 =

Occ(0)

Occ(1)

Occ(2)

Occ(3)

0

x

x

x

a

0

x

x

x

a

0,10,1

0,1

a a

1

1

x

x

x

x

a

1,21,2

1,2

a a

1,2

1,2

x

x

x

x

a

2,32,3

2,3

a a

1,2,31,2,3

1,2,3

a a a a

3

3

x

x

a

3,43,4

3,4

a a

2,3,42,3,4

2,3,4

a a a a

3,4

3,4

x

x

a

4,54,5

4,5

a a

3,4,53,4,5

3,4,5

a a a a

3,4,5

3,4,5

x

x

a

55

5

a a

4,54,5

4,5

a a a a

3,4,5

3,4,5

x

x

a

7

7

a a

55

5

a a a a

3,4,5

3,4,53,4,5

a a a a b a a a

x

x

a

7,87,8

7,8

a a

8

7

a a a a

4,5

4,54,5

a a a a b a a a

x

x

a

8,98,9

8,9

a a

8,9

8,9

a a a a

5

55

a a a a b a a a

9

9

Karp-Rabin and Streaming Pattern Matching 18/24

Page 92:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 1: When an occurrence of pj can be extended to anoccurrence of pj+1?

t[0..i ]ti,j+1

pj

h(pj+1)

h(u)

h(t[0..i ])

Stored data:

h(t[0..i ])

The hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 19/24

Page 93:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 1: When an occurrence of pj can be extended to anoccurrence of pj+1?

t[0..i ]ti,j+1

pj

h(pj+1)

h(u)

h(t[0..i ])

Stored data:

h(t[0..i ])

The hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 19/24

Page 94:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 1: When an occurrence of pj can be extended to anoccurrence of pj+1?

t[0..i ]ti,j+1

pj

h(pj+1)

h(u)

h(t[0..i ])

Stored data:

h(t[0..i ])

The hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 19/24

Page 95:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 1: When an occurrence of pj can be extended to anoccurrence of pj+1?

t[0..i ]ti,j+1

pj

h(pj+1)

h(u)

h(t[0..i ])

Stored data:

h(t[0..i ])

The hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 19/24

Page 96:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 1: When an occurrence of pj can be extended to anoccurrence of pj+1?

t[0..i ]ti,j+1

pj

h(pj+1)

h(u)

h(t[0..i ])

Stored data:

h(t[0..i ])

The hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 19/24

Page 97:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 2: How to handle deletion of the first stored occurrence of pj?

pj

t[0..i ]ti,j+1

pj

t[0..i + 1]ti+1,j+1

h(u)h(v)

h(w)

Stored data:

The hash of the period of pj (with special handling of the cases of 0,1, or 2 occurrences of pj in ti,j+1)

Karp-Rabin and Streaming Pattern Matching 20/24

Page 98:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 2: How to handle deletion of the first stored occurrence of pj?

pj

t[0..i ]ti,j+1

pj

t[0..i + 1]ti+1,j+1

h(u)h(v)

h(w)

Stored data:

The hash of the period of pj (with special handling of the cases of 0,1, or 2 occurrences of pj in ti,j+1)

Karp-Rabin and Streaming Pattern Matching 20/24

Page 99:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 2: How to handle deletion of the first stored occurrence of pj?

pj

t[0..i ]ti,j+1

pj

t[0..i + 1]ti+1,j+1

h(u)

h(v)

h(w)

Stored data:

The hash of the period of pj (with special handling of the cases of 0,1, or 2 occurrences of pj in ti,j+1)

Karp-Rabin and Streaming Pattern Matching 20/24

Page 100:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 2: How to handle deletion of the first stored occurrence of pj?

pj

t[0..i ]ti,j+1

pj

t[0..i + 1]ti+1,j+1

h(u)h(v)

h(w)

Stored data:

The hash of the period of pj (with special handling of the cases of 0,1, or 2 occurrences of pj in ti,j+1)

Karp-Rabin and Streaming Pattern Matching 20/24

Page 101:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 2: How to handle deletion of the first stored occurrence of pj?

pj

t[0..i ]ti,j+1

pj

t[0..i + 1]ti+1,j+1

h(u)h(v)

h(w)

Stored data:

The hash of the period of pj (with special handling of the cases of 0,1, or 2 occurrences of pj in ti,j+1)

Karp-Rabin and Streaming Pattern Matching 20/24

Page 102:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 2: How to handle deletion of the first stored occurrence of pj?

pj

t[0..i ]ti,j+1

pj

t[0..i + 1]ti+1,j+1

h(u)h(v)

h(w)

Stored data:

The hash of the period of pj (with special handling of the cases of 0,1, or 2 occurrences of pj in ti,j+1)

Karp-Rabin and Streaming Pattern Matching 20/24

Page 103:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pj

h(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pjOr set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 104:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pj

h(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pjOr set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 105:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pj

h(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pjOr set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 106:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pj

h(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pjOr set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 107:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pjh(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pjOr set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 108:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pjh(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pjOr set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 109:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pjh(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pjOr set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 110:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pj

h(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pj

Or set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 111:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pj

h(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pj

Or set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 112:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pj

h(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pj

Or set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 113:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Streaming Pattern Matching

Situation 3: How to compute the desired hashes on a new occurrence ofpj?

t[0..i ]ti,j+1

pj

h(u)

h(w)

h(v)

h(w)

Either do nothing

Or compute the hash of the period of pjOr set the hash from t[0] until the first occurrence of pj in ti,j+1

Karp-Rabin and Streaming Pattern Matching 21/24

Page 114:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Summary

Stored data:

h(t[0..i ])

And for each j = 0, . . . , dlogme:a, b, ` of the progression

h(pj)

The hash from t[0] until the first occurrence of pj in ti,j+1

The hash of the period of pj (with special handling of the cases of 0,1, or 2 occurrences of pj in ti,j+1)

We make O(1) updates for each j when advancing i .

Theorem

Pattern matching can be solved in the streaming model with O(logm)space and O(logm) time per arriving symbol. The algorithm may incursmall probability errors.

Karp-Rabin and Streaming Pattern Matching 22/24

Page 115:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

Summary

Stored data:

h(t[0..i ])

And for each j = 0, . . . , dlogme:a, b, ` of the progression

h(pj)

The hash from t[0] until the first occurrence of pj in ti,j+1

The hash of the period of pj (with special handling of the cases of 0,1, or 2 occurrences of pj in ti,j+1)

We make O(1) updates for each j when advancing i .

Theorem

Pattern matching can be solved in the streaming model with O(logm)space and O(logm) time per arriving symbol. The algorithm may incursmall probability errors.

Karp-Rabin and Streaming Pattern Matching 22/24

Page 116:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

References

First Algorithm

B. Porat, E. Porat, Exact and approximate pattern matching in the streamingmodel. Proceedings of the 50th Annual Symposium on Foundations ofComputer Science, FOCS’09 (2009)

O(logm) space, O(logm) time per symbol

Better Algorithm

D. Breslauer, Z. Galil, Real-time streaming string-matching. Proceedings of the22nd Annual Symposium on Combinatorial Pattern Matching, CPM’11 (2011)

O(logm) space, O(1) time per symbol;or O(logm) time per symbol with a simpler algorithm

Karp-Rabin and Streaming Pattern Matching 23/24

Page 117:  · String Hashing For a string s of length m, we use the Karp-Rabin hash function (polynomial hash, rolling hash): h(s) = (s[0]rm−1 + s[1]rm−2 + ...+ s[m −2]r + s[m −1])

References

What if up to k mismatches are allowed in occurrence?

First Algorithm

B. Porat, E. Porat, FOCS’09

O(k3 polylogn) space, O(k2 polylogn) time per symbol

Better Algorithm

R. Clifford et al., The k-mismatch problem revisited. 27th Annual ACM-SIAMSymposium on Discrete Algorithms, SODA’16 (2016)

O(k2 polylogn) space, O(√k polylogn) time per symbol

Most Recent Algorithm

R. Clifford, T. Kociumaka, E. Porat, The streaming k-mismatch problem, 30thACM-SIAM Symposium on Discrete Algorithms, SODA’19 (2019)

O(k polylogn) space and O(√k polylogn) time per symbol

Karp-Rabin and Streaming Pattern Matching 24/24