43
Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1 / 24

Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Mehlhorn-Tsakalidis Revisited

Christos Zaroliagis

[Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas]

1 / 24

Page 2: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Mehlhorn-Tsakalidis - once upon a time ...

2 / 24

Page 3: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Mehlhorn-Tsakalidis - once upon a time ...

2 / 24

Page 4: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Dynamic Dictionary Search - Problem

Given set of keys S = {X1, . . . , Xn}, Xi ∈ [a, b] ⊂ IR

Maintain (under key insertions and deletions) a non-decreasing ordering

P = {X(1), . . . , X(n)} of S such that:

given a query element y

find largest X(j) ∈ P : X(j) ≤ y

3 / 24

Page 5: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Dynamic Dictionary Search - Classical methods

Use arbitrary rule to select splitting element X(k) ∈ P that splits P into two

subsets, and recurse

e.g., in binary search, splitting element = middle element

Balanced search trees (AVL-trees, red-black trees, (a, b)-trees, etc):

search & update time: O(log n)

Bounds

• optimal for Pointer Machine

• RAM: Θ(√

log n

log log n

)[Andersson & Thorup, 2001]

4 / 24

Page 6: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Dynamic Dictionary Search - Classical methods

Use arbitrary rule to select splitting element X(k) ∈ P that splits P into two

subsets, and recurse

e.g., in binary search, splitting element = middle element

Balanced search trees (AVL-trees, red-black trees, (a, b)-trees, etc):

search & update time: O(log n)

Bounds

• optimal for Pointer Machine

• RAM: Θ(√

log n

log log n

)[Andersson & Thorup, 2001]

4 / 24

Page 7: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Dynamic Dictionary Search - Classical methods

Use arbitrary rule to select splitting element X(k) ∈ P that splits P into two

subsets, and recurse

e.g., in binary search, splitting element = middle element

Balanced search trees (AVL-trees, red-black trees, (a, b)-trees, etc):

search & update time: O(log n)

Bounds

• optimal for Pointer Machine

• RAM: Θ(√

log n

log log n

)[Andersson & Thorup, 2001]

4 / 24

Page 8: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Surpassing the Bounds - Expected Complexities

Interpolation Search [Peterson,57]

Select splitting elements by taking advantage of the statistical properties of

the keys

Splitting elements are spread closer to query key y

Expected search time (static problem):

Θ(log log n), uniform distribution

[Yao & Yao,76], [Gonnet,77], [Perl & Reingold,77], [Perl, Itai & Avni,78],

[Gonnet, Rogers & George, 80]

Θ(log log n), regular (non-uniform) distribution

[Willard,85]

5 / 24

Page 9: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Surpassing the Bounds - Expected Complexities

Interpolation Search [Peterson,57]

Select splitting elements by taking advantage of the statistical properties of

the keys

Splitting elements are spread closer to query key y

Expected search time (static problem):

Θ(log log n), uniform distribution

[Yao & Yao,76], [Gonnet,77], [Perl & Reingold,77], [Perl, Itai & Avni,78],

[Gonnet, Rogers & George, 80]

Θ(log log n), regular (non-uniform) distribution

[Willard,85]

5 / 24

Page 10: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Surpassing the Bounds - Expected Complexities

Interpolation Search [Peterson,57]

Select splitting elements by taking advantage of the statistical properties of

the keys

Splitting elements are spread closer to query key y

Expected search time (static problem):

Θ(log log n), uniform distribution

[Yao & Yao,76], [Gonnet,77], [Perl & Reingold,77], [Perl, Itai & Avni,78],

[Gonnet, Rogers & George, 80]

Θ(log log n), regular (non-uniform) distribution

[Willard,85]

5 / 24

Page 11: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Surpassing the Bounds - Expected Complexities

Interpolation Search [Peterson,57]

Select splitting elements by taking advantage of the statistical properties of

the keys

Splitting elements are spread closer to query key y

Expected search time (static problem):

Θ(log log n), uniform distribution

[Yao & Yao,76], [Gonnet,77], [Perl & Reingold,77], [Perl, Itai & Avni,78],

[Gonnet, Rogers & George, 80]

Θ(log log n), regular (non-uniform) distribution

[Willard,85]

5 / 24

Page 12: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Dynamic Interpolation Search

Scenario: µ-random insertions, random deletions

µ uniform [Frederickson,83], [Itai,Konheim & Rodeh, 81]

search: O(log log n) expected time

update: O(nε) time, 0 < ε < 1

6 / 24

Page 13: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Dynamic Interpolation Search

µ (f1, f2)-smooth (peak-less or peak-adjusting) [Mehlhorn & Tsakalidis,85]

Pr

[c2 −

c3 − c1

f1(n)≤ X ≤ c2 | c1 ≤ X ≤ c3

]≤ βf2(n)

n

smooth ⊃ {uniform, regular, bounded, other non-uniform}→ any probability distribution is (f1,Θ(n))-smooth

search: O(log log n) expected time

update:

{O(log log n) expected [Mehlhorn & Tsakalidis,85]

O(1) time (position given) [Andersson & Mattson,93]

7 / 24

Page 14: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Dynamic Interpolation Search

µ (f1, f2)-smooth (peak-less or peak-adjusting) [Mehlhorn & Tsakalidis,85]

Pr

[c2 −

c3 − c1

f1(n)≤ X ≤ c2 | c1 ≤ X ≤ c3

]≤ βf2(n)

n

smooth ⊃ {uniform, regular, bounded, other non-uniform}→ any probability distribution is (f1,Θ(n))-smooth

search: O(log log n) expected time

update:

{O(log log n) expected [Mehlhorn & Tsakalidis,85]

O(1) time (position given) [Andersson & Mattson,93]

7 / 24

Page 15: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Dynamic Interpolation Search

µ (f1, f2)-smooth (peak-less or peak-adjusting) [Mehlhorn & Tsakalidis,85]

Pr

[c2 −

c3 − c1

f1(n)≤ X ≤ c2 | c1 ≤ X ≤ c3

]≤ βf2(n)

n

smooth ⊃ {uniform, regular, bounded, other non-uniform}→ any probability distribution is (f1,Θ(n))-smooth

search: O(log log n) expected time

update:

{O(log log n) expected [Mehlhorn & Tsakalidis,85]

O(1) time (position given) [Andersson & Mattson,93]

7 / 24

Page 16: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Static/Dynamic Interpolation Search

Key AssumptionConditional distribution on the subinterval dictated by an arbitrary interpolation

step remains unaffected (i.e., µ-random)

8 / 24

Page 17: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

This Work (Revisiting Mehl-Tsak DIS) - Result 1

Key assumption is valid

• only when elements are distinct (indeed assumed in all previous work)

V produced under some continuous (or discrete) distribution with zero probability

of collision

• otherwise, it fails

Probabilistic analyses of previous Interpolation Search structures are

inapplicable to sequences of non-distinct elements

V produced by discrete probability distributions with measurable (non-zero)

probability of key collision

9 / 24

Page 18: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

This Work (Revisiting Mehl-Tsak DIS) - Result 1

Key assumption is valid

• only when elements are distinct (indeed assumed in all previous work)

V produced under some continuous (or discrete) distribution with zero probability

of collision

• otherwise, it fails

Probabilistic analyses of previous Interpolation Search structures are

inapplicable to sequences of non-distinct elements

V produced by discrete probability distributions with measurable (non-zero)

probability of key collision

9 / 24

Page 19: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Interpolation Search - Lack of Generalization

∃ applications where we must store duplicates

Secondary indices in databases

Searching tables with alphabetic keys (names, dictionary entries, etc)

B keys follow a non-uniform, discrete distribution and collisions do occur

B Empirical results showed that Interpolation Search has a very poor

performance on such data

[Perl & Reingold,77], [Burton & Lewis,80], [Santorno & Sidney,85], [Perl & Gabriel,92]

B Heuristics; no rigorous performance analysis

Storing duplicates once: statistical properties are destroyed

10 / 24

Page 20: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

This Work (Revisiting Mehl-Tsak DIS) - Result 2

New dynamic interpolation search data structure

search: O(log log n) expected time

update: O(1) time (position given)

Bounds hold with high probability

Analysis

− always valid irrespectively of the distinctness or not of the elements

− applies to

(n

(log log n)1+ε , nδ)

-smooth input distributions, δ ∈ (0, 1), ε > 0

Robust (no apriori knowledge of the input distribution is required)

11 / 24

Page 21: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

This Work (Revisiting Mehl-Tsak DIS) - Other Results

Modifications to our data structure yield

O(1) expected search time w.h.p. for a (largest so far) subclass of smooth

distributions

O(log log n) expected search time w.h.p. for Power-law and Binomialdistributions

B Power-law and Binomial: @ proper choices of f1, f2 to achieve O(log log n)expected search time

12 / 24

Page 22: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Interpolation Search [Mehlhorn & Tsakalidis,85], [Andersson & Mattson,93]

eFirst Interpolation Step

Children of the root node:

Root���������

""""

XXXXXXXXXe1

e2

. . . e

. . . v . . .

. . . eR(n)

Root’s ID array

ID[1] ID[2] ID[j] ID[I(n)]

Root’s REP array

a

a

b

b

pppppppppppp

pppppppppppp

pppppppppppp

ppppppppp

pppppppppppp

pppppppppppp pppppppp

ppppa+ 1 b−aI(n)

rREP[1]

rREP[2]

rREP[3]

r. . .

. . .

. . .REP[t] . . .

. . .

rREP[R(n)]

rREP[t+ l]

a+ 2 b−aI(n) a+ (j − 1) b−a

I(n) a+ j b−aI(n) a+ (f1(n)− 1) b−a

I(n)

µ is (f1(n), f2(n)) = (I(n), n/R(n))-smooth

root has R(n) children; a root-child has R(n/R(n)) children, etc.

ID array: partitions [a, b] into I(n) equal-length parts

REP array: partitions P into R(n) equal-sized subsets, each of size n/R(n)REP[i]↔ Pi = {X ∈ P | X((i−1) n

R(n)) < X ≤ X(i n

R(n))}, i = 1, . . . , R(n)

13 / 24

Page 23: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Interpolation Search [Mehlhorn & Tsakalidis,85], [Andersson & Mattson,93]

eFirst Interpolation Step

Children of the root node:

Root���������

""""

XXXXXXXXXe1

e2

. . . e

. . . v . . .

. . . eR(n)

Root’s ID array

ID[1] ID[2] ID[j] ID[I(n)]

Root’s REP array

a

a

b

b

pppppppppppp

pppppppppppp

pppppppppppp

ppppppppp

pppppppppppp

pppppppppppp pppppppp

ppppa+ 1 b−aI(n)

rREP[1]

rREP[2]

rREP[3]

r. . .

. . .

. . .REP[t] . . .

. . .

rREP[R(n)]

rREP[t+ l]

a+ 2 b−aI(n) a+ (j − 1) b−a

I(n) a+ j b−aI(n) a+ (f1(n)− 1) b−a

I(n)

Query element y V

j =⌊

y−a

b−aI(n)⌋

+ 1 V Ij =[a + (j − 1) b−a

I(n) , a + jb−a

I(n)

]

Search for appropriate REP within Ij

− O(1) expected number of REPs

− distribution of elements between consecutive REPs: µ-random (smooth)

14 / 24

Page 24: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Analysis and the Subtle Case

[a, b] µ-random and smooth, (a′, b′] ⊆ [a, b]

Pr[X = λ | a′ < X ≤ b′] =

Pr[X = λ]

Pr[a′ < X < b′], a

′ < λ < b′ (1)

Consider X1, X2, X3 ∈ [a, b], and their ordering X(1) ≤ X(2) ≤ X(3)

Pr[X = λ | X(1) = a′∩X(3) = b

′] =Pr[X = λ ∩ X(1) = a′ ∩ X(3) = b′]

Pr[X(1) = a′ ∩ X(3) = b′](2)

Is (2) µ-random and smooth ??

15 / 24

Page 25: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Analysis and the Subtle Case

[a, b] µ-random and smooth, (a′, b′] ⊆ [a, b]

Pr[X = λ | a′ < X ≤ b′] =

Pr[X = λ]

Pr[a′ < X < b′], a

′ < λ < b′ (1)

Consider X1, X2, X3 ∈ [a, b], and their ordering X(1) ≤ X(2) ≤ X(3)

Pr[X = λ | X(1) = a′∩X(3) = b

′] =Pr[X = λ ∩ X(1) = a′ ∩ X(3) = b′]

Pr[X(1) = a′ ∩ X(3) = b′](2)

Is (2) µ-random and smooth ??

15 / 24

Page 26: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Analysis and the Subtle Case

[a, b] µ-random and smooth, (a′, b′] ⊆ [a, b]

Pr[X = λ | a′ < X ≤ b′] =

Pr[X = λ]

Pr[a′ < X < b′], a

′ < λ < b′ (1)

Consider X1, X2, X3 ∈ [a, b], and their ordering X(1) ≤ X(2) ≤ X(3)

Pr[X = λ | X(1) = a′∩X(3) = b

′] =Pr[X = λ ∩ X(1) = a′ ∩ X(3) = b′]

Pr[X(1) = a′ ∩ X(3) = b′](2)

Is (2) µ-random and smooth ??

15 / 24

Page 27: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Analysis and the Subtle Case

Distinct keys (Pr[key collision] = 0)

Event E1 = {X = λ ∩ X(1) = a′ ∩ X(3) = b′} occurs if ≥ 1 occurs

{X2 = λ, X3 = a′, X1 = b

′}, {X1 = λ, X2 = a′, X3 = b

′},{X3 = λ, X1 = a

′, X2 = b′}, {X1 = λ, X3 = a

′, X2 = b′},

{X3 = λ, X2 = a′, X1 = b

′}, {X2 = λ, X1 = a′, X3 = b

′}

Event E2 = {X(1) = a′ ∩ X(3) = b′} occurs if ≥ 1 occurs

{X1 = a′, X2 = b

′, a′ < X3 < b

′}, {X2 = a′, X1 = b

′, a′ < X3 < b

′},{X1 = a

′, X3 = b′, a

′ < X2 < b′}, {X3 = a

′, X1 = b′, a

′ < X2 < b′},

{X2 = a′, X3 = b

′, a′ < X1 < b′}, {X3 = a

′, X2 = b′, a

′ < X1 < b′}

(2) = Pr[E1]Pr[E2]

= 6Pr[X=λ] Pr[X=a′] Pr[X=b′]6Pr[X=a′] Pr[X=b′] Pr[a′<X<b′] = Pr[X=λ]

Pr[a′<X<b′] = (1)

16 / 24

Page 28: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Analysis and the Subtle Case

Non-distinct keys (Pr[key collision] 6= 0)

Event E1 = {X = λ ∩ X(1) = a′ ∩ X(3) = b′} occurs if ≥ 1 occurs

{X2 = λ, X3 = a′, X1 = b

′}, {X1 = λ, X2 = a′, X3 = b

′},{X3 = λ, X1 = a

′, X2 = b′}, {X1 = λ, X3 = a

′, X2 = b′},

{X3 = λ, X2 = a′, X1 = b

′}, {X2 = λ, X1 = a′, X3 = b

′}

Event E2 = {X(1) = a′ ∩ X(3) = b′} occurs if ≥ 1 occurs

{X1 = a′, X2 = b

′, a′ < X3 < b

′}, {X2 = a′, X1 = b

′, a′ < X3 < b

′},{X1 = a

′, X3 = b′, a

′ < X2 < b′}, {X3 = a

′, X1 = b′, a

′ < X2 < b′},

{X2 = a′, X3 = b

′, a′ < X1 < b′}, {X3 = a

′, X2 = b′, a

′ < X1 < b′}

{X1,2 = a′, X3 = b

′}, {X1,2 = b′, X3 = a

′}, {X1,3 = a′, X2 = b

′},{X1,3 = b

′, X2 = a′}, {X2,3 = a

′, X1 = b′}, {X2,3 = b

′, X1 = a′}

(2) = Pr[E1]Pr[E2]

= Pr[X=λ]Pr[X=a′]+Pr[X=b′]

2+Pr[a′<X<b′]

6= (1) !!

17 / 24

Page 29: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

The New Data Structure

n elements drawn µ-randomly from [a, b]µ is (f1, f2) = (nα, nδ)-smooth, α, δ ∈ (0, 1)

Key idea

Forget about REPs

Partition [a, b] into f1(n) intervals, each of size (b − a)/f1(n)

Recurse on each interval (BIN) until |BIN| = O(poly log n)

18 / 24

Page 30: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

The New Data Structure

1st Layer of f1(n) bins.

Each bin receives a ball with probability O(f2(n)n

)= O

(nδ

n

).

BIN(1) BIN(2) BIN(j1) BIN(f1(n))

a b

r r r . . .

n1 balls r r . . .

n2 balls r r r . . .

nj1 balls r r r r . . .

nf1(n) balls r. . .

. . .

. . .

. . .

a+ 1 b−af1(n)

a+ 2 b−af1(n)

a+ (j1 − 1) b−af1(n)

a+ j1b−af1(n)

a+ (f1(n)− 1) b−af1(n)

✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥✥

✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦

✄✄✄✄✄

PPPPPPPPPPPPPPPP

BIN(j1)’s 2nd Layer of f1(nj1) bins.

Each bin receives a ball with probability O(f2(nj1

)

nj1

)= O

(nδ2

).

BIN(j1, 1) BIN(j1, 2) BIN(j1, j2) BIN(j1, f1(nj1))

aj1 bj1

r . . .

nj1,1 balls r r r r . . .

nj1,2 balls r r r . . .

nj1,j2 balls r r . . .

nj1,f1(nj1) ballsr

. . .

. . .

. . .aj1 + 1bj1−aj1f1(nj1

) aj1 + 2bj1−aj1f1(nj1

) aj1 + (j2 − 1)bj1−aj1f1(nj1

) aj1 + j2bj1−aj1f1(nj1

)

Leaf BIN: q∗-heap [Willard,93]⇒ O(1) search & update time

19 / 24

Page 31: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

The New Data Structure

Lem. 1 The elements of each BIN (subinterval) are µ-randomly distributed

Lem. 2 |BIN| = f2(|parent-BIN|) w.h.p.

⇓− 1st layer: |BIN| = O(nδ)

.

.

.

− k-th layer: |BIN| = O(nδk

)⇓

I Number of layers: O(log log n)

20 / 24

Page 32: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

O(1) Search Time

First layer of f1(n) BINS.

Each BIN receives a ball with probability O(f2(n)n

)= O

(lnO(1) n

n

).

BIN(11) BIN(21) BIN(j1) BIN(f1(n)1)

a b

r r r . . .

n11 balls r r r r . . .

n21 balls r r r r . . .

nj1 balls r r r r . . .

nf1(n)1 balls r. . .

. . .

. . .

. . .

a+ 1 b−af1(n)

a+ 2 b−af1(n)

a+ (j1 − 1) b−af1(n)

a+ j1b−af1(n)

a+ (f1(n)− 1) b−af1(n)

µ: (f1(n), f2(n)) =(

n

g, lnO(1)

n

)-smooth, g: constant

|BIN| = O(lnO(1)n) w.h.p.

Implement each bin as q∗-heap V O(1) search time

21 / 24

Page 33: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Power-law Distribution

I1 I21 bnα

I1 is dense V van Emde Boas structure, O(log log |I1|) search time

I2 is sparse V distribution is (f1(n), poly log n)-smooth

Similar idea for Binomial

22 / 24

Page 34: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Revisiting Mehl-Tsak DIS - Conclusions

New dynamic interpolation search data structure

Supports non-distinct (duplicate) elements

Expected search time O(log log n) w.h.p. for smooth and other

distributions (e.g., power law)

O(1) expected search time for a subclass (largest so far) of smooth

distributions

23 / 24

Page 35: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Thanks

Thank you Thanassi for

introducing me (and others) to the intricacies of elementary data structures

your contributions to CS&E and for establishing it within Greece

your huge contribution in elevating the Dept of Computer Engineering &

Informatics in all respects (infrastructure, procedures, etc) within and out of

our University

the artistic touch you gave to CEID

... your support and friendship

Wish you all the best for the future

24 / 24

Page 36: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Thanks

Thank you Thanassi for

introducing me (and others) to the intricacies of elementary data structures

your contributions to CS&E and for establishing it within Greece

your huge contribution in elevating the Dept of Computer Engineering &

Informatics in all respects (infrastructure, procedures, etc) within and out of

our University

the artistic touch you gave to CEID

... your support and friendship

Wish you all the best for the future

24 / 24

Page 37: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Thanks

Thank you Thanassi for

introducing me (and others) to the intricacies of elementary data structures

your contributions to CS&E and for establishing it within Greece

your huge contribution in elevating the Dept of Computer Engineering &

Informatics in all respects (infrastructure, procedures, etc) within and out of

our University

the artistic touch you gave to CEID

... your support and friendship

Wish you all the best for the future

24 / 24

Page 38: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Thanks

Thank you Thanassi for

introducing me (and others) to the intricacies of elementary data structures

your contributions to CS&E and for establishing it within Greece

your huge contribution in elevating the Dept of Computer Engineering &

Informatics in all respects (infrastructure, procedures, etc) within and out of

our University

the artistic touch you gave to CEID

... your support and friendship

Wish you all the best for the future

24 / 24

Page 39: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Thanks

Thank you Thanassi for

introducing me (and others) to the intricacies of elementary data structures

your contributions to CS&E and for establishing it within Greece

your huge contribution in elevating the Dept of Computer Engineering &

Informatics in all respects (infrastructure, procedures, etc) within and out of

our University

the artistic touch you gave to CEID

... your support and friendship

Wish you all the best for the future

24 / 24

Page 40: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Thanks

Thank you Thanassi for

introducing me (and others) to the intricacies of elementary data structures

your contributions to CS&E and for establishing it within Greece

your huge contribution in elevating the Dept of Computer Engineering &

Informatics in all respects (infrastructure, procedures, etc) within and out of

our University

the artistic touch you gave to CEID

... your support and friendship

Wish you all the best for the future

24 / 24

Page 41: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Thanks

Thank you Thanassi for

introducing me (and others) to the intricacies of elementary data structures

your contributions to CS&E and for establishing it within Greece

your huge contribution in elevating the Dept of Computer Engineering &

Informatics in all respects (infrastructure, procedures, etc) within and out of

our University

the artistic touch you gave to CEID

... your support and friendship

Wish you all the best for the future

24 / 24

Page 42: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Thanks

Thank you Thanassi for

introducing me (and others) to the intricacies of elementary data structures

your contributions to CS&E and for establishing it within Greece

your huge contribution in elevating the Dept of Computer Engineering &

Informatics in all respects (infrastructure, procedures, etc) within and out of

our University

the artistic touch you gave to CEID

... your support and friendship

Wish you all the best for the future

24 / 24

Page 43: Mehlhorn-Tsakalidis Revisited · Mehlhorn-Tsakalidis Revisited Christos Zaroliagis [Joint work with A. Kaporis, C. Makris, S. Sioutas, A. Tsakalidis, & K. Tsichlas] 1/24

Thanks

Thank you Thanassi for

introducing me (and others) to the intricacies of elementary data structures

your contributions to CS&E and for establishing it within Greece

your huge contribution in elevating the Dept of Computer Engineering &

Informatics in all respects (infrastructure, procedures, etc) within and out of

our University

the artistic touch you gave to CEID

... your support and friendship

Wish you all the best for the future

24 / 24