42
CS 221 Guest lecture: Cuckoo Hashing Shannon Larson March 11, 2011

CS 221 Guest lecture: Cuckoo Hashing

  • Upload
    amal

  • View
    66

  • Download
    0

Embed Size (px)

DESCRIPTION

CS 221 Guest lecture: Cuckoo Hashing. Shannon Larson March 11, 2011. Learning Goals. Describe the cuckoo hashing principle Analyze the space and time complexity of cuckoo hashing Apply the insert and lookup algorithms in a cuckoo hash table Construct the graph for a cuckoo table. - PowerPoint PPT Presentation

Citation preview

Page 1: CS 221 Guest lecture: Cuckoo Hashing

CS 221Guest lecture: Cuckoo Hashing

Shannon LarsonMarch 11, 2011

Page 2: CS 221 Guest lecture: Cuckoo Hashing

Learning Goals

• Describe the cuckoo hashing principle• Analyze the space and time complexity of

cuckoo hashing• Apply the insert and lookup algorithms in a

cuckoo hash table• Construct the graph for a cuckoo table

Page 3: CS 221 Guest lecture: Cuckoo Hashing

Remember Graphs?

• A set of nodes • A set of edges

• Here:

Page 4: CS 221 Guest lecture: Cuckoo Hashing

Graph Cycles

• A graph cycle is a path of edges such that the first and last vertices are the same

𝑣1 ,𝑣2 ,𝑣5 ,𝑣3 ,𝑣 4 ,𝑣1

Page 5: CS 221 Guest lecture: Cuckoo Hashing

Recall Hashing

• A hash function – Takes the target – Hashes x to a bucket

• Perfect hashing is ideal:– O(1) lookup– O(1) insert

• Perfect hashing is not realistic!

Page 6: CS 221 Guest lecture: Cuckoo Hashing

Cuckoo Hashing: the idea

• Remember the cuckoo bird?– Shares a nest with other species…– …then kicks the other species out!

• Same idea with cuckoo hashing– When we insert , we “kick out” what occupies the

nest, – Then finds a new, alternate home

Page 7: CS 221 Guest lecture: Cuckoo Hashing

Why is this cool?

• Perfect hashing guarantees– O(1) lookup, O(1) insert

• Cuckoo hashing guarantees– O(1) lookup– O(1) insert**

• Other hashing strategies can’t guarantee this!

• Also, it’s an option for your final project

** There’s a caveat here, but we’ll see it later

Page 8: CS 221 Guest lecture: Cuckoo Hashing

Cuckoo Hashing: Two Nests

• Suppose we have TWO hash tables – they each have a hash function – we prefer , but if we have to move we’ll go to – if we’re in and have to move, we’ll go back to

• This is our collision strategy for cuckoo hashing– Different from linear probing/open addressing– Different from trees

Page 9: CS 221 Guest lecture: Cuckoo Hashing

Cuckoo Hashing: Example

• We want to insert • There are no conflicts anywhere

x

h1(𝑥 )

h2(𝑥 )

Page 10: CS 221 Guest lecture: Cuckoo Hashing

Cuckoo Hashing : Example

• Now we want to insert • There are no conflicts anywhere

y

x

Page 11: CS 221 Guest lecture: Cuckoo Hashing

Cuckoo Hashing : Example

• To insert , • Move to

z

x

y

oh no!

Page 12: CS 221 Guest lecture: Cuckoo Hashing

Cuckoo Hashing : Example

• Now we insert into

z

x

y

NOW we’re fine!

Page 13: CS 221 Guest lecture: Cuckoo Hashing

Cuckoo Hashing : Example

• The final table after inserting in order

x

y

z

Page 14: CS 221 Guest lecture: Cuckoo Hashing

Why two tables?

• Two tables, one for each hash function• Simple to visualize, simple to implement

• But, why two?• One table works just as well!• Just as simple to implement (all one table)

Page 15: CS 221 Guest lecture: Cuckoo Hashing

One Table Example

• Let’s insert again, with • Again, preferred

x

h1(𝑥 )

h2(𝑥 )

Page 16: CS 221 Guest lecture: Cuckoo Hashing

One Table Example

• Now insert • No conflicts, no problem

y

x

h1(𝑦 )

h2(𝑦 )

Page 17: CS 221 Guest lecture: Cuckoo Hashing

One Table Example

• Now insert • But, another conflict with :

z

x

y

oh no!h1(𝑧 )

h2(𝑧 )

Page 18: CS 221 Guest lecture: Cuckoo Hashing

One Table Example

• First, move to

z

x

y

h1(𝑧 )

h2(𝑥 )

Page 19: CS 221 Guest lecture: Cuckoo Hashing

One Table Example

• Now we move to

x

y

z

Page 20: CS 221 Guest lecture: Cuckoo Hashing

One Table Example

• Final table after inserting in order

x

y

z

Page 21: CS 221 Guest lecture: Cuckoo Hashing

Graph Representation

• How can we represent our table?

• Why not a graph?– Nodes are every possible table entry– Edges are inserted entries• This is a directed graph• Direction from current location TO alternate location

Page 22: CS 221 Guest lecture: Cuckoo Hashing

Graph Example

• Remember our one-table example?

x

y

z

1

2

3

4

1 2

3 4

Page 23: CS 221 Guest lecture: Cuckoo Hashing

Infinite Insert

• Suppose we insert something, and we end up in an infinite loop– Or, “too many” displacements– Some pre-defined maximum based on table size

Page 24: CS 221 Guest lecture: Cuckoo Hashing

Example: Loops

• Remember our one-table example?

x

y

z

1

2

3

4

1 2

3 4

Page 25: CS 221 Guest lecture: Cuckoo Hashing

Example: Loops

• Let’s insert : no conflicts still

x

y

z

1

2

3

4

1 2

3 4w

Page 26: CS 221 Guest lecture: Cuckoo Hashing

Example: Loops

• Now let’s insert : displace

x

y

z

1

2

3

4

1 2

3 4aw

Page 27: CS 221 Guest lecture: Cuckoo Hashing

Example: Loops

• Now is placed, and is displaced (put in 4)

a

y

x

1

2

3

4

1 2

3 4zw

Page 28: CS 221 Guest lecture: Cuckoo Hashing

Example: Loops

• Now is placed, and is displaced (put in 3)

a

y

x

1

2

3

4

1 2

3 4wz

Page 29: CS 221 Guest lecture: Cuckoo Hashing

Example: Loops

• Notice what happens to the graph• We keep going and going and going….

1 2

3 4

Page 30: CS 221 Guest lecture: Cuckoo Hashing

Analysis: Loops

• Remember infinite loops in a new insert?

• In the graph, this is a closed loop– We might forever re-do the same displacements

• The probability of getting a loop increases dramatically once we’ve inserted elements– N is the number of buckets (size of table)– This is from the research on cuckoo hashing

Page 31: CS 221 Guest lecture: Cuckoo Hashing

Analysis: Loops

• What can we do once we get a loop?– Rebuild, same size (ok solution)– Double table size (better solution)

• We’ll need new hash functions for both

Page 32: CS 221 Guest lecture: Cuckoo Hashing

Analysis

• Lookup has O(1) time– At MOST two places to look, ever– One location per hash function

• Insert has amortized O(1) time– Think of this as “in the long run”– In practice we see O(1) time insert– You’ll see amortized analysis in CPSC 320

• Remember the “grass and trees” analysis?

Page 33: CS 221 Guest lecture: Cuckoo Hashing

Lookup: The Code

Return the position of (either or )Otherwise, return false

lookup(x)return T[h1(x)] = x or

T[h2(x)] = x

Page 34: CS 221 Guest lecture: Cuckoo Hashing

Insert: The Code

Given a table (array) T and item to insert:insert(x)

if lookup(x)return; // if it’s already here, donepos <- h1(x); // store h1(x)for i <- 1 to M// loop at most M timesif T[pos] emptyT[pos] <- xreturn; // if T[pos] empty, doneswap x and T[pos]; // put x in T[pos]if pos = h1(x) // now we’re displacingpos <- h2(x)elsepos <- h1(x)rehash(); // if we couldn’t stop, rehashinsert(x); // then insert currently displaced

end

Page 35: CS 221 Guest lecture: Cuckoo Hashing

Analysis: Load Factor

• What is load?– The average fill factor (% full) the table is

• What about cuckoo hash tables?– For two hash functions, load factor • Remember loops?

– For three hash functions, we get • That’s pretty great, actually!

Page 36: CS 221 Guest lecture: Cuckoo Hashing

More hash functions

• What would this look like?• We would have three tables (simple case)– One hash function per table

• Or, we would have two alternates (one table)

Page 37: CS 221 Guest lecture: Cuckoo Hashing

More hash functions

• What would this look like?• Each entry has TWO alternates, not one

x

y

z

Page 38: CS 221 Guest lecture: Cuckoo Hashing

More hash functions

• When something comes in new (insert)– Put it in

• If it’s displaced, check – If that’s full, go to

• To lookup, we just look in or – Still constant time!

Page 39: CS 221 Guest lecture: Cuckoo Hashing

Even better load?

• Currently we’ve only put one item per bucket

• What if we had two cells per bucket?

x,w

y,a

z

Page 40: CS 221 Guest lecture: Cuckoo Hashing

Even better load?

• Currently we’ve only put one item per bucket

• What if we had two cells per bucket?

• What about collision strategies?– Round-robin (cells take turns swapping out)– FIFO (oldest resident gets kicked out)

Page 41: CS 221 Guest lecture: Cuckoo Hashing

Even better load?

Page 42: CS 221 Guest lecture: Cuckoo Hashing

Links & Resources

• http://en.wikipedia.org/wiki/Cuckoo_hashing• http://www.ru.is/faculty/ulfar/CuckooHash.pdf• http://

www.it-c.dk/people/pagh/papers/cuckoo-undergrad.pdf

• No neat animations on the internet…yet!– Possible personal project?– Brownie points?– Pre-coop project?