21
ECE250: Algorithms and Data Structures Hash Tables (Part A) Materials from CLRS: Chapter 11.1, 11.2, 11.4 Ladan Tahvildari, PEng, SMIEEE Professor Software Technologies Applied Research (STAR) Group Dept. of Elect. & Comp. Eng. University of Waterloo

Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

ECE250: Algorithms and Data Structures

Hash Tables (Part A)

Materials from CLRS: Chapter 11.1, 11.2, 11.4

Ladan Tahvildari, PEng, SMIEEE Professor

Software Technologies Applied Research (STAR) Group

Dept. of Elect. & Comp. Eng.

University of Waterloo

Page 2: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Acknowledgements

v The following resources have been used to prepare materials for this course: Ø  MIT OpenCourseWare Ø  Introduction To Algorithms (CLRS Book) Ø  Data Structures and Algorithm Analysis in C++ (M. Wiess) Ø  Data Structures and Algorithms in C++ (M. Goodrich)

v Thanks to many people for pointing out mistakes, providing suggestions, or helping to improve the quality of this course over the last ten years: Ø  http://www.stargroup.uwaterloo.ca/~ece250/acknowledgment/

Lecture 8 ECE250 2

Page 3: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 3

The Problem

v RT&T is a large phone company, and they want to provide caller ID capability:

Ø given a phone number, return the caller’s name

Ø phone numbers range from 0 to r = 108 -1

Ø want to do this as efficiently as possible

Page 4: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 4

A Potential Solution

v A suboptimal way to design this dictionary is an array indexed by key

§  takes O(1) time

§  O(r) space - huge amount of wasted space

(null) (null) Jens Jensen

(null) (null)

0000-0000 0000-0000 9635-8904 0000-0000 0000-0000

Page 5: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 5

Symbol-Table Problem

v Symbol table holding records T n

x ][ xk ey Operations on

Ø  INSERT

Ø DELETE

Ø SERACH

records T),( xT

),( xT

),( kT

How should the data structure T be organized?

Page 6: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 6

Direct Access Table

IDEA: Suppose that the set of keys is and keys are distinct. Set up an array Operations take time

Problem: The range of keys can be large

64-bit numbers (represent 18,446,744,073,709,551,616 different keys)

}1,...,1,0{ −⊆ mK

]1..0[ −mT

otherwiseNILkxkeyandKkifx

kT=∈

⎩⎨⎧

=][

][

)1(θ

Page 7: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 7

Hash Functions

Solution: Use a hash function to map the universe of all keys into

hU }1,...,1,0{ −m

U

K1k

3k

4k2k

When a record to be inserted maps to an already occupied slot in , a collision occurs T

T0

1−m

)( 1kh

)( 4kh

)()( 32 khkh =

Page 8: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 8

Collision Resolution

v How to deal with two keys which hash to the same spot in the array?

Ø  Use chaining which sets up an array of links (a table), indexed by the keys, to lists of items with the same key

Page 9: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 9

An Example

Given the following input and the following hash function

Show the resulting hash table using Chaining

}9789,5879,5344,2699,5973,4123,3171{

10mod)( xxh =

Page 10: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 10

Collision Resolution by Chaining

3171

5973 4123

5344

2699 5879 9789

0

1

2

3

4

5

6

7

8

9

Page 11: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 11

Dictionary Operations with Chaining

v  Search: CHAINED-HASH-SEARCH(T, k) Ø search for an element with key k in list T [h(k)]

v Insertion: CHAINED-HASH-INSERT(T, x) Ø  insert x at the head of list T [h(key[x])]

v Deletion: CHAINED-HASH-DELETE(T, x) Ø delete x from the list T [h(key[x])]

Page 12: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 12

Analysis of Hashing

v Assumption: Each key is equally likely to be hashed into any slot of table independent of where other keys are hashed

Simple Uniform Hashing

v Given hash table with slots holding elements, the load factor is defined as

Average Number of Keys per Slot v Assume time to compute is v Search Time:

T

T

)(kh

mn /=αm n

)1(θ)1( αθ +

Page 13: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 13

Analysis of Operations with Chaining v  Assuming the number of hash table slots is proportional to the

number of elements in the table

Ø  Search: §  takes constant time on average

Ø  Insertion: §  takes O(1) worst-case time

o  Assumes that the element being inserted isn’t already in the list o  It would take an additional search to check if it was already inserted

Ø  Deletion: §  takes O(1) worst-case time when the lists are doubly linked §  If the lists are singly linked, then deletion takes as long as searching, because we

must find x’s predecessor in its list in order to correctly update next pointers

( )n O m=/ ( ) / (1)n m O m m Oα = = =

Page 14: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 14

More on Collisions

v A key is mapped to an already occupied table location Ø what to do?!?

v Use a collision handling technique v We have seen Chaining v Can also use Open Addressing

Ø Linear Probing Ø Double Hashing

Page 15: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 15

Open Addressing v All elements are stored in the hash table (n ≤ m) v  Insertion systematically probes the table until an

empty slot is found à The table may fill up! v Modify hash function to take the probe number i as

the second parameter (depends on both the key and the probe number)

Hash function h determines the sequence of slots examined for a given key

v Probe sequence for a given key k

{ } { }: 0,1,..., 1 0,1,..., 1h U m m× − → −slot number probe number

( ,0), ( ,1),..., ( , 1) - a permutation of 0,1,..., 1h k h k h k m m− −

Page 16: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 16

Linear Probing

v  If the current location is used, try the next table location

v Uses less memory than chaining Ø  one does not have to store all those links

v Slower than chaining (Primary Clustering) Ø  one might have to walk along the table for a long time

LinearProbingInsert(k) 01 if (table is full) error 02 probe = h(k) 03 while (table[probe] occupied) 04 probe = (probe+1) mod m 05 table[probe] = k

mikhikh mod))((),( ' +=

Page 17: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 17

Hash Tables - Example 1

Given the following input and the following hash function

Show the resulting hash table using Ø  Chaining Ø  Linear Probing

}9789,5879,5344,2699,5973,4123,3171{

10mod)( xxh =

Page 18: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 18

Collision Resolution by Chaining

3171

5973 4123

5344

2699 5879 9789

0

1

2

3

4

5

6

7

8

9

Page 19: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Lecture 8 ECE250 19

Collision Resolution by Linear Probing

0

1

2

3

4

5

6

7 8

9

3171

9789

5973

4123

5879

2699

5344

Page 20: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Hash Tables – Example 2

Show the resulting hash table using Linear Probing when the following keys:

{One, Two, Three, Four, Five, Six, Seven, Eight, Nine, Ten, Eleven, Twelve}

are inserted one-by-one, in the order given into an initially empty table. Assume that table size is . Use the division method of hashing. Use the following table of values for each key:

Lecture 8 ECE250 20

x value (hexadecimal) One 0x6EBE5 Two 0x75DAF Three 0x75A73925 Four 0x19EED32 Five 0x19E8DE5 Six 0x72A38 Seven 0x7293792E Eight 0x64A26A74 Nine 0x1BE8BE5 Ten 0x7592E Eleven 0x4993792E Twelve 0x292DDE5

m =16

Page 21: Algorithms and Data Structures - STAR - Homestargroup.uwaterloo.ca/~ece250/...HashTables-PartA.pdf · Ø Introduction To Algorithms (CLRS Book) Ø Data Structures and Algorithm Analysis

Hash Tables – Example 2: Solution

0 1 2 3 4 5 6 7 8 9 A B C D E F

Lecture 8 ECE250 21

Ten Eleven Four

Eight One

Three Five Six

Nine Twelve

Seven Two