Upload
annabel-chapman
View
261
Download
1
Tags:
Embed Size (px)
Citation preview
Hashing, Hashing Tables
Chapter 8
Class Hierarchy
Introduction
• Definition:– Key: a key is a field or composite of fields that
uniquely identifies an entry in a table.
Example
• Table of students in a course sorted by name
--------------------------------------------------------------
Name Year Mark
--------------------------------------------------------------
Adams, Keith 3 94
Davis, Susan 1 75
Jordan, Ann 1 86
Patterson, Lynn 4 73
Williams, George 1 65
Insert Function of ListAsArray
Find Function of ListAsArray
Insert Function of ListAsLinkedList
Find Function of ListAsLinkedList
Insert Function of SortedListAsArray
Binary Search
Hashing
• The implementation of hash tables is called Hashing.
• Hashing is a technique used for performing insertions and finds in constant average time.
• Efficient removal of items not required
The General Idea
– Array of some fixed size, containing items.
Example
Keys and Hash Functions
• Each key is mapped into some number in the range 0 to TableSize-1 and placed in the appropriate cell.
• The mapping is called a hash function
Keys and Hash Functions
• Characteristics of a good hash function– Avoids collisions
– Spread keys evenly in the array
– Easy to compute
Avoid Collisions• Ideal situation
– Given a set of n<=M distinct keys {k1,k2,…,kn}, the set of hash values {h(k1),h(k2),…,h(kn)} contains no duplicates
• We can only try to reduce the likelihood of a collision using knowledge about the keys
• E.g. if we know the telephone numbers are all from the same district, so the district number will have little use in our hash function
Spreading Keys Evenly • We need to know the distribution of the
keys
• An equal number of keys should map into each array position
Ease of Computation• The running time of the hash function
should be O(1) (Jumping immediately to the desired record is a direct access approach, much like direct access of data on a disk)
Hashing Methods• We are dealing with integer values first,
K=Z
• The value of the hash function falls between 0 and M-1
Division Method• The simplest method of hashing an integer
• The division method of hashing
h(x) = x mod M.
Choice of M
• Generally, any M is good– we often choose M to be a prime number
Implementation
Unsigned int const M = 1031; // a prime
Unsigned int h(unsigned int x)
{ return x%M; }
Middle Square Method• Avoid division • Making use of the fact that computer does finite-
precision integer arithmetic– All arithmetic is done modulo W, where W=2w, w is
the word size of the computer
• M=2k, W=2w
• Meaning:– Multiply x by itself, then shift to the right k bits.
Implementation• unsigned int const k = 10; // M==1024 • unsigned int const w = bitsizeof (unsigned
int); • unsigned int h (unsigned int x) • { return (x * x) >> (w - k); }
Multiplication Method• We multiply the key by a
Implementationunsigned int const k = 10; // M==1024
unsigned int const w = bitsizeof (unsigned int);
unsigned int const a = 2654435769U;
unsigned int h (unsigned int x)
{ return (x * a) >> (w - k); }
}
Hash Tables
HashTable Class Definition
Separate Chaining