Upload
pratik-gupta
View
2.415
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Hashing 1
Hashing …
To ‘hash’ (is to ‘chop into pieces’ or to ‘mince’), is to make a ‘map’ or a ‘transform’ …
Hashing provides a means for accessing data without the use of an index structure.
Data is addressed on disk by computing a function on a search key instead.
Math's …A set of N keys, compute the index, then use an array of size N
Key k at k -> direct address, now key k at h(k) -> hashing
Hashing 2
how??? ! A bucket in a hash file is unit of storage
(typically a disk block) that can hold one or more records.
The hash function, h, is a function from the set of all search-keys, K, to the set of all bucket addresses, B.
Insertion, deletion, and lookup are done in constant time*.
Hashing 3
Hash Table
Hash table is a data structure that support Finds, insertions, deletions (deletions
may be unnecessary in some applications)
The implementation of hash tables is called hashing A technique which allows the
executions of above operations in constant average time
Hashing 4
Implementation :P
The ideal hash table data structure is an array of some fixed size, containing the item
A search is performed based on key
Each key is mapped into some position in the range 0 to TableSize-1
Hashing 5
Practical Solution
T[k] corresponds to an element with key k
If the set contains no element with key k, then T[k]=NULL
Each position (slot) corresponds to a key in the universe of keys
Hashing 6
What’s the key FUNDA!!!
Usually, m << N
h(K i) = an integer in [0, …, m-1] called the hash
value of K i
Hashing 7
#@ Hash Function @# Ideally :-> The distribution should be
uniform. Hash function should assign the same
number of records in each bucket. Practically :-> The distribution can be
random. Regardless of the actual search-keys, the
each bucket has the same number of records on average
Hash values should not depend on any ordering or the search-keys
Hashing 8
Data Structure Incorporation
Hash indices are typically a prefix of the entire hash value.
More than one consecutive index can point to the same bucket. The indices have the same hash prefix which
can be shorter then the length of the index.
Hashing 9
Hashing 10
Example of Hash Index
Hashing 11 Hash Function – Strings of Characters
Folding Method:int h(String x, int D) {int i, sum;for (sum=0, i=0; i<x.length(); i++)
sum+= (int)x.charAt(i);return (sum%D);}
sums the ASCII values of the letters in the string
ASCII value for “A” =65; sum will be in range 650-900 for 10 upper-case letters; good when D around 100, for example
order of chars in string has no effect
Hashing 12 Hash Function – Strings of Characters
Much better: Cyclic Shiftstatic long hashCode(String key, int D) { int h=0; for (int i=0, i<key.length(); i++){
h = (h << 4) | ( h >> 27);h += (int) key.charAt(i);}
return h%D;}
Hashing 13
Use of Extendable Hash Structure: Example
Initial Hash structure, bucket size = 2
Hashing 14
Example (Cont.) Hash structure after insertion of one
Brighton and two Downtown records
Hashing 15
Example (Cont.)Hash structure after insertion of Mianus record
Hashing 16
Example (Cont.)
Hash structure after insertion of three Perryridge records
Hashing 17
Example (Cont.) Hash structure after insertion of
Redwood and Round Hill records
Hashing 18Comparison to Other Hashing Methods
Advantage: performance does not
decrease as the database size increases
Space is conserved by adding and removing
as necessary
Disadvantage: additional level of indirection for operations Complex implementation