Hashing

Preview:

DESCRIPTION

 

Citation preview

Hashing 1

Hashing …

To ‘hash’ (is to ‘chop into pieces’ or to ‘mince’), is to make a ‘map’ or a ‘transform’ …

Hashing provides a means for accessing data without the use of an index structure.

Data is addressed on disk by computing a function on a search key instead.

Math's …A set of N keys, compute the index, then use an array of size N

Key k at k -> direct address, now key k at h(k) -> hashing

Hashing 2

how??? ! A bucket in a hash file is unit of storage

(typically a disk block) that can hold one or more records.

The hash function, h, is a function from the set of all search-keys, K, to the set of all bucket addresses, B.

Insertion, deletion, and lookup are done in constant time*.

Hashing 3

Hash Table

Hash table is a data structure that support Finds, insertions, deletions (deletions

may be unnecessary in some applications)

The implementation of hash tables is called hashing A technique which allows the

executions of above operations in constant average time

Hashing 4

Implementation :P

The ideal hash table data structure is an array of some fixed size, containing the item

A search is performed based on key

Each key is mapped into some position in the range 0 to TableSize-1

Hashing 5

Practical Solution

T[k] corresponds to an element with key k

If the set contains no element with key k, then T[k]=NULL

Each position (slot) corresponds to a key in the universe of keys

Hashing 6

What’s the key FUNDA!!!

Usually, m << N

h(K i) = an integer in [0, …, m-1] called the hash

value of K i

Hashing 7

#@ Hash Function @# Ideally :-> The distribution should be

uniform. Hash function should assign the same

number of records in each bucket. Practically :-> The distribution can be

random. Regardless of the actual search-keys, the

each bucket has the same number of records on average

Hash values should not depend on any ordering or the search-keys

Hashing 8

Data Structure Incorporation

Hash indices are typically a prefix of the entire hash value.

More than one consecutive index can point to the same bucket. The indices have the same hash prefix which

can be shorter then the length of the index.

Hashing 9

Hashing 10

Example of Hash Index

Hashing 11 Hash Function – Strings of Characters

Folding Method:int h(String x, int D) {int i, sum;for (sum=0, i=0; i<x.length(); i++)

sum+= (int)x.charAt(i);return (sum%D);}

sums the ASCII values of the letters in the string

ASCII value for “A” =65; sum will be in range 650-900 for 10 upper-case letters; good when D around 100, for example

order of chars in string has no effect

Hashing 12 Hash Function – Strings of Characters

Much better: Cyclic Shiftstatic long hashCode(String key, int D) { int h=0; for (int i=0, i<key.length(); i++){

h = (h << 4) | ( h >> 27);h += (int) key.charAt(i);}

return h%D;}

Hashing 13

Use of Extendable Hash Structure: Example

Initial Hash structure, bucket size = 2

Hashing 14

Example (Cont.) Hash structure after insertion of one

Brighton and two Downtown records

Hashing 15

Example (Cont.)Hash structure after insertion of Mianus record

Hashing 16

Example (Cont.)

Hash structure after insertion of three Perryridge records

Hashing 17

Example (Cont.) Hash structure after insertion of

Redwood and Round Hill records

Hashing 18Comparison to Other Hashing Methods

Advantage: performance does not

decrease as the database size increases

Space is conserved by adding and removing

as necessary

Disadvantage: additional level of indirection for operations Complex implementation