Hash Table Collision Handling

Two basic methods; separate chaining and open address.

Separate Chain

Hangs an additional data structure off of the buckets. For example the bucket array becomes an array of link list. So to find an item we first go to the bucket then compare keys.. This is a popular method, and if link list is used the hash never fills up.

Illustrate

load factor, f = n/N where n is number of items stored in the hash table. Like for the load factor to be less then 1.

The cost for get(k) is on average O(n/N)

Open Addressing

The problem with separate chaining is that the data structure can grow with out bounds. Sometimes this is not appropriate because of finite storage, for example in embedded processors.

Open addressing does not introduce a new structure. If a collision occurs then we look for availability in the next spot generated by an algorithm. Open Addressing is generally used where storage space is a premium, i.e. embedded processors. Open addressing not necessarily faster then separate chaining.

Methods for Open Addressing:

Linear Probing:

We try to insert Item = (k, e) into bucket A[i] and find it full so the next bucket we try is:

A[(i + 1) mod N]
Quadratic Probing:

A[ (i + f(j) )mod N] where j = 0, 1, 2, ... and f(j) = j²
Double Hashing:

Use a second hash function h'.