DS Menu


Collision Resolution




In hash tables, collision resolution is a method used to handle situations where two or more keys hash to the same index. There are several techniques for collision resolution, each with its advantages and disadvantages. The most commonly used methods are:

1. Separate Chaining

Description: In separate chaining, each slot of the hash table contains a pointer to a linked list (or another secondary data structure) that stores all the keys mapping to that index.

Advantages: Simple to implement, less sensitive to hash function, and can store an unlimited number of collisions.

Disadvantages: Requires additional memory outside the table for the linked lists. Performance degrades if many elements hash to the same index (long chains).


2. Open Addressing

In open addressing, all elements are stored directly in the hash table, and the table must have space for each key-value pair. It includes several sub-methods:

a) Linear Probing

Description: When a collision occurs, linear probing searches for the next available slot linearly in the table.

Advantages: Easy to implement and doesn’t require additional memory.

Disadvantages: Subject to primary clustering, where continuous occupied slots build up, increasing the average search time.

b) Quadratic Probing

Description: Similar to linear probing, but instead of searching sequentially, it searches at intervals of 1^2, 2^2, 3^2, etc., from the point of collision.

Advantages: Reduces primary clustering compared to linear probing.

Disadvantages: Can still suffer from secondary clustering and does not explore all table slots, potentially leaving parts of the table unused.

c) Double Hashing

Description: Uses a second hash function to determine the interval between probes. This interval is fixed for each key but differs between keys.

Advantages: Minimizes clustering and provides better performance than linear and quadratic probing.

Disadvantages: Requires two hash functions, and the second hash function must be carefully chosen.


3. Coalesced Chaining

Description: A hybrid of separate chaining and open addressing. Colliding items are stored in the next available slot, and a linked list is formed to connect elements hashing to the same index.

Advantages: Utilizes space more efficiently than separate chaining and avoids clustering issues of open addressing.

Disadvantages: Implementation is more complex, and searching can be slower than in pure open addressing.

4. Robin Hood Hashing

Description: A variant of open addressing where, during insertion, if the new key has probed longer than the key in the current slot, they are swapped. This aims to balance the probe lengths.

Advantages: Can reduce worst-case search times and improve fairness in probing lengths.

Disadvantages: More complex to implement, and insertions can be slower.

5. Cuckoo Hashing

Description: Uses two or more hash functions. Each key can reside in one of the possible positions dictated by these functions. If a new key collides, it "kicks out" the existing key and takes its place, and the displaced key is then rehashed with its other hash functions.

Advantages: Provides constant worst-case lookup time and high space efficiency.

Disadvantages: Complex to implement, and insertion can be challenging, especially in a high-load factor.


Next Topic :Separate chaining