DS Menu


Extendible hashing




Extendible hashing is a dynamic hashing technique used in computer science and database systems to efficiently organize and search data. It is designed to provide a compromise between static hashing (which requires a fixed number of buckets) and dynamic hashing (which may involve frequent rehashing). Extendible hashing dynamically adapts the number of buckets as data grows, minimizing the overhead associated with rehashing.

Here's how extendible hashing works:

  1. Initialization: Initially, there is a single directory that contains a fixed number of empty buckets. The directory acts as an index or directory of the available buckets.
  2. Hashing: When a new key-value pair needs to be inserted or searched for, a hash function is applied to the key to determine the bucket in which the pair should be stored or searched. The hash function generates an index into the directory.
  3. Directory Structure: The directory contains entries, each of which corresponds to a specific bucket. Initially, all entries point to the same bucket. These pointers are usually represented as binary strings or integers.
  4. Bucket Splitting: When a bucket becomes full (i.e., it reaches a predefined load factor), it is split into two new buckets. This splitting operation is done by adding a single bit to the binary string associated with the directory entry that pointed to the full bucket. This effectively doubles the number of buckets available for storing data.
  5. Directory Expansion: As buckets are split and new ones are created, the directory's structure grows dynamically. The directory entries are updated to reflect the new bucket locations as they are created.
  6. Search and Insertion: When searching for a key or inserting a new key-value pair, the hash function generates an index into the directory. The directory entry at that index is used to determine the appropriate bucket to search or insert into.
  7. Merging: Over time, if some buckets become empty due to deletions or if the directory becomes sparsely populated, extendible hashing allows for merging. This involves reducing the number of buckets by removing a bit from directory entries and merging adjacent buckets to maintain a balanced structure.

Extendible hashing offers several advantages:

  • It adapts dynamically to the data distribution, avoiding the need for frequent rehashing.
  • It provides a good balance between space usage and search efficiency.
  • It offers predictable performance for search, insertion, and deletion operations.

Next Topic :Introduction to Tree