Hashing Techniques
In hash table data structures, several hashing techniques are commonly used. Here are some of the most popular ones:
Division Hashing:
- Uses the modulo operation to obtain a hash code. The formula is
hash_code = key % table_size
. - Simple and computationally efficient.
- Prone to clustering if keys have a pattern that aligns with the table size.
- Uses the modulo operation to obtain a hash code. The formula is
Multiplication Hashing:
- Involves multiplying the key by a constant A in the range (0, 1), extracting the fractional part, and then multiplying it by the size of the hash table.
- Formula:
hash_code = int(table_size * ((key * A) % 1))
. - Provides a more uniform distribution compared to division hashing.
Universal Hashing:
- Uses a family of hash functions, and a random hash function is selected for each instance of the hash table.
- Reduces the likelihood of collisions, even for worst-case key distributions.
- Achieves good average-case performance.
Double Hashing:
- Involves using a secondary hash function to resolve collisions.
- The formula is
hash_code = (hash1(key) + i * hash2(key)) % table_size
, wherei
is the number of collision resolution attempts. - Effective in minimizing clustering and providing good distribution.
Cuckoo Hashing:
- Uses multiple hash functions and two or more hash tables.
- Each key is assigned to one of the hash tables based on multiple hash functions.
- If a collision occurs, the existing key is moved to the other hash table.
- Repeats the process until there are no collisions.
- Effective for resolving collisions with a constant worst-case lookup time.
Perfect Hashing:
- Aims to eliminate collisions entirely by designing a hash function that guarantees no collisions for a specific set of keys.
- Involves two levels of hashing: the first level hashes keys into buckets, and within each bucket, a second-level hash function is used.
- Ideal for situations where the set of keys is fixed.
Quadratic Probing:
- A form of open addressing that resolves collisions by incrementing the probe sequence quadratically.
- The formula is
hash_code = (hash1(key) + c1 * i + c2 * i^2) % table_size
, wherec1
andc2
are constants. - Helps avoid clustering and provides good distribution.
Chaining (Separate Chaining):
- Each bucket in the hash table contains a linked list of key-value pairs.
- Collisions are resolved by appending key-value pairs to the linked list at the corresponding bucket.
- Provides a simple and effective way to handle collisions.
These hashing techniques vary in their approach to handling collisions and distributing keys. The choice of a hashing technique depends on factors such as the nature of the data, expected workload, and performance requirements for specific applications.
Comments
Post a Comment