Collision Resolution Techniques- Open Addressing and Chaining
Collision resolution is a crucial aspect of hashing because it addresses the situation where two different keys hash to the same index in a hash table. There are various collision resolution techniques, each with its own advantages and disadvantages. Here are some common collision resolution techniques:
A. Open Addressing:
In open addressing, when a collision occurs, the algorithm looks for the next available slot in the hash table until an empty slot is found. In Open Addressing, all elements are stored in the hash table itself. So at any point, the size of the table must be greater than or equal to the total number of keys (Note that we can increase table size by copying old data if needed). This approach is also known as closed hashing. This entire procedure is based upon probing. We will understand the types of probing ahead:
# Linear Probing
hash_code = (hash1(key) + i) % table_size
2.Quadratic Probing:
# Quadratic Probing
hash_code = (hash1(key) + c1 * i + c2 * i^2) % table_size
hash_code = (hash1(key) + i * hash2(key)) % table_size
Disadvantages: Increased likelihood of clustering (groups of consecutive filled slots), which can impact performance.
- Implementation:Hash Function: A hash function is used to determine the index for each key. The goal is to distribute the keys as uniformly as possible across the available slots.Linked Lists: Each slot in the hash table contains a linked list (or sometimes another data structure like a binary search tree) to handle multiple elements that hash to the same index.
Collision Handling: When a collision occurs (i.e., two keys hash to the same index), the new key-value pair is inserted at the end of the linked list at that index.Searching: To retrieve a value, the hash function is used to find the index, and then a search is performed in the linked list at that index.
# Collision resolution using Separate Chaining
hash_table[index].append((key, value))
Advantages:
- Simple Implementation: Chaining is relatively straightforward to implement compared to some other collision resolution techniques.
- Efficient for Multiple Collisions: It handles scenarios where multiple keys hash to the same index efficiently by creating a linked list at each slot.
- Memory Overhead: Requires additional memory for maintaining pointers in the linked lists, which can lead to increased memory usage.
- Performance: The efficiency of chaining can degrade if the linked lists become too long, leading to longer search times.
- Performance Considerations:
- The performance of chaining is influenced by the length of the linked lists. If the lists become too long, it can lead to increased search times.
- Choosing a good hash function is crucial to achieving a uniform distribution of keys and minimizing collisions.
- Conclusion:Chaining is a practical and intuitive method for handling collisions in hashing. Its simplicity makes it a good choice for scenarios where the number of collisions is expected to be reasonably low or when dealing with variable-length keys. Understanding the characteristics of the data and designing an effective hash function are essential for optimizing the performance of the chaining method.
S.No. Separate Chaining Open Addressing 1. Chaining is Simpler to implement. Open Addressing requires more computation. 2. In chaining, Hash table never fills up, we can always add more elements to chain. In open addressing, table may become full. 3. Chaining is Less sensitive to the hash function or load factors. Open addressing requires extra care to avoid clustering and load factor. 4. Chaining is mostly used when it is unknown how many and how frequently keys may be inserted or deleted. Open addressing is used when the frequency and number of keys is known. 5. Cache performance of chaining is not good as keys are stored using linked list. Open addressing provides better cache performance as everything is stored in the same table. 6. Wastage of Space (Some Parts of hash table in chaining are never
used).In Open addressing, a slot can be used even if an input doesn’t map to it. 7. Chaining uses extra space for links. No links in Open addressing
Let the size of a hash table is 10. The index of the hash table varies from 0 to 9. Assume the keys 73, 54, 15, 48, 89, 66, 37, 18, 41, 22, 62 are mapped using modulo operator.
Show how the keys are distributed using chaining method.
Apply the hash function h(x) = x mod 7 for linear probing on the data 2341, 4234,2839, 430, 22, 397, 3920 and show the resulting hash table
Comments
Post a Comment