Chaining
Picture each bucket in your hash table as a coat hook rather than a single locker. You can hang multiple coats on the same hook β they just form a little chain dangling down. That's exactly what separate chaining does: each bucket is the head of a linked list, and any keys that hash to the same bucket are appended to that list.
When you look up a key, you hash to the bucket (\(O(1)\)), then walk the chain at that bucket until you find the key or reach the end. If chains are short β which they will be if the load factor is low β this is effectively \(O(1)\).
Chaining vs Open Addressing
Chaining pros:
- Simple to implement
- Handles high load factors gracefully β chains just get longer
- Deletion is straightforward (unlink the node)
- Each chain can use any data structure (sorted list, BST, even another hash table)
Open addressing pros:
- Everything stays inside the array β better cache locality
- No extra memory for pointers
- Works better when load factor stays below ~0.5
Java's HashMap uses chaining. Python's dict uses open addressing with a custom probing scheme. Both work β the tradeoff is memory vs. cache performance.
β β arrow keys to step Β· click elements to interact
Hash Table with Chaining
When to resize
With chaining, the load factor Ξ± = n / m (n keys, m buckets) directly predicts average chain length. When Ξ± = 1, on average each chain has one node β that's fine. When Ξ± = 3, chains average 3 nodes per lookup β still manageable but slowing down.
The typical resize threshold for chaining is Ξ± > 0.75 (75% full). At that point the table doubles its bucket count and every key is rehashed into the new, larger array. The operation is \(O(n)\) but amortized \(O(1)\) per insert.
After resizing, Ξ± drops to roughly 0.375 (half of 0.75), giving plenty of breathing room before the next resize.
Treeifying long chains (Java 8+)
In Java's HashMap, if a single chain grows beyond 8 nodes, it automatically converts that chain into a red-black tree. Lookup in that bucket then becomes \(O(\log n)\) instead of \(O(n)\) β a safety net against adversarial inputs that hammer one bucket.