Chaining
Picture each bucket in your hash table as a coat hook rather than a single locker. You can hang multiple coats on the same hook — they just form a little chain dangling down. That's exactly what separate chaining does: each bucket is the head of a linked list, and any keys that hash to the same bucket are appended to that list.
When you look up a key, you hash to the bucket (\(O(1)\)), then walk the chain at that bucket until you find the key or reach the end. If chains are short — which they will be if the load factor is low — this is effectively \(O(1)\).
Chaining vs Open Addressing
Chaining pros:
- Simple to implement
- Handles high load factors gracefully — chains just get longer
- Deletion is straightforward (unlink the node)
- Each chain can use any data structure (sorted list, BST, even another hash table)
Open addressing pros:
- Everything stays inside the array — better cache locality
- No extra memory for pointers
- Works better when load factor stays below ~0.5
Java's HashMap uses chaining. Python's dict uses open addressing with a custom probing scheme. Both work — the tradeoff is memory vs. cache performance.
← → arrow keys to step · click elements to interact
Hash Table with Chaining
When to resize
With chaining, the load factor α = n / m (n keys, m buckets) directly predicts average chain length. When α = 1, on average each chain has one node — that's fine. When α = 3, chains average 3 nodes per lookup — still manageable but slowing down.
The typical resize threshold for chaining is α > 0.75 (75% full). At that point the table doubles its bucket count and every key is rehashed into the new, larger array. The operation is \(O(n)\) but amortized \(O(1)\) per insert.
After resizing, α drops to roughly 0.375 (half of 0.75), giving plenty of breathing room before the next resize.
Treeifying long chains (Java 8+)
In Java's HashMap, if a single chain grows beyond 8 nodes, it automatically converts that chain into a red-black tree. Lookup in that bucket then becomes \(O(\log n)\) instead of \(O(n)\) — a safety net against adversarial inputs that hammer one bucket.