Skip List
Picture a huge bookshelf with 1,000 books sorted by title. To find a specific book, you could start at the left and check every title — \(O(n)\). Or you could use a smarter system.
Some books have a tall colored tab sticking up. The tallest tabs (red) mark every 128th book. Medium tabs (blue) mark every 16th book. Short tabs (green) mark every other book. To find a book, you start with the tallest tabs — jump in big chunks — then use progressively shorter tabs to zero in. That's a skip list.
In a skip list, each element lives in a sorted linked list at level 0. Some elements are also promoted to level 1 (a sparser, "express lane" list), some to level 2, and so on. Each level is a subset of the one below, and higher levels allow bigger jumps.
The Probabilistic Promotion Rule
Here's what makes skip lists beautiful: you don't need complex rotation logic to keep it balanced. When inserting a new element, you just flip a fair coin. Heads? Promote to the next level. Keep flipping until you get tails (or hit the max level cap). This means:
- About 1/2 of elements appear at level 1
- About 1/4 at level 2
- About 1/8 at level 3
- ... and so on
The expected height of any node is 2, and the expected total number of pointers is \(O(n)\). The structure stays balanced on average, without any bookkeeping.
How Search Works
Start at the top-left sentinel node (the header of the highest level). Move right as long as the next node's key is less than your target. When you'd overshoot, drop down one level. Repeat until you reach level 0, then check the current node. Each level roughly halves the remaining search space — so the expected number of levels visited is \(O(\log n)\).
How Insert and Delete Work
For insertion: first search, recording the predecessor at each level (the nodes you dropped down from). Then insert the new node after each level-0 through level-k predecessor, where k is determined by coin flips. Delete: find predecessors at all levels and splice the node out. Both operations inherit the \(O(\log n)\) expected cost of the search.
Skip Lists vs Balanced BSTs
AVL trees and Red-Black trees guarantee \(O(\log n)\) worst case through careful rotations. Skip lists give \(O(\log n)\) expected through randomness — simpler code, but no worst-case guarantee (though the probability of significant deviations is negligible).
The real advantage of skip lists is in concurrent programming. Tree rotations touch many nodes, requiring locks on large subtrees. Skip list insertions only modify a handful of local pointers, making lock-free implementations straightforward. Java's ConcurrentSkipListMap uses skip lists for exactly this reason.