Randomized Algorithms7 min read

Skip List

Probabilistic BST alternative — \(O(\log n)\) expected
search:Fast · \(O(\log n)\) expectedinsert:Fast · \(O(\log n)\) expectedspace:\(O(n \log n)\) expected

Picture a huge bookshelf with 1,000 books sorted by title. To find a specific book, you could start at the left and check every title — \(O(n)\). Or you could use a smarter system.

Some books have a tall colored tab sticking up. The tallest tabs (red) mark every 128th book. Medium tabs (blue) mark every 16th book. Short tabs (green) mark every other book. To find a book, you start with the tallest tabs — jump in big chunks — then use progressively shorter tabs to zero in. That's a skip list.

In a skip list, each element lives in a sorted linked list at level 0. Some elements are also promoted to level 1 (a sparser, "express lane" list), some to level 2, and so on. Each level is a subset of the one below, and higher levels allow bigger jumps.

The Probabilistic Promotion Rule

Here's what makes skip lists beautiful: you don't need complex rotation logic to keep it balanced. When inserting a new element, you just flip a fair coin. Heads? Promote to the next level. Keep flipping until you get tails (or hit the max level cap). This means:

  • About 1/2 of elements appear at level 1
  • About 1/4 at level 2
  • About 1/8 at level 3
  • ... and so on

The expected height of any node is 2, and the expected total number of pointers is \(O(n)\). The structure stays balanced on average, without any bookkeeping.

How Search Works

Start at the top-left sentinel node (the header of the highest level). Move right as long as the next node's key is less than your target. When you'd overshoot, drop down one level. Repeat until you reach level 0, then check the current node. Each level roughly halves the remaining search space — so the expected number of levels visited is \(O(\log n)\).

How Insert and Delete Work

For insertion: first search, recording the predecessor at each level (the nodes you dropped down from). Then insert the new node after each level-0 through level-k predecessor, where k is determined by coin flips. Delete: find predecessors at all levels and splice the node out. Both operations inherit the \(O(\log n)\) expected cost of the search.

Skip Lists vs Balanced BSTs

AVL trees and Red-Black trees guarantee \(O(\log n)\) worst case through careful rotations. Skip lists give \(O(\log n)\) expected through randomness — simpler code, but no worst-case guarantee (though the probability of significant deviations is negligible).

The real advantage of skip lists is in concurrent programming. Tree rotations touch many nodes, requiring locks on large subtrees. Skip list insertions only modify a handful of local pointers, making lock-free implementations straightforward. Java's ConcurrentSkipListMap uses skip lists for exactly this reason.

Note: Skip lists give you sorted-set operations (search, insert, delete, range queries) with the simplicity of linked lists and the speed of balanced trees — no rotations, no color flags, no height calculations. Just a coin flip per level.

Skip List Implementation

import random
MAX_LEVEL = 4
P = 0.5
class Node:
def __init__(self, key, level):
self.key = key
self.forward = [None] * (level + 1)
class SkipList:
def __init__(self):
self.head = Node(float('-inf'), MAX_LEVEL)
self.level = 0
def random_level(self):
lvl = 0
while random.random() < P and lvl < MAX_LEVEL:
lvl += 1
return lvl
def insert(self, key):
update = [None] * (MAX_LEVEL + 1)
curr = self.head
for i in range(self.level, -1, -1):
while curr.forward[i] and curr.forward[i].key < key:
curr = curr.forward[i]
update[i] = curr
lvl = self.random_level()
if lvl > self.level:
for i in range(self.level + 1, lvl + 1):
update[i] = self.head
self.level = lvl
new_node = Node(key, lvl)
for i in range(lvl + 1):
new_node.forward[i] = update[i].forward[i]
update[i].forward[i] = new_node
def search(self, key):
curr = self.head
for i in range(self.level, -1, -1):
while curr.forward[i] and curr.forward[i].key < key:
curr = curr.forward[i]
curr = curr.forward[0]
return curr and curr.key == key
sl = SkipList()
for x in [3, 6, 7, 9, 12, 19, 17, 26, 21, 25]:
sl.insert(x)
print(sl.search(19)) # True
print(sl.search(15)) # False
Output
True
False

Complexity

🔍 SearchEach level roughly halves remaining elements — expected \(O(\log n)\) levels visited
\(O(\log n)\) on average\(O(\log n)\)
➕ InsertSearch cost plus \(O(1)\) pointer updates per promoted level — coin flips determine height
\(O(\log n)\) on average\(O(\log n)\)
🗑️ DeleteFind predecessors at all levels then splice out — same cost as search
\(O(\log n)\) on average\(O(\log n)\)
💾 SpaceEach node appears at expected 2 levels total — \(O(n)\) expected, \(O(n \log n)\) with the cap
More than a plain list\(O(n \log n)\)

Quick check

When inserting a new key into a skip list, how do you decide how many levels the node occupies?

Continue reading

Binary Search
Cut the problem in half — every single time
Bloom Filter
Probably yes, definitely no — a memory-sipping membership checker
Las Vegas vs Monte Carlo
Correct-always vs probably-correct — two flavors of randomized algorithms