Heaps & Priority Queues5 min read

Two-Heap Pattern

Split the data in half — find the median in \(O(1)\) always
add number: \(O(\log n)\)find median: \(O(1)\)space: \(O(n)\)

The median of a sorted list of numbers is the middle value. Finding it naively requires sorting — \(O(n \log n)\). But what if numbers arrive one at a time in a stream and you need the median after every insertion?

The two-heap pattern solves this beautifully. Picture a sorted line of people split in the middle: the left half are all shorter than everyone in the right half. You keep two heaps:

  • Max-heap (lower half): the largest element is instantly accessible at the root — that's the middle-left value
  • Min-heap (upper half): the smallest element is instantly accessible at the root — that's the middle-right value

The median is either the root of the larger heap (odd total count) or the average of both roots (even total count).

Invariant to maintain

After every insertion you must restore two invariants:

  1. Size balance: |max_heap.size - min_heap.size| ≤ 1 — the heaps are at most 1 apart in size
  2. Value ordering: max_heap.top() ≤ min_heap.top() — every element in the lower half is ≤ every element in the upper half
Watch two-heap pattern

← → arrow keys to step · click elements to interact

Median Finder (Two Heaps)

import heapq
class MedianFinder:
def __init__(self):
self.lo = [] # max-heap (negated)
self.hi = [] # min-heap
def add_num(self, num):
heapq.heappush(self.lo, -num)
# Balance: push lo's max into hi
heapq.heappush(self.hi, -heapq.heappop(self.lo))
# Keep lo >= hi in size
if len(self.hi) > len(self.lo):
heapq.heappush(self.lo, -heapq.heappop(self.hi))
def find_median(self):
if len(self.lo) > len(self.hi):
return float(-self.lo[0])
return (-self.lo[0] + self.hi[0]) / 2.0
mf = MedianFinder()
for n in [1, 2, 3, 4, 5]:
mf.add_num(n)
print(mf.find_median())
Output
1.0
1.5
2.0
2.5
3.0

Algorithm

pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class MedianFinder:
def __init__(self):
self.lo = [] # max-heap (store negatives for Python)
self.hi = [] # min-heap
def addNum(self, num):
# Step 1: push to max-heap (lower half)
heapq.heappush(self.lo, -num)
# Step 2: fix ordering — ensure lo.top <= hi.top
if self.hi and (-self.lo[0]) > self.hi[0]:
heapq.heappush(self.hi, -heapq.heappop(self.lo))
# Step 3: fix size balance
if len(self.lo) > len(self.hi) + 1:
heapq.heappush(self.hi, -heapq.heappop(self.lo))
elif len(self.hi) > len(self.lo):
heapq.heappush(self.lo, -heapq.heappop(self.hi))
def findMedian(self):
if len(self.lo) > len(self.hi):
return -self.lo[0] # odd count: lower half has one extra
return (-self.lo[0] + self.hi[0]) / 2.0 # even count: average roots

Why two heaps and not one?

A single heap gives \(O(1)\) access to only one extreme — the min or the max. The median is in the middle, which a single heap can't expose efficiently. Two heaps partition the data so both "halves" of the middle are at heap roots — \(O(1)\) median always.

Variant: Sliding window median

LeetCode 480 extends this to a sliding window: maintain the two-heap invariant while also removing elements that leave the window. Lazy deletion (mark elements as deleted, skip them when they reach the root) keeps the amortized cost at \(O(\log n)\) per step.

Note: Any time a problem asks for a running median, median of a stream, or the k-th percentile in a dynamic dataset — the two-heap pattern is the intended solution. It turns \(O(n \log n)\) sort-and-scan into \(O(n \log n)\) total with \(O(1)\) per query.

Complexity

Find medianJust read both heap roots and possibly average them
Instant\(O(1)\)
Add numberAt most 2-3 heap operations to restore invariants
Log n\(O(\log n)\)
SpaceAll n numbers stored across the two heaps
Linear\(O(n)\)

Sliding Window Median

import heapq
from sortedcontainers import SortedList
def sliding_median(nums, k):
window = SortedList()
result = []
for i, v in enumerate(nums):
window.add(v)
if i >= k:
window.remove(nums[i - k])
if i >= k - 1:
m = k // 2
if k % 2 == 1:
result.append(float(window[m]))
else:
result.append((window[m-1] + window[m]) / 2.0)
return result
print(sliding_median([1, 3, -1, -3, 5, 3, 6, 7], 3))
# [1.0, -1.0, -1.0, 3.0, 5.0, 6.0]
Output
[1.0, -1.0, -1.0, 3.0, 5.0, 6.0]
Challenge

Quick check

After inserting [1, 2, 3, 4, 5] into a MedianFinder, which heaps hold which values?

Continue reading

Heaps
Always know the smallest (or largest) in \(O(1)\) — with \(O(\log n)\) updates
Priority Queue Patterns
K-th largest, merge K lists, top-K frequent — solved with a small heap
Sliding Window
A picture frame scanning over an array to solve sub-array problems in \(O(n)\) time