Mo's Algorithm
Imagine you're a museum guard whose job is to check specific sections of a long painting for damage. You have a list of 1,000 sections to inspect, each covering some range [l, r]. You could walk left-right across the entire painting for each request — but that's 1,000 full traversals.
The smarter move: sort the requests by location so you're never backtracking far. If request 1 covers [10, 50] and request 2 covers [12, 55], just slide your window from the first to the second — four steps, not 55.
That's Mo's Algorithm: by sorting range queries in a particular clever order, we minimize total pointer movement and answer all queries efficiently.
The sliding window idea
Maintain a current window [cur_l, cur_r] and the aggregate value for that window. To answer query [l, r], expand or shrink the window to match — one element at a time. Each expansion/contraction must update the aggregate in \(O(1)\).
The cost of all queries combined = total pointer movement. Mo's job is to minimize that total movement.
Block-based query sorting
Divide the array into blocks of size B = ⌊√n⌋ (the same idea behind sqrt decomposition). Sort queries by:
1. Primary key: the block containing the left endpoint.
2. Secondary key: the right endpoint (ascending for even-numbered blocks, descending for odd-numbered — the "alternating" or "Hilbert" variant halves constant factors).
In the basic version, within each block all queries are sorted by right endpoint (ascending). The right pointer only moves forward within a block. When moving to the next block, the left pointer jumps at most B steps per query.
Complexity analysis — where √n comes from
Right pointer movement: within each block, queries are sorted by r, so the right pointer moves monotonically right — at most n total steps per block. With n/B blocks, total right pointer movement is \(O(n × n/B)\) = \(O(n^2/B)\).
Left pointer movement: within one block, the left pointer can jump at most 2B = 2√n per query. With q queries, total left movement is \(O(q × B)\).
Total = \(O(n^2/B + q×B)\). Minimized when \(n^2\)/B = q×B → B = n/√q. When q ≈ n, optimal B = √n, giving \(O((n+q)\)√n).
Requirements — when Mo's doesn't apply
Offline queries: all queries must be known before processing starts. We reorder them — you can't do that online (when queries arrive one by one).
Easy add and remove: adding an element to or removing one from a window boundary must take \(O(1)\). If your aggregate can't be maintained in \(O(1)\) per update, Mo's loses its advantage.
Count-distinct example
Maintain freq[v] = count of value v in current window, and a counter 'distinct'. Add element a[i]: if freq[a[i]] == 0 before incrementing → distinct++; then freq[a[i]]++. Remove a[i]: freq[a[i]]--; if freq[a[i]] == 0 after decrementing → distinct--. Both are \(O(1)\). Total: \(O((n+q)\)√n).
Mo's on trees
A variant handles path queries on trees by linearizing the tree via Euler tour. Paths become range queries on the linearized sequence (with LCA-based inclusion/exclusion). Same asymptotic complexity.