Mo's Algorithm
Imagine you're a museum guard whose job is to check specific sections of a long painting for damage. You have a list of 1,000 sections to inspect, each covering some range [l, r]. You could walk left-right across the entire painting for each request β but that's 1,000 full traversals.
The smarter move: sort the requests by location so you're never backtracking far. If request 1 covers [10, 50] and request 2 covers [12, 55], just slide your window from the first to the second β four steps, not 55.
That's Mo's Algorithm: by sorting range queries in a particular clever order, we minimize total pointer movement and answer all queries efficiently.
The sliding window idea
Maintain a current window [cur_l, cur_r] and the aggregate value for that window. To answer query [l, r], expand or shrink the window to match β one element at a time. Each expansion/contraction must update the aggregate in \(O(1)\).
The cost of all queries combined = total pointer movement. Mo's job is to minimize that total movement.
Block-based query sorting
Divide the array into blocks of size B = ββnβ (the same idea behind sqrt decomposition). Sort queries by:
1. Primary key: the block containing the left endpoint.
2. Secondary key: the right endpoint (ascending for even-numbered blocks, descending for odd-numbered β the "alternating" or "Hilbert" variant halves constant factors).
In the basic version, within each block all queries are sorted by right endpoint (ascending). The right pointer only moves forward within a block. When moving to the next block, the left pointer jumps at most B steps per query.
Complexity analysis β where βn comes from
Right pointer movement: within each block, queries are sorted by r, so the right pointer moves monotonically right β at most n total steps per block. With n/B blocks, total right pointer movement is \(O(n Γ n/B)\) = \(O(n^2/B)\).
Left pointer movement: within one block, the left pointer can jump at most 2B = 2βn per query. With q queries, total left movement is \(O(q Γ B)\).
Total = \(O(n^2/B + qΓB)\). Minimized when \(n^2\)/B = qΓB β B = n/βq. When q β n, optimal B = βn, giving \(O((n+q)\)βn).
Requirements β when Mo's doesn't apply
Offline queries: all queries must be known before processing starts. We reorder them β you can't do that online (when queries arrive one by one).
Easy add and remove: adding an element to or removing one from a window boundary must take \(O(1)\). If your aggregate can't be maintained in \(O(1)\) per update, Mo's loses its advantage.
Count-distinct example
Maintain freq[v] = count of value v in current window, and a counter 'distinct'. Add element a[i]: if freq[a[i]] == 0 before incrementing β distinct++; then freq[a[i]]++. Remove a[i]: freq[a[i]]--; if freq[a[i]] == 0 after decrementing β distinct--. Both are \(O(1)\). Total: \(O((n+q)\)βn).
Mo's on trees
A variant handles path queries on trees by linearizing the tree via Euler tour. Paths become range queries on the linearized sequence (with LCA-based inclusion/exclusion). Same asymptotic complexity.