Closest Pair of Points
Imagine you have a map with hundreds of cities scattered across it, and you want to find the two closest cities. The lazy approach: measure every possible pair. With 1,000 cities that's 500,000 measurements. With a million cities? Half a trillion. Way too slow.
The smart approach uses divide and conquer — splits the map in half, solves each half separately, then cleverly merges — and it finishes in \(O(n \log n)\) time, the same speed as sorting. Here's how.
The Naive Way — \(O(n^2)\)
Check every pair (i, j) and keep track of the shortest distance seen. Simple, correct, and painfully slow for large inputs. We need something smarter.
Divide & Conquer — the Big Idea
Step 1 — Divide: Sort all points by x-coordinate. Draw a vertical line through the median, splitting the set into a left half L and a right half R.
Step 2 — Conquer: Recursively find the closest pair in L (distance δ_L) and in R (distance δ_R). Let δ = min(δ_L, δ_R). We now know the best answer within each half.
Step 3 — Combine: Can a pair that straddles the dividing line beat δ? Only if both points are within δ of the line — otherwise their horizontal gap alone exceeds δ. This thin vertical strip is all we need to check.
The Magic of the Strip — Why Only 7 Neighbors?
Consider the strip: points within distance δ of the dividing line, sorted by y-coordinate. Within any δ × 2δ rectangle in the strip, at most 8 points can exist. Why? Each half of that rectangle is a δ × δ square. Any two points in the same half are ≥ δ apart (we already found the best in each half). You can fit at most 4 points in a δ × δ square under that constraint. Two squares → at most 8 points total.
So for each point in the strip, you only need to compare it against the next 7 points in y-order. That makes the strip scan \(O(n)\) — not \(O(n^2)\)!
Getting to \(O(n \log n)\)
If you sort the strip by y at each recursive level, the combine step costs \(O(n \log n)\), giving T(n) = 2T(n/2) + \(O(n \log n)\) → \(O(n \log^2 n)\). But pre-sort all points by y once before recursing (just like merge sort maintains a sorted list during merging), and the strip scan becomes \(O(n)\). Now T(n) = 2T(n/2) + \(O(n)\) → \(O(n \log n)\). Same trick as merge sort!