Closest Pair of Points
Imagine you have a map with hundreds of cities scattered across it, and you want to find the two closest cities. The lazy approach: measure every possible pair. With 1,000 cities that's 500,000 measurements. With a million cities? Half a trillion. Way too slow.
The smart approach uses divide and conquer β splits the map in half, solves each half separately, then cleverly merges β and it finishes in \(O(n \log n)\) time, the same speed as sorting. Here's how.
The Naive Way β \(O(n^2)\)
Check every pair (i, j) and keep track of the shortest distance seen. Simple, correct, and painfully slow for large inputs. We need something smarter.
Divide & Conquer β the Big Idea
Step 1 β Divide: Sort all points by x-coordinate. Draw a vertical line through the median, splitting the set into a left half L and a right half R.
Step 2 β Conquer: Recursively find the closest pair in L (distance Ξ΄_L) and in R (distance Ξ΄_R). Let Ξ΄ = min(Ξ΄_L, Ξ΄_R). We now know the best answer within each half.
Step 3 β Combine: Can a pair that straddles the dividing line beat Ξ΄? Only if both points are within Ξ΄ of the line β otherwise their horizontal gap alone exceeds Ξ΄. This thin vertical strip is all we need to check.
The Magic of the Strip β Why Only 7 Neighbors?
Consider the strip: points within distance Ξ΄ of the dividing line, sorted by y-coordinate. Within any Ξ΄ Γ 2Ξ΄ rectangle in the strip, at most 8 points can exist. Why? Each half of that rectangle is a Ξ΄ Γ Ξ΄ square. Any two points in the same half are β₯ Ξ΄ apart (we already found the best in each half). You can fit at most 4 points in a Ξ΄ Γ Ξ΄ square under that constraint. Two squares β at most 8 points total.
So for each point in the strip, you only need to compare it against the next 7 points in y-order. That makes the strip scan \(O(n)\) β not \(O(n^2)\)!
Getting to \(O(n \log n)\)
If you sort the strip by y at each recursive level, the combine step costs \(O(n \log n)\), giving T(n) = 2T(n/2) + \(O(n \log n)\) β \(O(n \log^2 n)\). But pre-sort all points by y once before recursing (just like merge sort maintains a sorted list during merging), and the strip scan becomes \(O(n)\). Now T(n) = 2T(n/2) + \(O(n)\) β \(O(n \log n)\). Same trick as merge sort!