Divide & Conquer6 min read

Count Inversions

Measure how shuffled a sequence is — in \(O(n \log n)\)
time:\(O(n \log n)\)space:\(O(n)\)

Imagine you made a playlist and your friend rearranged it. How different is their version from yours? One way to measure that: count how many pairs of songs are in the wrong relative order. If song A came before song B in your original, but B is now before A — that's one inversion. The more inversions, the more shuffled things got.

Formally, an inversion is a pair of positions (i, j) where i < j but A[i] > A[j] — two elements are out of their natural order. A fully sorted array has 0 inversions. A completely reversed array has the maximum: n(n−1)/2 inversions (every possible pair is out of order).

The naive approach — too slow

Check every pair (i, j) with i < j. If A[i] > A[j], count it. Simple, but \(O(n^2)\) — for a million songs, that's a trillion comparisons.

The clever insight: count during merge sort

Here's the aha moment: merge sort's merge step already detects inversions — it just doesn't count them yet.

When merging two sorted halves [Left] and [Right], if we pick an element from Right because it's smaller than the current Left element — every remaining element in Left is larger than it and would have appeared before it in the original array. Each one of those is an inversion with the Right element we just picked.

So: when you copy from Right, add (number of remaining Left elements) to the inversion count. This piggybacks on merge sort at zero extra asymptotic cost — still \(O(n \log n)\) total.

Example walkthrough

Array: [3, 1, 2]. Inversions are (3,1) and (3,2) — count = 2.

Merge sort splits into [3] and [1, 2]. Left=[3], Right=[1,2]. First, we pick 1 from Right (1 < 3) — add 1 remaining Left element → count += 1. Then pick 2 from Right (2 < 3) — add 1 remaining Left element → count += 1. Total = 2. Correct!

Note: Inversions aren't just a curiosity — they're used in collaborative filtering (how similar are two people's rankings?), measuring sorting algorithm performance, and even in genetics to compare gene sequences.

Count Inversions via Modified Merge Sort

def count_inversions(arr):
if len(arr) <= 1:
return arr, 0
mid = len(arr) // 2
left, left_inv = count_inversions(arr[:mid])
right, right_inv = count_inversions(arr[mid:])
merged, split_inv = merge_count(left, right)
return merged, left_inv + right_inv + split_inv
def merge_count(left, right):
result = []
inversions = 0
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
# All remaining left elements form inversions with right[j]
inversions += len(left) - i
result.append(right[j])
j += 1
result.extend(left[i:])
result.extend(right[j:])
return result, inversions
arr = [3, 1, 4, 1, 5, 9, 2, 6]
_, count = count_inversions(arr)
print(count) # 9
Output
9

Complexity

🐌 Naive pair checkCheck all pairs (i, j) with i < j — fine for small arrays, painful for large ones
Quadratic\(O(n^2)\)
⚡ Modified merge sortPiggybacks on merge sort — counting adds no extra asymptotic cost
Log-linear\(O(n \log n)\)
💾 Space (auxiliary array)Temporary merge buffer, same as standard merge sort
Linear\(O(n)\)
📊 Max possible inversionsEvery pair out of order — the most shuffled possible arrangement
Reverse-sorted arrayn(n−1)/2

Quick check

How many inversions does the array [4, 3, 2, 1] have?

Continue reading

Merge Sort
Split, sort each half, merge — guaranteed \(O(n \log n)\) every time
D&C Pattern
Split the problem, solve the pieces, combine the results