Prefix Sums

Imagine tracking how much money you spend each day over a year. If someone asks, "How much did you spend from day 15 to day 60?", the naive way is to add up all 45 days. This takes \(O(n)\) time for every single question. If they ask a million questions, you'll be doing a lot of addition!

Instead, you can keep a Running Total array, also known as a Prefix Sum array. At the end of every day, you just write down the total amount of money you've spent so far this year.

Prefix Sum: Build & Range Query

def build_prefix(arr):
    prefix = [0] * (len(arr) + 1)  # prefix[0] = 0
    for i, v in enumerate(arr):
        prefix[i+1] = prefix[i] + v
    return prefix

def range_sum(prefix, l, r):
    """Sum of arr[l..r] (inclusive) in O(1)."""
    return prefix[r+1] - prefix[l]

arr = [3, 1, 4, 1, 5, 9, 2, 6]
prefix = build_prefix(arr)
print(range_sum(prefix, 1, 4))  # 1+4+1+5 = 11
print(range_sum(prefix, 0, 6))  # 3+1+4+1+5+9+2 = 25
print(range_sum(prefix, 3, 5))  # 1+5+9 = 15

Output

11
25
15

How it works (The Math)

Let's say your daily spending array is A = [2, 4, 1, 5, 3].

You build a Prefix Sum array P where every element is the sum of all elements in A up to that point.

P[0] = A[0] = 2
P[1] = A[0]+A[1] = 6
P[2] = A[0]+A[1]+A[2] = 7
P[3] = A[0]+A[1]+A[2]+A[3] = 12
P[4] = A[0]+A[1]+A[2]+A[3]+A[4] = 15

So your Prefix Sum array is [2, 6, 7, 12, 15].

Now for the magic. Someone asks: "What is the sum from index 1 to index 3?" (In A, that's 4 + 1 + 5 = 10).

Instead of looping, look at the Prefix Sums! You know the total sum up to index 3 is P[3] = 12. And you know the total sum before index 1 is P[0] = 2. If you subtract the "before" from the "total", you are left with exactly the middle chunk!

Sum(i to j) = P[j] - P[i - 1]
Sum(1 to 3) = P[3] - P[0] = 12 - 2 = 10.

Instantly calculated! One subtraction! \(O(1)\) time.

The "1-Indexed Offset" Trick

In code, P[i - 1] causes a problem when i is 0 (an "out of bounds" error). To fix this, most engineers make the Prefix Sum array one element larger than the original array, starting it with a 0.

P = [0, 2, 6, 7, 12, 15]

Now the math becomes safer and cleaner: Sum(i to j) = P[j + 1] - P[i].

Complexity

Building the Prefix Array

Only done once at the beginning

Linear \(O(n)\)

Range Sum Query

Just one subtraction: P[j] - P[i-1]

Instant \(O(1)\)

Updating a Value

If A[i] changes, every prefix sum after it must be recalculated

Slow \(O(n)\)

Space Complexity

Requires storing the new Prefix Sum array