Dynamic Programming9 min read

Bitmask DP

Track every possible subset of choices in a single integer
time:\(O(2^n Β· n)\)space:\(O(2^n Β· n)\)feasible n:≀ 20

Imagine a light switch panel with n switches. Each switch is either ON or OFF. The combination of all switches at any moment β€” say, switches 1, 3, and 4 are on β€” describes a subset of items you've chosen to include.

Now here's the clever trick: instead of storing a list of which switches are on, you write a single binary number. Switch 0 on = bit 0 is 1. Switch 3 on = bit 3 is 1. All switches off = the number 0. All switches on = the number \(2^n\) βˆ’ 1.

That integer is called a bitmask, and dynamic programming over all possible bitmasks is Bitmask DP.

Why Does This Help?

Some problems ask: "what's the best way to assign items to slots, visiting every item exactly once?" Brute force tries every ordering β€” that's n! possibilities, which explodes for even n = 13. Bitmask DP instead tracks which subset of items has been handled rather than the exact order, collapsing the state space from n! down to \(2^n\) Γ— n β€” still exponential, but manageable up to n β‰ˆ 20.

The Core State

The DP state is dp[mask][i], meaning: "I've visited exactly the cities (or items) described by mask, and I'm currently at city i." The bit tricks you need are:

  • Check if city j is in mask: (mask >> j) & 1
  • Add city j to mask: mask | (1 << j)
  • Remove city j from mask: mask & ~(1 << j)
  • Is mask fully visited? mask == (1 << n) - 1

The Travelling Salesman Problem (TSP)

The most famous bitmask DP application (closely related to the Hamiltonian Path problem): given n cities and the travel cost between every pair, find the cheapest tour that visits every city exactly once and returns home.

Base case: dp[1][0] = 0 β€” we start at city 0, having visited only city 0 (bit 0 is set), at zero cost.

Transition: for every unvisited city j, try extending the current tour: dp[mask | (1<<j)][j] = min(dp[mask][i] + cost[i][j]).

Answer: the cheapest way to visit all cities and return: min over all i of (dp[full_mask][i] + cost[i][0]).

Note: The bitmask is just a compact way to say "here's the exact set of items I've dealt with so far." Every bit is a yes/no switch. Two states with the same bitmask are indistinguishable β€” that's the whole point.

TSP with Bitmask DP

import math
def tsp(cost):
n = len(cost)
INF = math.inf
full = (1 << n) - 1
# dp[mask][i] = min cost to visit cities in mask, ending at i
dp = [[INF] * n for _ in range(1 << n)]
dp[1][0] = 0 # start at city 0
for mask in range(1 << n):
for i in range(n):
if dp[mask][i] == INF:
continue
if not (mask >> i) & 1:
continue
for j in range(n):
if (mask >> j) & 1:
continue # already visited
new_mask = mask | (1 << j)
dp[new_mask][j] = min(dp[new_mask][j], dp[mask][i] + cost[i][j])
return min(dp[full][i] + cost[i][0] for i in range(n))
# 4 cities, symmetric costs
cost = [
[0, 10, 15, 20],
[10, 0, 35, 25],
[15, 35, 0, 30],
[20, 25, 30, 0]
]
print(tsp(cost)) # 80
Output
80

Assignment Problem

Assign n workers to n tasks, minimizing total cost. The state is simpler: dp[mask] = minimum cost to complete the tasks in mask, where we assign task by task in order.

If mask has k bits set, we're assigning worker k (0-indexed). For each task j already in mask, we could have assigned it last: dp[mask] = min(dp[mask ^ (1<<j)] + cost[k-1][j]).

Iterating Over Subsets

A powerful pattern: iterate over all subsets of a given mask. The loop for sub = mask; sub > 0; sub = (sub-1) & mask visits every non-empty subset of mask exactly once. The trick: (sub-1) & mask clears the lowest set bit within mask, stepping through all subsets in \(O(2^popcount(mask)\)) time. Summing over all masks gives \(3^n\) total steps.

Complexity

🐌 Brute-force TSP
Infeasible for n > 12 β€” 13! is already 6 billion
All orderings \(O(n!)\)
⚑ Bitmask DP (time)
\(2^n\) masks Γ— n current cities Γ— n transitions; ~400M ops for n=20
Exponential but tractable \(O(2^n Β· n^2)\)
πŸ’Ύ Bitmask DP (space)
~20 MB for n=20 with 32-bit integers β€” watch your memory limit
Exponential table \(O(2^n Β· n)\)
πŸ” Subset enumeration
Each element has 3 cases: not in mask, in mask but not sub, in both
Sum over all masks \(O(3^n)\)
πŸ“ Practical limit
\(2^{20}\) Γ— 20 β‰ˆ 20 million states β€” feasible within a few seconds
Hard ceiling n ≀ 20

Quick check

In TSP bitmask DP, what does dp[mask][i] store?

Continue reading