Dynamic Programming7 min read

Edit Distance

How many typo-fixes does it take to turn one word into another?
time:\(O(m·n)\)space:\(O(m·n)\) or \(O(min(m,n)\))

You type "kitten" and autocorrect suggests "sitting". How does it know they're close? It counts the minimum number of single-character changes needed to turn one into the other. Insert a letter, delete a letter, or replace one letter with another — each costs 1. That minimum count is the Edit Distance (also called Levenshtein distance). It is closely related to the Longest Common Subsequence problem.

kitten → sitten (replace k with s) → sittin (replace e with i) → sitting (insert g). Three operations. Edit distance = 3.

The Three Operations

Insert: Add a character anywhere. "cat" → "coat" = 1 insertion.

Delete: Remove a character. "cats" → "cat" = 1 deletion.

Replace: Swap one character for another. "cat" → "bat" = 1 replacement.

The 2D DP Table

Define dp[i][j] = minimum edits to transform the first i characters of s into the first j characters of t. Table is (m+1) × (n+1).

Base cases: dp[i][0] = i (delete all i characters of s) and dp[0][j] = j (insert j characters to build t from nothing).

The Recurrence

If s[i-1] == t[j-1]: dp[i][j] = dp[i-1][j-1]. Characters match — no operation needed, just inherit from the diagonal.

If s[i-1] != t[j-1]: dp[i][j] = 1 + min(dp[i-1][j-1], dp[i-1][j], dp[i][j-1]). One operation, plus the minimum of: replacing (diagonal), deleting s[i-1] (up), or inserting t[j-1] (left).

Space Optimization

dp[i][j] only depends on the cell diagonally up-left, directly above, and directly left — so one row of storage plus one extra variable is enough. Space drops from \(O(m·n)\) to \(O(min(m,n)\)). You lose reconstruction ability but keep the distance value.

Note: Spell checkers suggest corrections with edit distance 1 or 2 — words that are 'one typo away' or 'two typos away'. DNA alignment tools use the same idea to measure how genetically similar two sequences are.

Edit Distance — classic DP

def edit_distance(s, t):
m, n = len(s), len(t)
dp = [[0]*(n+1) for _ in range(m+1)]
for i in range(m+1): dp[i][0] = i
for j in range(n+1): dp[0][j] = j
for i in range(1, m+1):
for j in range(1, n+1):
if s[i-1] == t[j-1]:
dp[i][j] = dp[i-1][j-1]
else:
dp[i][j] = 1 + min(dp[i-1][j-1], dp[i-1][j], dp[i][j-1])
return dp[m][n]
print(edit_distance('kitten', 'sitting')) # 3
print(edit_distance('horse', 'ros')) # 3
Output
3
3

Complexity

⏱️ Timem = length of s, n = length of t; each cell is \(O(1)\)
Fill m×n table\(O(m·n)\)
📊 Space (full table)Full table needed to trace back the exact sequence of edits
For reconstruction\(O(m·n)\)
🪄 Space (distance only)Keep only the previous row — loses reconstruction ability
One row + one variable\(O(min(m,n)\))
🐌 Brute forceDP is exponentially faster than exhaustively trying all insert/delete/replace sequences
All operation sequences\(O(3^(m+n)\))

Quick check

What is the edit distance between 'kitten' and 'sitting'?

Continue reading

Longest Common Subsequence
Find the longest story two sequences share — skipping is allowed
DP Pattern
Never solve the same puzzle piece twice — write it down and look it up next time