Rabin-Karp

Imagine you're searching for a friend's name in a 500-page book. Checking every single letter of every page would take forever. But what if you could create a quick fingerprint of your friend's name — a tiny summary number — and compare fingerprints first? Only when a fingerprint matches do you bother reading the actual letters.

That's exactly what Rabin-Karp does. It computes a fingerprint (called a hash) for the pattern, then slides a window across the text and compares fingerprints. Most windows fail the fingerprint test instantly. Only rare matches require a full character comparison.

The Rolling Hash Trick

Here's the clever part: when you slide the window one step to the right, you don't recompute the whole fingerprint from scratch. Instead, you roll it — subtract the leftmost character's contribution, shift everything left, and add the new rightmost character. This makes each slide cost \(O(1)\) instead of \(O(m)\). That's the rolling hash.

Mathematically, for a window of length m starting at position i:
hash = (s[i]·b^(m-1) + s[i+1]·b^(m-2) + ... + s[i+m-1]) mod p
Sliding one step: new_hash = (old_hash - s[i]·b^(m-1)) × b + s[i+m]) mod p

Spurious Hits: When Fingerprints Lie

Sometimes two different strings have the same fingerprint — a hash collision. When this happens, Rabin-Karp does a full character-by-character comparison to confirm. If they don't actually match, it's called a spurious hit. With a good hash and a large prime modulus, spurious hits are extremely rare and the algorithm stays near \(O(n+m)\) in practice.

Searching for Multiple Patterns at Once

This is where Rabin-Karp really shines. If you want to search for k different patterns simultaneously, compute all k hashes and store them in a hash set. Each window only needs one \(O(1)\) lookup against the set. Total time: \(O(n + Σm_i)\) — linear in the combined input size. KMP would need k separate passes.

When Should You Use Rabin-Karp?

Searching for multiple patterns at once — this is its superpower
When you want a simple implementation with good average-case behavior
2D pattern matching (extend hashing to grids)

For a single pattern, KMP or Boyer-Moore may be faster in the worst case. But when you have dozens or hundreds of patterns, Rabin-Karp's hash-set trick makes it hard to beat.

Rabin-Karp

The Rolling Hash Trick

Spurious Hits: When Fingerprints Lie

Searching for Multiple Patterns at Once

Rabin-Karp Pattern Search

When Should You Use Rabin-Karp?

Complexity

Quick check