Building Blocks10 min read

Consistent Hashing

Adding a server shouldn't reshuffle everything
scope:Building Blockdifficulty:Intermediate

The Problem with Simple Hashing

Imagine you have 4 servers and you want to distribute data across them. The simple approach: server = hash(key) % 4. Key "alice" hashes to 2, so it goes to Server 2. Easy.

But what happens when you add a 5th server? Now it's hash(key) % 5. Suddenly, almost every key maps to a different server. "alice" might now map to Server 3 instead of Server 2. You'd have to move most of your data — a massive reshuffling.

In real systems, this reshuffling is catastrophic. Imagine a cache cluster with 100 servers and billions of keys. Adding one server would invalidate nearly all cached data, causing a cache stampede that could bring down your database.

Consistent hashing solves this elegantly.

The problem with mod N hashing

The Hash Ring

Instead of a line of numbers, imagine a circle (a ring) from 0 to 2^32. Both servers and keys are placed on this ring using a hash function.

Here's how it works:

  1. Place servers on the ring. Hash each server's name/IP to get its position. Server A lands at position 10,000, Server B at 40,000, Server C at 80,000.
  2. Place keys on the ring. Hash each key to get its position. Key "alice" lands at position 25,000.
  3. Find the owner. Walk clockwise from the key's position until you hit a server. "alice" at 25,000 walks clockwise and hits Server B at 40,000. So Server B owns "alice."

Now the magic: when you add or remove a server, only the keys between the new server and the previous server need to move. Instead of reshuffling everything, only a small fraction of keys are affected.

Servers placed on a hash ring
Keys find the nearest server clockwise

Consistent Hashing Implementation

import hashlib
from bisect import bisect_right
class ConsistentHash:
def __init__(self, num_virtual_nodes=150):
self.num_vnodes = num_virtual_nodes
self.ring = {} # hash_value -> server_name
self.sorted_keys = [] # sorted hash positions on the ring
def _hash(self, key: str) -> int:
return int(hashlib.md5(key.encode()).hexdigest(), 16)
def add_server(self, server: str):
"""Add a server with virtual nodes to the ring."""
for i in range(self.num_vnodes):
vnode_key = f"{server}:vnode{i}"
h = self._hash(vnode_key)
self.ring[h] = server
self.sorted_keys.append(h)
self.sorted_keys.sort()
def remove_server(self, server: str):
"""Remove a server and all its virtual nodes."""
for i in range(self.num_vnodes):
vnode_key = f"{server}:vnode{i}"
h = self._hash(vnode_key)
del self.ring[h]
self.sorted_keys.remove(h)
def get_server(self, key: str) -> str:
"""Find which server owns this key (walk clockwise)."""
if not self.ring:
raise Exception("No servers available")
h = self._hash(key)
idx = bisect_right(self.sorted_keys, h)
if idx == len(self.sorted_keys):
idx = 0 # Wrap around the ring
return self.ring[self.sorted_keys[idx]]
# Demo
ch = ConsistentHash(num_virtual_nodes=150)
ch.add_server("server-A")
ch.add_server("server-B")
ch.add_server("server-C")
# Check where keys land
for key in ["user:1", "user:2", "user:3", "photo:42", "session:abc"]:
print(f"{key} -> {ch.get_server(key)}")
print("\n--- Adding server-D ---")
ch.add_server("server-D")
for key in ["user:1", "user:2", "user:3", "photo:42", "session:abc"]:
print(f"{key} -> {ch.get_server(key)}")
# Only ~25% of keys should remap (1/4 with 4 servers)
Output
user:1 -> server-A
user:2 -> server-C
user:3 -> server-B
photo:42 -> server-A
session:abc -> server-C

--- Adding server-D ---
user:1 -> server-A
user:2 -> server-C
user:3 -> server-D
photo:42 -> server-A
session:abc -> server-C
Adding a server — minimal key movement

Virtual Nodes

There's a problem with basic consistent hashing: servers might be placed unevenly on the ring. If Server A and Server B land next to each other, Server A might own a tiny arc while Server C owns a huge arc. The load distribution is unbalanced.

Virtual nodes fix this. Instead of placing each server once on the ring, you place it many times (100-200 copies). Each copy is called a virtual node. Server A gets positions for "A:vnode0", "A:vnode1", ..., "A:vnode149".

With 150 virtual nodes per server, the positions are much more evenly spread around the ring. The load distribution becomes nearly uniform — typically within 5-10% variance between servers.

Virtual nodes also make it easy to handle heterogeneous servers. A powerful server can get 300 virtual nodes while a smaller one gets 100, naturally receiving proportionally more traffic.

Virtual nodes for even load distribution
Note: Think of virtual nodes like this: instead of one kid choosing one seat in a game of musical chairs, each kid gets to put their jacket on multiple chairs. When the music stops, they're much more likely to be near a chair — and the chairs are evenly "claimed" across all players.

What Happens When Servers Change?

Adding a server: The new server takes over some keys from its clockwise neighbor. Only those keys need to move. With N servers, adding one server only moves ~1/N of the total keys. Compare this to modular hashing where nearly ALL keys move.

Removing a server: The removed server's keys are taken over by its clockwise neighbor. Again, only ~1/N keys are affected.

Let's see the math with 100 servers and 1 million keys:

  • Modular hashing: Add one server → ~990,000 keys move (99%!)
  • Consistent hashing: Add one server → ~10,000 keys move (1%)

That's a 99x improvement. In a cache cluster, this means 99% of your cache stays valid instead of being invalidated. Your database barely notices.

Real-World Use Cases

Consistent hashing is everywhere in distributed systems:

Amazon DynamoDB — Uses consistent hashing to distribute data across storage nodes. When a node is added or removed, only neighboring data partitions are affected.

Apache Cassandra — Each node is assigned a position on the hash ring. Data is automatically distributed based on the partition key's hash. Adding a node triggers minimal data migration.

Akamai CDN — Actually one of the earliest users! Uses consistent hashing to decide which edge server caches which content. This was described in the original 1997 paper.

Discord — Uses consistent hashing to route users to specific server processes. When they add capacity, only a fraction of users are rebalanced.

Load Balancers — Some load balancers use consistent hashing for session affinity. The same key (user ID, session ID) always maps to the same backend server, without the reshuffling problems of simple hash-mod.

Note: Interview tip: When you mention consistent hashing in an interview, always explain WHY you need it (avoid massive reshuffling), HOW it works (hash ring, clockwise walk), and don't forget virtual nodes (even distribution). Bonus points for mentioning the 1/N key movement property.

Key Metrics

Key lookupBinary search on sorted ring
\(O(\log N)\)\(O(\log N)\)
Add serverV = virtual nodes per server
\(O(V \log N)\)\(O(V \log N)\)
Remove serverRemove V virtual nodes
\(O(V \log N)\)\(O(V \log N)\)
Keys moved on add (modular)Almost everything moves
~(N-1)/N keys
Keys moved on add (consistent)Minimal disruption
~1/N keys
Space per server150-200 virtual nodes typical
\(O(V)\)\(O(V)\)

Quick check

With 10 servers using consistent hashing, approximately what percentage of keys need to move when you add an 11th server?

Continue reading

Databases: SQL vs NoSQL
Choosing the right home for your data
Caching
Keep the good stuff close — skip the slow trip
CAP Theorem
You can't have it all — pick two out of three
Hash FunctionsData Structure
Collisions, load factor, probing