Fundamentals11 min read

Caching

Keep the good stuff close — skip the slow trip
scope:Building Blockdifficulty:Beginner-Intermediate

Why Cache?

Imagine your favorite snack is in the kitchen downstairs. Every time you want a bite, you walk downstairs, open the pantry, grab a chip, walk back upstairs. That's exhausting.

Now imagine you just bring the whole bag to your desk. That's caching.

In system design, a cache is a fast, temporary storage layer that sits between your application and the slow data source (usually a database). Instead of hitting the database for every request, you check the cache first. If the data is there (cache hit), you return it instantly. If not (cache miss), you fetch from the database, store it in the cache, and return it.

Caches are fast because they use RAM (memory), which is 100-1000x faster than reading from disk. The trade-off? RAM is limited and expensive, so you can only cache a subset of your data.

Without cache — every request hits the database

Caching Strategies

There are several patterns for how your cache and database work together. Each has different trade-offs.

Cache-Aside (Lazy Loading)

  • App checks the cache first.
  • On a miss, app reads from the database, then writes the result to the cache.
  • On a hit, app returns the cached data directly.
  • Pros: Only caches data that's actually requested. Simple to implement.
  • Cons: First request for each piece of data is slow (cache miss). Data can become stale.

Write-Through

  • Every write goes to both the cache and the database at the same time.
  • Cache is always up-to-date.
  • Pros: Data is never stale. Reads are always fast.
  • Cons: Writes are slower (you're writing twice). You might cache data nobody ever reads.

Write-Back (Write-Behind)

  • Writes go to the cache only. The cache writes to the database later in batches.
  • Pros: Super-fast writes. Good for write-heavy workloads.
  • Cons: Risk of data loss if the cache crashes before flushing to the database. More complex.

Read-Through

  • Similar to cache-aside, but the cache itself is responsible for loading from the database on a miss.
  • The application only ever talks to the cache, never directly to the database.
Cache-aside pattern — the app checks cache first, falls back to DB on miss, then populates the cache for next time
Cache hit — data served from fast memory
Cache miss — fetch from DB, then populate cache

Cache-Aside Pattern Implementation

import redis
import json
import time
class CacheAside:
def __init__(self):
self.cache = redis.Redis(host='localhost', port=6379)
self.ttl = 3600 # Cache entries expire after 1 hour
def get_user(self, user_id: int) -> dict:
# Step 1: Check the cache
cache_key = f"user:{user_id}"
cached = self.cache.get(cache_key)
if cached:
print(f"Cache HIT for user {user_id}")
return json.loads(cached)
# Step 2: Cache miss — fetch from database
print(f"Cache MISS for user {user_id} — querying DB...")
user = self._query_database(user_id)
# Step 3: Store in cache for next time
self.cache.setex(cache_key, self.ttl, json.dumps(user))
return user
def update_user(self, user_id: int, data: dict):
# Update database first
self._update_database(user_id, data)
# Then invalidate (delete) the cache entry
self.cache.delete(f"user:{user_id}")
# Next read will fetch fresh data from DB
def _query_database(self, user_id):
time.sleep(0.05) # Simulated 50ms DB query
return {"id": user_id, "name": "Alice", "email": "[email protected]"}
def _update_database(self, user_id, data):
time.sleep(0.05) # Simulated 50ms DB write
# Usage
service = CacheAside()
service.get_user(42) # Cache MISS — slow (50ms)
service.get_user(42) # Cache HIT — fast (<1ms)
Output
Cache MISS for user 42 — querying DB...
Cache HIT for user 42

Cache Eviction Policies

Your cache has limited space. When it's full and a new item needs to go in, which old item gets kicked out? That's the eviction policy.

LRU (Least Recently Used) — Evict the item that hasn't been accessed for the longest time. "If you haven't touched it in a while, you probably don't need it." The most popular eviction policy.

LFU (Least Frequently Used) — Evict the item that's been accessed the fewest times overall. "The item nobody visits much." Good for workloads with stable popularity patterns.

TTL (Time To Live) — Each item has an expiration timer. After X seconds/minutes, it's automatically removed. Not exactly eviction, but it prevents stale data from living forever.

FIFO (First In, First Out) — Evict the oldest item. Simple but not always smart — the oldest item might still be very popular.

Random Eviction — Pick a random item to evict. Sounds silly, but it's fast and surprisingly effective in some workloads.

In practice, most systems use LRU + TTL together. LRU handles the limited space, and TTL ensures freshness.

Cache eviction — LRU removes the least recently used item
Note: Cache invalidation is one of the two hardest problems in computer science (the other being naming things and off-by-one errors). The challenge: when does cached data become stale? Too aggressive invalidation means low cache hit rates. Too lazy invalidation means users see outdated data. TTL is usually your best friend here.

Cache Invalidation

Stale data in a cache can cause real problems. Imagine a user updates their profile picture, but everyone else still sees the old one because it's cached. How do you keep the cache fresh?

Purge on write: When data is updated, immediately delete (or update) the cached version. Simple and effective for single-server setups.

TTL-based expiry: Set a time-to-live on every cached item. After 5 minutes, the data automatically expires and the next read fetches fresh data. You accept a small window of staleness.

Event-driven invalidation: The database publishes events (via message queues) when data changes. Cache listeners consume these events and invalidate the relevant entries. More complex but more precise.

Versioned keys: Instead of user:42, use user:42:v3. When data changes, increment the version. Old cached entries naturally become irrelevant.

Cache invalidation strategies — write-through keeps cache consistent, write-back optimizes for speed, write-around avoids caching infrequently-read writes

Redis and Memcached

The two most popular caching solutions:

Redis — An in-memory data structure store. Supports strings, lists, sets, hashes, sorted sets, and more. Has persistence options (RDB snapshots, AOF logs) so data survives restarts. Supports pub/sub, Lua scripting, and transactions. The Swiss Army knife of caching.

Memcached — A simpler, pure key-value cache. Faster for simple get/set operations because it's more lightweight. No persistence, no fancy data structures. Best for straightforward caching needs.

When to choose which?

  • Use Redis if you need data structures (sorted sets for leaderboards, lists for queues), persistence, or pub/sub.
  • Use Memcached if you need a simple, blazing-fast cache and don't need persistence.

CDN Caching and the Thundering Herd

CDN caching is a special form of caching where static content (images, CSS, JavaScript, videos) is cached on edge servers around the world. When a user in Tokyo requests an image, it's served from a nearby CDN node instead of your server in Virginia. We cover this in depth in the CDN lesson.

The Thundering Herd Problem: Imagine a popular cache entry expires. Suddenly, 10,000 simultaneous requests all get a cache miss and hit the database at the same time. Your database collapses under the load. This is the "thundering herd."

Solutions:

  • Cache locking: Only one request fetches from the database. Others wait for the cache to be repopulated.
  • Stale-while-revalidate: Serve the stale cached data while one background request refreshes it.
  • Jittered TTL: Add random variation to TTL values so not all entries expire at the same time. Instead of TTL=3600, use TTL=3600 + random(0, 300).
Where to cache — from browser to database
Note: Interview tip: When discussing caching, always mention the trade-off between consistency and performance. More aggressive caching = faster reads but potentially stale data. Also mention your cache's size limitations — you can't cache everything, so the eviction policy matters.

Key Metrics

Cache read (Redis)In-memory lookup
~0.1-0.5 ms\(O(1)\)
Cache write (Redis)In-memory write
~0.1-0.5 ms\(O(1)\)
DB read (uncached)Disk I/O + query execution
~5-50 ms\(O(\log n)\)
CDN hitServed from edge server
~1-20 ms\(O(1)\)
CDN miss (origin fetch)Cross-region round trip
~50-200 ms\(O(1)\)
Typical cache hit ratioWell-tuned read-heavy workloads
85-99%

Quick check

In the cache-aside pattern, what happens on a cache miss?

Continue reading

Content Delivery Networks
Serve content from the closest shelf, not the warehouse
Databases: SQL vs NoSQL
Choosing the right home for your data
Consistent Hashing
Adding a server shouldn't reshuffle everything
LRU CacheData Structure
Hash map + doubly linked list