Real-World Systems12 min read

Design a News Feed

Delivering personalized content to millions — instantly
scope:Real-World Systemdifficulty:Intermediate-Advanced

Understanding the Problem

Open Instagram, Twitter, or Facebook. That scrolling list of posts from people you follow — that's the news feed. It looks simple, but behind the scenes it's one of the most challenging systems to design at scale.

Think about it: when a celebrity with 100 million followers posts a photo, those 100 million people need to see it in their feed — quickly. Meanwhile, thousands of other people are also posting. Each user's feed is unique — a personalized mix of content from everyone they follow.

Functional Requirements:

  • Users can create posts (text, images, videos).
  • Users can view a personalized feed of posts from people they follow.
  • Feed is sorted by relevance or time (or a mix).
  • Support pagination (infinite scroll).
  • New posts appear in near-real-time.

Non-Functional Requirements:

  • Low latency: Feed should load in < 200ms.
  • Scale: 500M DAU, each user follows ~200 people on average.
  • Availability: The feed should always load — showing a slightly stale feed is better than showing nothing.
User posts content into the system

The Core Challenge: Fan-Out

When Alice posts something, all her followers need to see it. Getting that post into everyone's feed is called fan-out. There are two opposite approaches, and choosing between them is the key architectural decision.

Fan-Out on Write (Push Model)

When Alice posts, immediately push the post to the feed cache of every follower. Like a newspaper delivery — the paper is delivered to your doorstep before you wake up.

  • Pros: Reading the feed is lightning fast — the data is pre-computed and waiting. Just fetch from cache.
  • Cons: Writing is expensive. If Alice has 10M followers, that's 10M cache writes for a single post. For celebrities, this is a massive fan-out that takes time and resources.
  • Good for: Users with a moderate number of followers (< 10K).

Fan-Out on Read (Pull Model)

When Bob opens his feed, the system fetches the latest posts from everyone Bob follows and merges them on the fly. Like going to a buffet — you assemble your own plate when you're hungry.

  • Pros: Writing is instant — just store the post. No expensive fan-out.
  • Cons: Reading is slower — you must query hundreds of people's posts and merge them in real-time.
  • Good for: Celebrity/influencer posts where fan-out would be too expensive.
Fan-out on write pre-computes feeds at post time (fast reads, expensive writes). Fan-out on read assembles feeds at view time (cheap writes, slower reads).
Fan-out on write: push to every follower's cache
Fan-out on read: fetch and merge at read time
Note: The hybrid approach is what most real systems use: Fan-out on write for regular users (fast reads, manageable fan-out). Fan-out on read for celebrities/power users (avoid the 100M-write problem). Instagram and Twitter both use this hybrid model. In interviews, always mention this hybrid approach — it shows you understand the trade-off.
Hybrid approach: write for regular users, read for celebrities

Feed Generation: Fan-Out on Write vs Read

import heapq
import time
from typing import Optional
class FeedService:
def __init__(self, post_store, follow_store, feed_cache, celebrity_threshold=10000):
self.posts = post_store # user_id -> [posts]
self.follows = follow_store # user_id -> [followed_ids]
self.feed_cache = feed_cache # user_id -> [feed items]
self.celebrity_threshold = celebrity_threshold
def create_post(self, user_id: str, content: str) -> dict:
post = {
"post_id": f"{user_id}_{int(time.time() * 1000)}",
"user_id": user_id,
"content": content,
"timestamp": time.time(),
}
# Store the post
self.posts.store(post)
# Fan-out decision: is this user a celebrity?
follower_count = self.follows.get_follower_count(user_id)
if follower_count < self.celebrity_threshold:
# PUSH: Fan-out on write for regular users
followers = self.follows.get_followers(user_id)
for follower_id in followers:
self.feed_cache.prepend(follower_id, post)
# Trim cache to last 500 posts
self.feed_cache.trim(follower_id, max_size=500)
else:
# Celebrity — skip fan-out, will be fetched on read
pass
return post
def get_feed(self, user_id: str, cursor: Optional[str] = None,
limit: int = 20) -> list:
# Step 1: Get pre-computed feed from cache (fan-out on write items)
cached_feed = self.feed_cache.get(user_id, limit=limit, cursor=cursor)
# Step 2: Fetch celebrity posts on-demand (fan-out on read)
celebrity_followings = self.follows.get_celebrity_followings(
user_id, self.celebrity_threshold
)
celebrity_posts = []
for celeb_id in celebrity_followings:
recent = self.posts.get_recent(celeb_id, limit=5)
celebrity_posts.extend(recent)
# Step 3: Merge and sort by timestamp (k-way merge)
all_posts = cached_feed + celebrity_posts
all_posts.sort(key=lambda p: p["timestamp"], reverse=True)
return all_posts[:limit]
# The magic: regular posts are pre-cached (fast!)
# Celebrity posts are fetched on-demand (only a few to merge)
Output
# get_feed returns a merged, sorted list of posts
# ~80% from cache (fast), ~20% from celebrity queries (on-demand)

Feed Ranking and Scoring

A purely chronological feed (newest first) is simple but often not the best experience. What if you haven't checked your feed in 8 hours? The most interesting posts might be buried under hundreds of less relevant ones.

Modern feeds use ranking algorithms that score each post based on signals:

  • Recency: Newer posts score higher. A post from 5 minutes ago beats one from yesterday.
  • Engagement: Posts with lots of likes, comments, and shares score higher — social proof that it's interesting.
  • Relationship: Posts from close friends (people you message, react to, visit often) rank higher than acquaintances.
  • Content type: If you engage more with videos than text, video posts get a boost.
  • Diversity: Avoid showing 5 posts from the same person in a row. Mix it up.

A simple scoring formula might look like:

score = recency_weight × time_decay + engagement_weight × (likes + 2×comments + 3×shares) + relationship_weight × interaction_score

In practice, companies like Facebook and TikTok use complex machine learning models that consider hundreds of signals to predict what each user wants to see.

Media Handling

Posts with images and videos are far more complex than text-only posts:

Image uploads:

  1. Client uploads image to an object store (S3) via a pre-signed URL.
  2. A processing pipeline generates multiple resolutions: thumbnail (150px), medium (600px), full (1200px).
  3. Images are served through a CDN for fast global delivery.
  4. The post record stores the image URL, not the image itself.

Video uploads:

  1. Raw video is uploaded to S3.
  2. A transcoding pipeline (using FFmpeg or a service like AWS MediaConvert) creates multiple quality versions: 360p, 720p, 1080p.
  3. Adaptive bitrate streaming (HLS/DASH) delivers the right quality based on the user's connection speed.
  4. Videos are cached at CDN edge servers worldwide.

Key insight: media processing is asynchronous. The post is created immediately with a "processing" status, and the media becomes available seconds or minutes later. Use a message queue to coordinate this pipeline.

Pagination with Cursor

Users scroll endlessly through their feed. How do you paginate efficiently?

Offset-based (bad): "Give me posts 100-120." But what if 5 new posts were added since you loaded page 1? You'll see duplicates or miss posts. Also, fetching page 500 requires the DB to skip 10,000 rows — slow.

Cursor-based (good): "Give me 20 posts older than this timestamp/ID." The cursor is an opaque token (usually the last post's timestamp or ID) that marks exactly where you left off. New posts don't affect your scroll position.

For the cache, store feeds as sorted sets in Redis (sorted by score/timestamp). Cursor-based pagination becomes a simple ZRANGEBYSCORE query — fast and consistent.

Full architecture: complete news feed system

Cache Strategy

Caching is the backbone of feed performance. Here's the multi-layer approach:

  • Feed cache (Redis): Each user has a pre-computed feed stored as a sorted set. This is the primary read path — feeds load from cache, not from the database. Max 500-800 items per user.
  • Post cache: Individual post data (author, content, media URLs, like count) cached separately. When rendering a feed, fetch post details from this cache.
  • Social graph cache: Who follows whom, cached for the fan-out service.
  • CDN: All media (images, videos) served through CDN edge servers.

For users who haven't logged in recently, their feed cache may be empty (evicted). When they open the app, rebuild the feed on-demand by pulling recent posts from their followings. This "cold start" is slower but only happens once.

Real-Time Updates

When a new post is published, active users should see it without refreshing. Two approaches:

Pull-to-refresh: The user manually refreshes to check for new posts. Simple but not truly real-time. Most social apps use this for the main feed.

Server-push notifications: Use WebSockets (as in a chat system) or Server-Sent Events (SSE) to push a "new posts available" indicator. The client then shows a "3 new posts" banner. Tapping it fetches and inserts the new posts. This avoids jarring auto-insertion while keeping things fresh.

Full real-time insertion (posts appearing as you scroll) is usually reserved for live events or chat-like features, not the main feed — it's disorienting to have your scroll position jump around.

Note: Interview tip: The news feed question tests your understanding of fan-out trade-offs, caching, and ranking. Always structure your answer: (1) clarify requirements, (2) estimate scale, (3) choose fan-out strategy (mention the hybrid), (4) discuss ranking, (5) explain the cache layer, (6) address media handling. Don't forget to mention cursor-based pagination — it shows attention to detail.

Key Metrics

Load feed from cachek = items per page (20)
~1-10 ms\(O(k)\)
Fan-out on write (1K followers)n = follower count
~50-200 ms\(O(n)\)
Fan-out on write (1M followers)Use fan-out on read instead
~minutes\(O(n)\)
Fan-out on read (merge 200 feeds)k-way merge of n followings
~50-200 ms\(O(k \log n)\)
Feed ranking (ML model)Score k candidate posts
~10-50 ms\(O(k)\)
Image processing (3 sizes)Async, queued
~2-10 sec\(O(1)\)
Video transcoding (3 qualities)Async, CPU-intensive
~30-300 sec\(O(duration)\)

Quick check

A celebrity with 50 million followers creates a post. Which fan-out strategy should be used?

Continue reading

Caching
Keep the good stuff close — skip the slow trip
Message Queues
Don't do everything right now — put it in line
Design a Chat System
Real-time messaging for millions — delivered instantly
Min-Heap & Max-HeapData Structure
Heapify, sift up/down
Design a Notification System
The right message, to the right person, at the right time