Building Blocks11 min read

Message Queues

Don't do everything right now — put it in line
scope:Building Blockdifficulty:Intermediate

Why Message Queues?

Imagine a restaurant. When you order, the waiter doesn't stand at the grill waiting for your burger. They write your order on a ticket and clip it to the line. The cook grabs tickets in order and works through them. The waiter is free to take more orders.

That ticket line is a message queue. It decouples the sender (waiter) from the receiver (cook) so they can work at their own pace.

In software, message queues sit between services. Instead of Service A directly calling Service B and waiting for a response, Service A drops a message in the queue and moves on. Service B picks it up whenever it's ready.

This pattern solves three critical problems:

  • Decoupling — Services don't need to know about each other. Service A just sends a message; it doesn't care who processes it.
  • Buffering — If traffic spikes, the queue absorbs the burst (complementing load balancing). Consumers process at their own pace without being overwhelmed.
  • Reliability — If Service B crashes, messages wait safely in the queue until it recovers. No data lost.
Direct communication problem
Adding a queue to decouple services

Messaging Patterns

Point-to-Point (Queue)

One producer sends a message, and exactly one consumer receives it. Like a task queue — once a worker picks up a task, no other worker gets it. Perfect for job processing, order handling, or any work that should happen exactly once.

Publish/Subscribe (Pub/Sub)

One producer publishes a message to a topic, and all subscribers receive a copy. Like a radio broadcast — everyone tuned to the channel hears the message. Perfect for event notifications, real-time updates, or fan-out processing.

For example, when a user uploads a photo:

  • The image service publishes a "photo-uploaded" event to a topic
  • The thumbnail service subscribes and generates thumbnails
  • The notification service subscribes and alerts followers
  • The analytics service subscribes and logs the event

Each service works independently. Adding a new subscriber (like a moderation service) doesn't require changing the publisher at all.

Handling traffic spikes with buffering
Pub/Sub fan-out pattern
Point-to-point: Producer sends a message, queue stores it, consumer processes and acknowledges
Pub/Sub fan-out: One event is delivered to all subscribed services independently

Producer/Consumer Pattern with a Simple Queue

import json
import time
import threading
from queue import Queue
# Simulating a message queue (in production, use Kafka/RabbitMQ/SQS)
message_queue = Queue()
def producer(orders: list[dict]):
"""Restaurant waiter — takes orders and puts them in the queue."""
for order in orders:
message = json.dumps(order)
message_queue.put(message)
print(f"[Producer] Order placed: {order['item']}")
time.sleep(0.1)
def consumer(name: str):
"""Cook — pulls orders from the queue and processes them."""
while True:
message = message_queue.get()
if message == "STOP":
break
order = json.loads(message)
print(f" [{name}] Cooking: {order['item']} for {order['customer']}")
time.sleep(0.3) # Simulate cooking time
print(f" [{name}] Done: {order['item']}!")
message_queue.task_done()
# Start 2 cooks (consumers)
for i in range(2):
t = threading.Thread(target=consumer, args=(f"Cook-{i+1}",), daemon=True)
t.start()
# Place orders (producer)
orders = [
{"item": "Burger", "customer": "Alice"},
{"item": "Pizza", "customer": "Bob"},
{"item": "Salad", "customer": "Charlie"},
{"item": "Taco", "customer": "Diana"},
]
producer(orders)
message_queue.join() # Wait for all orders to be processed
print("\nAll orders completed!")
Output
[Producer] Order placed: Burger
[Producer] Order placed: Pizza
  [Cook-1] Cooking: Burger for Alice
  [Cook-2] Cooking: Pizza for Bob
[Producer] Order placed: Salad
[Producer] Order placed: Taco
  [Cook-1] Done: Burger!
  [Cook-1] Cooking: Salad for Charlie
  [Cook-2] Done: Pizza!
  [Cook-2] Cooking: Taco for Diana
  [Cook-1] Done: Salad!
  [Cook-2] Done: Taco!

All orders completed!

Popular Message Brokers

Apache Kafka — A distributed streaming platform built for high throughput. Messages are stored in ordered, immutable logs organized into topics and partitions. Consumers track their position (offset) and can replay messages. Kafka handles millions of messages per second and retains data for days or weeks. Best for: event streaming, log aggregation, real-time analytics.

RabbitMQ — A traditional message broker that excels at complex routing. Supports multiple messaging patterns out of the box: direct, topic, fanout, and headers exchanges. Messages are typically deleted after consumption. Best for: task queues, request-reply patterns, complex routing logic.

Amazon SQS — A fully managed queue service from AWS. No infrastructure to manage. Two flavors: Standard (best-effort ordering, at-least-once delivery) and FIFO (strict ordering, exactly-once processing). Best for: simple cloud workloads, serverless architectures, teams that don't want to manage infrastructure.

Quick comparison:

  • Throughput: Kafka >> RabbitMQ > SQS
  • Message replay: Kafka (yes) vs RabbitMQ/SQS (no, messages deleted after consumption)
  • Complexity: Kafka (high) vs RabbitMQ (medium) vs SQS (low)
  • Managed option: SQS (fully), Kafka (Confluent Cloud, AWS MSK), RabbitMQ (Amazon MQ)

Delivery Guarantees

How many times does a consumer receive each message? This is one of the trickiest problems in distributed systems.

At-most-once: The message is delivered zero or one times. If something goes wrong, the message is lost. Fast but unreliable. Like sending a postcard — it might get there, it might not, you'll never know.

At-least-once: The message is delivered one or more times. If the consumer crashes before acknowledging, the message is redelivered. You might process it twice, so your consumer must be idempotent (processing the same message twice has the same result as once). This is the most common guarantee.

Exactly-once: The holy grail — every message is processed exactly one time. Extremely hard to achieve in distributed systems. Kafka achieves it through a combination of idempotent producers and transactional consumers, but it comes with a performance cost.

In practice, most systems use at-least-once delivery with idempotent consumers. For example, when processing a payment, check if the payment ID has already been processed before charging the card again.

Note: Making your consumer idempotent is the most important thing to get right with message queues. A simple strategy: store a set of processed message IDs. Before processing a message, check if you've seen it before. If yes, skip it. This turns at-least-once delivery into effectively-exactly-once processing.
Dead letter queue for failed messages

Dead Letter Queues

What happens when a message can't be processed? Maybe the data is malformed, or a dependent service is permanently down. If you keep retrying forever, the bad message blocks the entire queue — a poison pill.

The solution: a Dead Letter Queue (DLQ). After a message fails N times (say, 3 retries), it's moved to a separate queue for inspection. The main queue keeps flowing, and an engineer can later examine the DLQ to fix the issue.

DLQs are essential for production systems. They prevent one bad message from bringing your entire pipeline to a halt.

Backpressure

What if producers generate messages faster than consumers can process them? The queue grows and grows until you run out of memory or disk. This is a backpressure problem.

Strategies to handle it:

  • Drop messages: When the queue is full, reject new messages. Simple but you lose data. OK for metrics or logs, not OK for orders.
  • Block the producer: Make the producer wait until there's space. This naturally slows down the system but can cause cascading slowdowns.
  • Scale consumers: Automatically add more consumers when the queue depth exceeds a threshold. The most common cloud approach.
  • Set queue limits: Configure a max queue size with an overflow policy (dead letter, oldest-first eviction, etc.).

Monitoring queue depth is critical. If it keeps growing, you either need more consumers or fewer producers. A steadily increasing queue is a ticking time bomb.

Note: Interview tip: When designing a system, identify which operations can be asynchronous. Sending a confirmation email after signup? Queue it. Generating a report? Queue it. Processing a video upload? Definitely queue it. This is essential for scalability. Anything that doesn't need an immediate response to the user is a candidate for a message queue.

Key Metrics

Kafka write (per msg)Append to log, batched
~0.5-2 ms\(O(1)\)
Kafka read (per msg)Sequential disk read
~1-5 ms\(O(1)\)
Kafka throughputPer cluster
1M+ msg/sec
RabbitMQ throughputPer node
~50K msg/sec
SQS throughputPer queue, standard unlimited
~3K msg/sec (FIFO)
Message retention (Kafka)Configurable
Days-weeks
Message retention (SQS)Auto-deleted after
Up to 14 days

Quick check

In a pub/sub system, a 'photo-uploaded' event is published. Which services receive it?

Continue reading

Design a Notification System
The right message, to the right person, at the right time
Design a Chat System
Real-time messaging for millions — delivered instantly
Scalability
From one user to one billion — how systems grow
QueueData Structure
FIFO — array & linked-list backed