Module 05

Message Queues
& Streaming

Decouple services, absorb traffic spikes, and enable event-driven architectures that scale.

High Impact Deep Concepts

Queues 01

Why Message Queues?

Imagine your payment service directly calls the email service, the analytics service, and the fraud detection service after every purchase. If any of them is slow or down, your payment flow slows down or fails. Message queues break this coupling โ€” the payment service publishes an event, and each downstream service consumes it independently.

A message queue is like a post office. The sender drops a letter in the box and moves on โ€” they don't wait for delivery. The recipient picks up their mail when ready. Both parties operate at their own pace. If the recipient is on holiday, letters pile up and are delivered when they return.

Problems Queues Solve

  • Decoupling: Producer and consumer don't need to know about each other. You can replace the email service without changing the payment service.
  • Traffic absorption: Black Friday spike: 100,000 orders/minute. Your fulfilment system can only process 10,000/minute. Queue absorbs the spike; fulfilment processes at its own rate.
  • Reliability: If the consumer crashes, messages are held in the queue. When it recovers, it picks up where it left off. No messages lost.
  • Fan-out: One payment event triggers: email notification, analytics update, inventory deduction, fraud check, loyalty points โ€” all in parallel, from one event.
  • Async processing: Move slow work out of the request path. Resize images, send emails, generate PDFs โ€” asynchronously, after responding to the user.
โฆ

Queues 02

Kafka Deep Dive

Kafka is a distributed, partitioned, replicated commit log. It's not a traditional message queue โ€” it's an event streaming platform. Messages are stored durably on disk and can be replayed.

Core Concepts

  • Topic: A named stream of events. Like a database table, but for events. e.g., user-signups, payment-events.
  • Partition: Topics are split into partitions for parallelism. Each partition is an ordered, immutable log. Messages within a partition are strictly ordered.
  • Offset: Each message in a partition has a sequential offset (0, 1, 2, ...). Consumers track their position by storing the last committed offset.
  • Broker: A Kafka server. Brokers form a cluster. Each partition has a leader broker and follower brokers.
  • Producer: Writes events to topics. Can choose which partition (by key hash or round-robin).
  • Consumer: Reads events from topics. Pulls from brokers (not pushed to). Controls its own read pace.
  • ZooKeeper / KRaft: Manages cluster metadata, leader election. Kafka 3.x replaced ZooKeeper with KRaft (built-in Raft consensus).

Partition Layout

Topic: payment-events | Partition 0

off:0
paid
off:1
paid
off:2
refund
off:3
paid
off:4
paid
off:5
โ€”

Green = consumed (committed offset: 2) | Gold = unconsumed | Consumer reads from offset 3 next

Why Kafka is Fast

  • Sequential disk writes: Kafka appends to partition logs sequentially. Sequential I/O on modern SSDs is ~500MB/s vs ~100MB/s for random I/O.
  • Zero-copy: Uses OS sendfile() syscall to transfer data from disk to network without copying through user space. Massive CPU saving.
  • Batching: Producers and brokers batch messages. Fewer, larger writes. Consumers read in batches.
  • Compression: Entire batches compressed together (lz4, snappy, zstd). Better compression ratio than per-message.
  • Pull-based consumers: Consumers pull when ready. No broker needs to track consumer state for throttling.
โฆ

Queues 03

Delivery Guarantees

This is one of the most important โ€” and most misunderstood โ€” concepts in distributed systems. Every messaging system must choose how it handles failures.

At-Most-Once

โ–พ

Messages are delivered zero or one times. Fire and forget. If the consumer crashes after receiving but before processing, the message is lost. Use when: occasional data loss is acceptable. Metrics collection, logging, analytics where losing 0.01% of events doesn't matter. Fastest option โ€” no retry overhead.

Producer: send message โ†’ no acknowledgement needed
Consumer: receive โ†’ ack โ†’ process
          (if crash between ack and process: message lost)

At-Least-Once

โ–พ

Messages are delivered one or more times. Retried until acknowledged. If consumer crashes before acking, message is redelivered. Consumer may process the same message multiple times. Use when: you can't lose messages but can tolerate duplicates. Requires: idempotent consumers. This is the most common setting in production.

Producer: send โ†’ wait for ack โ†’ retry if no ack (may duplicate)
Consumer: receive โ†’ process โ†’ ack
          (if crash before ack: redelivered, processed twice)

Solution: make processing idempotent
  e.g., "process order:456" twice has same result as once

Use a deduplication table: store processed message IDs. Before processing, check if ID exists. If yes, skip. This converts at-least-once into effectively-exactly-once.

Exactly-Once

โ–พ

Messages are delivered and processed exactly once. The holy grail โ€” and the most expensive to achieve. Requires coordination between producer, broker, and consumer. Kafka supports exactly-once semantics (EOS) using transactions and idempotent producers since Kafka 0.11.

Exactly-once in Kafka requires: idempotent producer (dedup on broker), Kafka transactions (atomic write across partitions), and transactional consumers (read-process-write as an atomic unit). Significant performance overhead (~20% lower throughput). Only use when business requirements truly demand it.

โฆ

Queues 04

Consumer Groups & Scaling

A consumer group is a set of consumers that cooperatively consume a topic. Each partition is assigned to exactly one consumer in the group. This enables horizontal scaling of consumers.

Topic: orders (4 partitions)
Consumer Group: order-processor (3 consumers)

Partition 0 โ†’ Consumer A
Partition 1 โ†’ Consumer B
Partition 2 โ†’ Consumer C
Partition 3 โ†’ Consumer A   (A gets 2 partitions since 4 > 3 consumers)

Adding Consumer D: partition 3 moves from A to D (rebalance)
Removing Consumer B: partition 1 moves to C or D (rebalance)

Key rule: You cannot have more consumers in a group than partitions. Extra consumers sit idle. If you need 10x throughput, create 10 partitions. Plan partitions at topic creation โ€” changing partition count on an existing topic requires rebalancing and can break ordering guarantees.

Multiple Consumer Groups

Multiple independent consumer groups can read the same topic. Each group maintains its own offset โ€” so the analytics service and the email service can both read the same payment-events topic independently, at their own pace, without interfering.

โฆ

Queues 05

Dead Letter Queues (DLQ)

A Dead Letter Queue is a holding area for messages that failed processing after N retries. Instead of blocking the entire queue on a "poison pill" message, bad messages are moved aside for manual inspection.

Message
โ†’
Consumer
โ†’
Fail ร—3
โ†’
DLQ
โ†’
Alert

Without a DLQ: A malformed message causes the consumer to crash. It's redelivered. Consumer crashes again. Infinitely. The queue is blocked. No other messages are processed. The DLQ breaks this cycle โ€” bad messages are quarantined, normal processing continues.

DLQ Best Practices

  • Alert on messages arriving in DLQ (PagerDuty, Slack)
  • Store enough context to understand why it failed (original message, error stack trace, retry count)
  • Build a replay mechanism โ€” fix the bug, then replay DLQ messages
  • Set a retention period on DLQ (7โ€“30 days)
โฆ

Queues 06

Backpressure

Backpressure is a mechanism to signal an upstream producer to slow down when a downstream consumer can't keep up. Without it, unbounded queues grow until OOM, or slow consumers cause producers to overload.

A garden hose full blast into a funnel. If you pour faster than the funnel drains, water overflows. Backpressure is putting your thumb partly over the hose opening โ€” slowing input to match output capacity.

Backpressure Strategies

  • Bounded queues: Set a max queue size. When full, block the producer (back-pressure) or drop messages (load shedding).
  • Rate limiting producers: Slow down ingestion at the source rather than letting queues grow unboundedly.
  • Consumer autoscaling: Monitor queue depth (lag). When lag grows, spin up more consumer instances. Kafka + Kubernetes KEDA can do this automatically.
  • Reactive streams: Pull-based consumption where consumers request N items when ready. Producers only send what consumers can handle.
โฆ

Queues 07

RabbitMQ vs Kafka

RabbitMQ โ€” Smart Broker

  • Traditional message broker (AMQP)
  • Messages deleted after consumption
  • Complex routing (exchanges, bindings)
  • Push-based delivery to consumers
  • Per-message acknowledgement
  • Good for: task queues, RPC, complex routing
  • Throughput: ~50k msg/sec

Kafka โ€” Dumb Broker, Smart Consumer

  • Distributed log (event streaming)
  • Messages retained for days/weeks (replayable)
  • Simple topic/partition routing
  • Pull-based consumption
  • Offset-based acknowledgement
  • Good for: event streaming, audit logs, replay
  • Throughput: millions msg/sec

Choose RabbitMQ when: you need complex routing, message priorities, per-message acknowledgement, and messages can be deleted after processing. Choose Kafka when: you need replay, high throughput, multiple independent consumers reading the same events, or event sourcing. In practice, many teams use both.