Module 07

Scalability
Patterns

Architecture patterns that let systems grow from thousands to billions of users.

Scalability 01

CQRS โ€” Command Query Responsibility Segregation

CQRS separates read and write models. Commands (writes) go to one model optimised for writes. Queries (reads) go to a separate model optimised for reads. This lets you scale reads and writes independently, and optimise each for its purpose.

A bank has tellers for transactions (writes) and an ATM network for balance checks (reads). The systems are separate, scale independently, and are optimised for their specific jobs. The ATM doesn't need the same infrastructure as the teller system.

CQRS Flow

Write side:
  User โ†’ Command (CreateOrder) โ†’ Command Handler
       โ†’ Validates, updates Write DB
       โ†’ Emits domain event (OrderCreated)
       โ†’ Event updates Read Model (async)

Read side:
  User โ†’ Query (GetOrderSummary) โ†’ Query Handler
       โ†’ Reads from denormalised Read DB (fast!)
       โ†’ Returns response immediately

Tradeoff: Read model is eventually consistent (updated asynchronously after writes). There will be a lag (milliseconds to seconds) between a write and the read model reflecting it. Acceptable for most use cases. Not appropriate for banking balances or inventory checks.

When to Use CQRS

  • Read-to-write ratio is very high (100:1 or more)
  • Read and write workloads have very different scaling needs
  • Complex domain logic on writes, simple projections on reads
  • You need multiple read models (mobile app view vs analytics view)
โฆ

Scalability 02

Event Sourcing

Instead of storing current state (balance = ยฃ500), store all events that led to that state (deposited ยฃ300, withdrew ยฃ100, deposited ยฃ300). The current state is derived by replaying events.

Traditional: users table
  id=1, name='Alice', balance=500

Event Sourced: events table
  {userId:1, type:'AccountOpened',  amount:0,   ts: t1}
  {userId:1, type:'MoneyDeposited', amount:300,  ts: t2}
  {userId:1, type:'MoneyWithdrawn', amount:100,  ts: t3}
  {userId:1, type:'MoneyDeposited', amount:300,  ts: t4}
  โ†’ replay โ†’ balance = 500

Event Sourcing Benefits

  • Complete audit trail โ€” every change recorded
  • Time travel โ€” replay to any point in time
  • Event-driven integration โ€” publish events
  • Easy debugging โ€” see exactly what happened

Event Sourcing Costs

  • Querying current state requires replay (use snapshots)
  • Schema evolution is hard โ€” old events must still be valid
  • Increased storage
  • Mental model shift โ€” unfamiliar to most teams
โฆ

Scalability 03

Geo-Distribution & Multi-Region

Running your application in multiple geographic regions reduces latency for global users and provides disaster recovery. A user in Tokyo gets served from Tokyo, not US-East.

Data Residency Challenge

The hardest part of geo-distribution is data. You want user data close to the user, but you also need global consistency. Strategies:

  • Data locality: Pin user data to their region (EU users' data stays in EU). GDPR compliance + lower latency. Challenge: what if user travels?
  • Global replicated DB: Google Spanner, CockroachDB โ€” replicate across regions with strong consistency. Higher write latency (cross-region round trips), extreme durability.
  • Async cross-region replication: Primary region handles writes, replicates asynchronously to other regions. Low latency writes, possible staleness on reads in non-primary regions.

In interviews: "How would you make this globally available?" โ†’ Discuss: which data is user-specific (can be local), which is global (e.g., follower counts, trending feeds), and how you'd handle cross-region writes. Latency numbers matter: cross-US ~40ms, US-EU ~80ms, US-Asia ~150ms.

โฆ

Scalability 04

Active-Active vs Active-Passive

Active-Active

  • All nodes serve traffic simultaneously
  • No wasted capacity
  • Immediate failover (no promotion delay)
  • Write conflicts if multiple masters
  • Complex data synchronisation
  • Good for: stateless services, read-heavy workloads

Active-Passive

  • One primary serves traffic; others on standby
  • Standby capacity is wasted (or used for reads)
  • Failover requires promotion (seconds of downtime)
  • No write conflicts โ€” single writer
  • Simpler consistency model
  • Good for: stateful databases, write-heavy workloads
โฆ

Scalability 05

Bulkhead Pattern

The bulkhead pattern isolates different parts of a system into pools so that if one fails, the others continue. Named after the bulkheads in ships โ€” watertight compartments that prevent one leak from sinking the whole ship.

Without bulkheads: Your app has one shared thread pool (200 threads). A slow third-party payment API ties up 200 threads waiting for responses. No threads available for other requests. Entire app becomes unresponsive โ€” even the user login feature which doesn't use payments.

Implementation

  • Thread pool isolation: Separate thread pool per downstream dependency. Payment pool: 50 threads. Email pool: 20 threads. Core app pool: 130 threads. Payment slowdown only exhausts payment pool.
  • Connection pool isolation: Separate DB connection pool for analytics vs. transactional queries.
  • Process isolation: Run risky operations in separate processes/containers. A crash in one doesn't affect others.
โฆ

Scalability 06

Service Mesh & Sidecars

A service mesh handles cross-cutting concerns (auth, observability, retries, circuit breaking) at the infrastructure layer โ€” not in application code. A sidecar proxy (e.g., Envoy) runs next to each service instance and intercepts all traffic.

  • mTLS: All service-to-service traffic is automatically encrypted and mutually authenticated.
  • Observability: Metrics, traces, and logs collected automatically for every request.
  • Traffic management: Canary deployments, A/B testing, circuit breaking โ€” configured centrally.
  • Examples: Istio (uses Envoy), Linkerd, Consul Connect.

Service meshes add 1โ€“5ms latency per hop (sidecar overhead). For very latency-sensitive systems, this matters. They also add operational complexity. Only adopt when you genuinely need the capabilities โ€” not just because it's fashionable.