Module 10

Real System Design
Interviews

20 complete system designs โ€” each with requirements, capacity estimation, architecture, database design, scaling strategy, and tradeoff analysis.

How to use this module: First, spend 35 minutes designing each system yourself without looking at the solution. Then compare. The thinking process matters more than the answer. Interviewers are evaluating your reasoning, not whether you got the "right" answer.

URL Shortener YouTube WhatsApp Uber Instagram Twitter/X Netflix Google Drive Spotify Discord Notification System Search Engine Payment System News Feed Chat System Rate Limiter Web Crawler Pastebin Ticketmaster Zoom
๐Ÿ”—

Design a URL Shortener (bit.ly / TinyURL)

Classic starter problem ยท Hashing ยท Redirects ยท High read traffic

โ–พ

Functional Requirements

  • Given a long URL, return a short URL
  • Redirect short URL to original long URL
  • Custom aliases (optional)
  • Expiry time per URL
  • Analytics: click count, geo, referrer

Non-Functional Requirements

  • 100M URLs created per day
  • 10:1 read:write ratio
  • URLs must be unique
  • Redirect latency < 10ms
  • 99.99% availability
  • Short URLs: 7 characters

Capacity Estimation

Write QPS: 100M / 86400 โ‰ˆ 1,200 writes/sec Read QPS: 10 ร— 1,200 = 12,000 reads/sec Storage: 100M/day ร— 365 ร— 10 years = 365B URLs Each URL โ‰ˆ 500 bytes โ†’ 365B ร— 500B โ‰ˆ 182 TB

Core Design: Short URL Generation

Option 1 โ€” Hash the URL: MD5(longURL) โ†’ take first 7 chars โ†’ base62 encode. Problem: collisions (different URLs same hash prefix). Need to handle collision with a suffix.

Option 2 โ€” Auto-increment ID: DB auto-increment ID โ†’ base62 encode. 7 chars of base62 = 62โท = 3.5 trillion URLs. Simple, no collisions. Problem: IDs are predictable (enumerable). Use a counter + shuffle bits.

Option 3 โ€” Pre-generate keys (Twitter Snowflake style): Generate all 7-char keys in advance, store in a Key DB. Assign on request. Eliminates hotspot on ID generation. Used in production.

High-Level Architecture

Client โ†’ Load Balancer โ†’ [URL Service (stateless, N instances)] โ†“ โ†“ Key DB Write โ†’ MySQL (pre-generated keys) Read โ†’ Redis Cache (CDN edge) โ†“ miss MySQL DB Short URL redirect: GET /abc1234 โ†’ Redis lookup โ†’ 301/302 redirect to longURL Cache hit rate: ~99% (popular URLs cached at edge)

Key Design Decisions

  • 301 vs 302 redirect: 301 (permanent) is cached by browsers โ€” reduces server load but you lose analytics (browsers don't re-request). 302 (temporary) hits your server every time โ€” better analytics. Choose based on requirements.
  • Database: MySQL with key-value access pattern. Could use Cassandra for global scale. Redis for hot URLs (99% of reads will be top 20% of URLs).
  • Analytics: Stream click events to Kafka โ†’ Flink/Spark โ†’ analytics DB (ClickHouse/BigQuery). Don't put analytics logic in the hot redirect path.
  • Custom aliases: Store in same table, check uniqueness on create. Rate limit per user to prevent squatting.
โ–ถ๏ธ

Design YouTube / Netflix Video Streaming

Video upload ยท Transcoding pipeline ยท CDN streaming ยท Adaptive bitrate

โ–พ

Functional Requirements

  • Upload videos (up to 10GB)
  • Stream videos globally
  • Multiple resolutions (360pโ€“4K)
  • Recommendations, search, comments
  • Like/dislike, subscriptions

Scale

  • 2 billion users, 500 hours/min uploaded
  • 1 billion hours watched per day
  • Read:write โ‰ˆ 1000:1
  • Peak: 10M concurrent viewers

Video Upload & Processing Pipeline

User โ†’ Upload Service โ†’ Raw Video Blob Storage (S3) โ†“ Message Queue (Kafka/SQS) โ†“ Transcoding Workers (many, parallel) โ”œโ”€โ”€ 360p โ†’ CDN โ”œโ”€โ”€ 720p โ†’ CDN โ”œโ”€โ”€ 1080p โ†’ CDN โ””โ”€โ”€ 4K โ†’ CDN โ†“ Metadata DB: {videoId, title, userId, urls[], status} โ†“ Notify user: "Video ready"

Adaptive Bitrate Streaming (ABR)

Videos are split into 2โ€“10 second segments. The player downloads a manifest file (HLS/DASH) listing all available quality levels. Based on network speed, it dynamically switches between qualities mid-stream. If bandwidth drops, it switches to 360p. If it improves, it switches back to 1080p โ€” seamlessly.

CDN Strategy

99% of views are on videos uploaded >24h ago. Pre-warm CDN edge nodes with popular videos. New video uploads go to origin, then are pushed/pulled to CDN as popularity grows. Long-tail videos stay on origin storage (rarely accessed, no need to cache everywhere).

Database Design

Videos: {videoId, userId, title, description, thumbnail_url, segment_urls[], duration, status, created_at} Users: {userId, name, channel_name, subscriber_count} Comments: {commentId, videoId, userId, text, likes, created_at} โ†’ Cassandra (high write) Likes: {videoId, userId} โ†’ Redis sorted set for counts Views: Kafka โ†’ Flink aggregation โ†’ analytics store (not real-time DB)

Key Tradeoffs

  • View count: Don't update the DB on every view. Buffer in Redis, flush to DB every minute. Accept slight inaccuracy.
  • Recommendations: Offline ML pipeline. Pre-compute recommendations per user. Store in Redis. Don't compute in real-time for each request.
  • Search: Elasticsearch for full-text title/description search. Index updated asynchronously after video is published.
๐Ÿ’ฌ

Design WhatsApp / Chat System

Real-time messaging ยท Message delivery guarantees ยท Group chat ยท Media sharing

โ–พ

Functional Requirements

  • 1:1 real-time messaging
  • Group chats (up to 1000 members)
  • Message delivery receipts (sent/delivered/read)
  • Media sharing (images, videos, documents)
  • Online presence indicator
  • End-to-end encryption

Scale

  • 2 billion active users
  • 100 billion messages/day โ‰ˆ 1.2M msg/sec
  • Each user: ~50 messages/day avg
  • Max message size: 64KB text, 2GB media

Connection Architecture

Client โ†โ†’ WebSocket โ†โ†’ Chat Server (stateful) โ†“ Message Queue (Kafka) โ†“ Message Processor โ”œโ”€โ”€ Persist to Cassandra โ”œโ”€โ”€ Route to recipient's Chat Server โ””โ”€โ”€ Push notification if offline Service Registry (ZooKeeper): User 123 โ†’ connected to Chat Server A:8001 User 456 โ†’ connected to Chat Server B:8003 Routing: User A โ†’ sends to User B Chat Server A looks up: "User B is on Server B" Chat Server A โ†’ publishes to Kafka partition(userId_B) Chat Server B consumes โ†’ delivers to User B's WebSocket

Message Storage

Cassandra table: messages PRIMARY KEY (chat_id, message_id) -- chat_id partitions, message_id (UUID timeuuid) sorts Fields: sender_id, content, type (text/image/video), status, created_at Why Cassandra: - Write-heavy (1.2M msg/sec) - Append-only (messages are immutable) - Time-ordered within partition - Horizontal scale, no single point of failure

Group Messaging at Scale

For a group of 1000 members: when a message is sent, the server must deliver to all 1000 members. Fan-out approaches:

  • Fan-out on write: When message received, immediately deliver to all 1000 members' queues. Low read latency but high write amplification. OK for small groups (<100).
  • Fan-out on read: Store message once. Each member fetches on next poll. Lower write load but higher read latency. Good for large groups.
  • Hybrid: WhatsApp uses a hybrid โ€” deliver immediately to online members (fan-out on write), store for offline members (fan-out on read when they come online).

End-to-End Encryption

WhatsApp uses the Signal Protocol. Each device has a public/private key pair. Messages are encrypted with the recipient's public key. Only the recipient's private key can decrypt. WhatsApp servers see only encrypted bytes โ€” even they cannot read your messages.

๐Ÿš—

Design Uber / Ride Matching System

Geo-spatial queries ยท Real-time location tracking ยท Matching algorithm ยท Surge pricing

โ–พ

Functional Requirements

  • Rider requests a ride with pickup/dropoff
  • Match rider to nearest available driver
  • Real-time driver location tracking
  • ETA estimation
  • Trip tracking during ride
  • Pricing (surge based on supply/demand)

Scale

  • 5M rides/day globally
  • 3M active drivers
  • Driver location update every 4 seconds
  • 3M ร— 15 updates/min โ‰ˆ 750K location updates/sec

Location Storage โ€” The Core Problem

The challenge: Given a rider's location, find all available drivers within 5km โ€” in milliseconds, updated every 4 seconds, for 3M drivers globally. A naive SQL query WHERE distance(lat,lon, driver_lat, driver_lon) < 5000 requires computing distance for every driver โ€” O(3M) per request.

Solution: Geohash / S2 cells. Divide the world into a grid. Each cell has an ID. A driver at lat/lon belongs to cell "9q8yy". "Find nearby drivers" = "find all drivers in cell 9q8yy and adjacent cells" โ€” a simple key lookup.

Redis Geospatial: GEOADD active_drivers driver_123 -73.9857 40.7484 (NYC) GEORADIUS active_drivers -73.9857 40.7484 5 km โ†’ [driver_123, driver_456, ...] Updates every 4 seconds per driver: GEOADD active_drivers driver_123 {new_lat} {new_lon} This is 750K writes/sec โ†’ need Redis cluster, pipeline writes

Matching Algorithm

1. Rider requests โ†’ Matching Service 2. Get nearby drivers (Redis GEORADIUS, r=2km โ†’ expand to 5km if insufficient) 3. Filter: available, correct vehicle type, acceptable rating 4. Rank by: ETA (not distance โ€” traffic-aware), driver rating, surge zone 5. Send ride offer to top driver (5 second timeout) 6. If rejected/timeout โ†’ next driver 7. If accepted โ†’ create trip, notify rider, start tracking

Surge Pricing

Surge multiplier = f(demand/supply in geohash cell). Computed every 5 minutes. Demand = ride requests. Supply = available drivers. If demand/supply > threshold โ†’ trigger surge. Multiplier increases until supply meets demand (higher price attracts more drivers). Implemented as a background job, result stored in Redis by geohash.

๐Ÿ“ธ

Design Instagram / Photo Sharing

Media upload ยท News feed ยท Follow graph ยท Stories

โ–พ

Functional Requirements

  • Upload photos/videos with captions
  • Follow/unfollow users
  • Home feed (posts from followed users)
  • Like, comment on posts
  • Stories (24h expiry)
  • Explore/discovery page

Scale

  • 1 billion MAU, 500M DAU
  • 100M photos uploaded/day
  • 4.2B likes/day
  • Read:write โ‰ˆ 100:1
  • Celebrity accounts: 100M+ followers

Feed Generation โ€” The Hard Part

Fan-out on write (push model): When a user posts, push to all followers' feed caches immediately. Reading your feed is fast (pre-computed). Problem: a celebrity with 100M followers posting = 100M cache writes. Celebrities can't use push model.

Fan-out on read (pull model): When you open your feed, fetch the latest posts from everyone you follow. Merge and rank. Problem: if you follow 5000 people, you need 5000 DB reads per feed load. Too slow.

Hybrid (Instagram's actual approach): Regular users: push model (pre-computed feed). Celebrity users: pull model on read, merged with your pre-computed feed. Best of both worlds.

Architecture

Post Upload: User โ†’ Upload Service โ†’ S3 (raw image) โ†“ async Image Processing (resize, CDN push) โ†“ PostgreSQL (metadata) + Cassandra (feed events) โ†“ Kafka โ†’ Feed Worker โ†’ Push to followers' Redis feed lists Feed Read: User โ†’ Feed Service โ†’ Redis (precomputed feed list of post IDs) โ†“ batch fetch PostgreSQL (post metadata, like counts) โ†“ CDN URLs for images โ†’ response assembled

Follow Graph

Store follow relationships in a graph-like structure. For feed generation, you need: "give me the user IDs of everyone that user 123 follows." Use a column-family DB (Cassandra) or denormalised Redis set. Avoid JOINs on a large social graph โ€” they don't scale.

๐Ÿฆ

Design Twitter/X

Tweets ยท Timeline ยท Trending ยท Search ยท Notifications

โ–พ

Functional Requirements

  • Post tweets (280 chars, media)
  • Follow users
  • Home timeline (followed users' tweets)
  • Retweet, like, reply
  • Trending hashtags
  • Search tweets

Scale

  • 300M MAU, 100M DAU
  • 500M tweets/day โ‰ˆ 6K tweets/sec
  • Read:write โ‰ˆ 1000:1
  • Timeline read: <100ms p99
  • Max followers: Elon Musk ~150M

Timeline Architecture

Tweet posted by user 123 (5K followers): โ†’ Written to Tweets table (MySQL/Cassandra) โ†’ Kafka event: {tweetId, authorId, timestamp} โ†’ Timeline Worker: for each follower_id, push tweetId to their Redis timeline list LPUSH timeline:follower_id {tweetId} (keep last 800 tweets) User reads home timeline: โ†’ LRANGE timeline:user_id 0 20 (get 20 tweet IDs from Redis) โ†’ Batch fetch tweet data from Tweets cache/DB โ†’ Merge with "celebrity tweets" fetched directly on read Trending hashtags: โ†’ Count #hashtag occurrences in real-time with Redis ZINCRBY โ†’ ZREVRANGE trending 0 9 โ†’ top 10 trending โ†’ Recompute every 5 minutes, cache result
๐Ÿ’พ

Design Google Drive / Dropbox

File storage ยท Sync ยท Sharing ยท Conflict resolution ยท Versioning

โ–พ

Functional Requirements

  • Upload/download files (any size)
  • File sync across devices
  • Share files/folders with others
  • File versioning (restore previous versions)
  • Offline access
  • Collaborative editing

Scale

  • 1 billion users, 15GB free storage each
  • 15 exabytes total storage
  • 1B ร— 15GB = 15 EB data
  • Strong consistency required (no conflicting edits)

File Upload โ€” Block-Level Chunking

Large file upload (e.g., 4GB video): Client splits file into 4MB chunks For each chunk: hash(chunk) โ†’ check if already uploaded (deduplication) Upload only missing chunks to S3 (parallel, resumable) After all chunks uploaded: send manifest {file_id, chunk_ids[]} to Metadata DB Benefits: - Resume interrupted uploads (only re-upload missing chunks) - Deduplication: if 1000 users upload same PDF, store once - Delta sync: only upload changed chunks on file update

Sync Algorithm

Client maintains local file metadata + last_sync_version On change: upload delta โ†’ server stores new version, broadcasts to other clients via WebSocket Other clients: receive notification โ†’ download changed chunks โ†’ apply to local copy Conflict resolution: Two clients edit same file offline: โ†’ Both upload changes โ†’ Server creates conflict copy: "file (John's copy).docx" โ†’ User resolves manually (Google Docs uses CRDT for real-time, no conflicts)

Storage Architecture

  • Blob storage: S3/GCS for actual file chunks. Content-addressed by hash.
  • Metadata DB: PostgreSQL for file hierarchy, permissions, version history, user accounts.
  • Block DB: Mapping of chunk hashes to storage locations. Enables deduplication across users.
  • CDN: For frequently accessed files โ€” thumbnails, profile pictures, shared documents.
๐Ÿ””

Design a Notification System

Push ยท Email ยท SMS ยท In-app ยท Prioritisation ยท Throttling

โ–พ

Functional Requirements

  • Support push (iOS/Android), email, SMS, in-app
  • User notification preferences
  • Priority levels (critical vs marketing)
  • Deduplication (don't send same notification twice)
  • Scheduled notifications
  • Template management

Scale

  • 10M notifications/day baseline
  • Spikes: 100M/day for breaking news
  • Email SLA: delivered within 1 minute
  • SMS SLA: delivered within 30 seconds
  • Push: delivered within 5 seconds

Architecture

Event Sources (any service) โ†’ Notification Service API โ†“ Notification DB (store all notifications, status) โ†“ Kafka Topics (partitioned by user_id for ordering): โ”œโ”€โ”€ push-notifications โ”œโ”€โ”€ email-notifications โ””โ”€โ”€ sms-notifications โ†“ Workers (per channel): โ”œโ”€โ”€ Push Worker โ†’ APNs (Apple) / FCM (Google) โ”œโ”€โ”€ Email Worker โ†’ SendGrid / SES โ””โ”€โ”€ SMS Worker โ†’ Twilio / SNS โ†“ Delivery Status โ†’ Update Notification DB Failed โ†’ DLQ โ†’ Retry with exponential backoff

Key Design Decisions

  • Deduplication: Before sending, check Redis: SET sent:{notifId} 1 NX EX 86400. NX = only set if not exists. If already set, skip sending.
  • User preferences: Check opt-out before sending. Cache preferences in Redis to avoid DB lookup per notification.
  • Throttling: Don't send 50 emails in 5 minutes to one user. Use a rate limiter per user per channel per time window.
  • Priority queues: Critical alerts (security, transaction) jump the queue. Marketing notifications are deprioritised during high load.
๐Ÿšฆ

Design a Distributed Rate Limiter

Token bucket ยท Redis ยท Per-user limits ยท Multi-region

โ–พ

Requirements

  • Limit requests per user/IP per time window
  • Different limits per API endpoint
  • Return rate limit headers (X-RateLimit-*)
  • Distributed (multiple servers, same limits)
  • Low latency (<5ms overhead)

Scale

  • 10,000 QPS per server
  • 100 server instances
  • Rate limit state must be shared
  • Redis cluster for shared state

Token Bucket in Redis

-- Lua script (atomic in Redis):
local key = "ratelimit:" .. user_id .. ":" .. endpoint
local limit = 100   -- 100 requests
local window = 60   -- per 60 seconds
local now = tonumber(redis.call('TIME')[1])

local count = redis.call('INCR', key)
if count == 1 then
  redis.call('EXPIRE', key, window)
end
if count > limit then
  return 0  -- rate limited
end
return 1  -- allowed

Architecture

Client Request โ†’ API Gateway โ†“ Rate Limiter Middleware โ†“ Redis Cluster (shared counter per user+endpoint) โ”œโ”€โ”€ INCR + EXPIRE (sliding window) โ”œโ”€โ”€ Returns: {allowed: true/false, remaining: N, reset_at: T} โ†“ If allowed โ†’ forward to backend If denied โ†’ 429 Too Many Requests + Retry-After: 30 seconds
๐ŸŽŸ๏ธ

Design Ticketmaster / High-Concurrency Booking

Seat reservation ยท Race conditions ยท Virtual queue ยท Inventory management

โ–พ

Functional Requirements

  • Browse events, view seat map
  • Reserve seats (10-min hold)
  • Purchase (payment + confirmation)
  • No double-booking (critical)
  • Handle traffic spikes (Taylor Swift = 10M concurrent)

The Core Challenge

  • Concert with 50,000 seats
  • 5M people try to buy at 10am
  • Must not oversell even 1 seat
  • Response must be fast (or queue)
  • Seat holds must expire correctly

Seat Reservation โ€” Race Condition Prevention

Option 1 โ€” Pessimistic Locking (DB): BEGIN TRANSACTION SELECT * FROM seats WHERE id=123 AND status='available' FOR UPDATE -- Lock held: no one else can select this seat UPDATE seats SET status='held', held_by=user_id, held_until=now()+600 COMMIT โœ“ Correct, but locks can queue up under high concurrency Option 2 โ€” Redis Atomic Hold: SET seat:123 user_456 NX EX 600 -- NX = only if not exists, EX = expire in 600 seconds -- Atomic: exactly one user wins the race โœ“ Fast (sub-ms), automatically expires, no DB lock needed โœ— If Redis fails, seats might be lost (persistence + replica) Option 3 โ€” Virtual Queue (Ticketmaster's approach): Under extreme load: put users in a virtual queue Issue queue position token, show wait time Process N users at a time from the queue Prevents stampede, gives fair access

Booking Flow

1. User views event โ†’ cached seat map (Redis/CDN) 2. User selects seat โ†’ Reserve: Redis SET seat:X user:Y NX EX 600 3. If reserved: proceed to payment (10 min window) 4. Payment success โ†’ atomically: release Redis lock + mark seat SOLD in DB 5. If payment fails/timeout: Redis key expires โ†’ seat available again 6. Confirm ticket โ†’ generate PDF, send email notification
Interview Strategy

How to Structure Any System Design Answer

The 35-minute framework interviewers use at FAANG: Requirements (5 min) โ†’ Capacity estimation (3 min) โ†’ High-level design (10 min) โ†’ Deep dive on 2-3 components (15 min) โ†’ Wrap up: tradeoffs, bottlenecks, improvements (2 min)

Red Flags in Interviews