Module 02

Networking
Fundamentals

Understanding the pipes your data travels through โ€” and exactly where latency hides.

Foundational Interview Critical

Networking 01

TCP vs UDP

Every byte your application sends travels over one of two transport protocols. Choosing the wrong one for your use case is a common architecture mistake.

TCP โ€” Transmission Control Protocol

  • Connection-oriented (3-way handshake)
  • Guaranteed delivery โ€” retransmits lost packets
  • Ordered delivery โ€” data arrives in sequence
  • Flow control & congestion control
  • Higher overhead, higher latency
  • Use: HTTP, databases, file transfer, email

UDP โ€” User Datagram Protocol

  • Connectionless โ€” no handshake
  • No delivery guarantee โ€” packets may be lost
  • No ordering โ€” packets can arrive out of order
  • No flow control
  • Lower overhead, lower latency
  • Use: video streaming, gaming, DNS, VoIP

TCP is like sending a registered letter โ€” you get confirmation of delivery, it arrives in order, and if it's lost it's resent. UDP is like shouting across a room โ€” fast, but some words might not reach everyone. For live video, a slightly choppy frame is better than waiting for a resend.

The TCP 3-Way Handshake

Client                    Server
  |                          |
  |--- SYN (seq=x) -------->|   Client initiates
  |                          |
  |<-- SYN-ACK (seq=y,      |   Server acknowledges
  |    ack=x+1) ------------|
  |                          |
  |--- ACK (ack=y+1) ------>|   Client confirms
  |                          |
  |=== Connection Open ======|
  |                          |

This handshake adds one round-trip time (RTT) before any data flows. At 150ms cross-continent latency, that's 150ms just to open the connection โ€” before a single byte of your HTTP request is sent. This is why HTTP keep-alive and connection pooling matter so much.

When asked "what happens when you open a WebSocket?": TCP handshake (1 RTT) โ†’ TLS handshake (1โ€“2 RTT) โ†’ HTTP Upgrade request โ†’ WebSocket connection established. Total: 3โ€“4 RTTs before real-time messaging starts. This is why proximity to users matters for latency-sensitive apps.

โฆ

Networking 02

HTTP/1.1 vs HTTP/2 vs HTTP/3

Each version of HTTP was invented to solve performance bottlenecks of the previous one. Understanding why they were created reveals deep insight into how the web works.

HTTP/1.1 โ€” The Baseline

โ–พ

One request per TCP connection at a time. The connection can be reused (keep-alive), but only one outstanding request allowed. Problem: loading a page with 100 assets means 100 sequential requests on each connection. Browsers work around this by opening 6 parallel connections per domain โ€” a hack, not a solution.

  • Head-of-line blocking: If request #1 stalls, requests #2โ€“6 wait behind it on that connection.
  • No header compression: Same headers (cookies, user-agent) are sent with every request โ€” can be kilobytes of overhead.
  • Text-based protocol: Human-readable but inefficient to parse.

HTTP/2 โ€” Multiplexing

โ–พ

HTTP/2 sends multiple requests over a single TCP connection simultaneously using streams. Each request is a stream; they're multiplexed. This eliminates the need for 6 parallel connections.

  • Multiplexing: Many requests in parallel over one connection. No more 6-connection hack.
  • Header compression (HPACK): Headers are compressed and deduplicated โ€” massive bandwidth saving for repeated requests.
  • Server Push: Server can proactively send resources the client hasn't requested yet (e.g., push CSS before browser parses HTML). Rarely used well in practice.
  • Binary framing: Efficient machine-readable format instead of text.
  • Still TCP: TCP-level head-of-line blocking remains. One lost packet stalls all streams.

HTTP/3 โ€” Built on QUIC (UDP)

โ–พ

HTTP/3 replaces TCP with QUIC โ€” a protocol built on UDP that reimplements TCP's reliability features but eliminates TCP's head-of-line blocking.

  • QUIC solves TCP HOL blocking: Each stream is independent. A lost packet only stalls its stream, not all streams.
  • 0-RTT connection resumption: Reconnecting to a known server can send data immediately โ€” no handshake wait.
  • Built-in TLS 1.3: Connection establishment and encryption happen in parallel.
  • Connection migration: Switching from WiFi to mobile doesn't drop the connection โ€” identified by connection ID, not IP.

QUIC on UDP means firewalls and middleboxes that block UDP traffic can't use HTTP/3. Many corporate networks block UDP port 443. Clients fall back to HTTP/2 in those cases. Always support both.

FeatureHTTP/1.1HTTP/2HTTP/3
MultiplexingNoYes (TCP)Yes (QUIC)
HOL BlockingYesAt TCP levelNo
Header CompressionNoHPACKQPACK
TransportTCPTCPUDP (QUIC)
0-RTT ReconnectNoNoYes
TLS RequiredOptionalEffectively yesAlways
โฆ

Networking 03

TLS & HTTPS

TLS (Transport Layer Security) provides encryption, authentication, and integrity for network communication. HTTPS = HTTP over TLS.

TLS 1.3 Handshake (Simplified)

Client                              Server
  |                                    |
  |-- ClientHello (supported ciphers,  |
  |   random, key_share) ------------->|
  |                                    |
  |<-- ServerHello (chosen cipher,     |
  |    random, key_share, cert,        |
  |    Finished) ----------------------|
  |                                    |
  |-- Finished + HTTP Request -------->|   โ† 1 RTT only in TLS 1.3!
  |                                    |
  |<-- HTTP Response ------------------|

TLS 1.3 reduced the handshake from 2 RTTs (TLS 1.2) to 1 RTT โ€” and 0 RTTs for session resumption. This was a major latency improvement. Always use TLS 1.3 in new systems.

Where TLS Termination Happens

  • At the load balancer / reverse proxy: Most common. NGINX/ALB decrypts traffic, forwards plain HTTP to backend. Simpler certificate management, backend servers don't need TLS config. Traffic inside your network is unencrypted.
  • End-to-end (mTLS): Traffic is encrypted all the way to the backend service. Required for zero-trust architectures. Each service has a certificate and verifies the other (mutual TLS). More secure, more operational overhead.
โฆ

Networking 04

DNS Resolution

DNS translates human-readable domain names (google.com) into IP addresses (142.250.80.46). Every network request begins with DNS โ€” it's the phone book of the internet.

Resolution Flow

Browser โ†’ OS DNS cache?  โ†’ Yes: use IP
                         โ†’ No:
Browser โ†’ Local Resolver (ISP / 8.8.8.8)
  Resolver โ†’ Root Name Server (knows TLDs)
  Resolver โ†’ TLD Name Server (.com, .net)
  Resolver โ†’ Authoritative Name Server (owns google.com records)
  Authoritative โ†’ returns IP + TTL
  Resolver โ†’ caches + returns to browser

DNS Record Types

RecordPurposeExample
AIPv4 addressgoogle.com โ†’ 142.250.80.46
AAAAIPv6 addressgoogle.com โ†’ 2a00:1450:...
CNAMEAlias to another namewww โ†’ google.com
MXMail servergmail.com โ†’ smtp.google.com
TXTArbitrary text (SPF, verification)v=spf1 include:...
NSName servers for domainns1.example.com
SOAStart of Authority โ€” zone metadataSerial, refresh intervals

DNS for System Design

  • DNS load balancing: Return multiple A records for the same domain. Client picks one (often first). Low-overhead global LB, but no health checks โ€” dead servers still get traffic until TTL expires.
  • GeoDNS: Return different IPs based on client's location. Users in Asia get Asia region IPs. Used by CDNs and multi-region apps.
  • Low TTL for failover: Set TTL to 60 seconds for services that need fast failover. Cost: more DNS queries, more load on resolvers.
  • DNS propagation: Changing a DNS record takes up to TTL seconds to propagate. Plan deployments around this.
โฆ

Networking 05

WebSockets

WebSockets provide full-duplex, persistent communication between client and server over a single TCP connection. Unlike HTTP (request โ†’ response), WebSockets allow the server to push data to the client at any time.

HTTP is like a walkie-talkie โ€” one party talks, the other listens, then they switch. WebSockets is like a phone call โ€” both parties can talk simultaneously, any time, without one needing to "initiate".

Upgrade Handshake

Client โ†’ Server:
  GET /chat HTTP/1.1
  Upgrade: websocket
  Connection: Upgrade
  Sec-WebSocket-Key: dGhlIHNhbXBsZQ==

Server โ†’ Client:
  HTTP/1.1 101 Switching Protocols
  Upgrade: websocket
  Connection: Upgrade
  Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

// Connection is now a WebSocket. Both sides can send frames anytime.

WebSockets at Scale: The Hard Parts

  • Sticky sessions required: A WebSocket connection is stateful โ€” it lives on one server. Your load balancer must route a client to the same server every time (sticky sessions / IP hash).
  • Horizontal scaling problem: If User A is on Server 1 and User B is on Server 2, how does a message from A reach B? You need a pub/sub broker (Redis Pub/Sub, Kafka) to broadcast between servers.
  • Connection limits: Each open WebSocket is a file descriptor. A Linux server defaults to ~65k open connections. Tune ulimit, use event-driven servers (Node.js, Golang), not one thread per connection.
  • Heartbeats: Idle connections are killed by NATs and load balancers after 60โ€“90 seconds. Send ping/pong frames every 30s to keep connections alive.
โฆ

Networking 06

REST APIs

REST (Representational State Transfer) is an architectural style for building APIs over HTTP. It's not a protocol โ€” it's a set of constraints. When followed correctly, REST gives you a predictable, cacheable, scalable interface.

REST Constraints

  • Stateless: Each request contains all information needed. Server holds no client state. Enables horizontal scaling.
  • Uniform interface: Resources identified by URLs. Standard HTTP verbs (GET, POST, PUT, DELETE, PATCH). Standard status codes.
  • Cacheable: GET responses can be cached by browsers, CDNs, proxies. Must be explicit about cache headers.
  • Client-Server: Clear separation of concerns. UI and data storage evolve independently.

HTTP Status Codes That Matter

CodeMeaningWhen to Use
200 OKSuccessSuccessful GET, PUT, PATCH
201 CreatedResource createdSuccessful POST
204 No ContentSuccess, no bodySuccessful DELETE
400 Bad RequestInvalid inputValidation errors
401 UnauthorizedNot authenticatedMissing/invalid token
403 ForbiddenNot authorizedValid token, wrong permissions
404 Not FoundResource missingID doesn't exist
409 ConflictState conflictDuplicate create, optimistic lock fail
429 Too Many RequestsRate limitedRate limit exceeded
500 Internal Server ErrorServer bugUnhandled exceptions
503 Service UnavailableOverloaded/downCircuit open, health check fail
โฆ

Networking 07

GraphQL

GraphQL is a query language for APIs. Instead of the server deciding what data to return, the client specifies exactly what it needs. This eliminates over-fetching and under-fetching.

REST over-fetching problem

  • GET /users/123 returns 40 fields
  • You need 3 fields
  • 37 fields are wasted bandwidth
  • Mobile clients on 3G are hurt most

REST under-fetching (N+1)

  • GET /users returns 20 users
  • For each user, GET /users/id/posts
  • = 21 HTTP requests for one screen
  • Each is an extra RTT
# GraphQL: client asks for exactly what it needs
query {
  user(id: "123") {
    name
    email
    posts(last: 5) {
      title
      createdAt
    }
  }
}
# One request โ†’ exactly the data you asked for

GraphQL tradeoffs: More complex caching (POST requests, dynamic queries โ†’ harder to CDN cache). N+1 queries on the server side without DataLoader. Schema complexity grows over time. Not always the right choice โ€” REST is simpler for internal APIs. GraphQL shines for mobile apps and when multiple clients need different data shapes from the same API.

โฆ

Networking 08

gRPC

gRPC is a high-performance RPC framework by Google. Uses HTTP/2 as transport, Protocol Buffers for serialisation. Designed for internal service-to-service communication.

gRPC Advantages

  • ~10x faster than JSON/REST (binary Protobuf)
  • Strongly typed schema (proto files)
  • Auto-generated client SDKs in any language
  • Streaming support (client, server, bidirectional)
  • HTTP/2 multiplexing

gRPC Disadvantages

  • Not human-readable (binary)
  • Browser support requires gRPC-Web proxy
  • Harder to debug without tooling
  • Schema evolution needs care
  • Not appropriate for public APIs

In system design: use REST or GraphQL for external/public APIs (developer-friendly, cacheable, browser-native). Use gRPC for internal microservice communication (performance, type safety, streaming). This is what Google, Netflix, and Uber do.

โฆ

Networking 09

Connection Pooling

Opening a TCP connection + TLS handshake takes ~3 RTTs and significant CPU. Creating a database connection is even more expensive (auth, session setup). Connection pooling reuses existing connections instead of creating a new one per request.

Classic failure: An application opens a new DB connection per request. At 1000 req/s, it tries to open 1000 DB connections. PostgreSQL has a default max of 100. Connections are refused. The app fails. A connection pool of 20โ€“50 connections handles this easily โ€” requests wait briefly in the pool queue rather than failing.

Sizing a Connection Pool

A common formula: pool_size = (core_count * 2) + effective_spindle_count. More connections don't always mean more throughput โ€” too many connections cause context switching overhead. Start small (10โ€“20), measure, then tune.

โฆ

Networking 10

Long Polling & Server-Sent Events

Long Polling

  • Client sends request, server holds it open
  • Server responds when data is available
  • Client immediately re-connects after response
  • Works through all firewalls (plain HTTP)
  • Higher latency than WebSockets
  • Good for: chat fallback, notifications

Server-Sent Events (SSE)

  • One-way: server โ†’ client only
  • Single long-lived HTTP connection
  • Auto-reconnect built in
  • Text-based (easy to debug)
  • Works through HTTP/2 multiplexing
  • Good for: live dashboards, stock tickers, notifications

Choosing the right real-time protocol: SSE for one-way server push (simpler), WebSockets for bidirectional (chat, gaming, collaboration), Long Polling as a fallback when WebSockets are blocked. Many production systems (Slack, GitHub) use WebSockets with Long Polling fallback.

Module 02 Quiz

Test Your Networking Knowledge

Scenario-based questions. Select the best answer โ€” then read the explanation.

Q1. Your API uses HTTP/1.1. Users complain about slow page loads โ€” the browser makes 40 parallel requests to load a page. You switch to HTTP/2. What is the PRIMARY improvement?

Q2. You're building a multiplayer game that sends 60 position updates per second per player. Which protocol should you use?

Q3. A client profiling tool shows: DNS 50ms, TCP connect 80ms, TLS handshake 160ms, server processing 2,600ms, transfer 110ms โ€” total 3s. Where do you focus optimization?

Q4. A response arrives with Cache-Control: no-cache. What does this actually mean?

Q5. You need real-time serverโ†’client notifications (new messages, alerts) for 500K concurrent users. Lowest server resource footprint?

Q6. Your gRPC service handles 50K RPC calls/second. New requirement: add a browser client. Browsers cannot use gRPC natively. Standard solution?