flowchart LR
A["<b>DHCP</b><br/>Get IP address,<br/>router IP, DNS IP<br/><i>→ broadcast on LAN</i>"]
B["<b>ARP</b><br/>Resolve router's<br/>MAC address<br/><i>→ broadcast query</i>"]
C["<b>DNS</b><br/>Resolve hostname<br/>to IP address<br/><i>→ hierarchical lookup</i>"]
D["<b>TCP</b><br/>Establish reliable<br/>connection<br/><i>→ 3-way handshake</i>"]
E["<b>HTTP</b><br/>Fetch web page<br/>content<br/><i>→ GET request</i>"]
A --> B --> C --> D --> E
style A fill:#4477AA,color:#fff,stroke:#4477AA
style B fill:#66CCEE,color:#000,stroke:#66CCEE
style C fill:#228833,color:#fff,stroke:#228833
style D fill:#CCBB44,color:#000,stroke:#CCBB44
style E fill:#EE6677,color:#fff,stroke:#EE6677
2 First Principles for Networked Systems
2.1 The Question Behind Every System
In October 1986, throughput on some Internet paths dropped by a factor of 1000. TCP — the protocol that had worked reliably for over a decade — was the cause. The engineers who built TCP had answered three structural questions well but left a fourth unanswered. For ten years, this gap remained invisible. Then the Internet grew, and the missing answer became a catastrophe. This chapter teaches you what those four questions are, why every networked system must answer all of them, and how to use them as a design tool that goes beyond description.
In an introductory networking course, you learn how individual protocols work: TCP delivers bytes reliably, DNS resolves names, HTTP fetches web pages. Each protocol has its own rules, its own header format, its own behavior. What becomes visible at a deeper level is that every one of these protocols is a networked system — a collection of components that must make design choices about the same four questions:
- What information should the system maintain? (State)
- When do things happen? (Time)
- Who makes decisions? (Coordination)
- How do components talk to each other? (Interface)
These four questions define the design space of networked systems. They are structural invariants: irreducible questions that every system must answer, regardless of whether it operates at the physical layer, the transport layer, or the application layer. A WiFi access point answers them. TCP answers them. A video streaming client answers them. A broadband measurement system answers them.
The answers differ radically — but the questions do not.
You can study each invariant on its own — asking “what state does TCP maintain?” is a coherent question independent of “who coordinates?” But the answers interact. TCP’s choice of distributed coordination (no central authority) forces it to build state from local observations only. Its interface inheritance from IP (unreliable datagrams) rules out centralized scheduling. The four invariants are separable for analysis but coupled in practice: the answer to one shapes what answers are feasible for the others.
This chapter introduces the analytical framework that organizes the rest of the book. The framework has three components:
- Four invariants that define what must be answered: State, Time, Coordination, Interface.
- Three design principles that describe recurring strategies for constructing good answers under constraints: Disaggregation, Closed-loop reasoning, Decision placement.
- A method — the anchored dependency graph — that traces how one design choice constrains the rest.
The framework explains failures and generates design alternatives. Together, these components give you a vocabulary for describing any networked system, reasoning tools for predicting system behavior, and a method for producing new designs when environmental constraints shift. The framework is a design tool, distinct from a protocol catalog (a chapter-per-protocol walkthrough of TCP, DNS, and BGP belongs in an introductory course) and distinct from a layered description in the OSI tradition — it cuts across layers, because the same invariant questions arise at every layer. It is a design tool: given any system, answer four questions, trace their dependencies, and you can predict its failure modes and generate its alternatives.
Three intellectual traditions underpin this framework. Wiener’s cybernetic program (Wiener 1948) identified the feedback loop as the organizing principle of adaptive systems — the foundation of our Closed-loop reasoning principle. Saltzer, Reed, and Clark’s end-to-end arguments (Saltzer et al. 1984) formalized where functionality belongs in layered architectures — the question our Interface invariant and Decision placement principle address. Shannon’s information theory (Shannon 1948) established fundamental limits on what any observer can learn about a noisy channel — the constraint that shapes every system’s State invariant, because measurement signals are always partial and delayed. This book’s contribution is the specific four-invariant decomposition, the anchored dependency graph as an analytical tool, and the separation of invariants from principles as a pedagogical structure for teaching networked systems.
2.2 A Motivating Example: What Happens When You Load a Web Page
You open your laptop, connect to WiFi, and type www.google.com into a browser. Before the page renders, five protocols must fire in sequence: DHCP, ARP, DNS, TCP, and HTTP. Each protocol solves a different problem, uses a different strategy, and makes different structural choices. Figure 2.1 shows the network scenario — a laptop on a local network, a WiFi router that serves as the first hop, and servers reachable through the Internet. This scenario — familiar from an introductory networking course — is the framework’s first test case. Walk through it concretely, then ask: why do these five protocols look so different from each other?
The laptop starts with zero configuration — it lacks an IP address, a router address, and a DNS server address. Sending a targeted request requires knowing whom to ask, and the laptop lacks that knowledge. DHCP (Dynamic Host Configuration Protocol) (Droms 1997) resolves this bootstrap problem. The client constructs a DHCP Discover message and broadcasts it on the LAN using destination MAC address FF:FF:FF:FF:FF:FF — the Ethernet broadcast address that every device on the local network receives. The encapsulation stack is DHCP → UDP → IP → Ethernet. A router on the LAN runs the DHCP server, receives the broadcast, and replies with three critical pieces of information: the client’s assigned IP address, the first-hop router’s IP address, and the DNS server’s IP address. Why UDP and not TCP? TCP requires an established connection, and establishing a connection requires the client to already have an IP address — the very thing it lacks. UDP’s connectionless delivery sidesteps the circular dependency.
The client now knows the router’s IP address (say 10.0.0.1) but still needs the router’s MAC address to send an Ethernet frame. Ethernet frames are addressed by MAC, and IP addresses alone are insufficient. ARP (Address Resolution Protocol) (Plummer 1982) bridges this gap. The client broadcasts an ARP query: “Who has 10.0.0.1? Tell me your MAC address.” The router replies with its MAC address (say 00:1A:2B:3C:4D:5E). The client caches this mapping and can now address Ethernet frames directly to the router. Without ARP, every data frame would need to be broadcast to the entire LAN — extremely wasteful on a network with dozens or hundreds of devices.
The client can reach the router. Now it needs to resolve www.google.com to an IP address. The client constructs a DNS query (Mockapetris 1987a, 1987b) and encapsulates it in a UDP segment destined for port 53 on the DNS server. The query leaves the local network, traversing potentially multiple routed networks — forwarded by OSPF (Open Shortest Path First) within an administrative domain, by BGP (Border Gateway Protocol) across domain boundaries. The DNS server resolves the name through a hierarchy of lookups (root server → .com TLD server → Google’s authoritative server) and replies with an IP address: 216.58.193.196.
The client knows the server’s IP address. Now it establishes a reliable transport connection. TCP’s (Postel 1981) three-way handshake fires: the client sends a SYN segment to 216.58.193.196 port 80, the server replies with SYN-ACK, and the client completes the handshake with ACK. The connection is established. TCP will handle reliability (retransmitting lost segments), ordering (reassembling out-of-order arrivals), and congestion control (adapting the sending rate to available capacity) for everything that follows.
The client sends an HTTP GET request through the TCP socket: GET / HTTP/1.1. The web server processes the request and replies with the HTML content of the page. The browser parses the HTML, requests additional resources (CSS, JavaScript, images) through additional HTTP requests on the same or parallel TCP connections, and renders the page. The entire sequence — DHCP, ARP, DNS, TCP, HTTP — fired in under a second.
Figure 2.2 summarizes the five-step sequence. Each protocol solves a different problem and leaves the laptop with a new piece of state that the next protocol requires.
Five protocols, five different designs. DHCP broadcasts; ARP broadcasts; DNS traverses a hierarchy; TCP handshakes between two endpoints; HTTP issues a stateless request-response. Why such diversity? The framework’s claim: the differences are systematic. Each protocol faces different constraints, and the constraints determine the invariant answers. Table 2.1 previews how DHCP, DNS, and TCP answer the four invariants — the rest of the chapter unpacks why.
| DHCP | DNS | TCP | |
|---|---|---|---|
| State | Centralized — server tracks address pool and leases | Hierarchically distributed — partitioned namespace with caching | Distributed — endpoints maintain cwnd, SRTT, sequence numbers |
| Time | Lease-based — prescribed duration, client renews or loses address | TTL-based — authority prescribes expiration; caches obey | Inferred — RTT estimated from ACK arrivals; RTO governs loss detection |
| Coordination | Centralized — server decides, client accepts | Hierarchical delegation — each zone authoritative for its own names | Distributed — each sender adjusts independently via AIMD |
| Interface | UDP broadcast (forced by bootstrap: client has no IP yet) | UDP on port 53 (TCP fallback for large responses) | Reliable byte stream above (socket API); unreliable datagrams below (IP) |
Why these particular answers? The constraints shaped the designs:
- DHCP centralizes because IP address uniqueness on a subnet requires a single allocator — two independent allocators risk assigning the same IP to different clients.
- TCP distributes because the Internet’s administrative decentralization means no single entity can observe all flows traversing a bottleneck link.
- DNS hierarchicalizes because global name resolution is too large for one server and too coordination-intensive for pure distribution — the hierarchy trades query latency for scalability.
These three protocols will serve as recurring examples throughout this chapter, analyzed in depth in each invariant section below. Later chapters apply the same invariant analysis to new systems: wireless medium access (Chapter 3), cellular architecture (Chapter 4), congestion control beyond AIMD (Chapter 8), queue management (Chapter 7), multimedia applications (Chapter 11), and network measurement (Chapter 12).
2.3 The Four Invariants
The web page example demonstrated that different protocols make different structural choices. The invariants explain why: each protocol faces different constraints, and the constraints determine the answers. The framework’s analytical core is a decomposition into four irreducible questions that every networked system must answer. They are structural invariants. A system that leaves one unanswered has answered it by default, usually badly. Figure 2.3 visualizes how the four invariants and their solution spaces interact.
Answering these four questions well requires understanding each invariant’s dimensions and the constraints that limit feasible answers. The sections below unfold each invariant in detail.
Critically, the four invariants are coupled — they form a dependency chain that starts from the system’s binding constraint. For TCP, that chain runs Interface → Coordination → State → Time: IP’s interface (unreliable datagrams) combined with administrative decentralization forces distributed coordination; distributed coordination forces each endpoint to run its own finite state machine with endpoint-local state; and the state model’s measurement needs determine timing mechanisms (event-driven ACK processing, inferred retransmission timeouts). For other systems, the chain starts from a different invariant entirely. WiFi’s binding constraint is the shared wireless medium — a physics constraint that shapes Coordination first (distributed contention), from which State and Time follow. Video streaming’s binding constraint is human perceptual time — a Time constraint that shapes what State (playback buffer) and Coordination (server-driven adaptation) must look like. The dependency chain always starts from the binding constraint — the invariant answer hardest to change — and cascades through the others. Section 1.5 (The Anchored Dependency Graph) formalizes this cascade as an analytical method.
2.3.1 State — What Exists?
Every system maintains state — objects, resources, configuration, histories. TCP maintains a connection state machine. A DNS resolver maintains a cache of recent name-to-address mappings. A router maintains a forwarding table built from routing protocol advertisements.
Among the four invariants, State is the most diagnostically useful, because it decomposes into three layers:
Environment state is what actually exists, independent of any observer. The true bottleneck queue occupancy on a network path. The actual available bandwidth. The true congestion level at every router along a path. The environment exists whether or not anyone measures it.
Measurement signal is what an agent can observe about the environment, and at what cost. ACK arrival times in TCP. Timeout events in TCP. DNS TTL expiration as a signal that a cached record is stale. Every measurement is partial, delayed, and sometimes misleading.
Internal belief is the agent’s model of the environment, constructed from measurement signals. TCP’s congestion window is a belief about available capacity. A DNS cache entry is a belief that a domain still maps to a particular IP address. A routing table entry is a belief about the best next hop toward a destination.
The gap between environment state and internal belief is where most system failures live. Bufferbloat occurs when TCP’s belief about available capacity diverges from reality because large buffers delay the loss signal. DNS cache poisoning occurs when a resolver’s belief about a domain’s address diverges from reality because a malicious response corrupted the measurement signal. In both cases, the measurement signal fails to track the environment — and the system makes decisions based on a stale or incorrect model.
This three-layer decomposition — environment, measurement, belief — is one of the most diagnostic tools in the framework. When a system fails, ask: where did the belief diverge from the environment? Was the measurement signal delayed? Ambiguous? Missing entirely? The answer identifies not just the failure but the class of fix required.
The three-layer decomposition is diagnostic and generative — it defines the innovation frontier for any adaptive system. Three questions drive every redesign: what to measure (loss events? RTT changes? delivery rate? explicit router marks?), how to infer (exponentially weighted averages? windowed extrema? machine-learned models?), and how to validate the belief against reality (does the model’s prediction match observed behavior?). TCP’s 40-year evolution is a sequence of answers to these three questions. Jacobson (Jacobson 1988) measured ACK timing and inferred via EWMA. CUBIC (Ha et al. 2008) kept the same measurements but changed the inference model from linear to cubic growth. BBR (Bottleneck Bandwidth and RTT) (Cardwell et al. 2017) changed what is measured — delivery rate and minimum RTT rather than loss — and changed how it is inferred — windowed estimators rather than smoothed averages. Each redesign changes what is measured or how it is modeled; the environment itself is the same bottleneck queue, the same competing traffic, the same lossy links. The innovation is always in the observation and the inference — and this remains an intellectually active frontier, because no belief model yet tracks the environment perfectly under all conditions.
Later chapters apply this three-layer decomposition to systems where the environment-measurement-belief gap produces different failure modes: wireless medium access (Chapter 3), queue management (Chapter 7), and adaptive video streaming (Chapter 11).
2.3.2 Time — When Do Things Happen?
Every system must handle time and ordering. Some systems prescribe time: Ethernet defines a fixed slot time for collision detection, and DHCP assigns lease durations that clients must respect. Other systems infer time: TCP estimates round-trip time from ACK arrivals, and DNS resolvers use TTL values to decide when a cached record has expired. The distinction matters.
The choice between prescribed and inferred time has architectural consequences. Prescribed time requires shared infrastructure — a common clock or a common medium. Ethernet’s collision detection requires all stations on a segment to agree on a slot time. Inferred time requires only local observation — TCP estimates RTT from its own ACK arrivals, making it deployable across any path without shared infrastructure. The tradeoff: prescribed time enables precise coordination but demands shared infrastructure; inferred time works anywhere but the inference is noisy, delayed, and sometimes wrong.
Time governs failure detection, synchronization, and adaptation speed. TCP’s retransmission timeout determines how quickly it detects loss. Ethernet’s slot time determines collision detection range. DNS’s TTL determines how quickly clients learn about address changes. At a deeper level, every timeout in a networked system — TCP’s RTO, DHCP’s lease duration, OSPF’s hello interval — exists because distributed systems require timing assumptions to distinguish a slow process from a dead one. Fischer, Lynch, and Paterson (Fischer et al. 1985) proved this impossibility rigorously: deterministic agreement in an asynchronous system with even one crash failure is impossible. Timeouts are the engineering workaround — they introduce timing assumptions that make the problem solvable at the cost of occasional false positives (declaring a slow process dead) or false negatives (waiting too long to detect a genuine failure).
Many protocol designs are fundamentally about making time tractable. Jacobson’s RTT estimator (Jacobson 1988) transforms noisy per-packet timing observations into a smoothed estimate that TCP can act on. Lamport clocks replace physical time with logical ordering (Lamport 1978). In each case, the redesign changes what “time” means for the system — and the consequences cascade through the other invariants.
Walk through the three focal protocols to see how different timing strategies arise from different constraints.
DHCP: prescribed time. The server sets the lease duration — the client obeys. Short leases (1 hour at a coffee shop) reclaim addresses quickly from departed clients but increase renewal traffic. Long leases (24 hours on a campus) reduce overhead but leave addresses tied to clients that departed hours ago. The failure mode is stark: a client whose lease expires mid-session loses all active connections. The timing parameters encode a single tradeoff: freshness of address allocation vs. overhead of renewal.
DNS: prescribed by authority, inferred by resolver. Each DNS record carries a TTL set by the zone administrator — the resolver obeys passively. The tension between freshness and load is fundamental: short TTLs (30–60 seconds) enable fast updates but cause query storms at authoritative servers; long TTLs (hours to days) reduce load but serve stale mappings. Unlike DHCP, where client and server have a direct relationship, DNS timing is set unilaterally by the authority — the resolver accepts the TTL as given.
TCP: fully inferred. TCP has no prescribed time. The sender infers timing from the only signal available: when ACKs arrive. Jacobson’s RTT estimator (Jacobson 1988) maintains a smoothed estimate (SRTT) and a smoothed mean deviation (RTTVAR), updated with each ACK. The retransmission timeout is RTO = SRTT + 4*RTTVAR — a bet that the next RTT will fall within four mean deviations of the smoothed estimate.1
The failure modes are symmetric:
- RTO too low → spurious retransmissions. The sender retransmits segments that are still in flight, wasting bandwidth and potentially triggering unnecessary congestion responses.
- RTO too high → stalls after genuine loss. The connection freezes for hundreds of milliseconds while the application waits.
At scale, both costs are enormous: a timeout miscalibrated by 10ms across millions of connections either wastes measurable latency on every lost packet or triggers millions of spurious retransmissions per second. The difficulty is that RTT varies continuously — with queue occupancy, route changes, and competing traffic. A path that had a 20ms RTT can spike to 200ms when a buffer fills. Jacobson’s estimator is a low-pass filter on a noisy signal; sudden shifts take multiple RTTs to propagate into the belief.
| Protocol | Prescribed vs. Inferred | What Governs Timing | Failure Mode When Timing Goes Wrong |
|---|---|---|---|
| DHCP | Prescribed — server sets lease duration, T1/T2 renewal thresholds | Lease duration (hours); T1 at 50%, T2 at 87.5%2 | Lease expires → client loses IP, all connections drop; too-short leases → excessive renewal traffic |
| DNS | Prescribed by authority, inferred by resolver — TTL set at authoritative server, passively obeyed by caches | TTL value per record (seconds to days) | TTL too long → stale records served, users reach wrong server; TTL too short → query storms at authoritative servers |
| TCP | Inferred — RTT estimated from ACK arrivals via Jacobson’s algorithm | RTO = SRTT + 4 * RTTVAR, updated per ACK | RTO too low → spurious retransmissions waste bandwidth; RTO too high → connection stalls after genuine loss |
The Time invariant generalizes well beyond these three protocols. OSPF’s hello interval (default 10 seconds) is prescribed time — routers declare a neighbor dead after missing a fixed number of hellos, and the interval determines how quickly the network detects a link failure. BGP’s hold timer (default 90 seconds) is similarly prescribed but far more conservative, reflecting the protocol’s preference for stability over rapid convergence. ARP cache entries expire after a timeout (typically 20 minutes on Linux), a prescribed duration that balances stale-mapping risk against broadcast overhead. Ethernet’s slot time (51.2 microseconds on 10 Mbps Ethernet) prescribes the collision detection window and thereby constrains maximum segment length. In every case, the same question applies: is the timing prescribed or inferred, and what fails when the timing is wrong?
Later chapters examine time in systems with tighter constraints: microsecond-scale slot times in wireless (Chapter 3), millisecond-scale scheduling in cellular networks (Chapter 4), and sojourn-time measurement in queue management (Chapter 7).
2.3.3 Coordination — Who Decides?
Whenever multiple actors share a resource, the system must define coordination rules. Who has authority? How do components agree? How do conflicts resolve?
TCP congestion control is distributed — each sender adjusts independently, with coordination emerging from shared bottleneck effects. DNS uses hierarchical coordination — root servers delegate to TLD servers, which delegate to authoritative servers, with caching at every level reducing the load on higher tiers. OSPF routing uses distributed coordination — each router computes shortest paths independently from link-state advertisements that all routers share. These are different points on the coordination spectrum, shaped by different constraints.
Neither distributed nor centralized coordination dominates. Centralized coordination enables global optimization: a DHCP server sees the entire address pool and allocates without conflict — but it is a single point of failure, and all clients must be able to reach it. Distributed coordination enables scale and fault tolerance: TCP works across arbitrary paths without prior arrangement — but convergence is slow and fairness is approximate. The right answer depends primarily on the anchoring constraints. The Internet’s administrative decentralization pushes toward distributed endpoints.
The three focal protocols span the coordination spectrum:
DHCP: centralized. A single server holds the address pool and has sole authority to allocate. This is forced by the constraint that IP addresses on a subnet must be unique — two independent allocators could assign the same IP to different clients. Centralization eliminates the coordination problem at the cost of a single point of failure. Engineering extensions (relay agents for multi-subnet reach, failover protocols for availability) mitigate the costs without distributing the decision.
DNS: hierarchical. Authority is partitioned: the root zone delegates .com to Verisign’s TLD servers; Verisign delegates google.com to Google’s authoritative servers. Each level is authoritative for its own zone. The delegation chain defines who decides what, without requiring any two authorities to agree on anything outside their own zone. Caching adds an implicit second layer: resolvers absorb query load by serving cached records, reducing root server traffic by orders of magnitude. Coordination failure manifests as inconsistency — a resolver with a stale cache directs users to a decommissioned server — and persists until the TTL expires.
TCP: distributed. No central authority, no delegation hierarchy, no explicit communication between competing flows. Each sender adjusts its rate based solely on local observations. Chiu and Jain (Chiu and Jain 1989) proved that AIMD converges to fairness through a geometric argument: additive increase moves all flows toward efficiency at 45 degrees; multiplicative decrease moves them toward fairness along a ray through the origin.3 But convergence is slow and approximate:
- RTT bias: a flow with 10ms RTT increases its window 10x faster than one with 100ms RTT — no mechanism to compensate without a central scheduler.
- Signal ambiguity: a packet dropped by congestion and one dropped by a bit error look identical to the sender — both trigger multiplicative decrease, even though only congestion warrants rate reduction.
| Protocol | Coordination Style | Who Decides | What Happens When Coordination Fails |
|---|---|---|---|
| DHCP | Centralized — single server holds address pool | DHCP server; client requests, server grants or denies | Server crash → no new leases, renewals fail; dual servers without failover → address conflicts |
| DNS | Hierarchical delegation — each zone authoritative for its names | Each authoritative server for its zone; caching resolvers obey TTLs | Stale caches → inconsistent answers across resolvers; root compromise → global trust failure |
| TCP | Distributed — no central authority, no inter-flow communication | Each sender independently, based on local ACK observations | RTT-biased fairness; congestion/non-congestion loss conflation; slow convergence with many flows |
The coordination spectrum extends across every protocol students encounter in an introductory course. OSPF uses distributed coordination — every router floods link-state advertisements and independently computes shortest paths, converging to consistent forwarding tables without any central authority. BGP adds policy-driven coordination — each autonomous system makes routing decisions based on local preferences, coordinating only through route announcements, which means global convergence remains unguaranteed and routing anomalies persist indefinitely. ARP uses no coordination at all — any host can claim any IP-to-MAC mapping, which is why ARP spoofing is trivially easy. Ethernet switching uses a hybrid: the spanning tree protocol elects a root bridge (centralized decision) but each switch independently computes its forwarding state from the elected topology (distributed execution). In every case, the coordination structure reflects the constraints: who has authority, what information is available to each decision-maker, and what is the cost of disagreement.
Later chapters show how shared-medium physics pushes wireless systems toward centralization at scale (Chapters 3-4).
2.3.4 Interface — How Do Components Interact?
Every system defines interfaces between layers or modules. TCP presents a reliable byte stream to applications and uses unreliable datagrams from IP. The socket API is one of the most successful interfaces in computing — applications operate without awareness of packets, loss, or reordering.
Interfaces are the most path-dependent design choice. The IP packet interface was inherited from the Internet’s original architecture (Clark 1988). It constrains everything built above it — TCP must infer congestion from endpoint observations precisely because IP routers expose only drop/forward behavior. Changing this interface is expensive: ECN took decades to deploy (Ramakrishnan et al. 2001). But the interface evolves at the margins — QUIC moved transport over UDP to avoid middlebox ossification (Langley et al. 2017).
The thin waist — a narrow interface decoupling diverse things above from diverse things below — is among the most consequential architectural patterns in networking (Deering 1998). IP’s datagram interface allows any link technology below and any application above to evolve independently. But the same stability that makes an interface valuable makes it resistant to change. Middleboxes that inspect TCP headers ossify the transport layer; adding a new TCP option becomes a multi-year deployment battle. QUIC’s response — encrypt the transport header and run over UDP — is an interface renegotiation forced by the ossification of the previous one.
The three focal protocols illustrate how constraints force interface choices — and how interfaces, once deployed, resist change.
DHCP: broadcast forced by bootstrap. The client lacks an IP address and knowledge of the server’s address — so the only available interface is Ethernet broadcast. UDP is chosen over TCP because TCP requires IP addresses on both sides. This is the chicken-and-egg problem of network bootstrapping: the client needs IP to communicate, but needs DHCP to get IP. The broadcast interface confines discovery to the local subnet, which is why relay agents are needed for multi-subnet deployments.
DNS: UDP optimized for speed, evolving under privacy pressure. Queries are small, latency-sensitive, and stateless — UDP on port 53 is a natural fit. Responses originally had to fit in 512 bytes; EDNS0 relaxed this limit. The more dramatic renegotiation is DNS-over-HTTPS (DoH) and DNS-over-TLS (DoT), driven by privacy constraints: the technical content is unchanged, but the interface shifts from plaintext to encrypted, moving trust from the network path to the resolver operator.
TCP: the largest abstraction gap in networking. Above, the socket API exposes a reliable, ordered byte stream — applications operate oblivious to packets, loss, or reordering. Below, IP delivers unreliable, unordered datagrams. TCP bridges this gap through segmentation, sequence numbering, acknowledgment, retransmission, and reordering. The abstraction hides information some applications need — a video client benefits more from skipping a lost frame than waiting for retransmission — which is why UDP exists as an alternative. The socket API is also the site of the Internet’s most consequential ossification: implemented in the OS kernel, changing TCP requires kernel updates across millions of devices. QUIC moves transport to user space over UDP to escape this ossification.
Interfaces enable independent evolution — but the same stability that makes an interface valuable makes it resistant to change. Middleboxes that depend on TCP headers ossify the transport layer; adding a new TCP option becomes a multi-year deployment battle. QUIC encrypts its headers to prevent this.
| Protocol | Abstraction Above | Abstraction Below | Ossification Risk |
|---|---|---|---|
| DHCP | OS network configuration (implicit) | Ethernet broadcast + UDP | Low — operates only at boot/renewal |
| DNS | getaddrinfo() stub resolver API |
UDP port 53 (TCP fallback); evolving to DoH/DoT | Moderate — plaintext DNS inspected by ISPs |
| TCP | Socket API: reliable ordered byte stream | IP datagrams: unreliable, unordered | High — kernel + middlebox ossification |
2.4 Design Principles: Strategies for Answering Well
The four invariants define the questions. Three recurring principles describe how good answers are constructed — solution patterns that successful systems employ under constraints. Invariants are structural: every system answers them. Principles are strategic: they capture what works. The distinction matters: invariants define the design space; principles guide navigation within it.
2.4.1 Disaggregation — Separation of Concerns Under Constraints
Separating coupled concerns into independently controllable dimensions. The Internet’s protocol stack is the canonical example: it separates naming (DNS) from addressing (IP) from routing (OSPF/BGP) from reliable delivery (TCP) from application logic (HTTP). Each layer can evolve independently — HTTP moved from 1.0 to 1.1 to 2.0 to 3.0 without changing TCP or IP. IPv4 is being replaced by IPv6 without rewriting applications. This independence would be impossible if the entire stack were a monolithic design.
The cost is interface overhead. Every separation boundary introduces a coordination signal, and that signal can degrade. The layered stack forces TCP to infer congestion from endpoint observations because IP’s interface hides router queue state. A richer interface — routers explicitly signaling congestion via ECN (Explicit Congestion Notification) — took decades to deploy (Ramakrishnan et al. 2001), precisely because changing an interface between independently evolving layers is expensive. Later chapters show disaggregation at other scales: 5G base station decomposition (Chapter 4) and control/data plane separation via SDN (Software-Defined Networking, Chapter 7).
The focal protocols demonstrate disaggregation — and what would break without it:
- DNS separates naming from addressing from routing. Google can change the IP behind
www.google.comby updating one DNS record — no routing change, no IP renumbering. If naming and addressing were fused, every IP change would require updating every name record. CDN load balancing, failover, and migration would be impossible at scale. - TCP separates reliable delivery from routing. TCP is path-agnostic — it operates independently of which route segments traverse; IP is content-agnostic — it forwards packets without knowing whether they belong to a reliable stream. If TCP required path knowledge, every route change would break active connections. The cost: TCP optimizes for path characteristics only through endpoint inference.
- DHCP separates address allocation from address use. The server manages the pool; the client uses the assigned address without knowing pool state. If clients had to participate in address management, the bootstrap problem would be intractable — coordination requires the very configuration the client is trying to obtain.
2.4.2 Closed-Loop Reasoning — How Decisions Adapt Over Time
The dynamical discipline that evaluates whether the combined invariant answers produce stable, convergent behavior. Closed-loop reasoning is an analytical lens present throughout design, active from the first invariant answer onward.
Every adaptive protocol you studied in an introductory course is a feedback loop. TCP’s AIMD is the canonical example: observe ACK stream → infer congestion → adjust sending rate → observe again. The loop period is one RTT; the gain is 1 MSS per RTT on increase, 50% on decrease. DNS caching is a simpler feedback loop: resolve a name → cache the result → serve from cache until TTL expires → re-resolve. If the TTL is too long, the cache serves stale data — the belief diverges from the environment.
The question closed-loop reasoning asks is always the same: given these invariant answers, will the system converge? What is the loop’s bandwidth — how fast can it track environmental changes? Where are the oscillation modes? TCP’s congestion control oscillates: if many flows sharing a bottleneck all detect loss simultaneously, they all halve their windows together, then ramp up together — a phenomenon called global synchronization that wastes link capacity. Later chapters examine feedback loops in systems with different dynamics: exponential backoff in wireless medium access (Chapter 3), queue management algorithms that reshape the congestion signal (Chapter 7), and model-based congestion control that restructures the loop entirely (Chapter 8).
Each focal protocol implements a different feedback loop:
- TCP AIMD — observe ACK stream → infer congestion → adjust cwnd → observe again. Loop period: one RTT (10–200ms). The asymmetric gain — slow increase (linear), fast decrease (multiplicative) — probes cautiously but retreats aggressively. When many flows share a bottleneck, they all probe and retreat together, causing global synchronization — the bottleneck alternates between overload and underutilization. RED and its successors (Chapter 7) address this by desynchronizing loss signals.
- DNS TTL — query → cache → serve from cache → TTL expires → re-query. Loop period: the TTL (30 seconds to 24 hours), set unilaterally by the zone administrator. The resolver has no control over the loop speed. Too slow → stale records served for hours. Too fast → query storms at authoritative servers. The loop has no adaptive gain — a static parameter, not a dynamic control.
- DHCP lease renewal — an escalating loop. At T1 (50% of lease), unicast renewal to original server. At T2 (87.5%), escalate to broadcast. At lease expiry, relinquish address — all connections drop. The loop period is the lease duration, and the failure mode is catastrophic.
Given these invariant answers, will the system converge? How fast can the loop track environmental changes? Where are the oscillation modes? TCP’s loop is well-designed for a single flow but exhibits emergent oscillation when multiple loops interact through a shared bottleneck — a recurring theme in Chapters 7 and 8.
2.4.4 A Preview: Local Measurement, Remote Reality
The three design principles interact most visibly when local observation must stand in for remote conditions. Carrier sense in wireless networks — where a sender listens to the medium to infer whether the receiver’s medium is free — is the canonical example. Chapter 3 examines how this limitation produces the hidden terminal and exposed terminal problems. The pattern recurs throughout: TCP infers congestion from endpoint observations that diverge from actual queue state. DNS caching serves records that have changed at the authoritative server since the last query. The design question is always the same: how much can local observation tell you about conditions elsewhere?
2.5 The Anchored Dependency Graph
The invariants define what to answer; the principles describe what works. In any given system, one or two anchoring choices constrain the feasible answers to the remaining invariants. The dependency chain always starts from the binding constraint — the invariant answer that is hardest to change — and cascades through the others. The binding constraint differs across systems. For TCP, the binding constraint is Interface (IP’s unreliable datagram service). For WiFi, the binding constraint is a physics constraint on the shared medium that shapes Coordination first (distributed contention via CSMA/CA). For video streaming, the binding constraint is human perceptual time — a Time constraint (< 150ms for interactive, ~1 second for buffered playback) that dictates what State (playback buffer, bitrate model) and Coordination (server-driven adaptation) must look like. The dependency chain runs from constraint to consequence; its starting point depends on the system’s specific binding constraint.
Anchors come from several sources: physics (shared-medium propagation in WiFi), legacy interfaces (IP’s best-effort datagram service), institutional boundaries (administrative decentralization of the Internet), or deployment realities (commodity hardware constraints). What makes something an anchor is that it is harder to change than the invariant answers it constrains.
Anchor choice(s)
(physics, architecture, legacy, deployability)
|
v
Constrain feasible answers to other invariants
|
v
Design principles guide choices within constraints
|
v
Closed-loop dynamics shape system behavior
|
v
Emergent properties (fairness, scaling, failure modes)
This cascade of constraints — anchor constraining invariants, invariants shaping principles, principles guiding implementation — is visualized in Figure 2.6. The anchored dependency graph is the method: given any system, identify the anchor, trace the constraints, understand how principles guide the choices within those constraints, and evaluate the resulting closed-loop dynamics.
Three worked examples demonstrate the method by tracing, step by step, how the anchor forces each subsequent invariant answer.
TCP worked example. Start from the binding constraint: IP provides unreliable datagrams across administratively decentralized paths. No single entity controls the paths that TCP segments traverse; no router is obligated to report its queue state; no central scheduler allocates bandwidth. The Interface invariant — what IP chose to hide (reliability, ordering, congestion state) — is the constraint that forced TCP to exist. From this anchor, trace the cascade:
- Interface (the binding constraint): IP delivers best-effort datagrams — unreliable, unordered, with no congestion feedback. TCP must provide reliability above this interface. TCP’s own interface above — the socket API (
connect,send,recv) — has been essentially unchanged since 4.2BSD (1983), ossified by decades of application dependence and middlebox header inspection. QUIC’s decision to tunnel transport over UDP is a direct response to this ossification. - Coordination (forced by the interface + admin decentralization): Administrative decentralization means no central scheduler can observe all flows at a bottleneck. TCP must coordinate in a distributed fashion — each sender adjusts independently, with no knowledge of competing flows. This distributed coordination has a critical consequence: each endpoint must run its own finite state machine (CLOSED → SYN_SENT → ESTABLISHED → FIN_WAIT → CLOSED). The FSM makes every decision — whether to send data, how much, whether to retransmit, when to terminate. If TCP were centralized, endpoints would not need their own state machines; a controller would decide for them. The FSM is an artifact of the distributed coordination choice.
- State (forced by the FSM and distributed coordination): Because no central entity aggregates information, each endpoint’s FSM must build its own model of the path. The congestion window (cwnd) is the sender’s belief about available capacity, constructed from a measurement signal (ACK arrivals and their timing) that partially and noisily reflects the environment (bottleneck queue occupancy, competing traffic). SRTT (smoothed round-trip time) is the sender’s belief about path delay, constructed from the same ACK-based measurement signal. This is endpoint-local state — the only kind available when coordination is distributed. The FSM dictates what state variables are needed and what measurement mechanisms update them.
- Time (forced by the state model): The FSM makes decisions in two fundamentally different ways. Event-driven decisions react immediately to observed signals: an ACK arrives → update cwnd, advance the send window, transmit new data. Timer-driven decisions fire after a deadline: RTO expires → retransmit the unacknowledged segment. Not all FSM transitions are time-based — most are event-driven. But when timers are needed, the values must be inferred: Jacobson’s RTT estimator computes SRTT and RTTVAR from ACK arrivals, then derives RTO = SRTT + 4*RTTVAR. No server prescribes this timeout; it emerges from measurement. RTT estimation serves a dual role, bridging the State and Time invariants: it enables capacity inference (RTT changes signal congestion, informing the State invariant’s belief model) and timeout calibration (RTO must track actual delay to avoid spurious retransmissions or stalls, serving the Time invariant’s decision-timing needs).
The dependency chain for TCP runs Interface → Coordination → State → Time. Each step is forced by the previous one. A different binding constraint — say, a single-administrator datacenter — would produce a completely different cascade (relaxed coordination constraints, richer measurement signals, tighter margins), which is exactly what we see in datacenter TCP variants like DCTCP (Data Center TCP).
DNS worked example. Start from the anchor: the global namespace is too large for one server (billions of records) and administratively decentralized (millions of independent organizations control their own names). From this anchor:
- Coordination (forced by the anchor): No single entity has the authority or capacity to resolve all names. Coordination must be hierarchical — the root zone delegates to TLD servers, TLD servers delegate to authoritative servers. Each level is authoritative for its own zone and delegates everything below. This hierarchy partitions the namespace to match administrative reality.
- State (forced by hierarchical coordination): Because resolution traverses multiple levels, and because queries are frequent while namespace changes are rare, caching at every level is essential. Each resolver maintains a cache of recent query results. Each cached record is a belief — a snapshot of a name-to-address mapping that was true at query time but drifts as the environment changes. State is distributed across thousands of caching resolvers worldwide, with no central cache.
- Time (forced by distributed cached state): Cached records must expire; otherwise stale beliefs persist indefinitely. The TTL mechanism provides prescribed expiration — the authoritative server sets the TTL, and every cache obeys it. The authority controls the tradeoff between freshness (short TTL) and load (long TTL). Resolvers have no ability to negotiate or override.
- Interface (forced by all prior choices): UDP queries on port 53 for speed and simplicity, with TCP fallback for large responses. The hierarchical resolution process — iterative queries from resolver to root to TLD to authoritative — is the interface between the resolver and the distributed namespace. DNS-over-HTTPS and DNS-over-TLS represent interface renegotiations driven by privacy constraints absent from the original design era.
DHCP worked example. Start from the anchor: IP addresses must be unique on a subnet, and the client has no IP address at bootstrap. From this anchor:
- Coordination (forced by the uniqueness constraint): Two independent allocators could assign the same address to different clients, breaking both. A single allocator — the DHCP server — eliminates the coordination problem entirely. Coordination is centralized.
- State (forced by centralized coordination): The server maintains the address pool and a lease database — which addresses are available, which are assigned, to whom, and until when. This is centralized state, co-located with the centralized decision-maker.
- Time (forced by centralized state and the need for address reclamation): Addresses must be reclaimed when clients depart. The server lacks visibility into client departures (the client crashes without notifying the server). Lease duration solves this: the server prescribes a time limit, and the client must renew or lose the address. The timing is fully prescribed by the authority.
- Interface (forced by the bootstrap constraint): The client lacks an IP address, so unicast is unavailable. The only available interface is UDP broadcast on the LAN — the most primitive communication mechanism available, but the only one accessible to an unconfigured client.
Notice the structural difference: TCP’s cascade starts from an Interface constraint (IP’s unreliable datagrams) and runs Interface → Coordination → State → Time. DHCP’s cascade starts from a Coordination constraint (uniqueness requires a single allocator) and runs Coordination → State → Time → Interface. DNS’s cascade starts from a scaling-plus-decentralization constraint and produces hierarchical answers throughout.
The binding constraint determines the character of the entire system — and the starting point of the chain.
This is why the dependency graph method allows any starting invariant. The chain starts wherever the constraint is hardest to change:
- WiFi: the shared wireless medium is a physics constraint → shapes Coordination (distributed contention) → shapes State (carrier sensing, backoff counters) → shapes Time (microsecond slot times, prescribed by the standard)
- Video streaming: human perceptual time (< 150ms for interactive, ~1s buffer for playback) is the binding constraint → shapes State (playback buffer, bitrate estimation) → shapes Coordination (server-driven ABR adaptation)
- SDN: starts from a coordination goal — centralized visibility
- BBR: starts from a state redesign — explicit path model replacing loss-based inference
- QUIC: starts from an interface renegotiation — transport over UDP to escape TCP header ossification
The method is always the same: identify the binding constraint, trace the cascade through the invariants, apply design principles within the constrained space, and evaluate the resulting closed-loop dynamics. Later chapters trace the dependency graph for systems with different binding constraints: shared-medium physics in wireless (Chapter 3), finite buffers in queue management (Chapter 7), and human perceptual requirements in multimedia applications (Chapter 11).
2.5.1 How the Pioneers Discovered the Invariants
The four-invariant framework postdates TCP’s engineers. They discovered these structural questions the hard way — by building a system that worked, watching it break, and spending six years diagnosing what was missing.
RFC 793 (Postel 1981) in 1981 answered three invariants well — State (connection FSM, sequence numbers), Coordination (distributed, each endpoint runs independently), and Interface (byte stream above, datagrams below) — but left the fourth unanswered. The sending rate tracked only the receiver’s advertised window, assuming the network had sufficient capacity. For a decade, on the lightly loaded ARPANET, that assumption held. By 1986, with 5,000 hosts, throughput on some paths collapsed by a factor of 1,000. The measurement signal (ACKs) conflated receiver state with network state; when congestion became the binding constraint, the belief diverged from the environment, and the system entered a positive feedback loop that Nagle (Nagle 1984) named congestion collapse.
Four researchers each diagnosed a different invariant failure. Nagle saw the missing network-load model (State). Zhang (Zhang 1986) showed the RTT estimator was too crude to track variance (Time). Karn (Karn and Partridge 1987) found that retransmission ambiguity corrupted RTT samples (State — measurement signal). Chiu and Jain (Chiu and Jain 1989) proved AIMD is the only linear policy that converges to fairness and efficiency (Coordination). Jacobson (Jacobson 1988) synthesized all four insights: a second state variable (cwnd) tracking network capacity, a variance-aware RTT estimator, slow start and congestion avoidance — all endpoint-only, requiring zero router changes. RFC 1122 (Braden 1989) mandated the fix one year later.
| Researcher | Year | What they diagnosed | Invariant |
|---|---|---|---|
| Nagle | 1984 | No model of network load | State — environment underspecified |
| Zhang | 1986 | RTT estimator ignores variance | Time — belief too crude |
| Karn | 1987 | Retransmission ambiguity corrupts samples | State — measurement corrupted |
| Chiu & Jain | 1987 | No convergence theory for distributed rate control | Coordination |
| Jacobson | 1988 | Synthesized all four | State + Time + Coordination (Interface unchanged) |
The pioneers converged on the same decomposition that the four invariants formalize — because the structure of the problem forced it. Chapter 8 traces this intellectual lineage in full detail. The framework gives you the questions upfront; future networked systems should converge on their design faster.
2.6 Decomposition: Why Systems Have Parts
The Internet is one system with a single global purpose: deliver data between endpoints. No one designs the Internet as a single system. It decomposes into subsystems — link-layer medium access, network-layer routing, transport-layer reliability, application-layer adaptation — each answering the four invariants independently, coordinated through well-defined interfaces.
The decomposition runs along multiple axes simultaneously: administrative (Autonomous Systems under independent control), functional (control plane vs. data plane), protocol stack (layers hiding complexity behind interfaces), endpoint (intelligence at edges, stateless forwarding in the middle), traffic (per-flow vs. aggregate queuing), and temporal (routing reconverges over minutes; congestion control adapts per RTT; scheduling allocates per slot). Each axis produces a different cut, and each cut determines what coordination signals flow between the pieces.
Each decomposition yields subsystems. Each subsystem answers the four invariants. The framework in this chapter — invariants, principles, dependency graphs — is the analytical toolkit applied within each subsystem. Decomposition also reveals connections: the Interface invariant is precisely the coordination signal between subsystems. When we analyze queue management (Chapter 7), the loss signal it sends to transport is an interface coupling two subsystems. When that signal degrades — a 10 GB buffer absorbing thousands of RTTs before dropping — the subsystems diverge, and we get bufferbloat. Understanding why subsystems are separate and how they are coupled is as important as understanding how each one works internally.
2.7 The Structure of the Book
The rest of the book applies the framework to the major subsystems of the Internet, organized in six parts. Each part answers a distinct engineering question; each chapter within a part traces the anchor, invariants, principles, and dependency graph for a specific system.
| Part | Chapters | Engineering Question |
|---|---|---|
| Link Layer | Ch 2-4 | How to share a physical medium? |
| Network Layer | Ch 5-7 | How to address, route, and manage queues? |
| Transport | Ch 8-9 | How to deliver reliably end-to-end? |
| Application | Ch 10-11 | How to serve users over a best-effort network? |
| Cross-Cutting | Ch 12 | How to observe what is happening? |
| Agentic Systems | Ch 13 | How to build intelligent systems on this infrastructure? |
A real networked system — a video call over WiFi, a web page load over LTE — instantiates several parts simultaneously. Each part’s output shapes the next part’s environment: transport generates the arrival rate that queue management must handle; queue management produces the signals that transport interprets as congestion feedback; applications receive whatever throughput and delay the layers below deliver.
These systems are familiar ground examined through a new lens. You have already seen medium access in Ethernet’s CSMA/CD. You have already seen transport in TCP’s sequence numbers and acknowledgments. You have already seen queue management implicitly — every router has buffers, and buffer overflow is the packet loss that TCP interprets as congestion. This book examines each system through the invariants lens, revealing design choices that introductory treatments leave invisible: why does Ethernet use exponential backoff rather than a fixed retry interval? Why does TCP halve its window on loss rather than reduce it by a fixed amount? Why do routers use FIFO queuing rather than per-flow scheduling? Each choice is an answer to one or more invariants, shaped by the anchor constraints that system faces.
2.8 Objectives, Failure, and Meta-Constraints
Different systems optimize different objectives, which is why the same invariant answer is good in one setting and bad in another. The four invariants describe structural anatomy — but every system also has a purpose. Objectives — throughput maximization, fairness, bounded latency, energy efficiency — define what “answering well” means. Failure semantics — what happens when state is stale, measurement is wrong, coordination breaks, or interfaces are violated — define the cost of answering badly.
Objectives and failure are evaluation criteria, distinct from the four invariants. A dependency graph without objectives is a description. A dependency graph with objectives is a design argument. This framework supplies the structural vocabulary; objectives and failure supply the evaluative vocabulary. Both are necessary.
Meta-constraints shape all invariant answers from outside the technical design space. They explain why the technically superior design sometimes loses:
- Incremental deployability. A protocol that requires coordinated upgrades across the Internet will not be deployed, regardless of its technical merits. TCP survived and evolved because it is endpoint-only — a new congestion control algorithm (Cubic, BBR) can be deployed on a single server without any router upgrade. DNS survived because new record types (AAAA for IPv6, TLSA for DANE, CAA for certificate authority authorization) require zero resolver upgrades — resolvers that encounter an unknown record type simply pass it through. DHCP survived because its option space is extensible — new options (vendor-specific information, SIP server addresses, NTP server addresses) can be added without breaking old clients, which simply ignore unrecognized options. The common pattern: all three protocols were designed with extension points that allow incremental change without coordinated deployment.
- Backward compatibility. Every successful protocol must coexist with its predecessors. IPv6 must coexist with IPv4 on the same Internet — dual-stack deployment, tunneling (6to4, Teredo), and translation (NAT64) mechanisms exist solely because of this constraint. DNS’s EDNS0 extension is explicitly backward-compatible: a resolver encountering an unknown OPT pseudo-record ignores it and responds within the traditional 512-byte UDP limit. TCP options are negotiated during the three-way handshake — the SYN segment advertises supported options (window scaling, SACK, timestamps), and the peer ignores unrecognized options rather than rejecting them. This design allows TCP to evolve without breaking connections to older implementations.
- Administrative boundaries. The Internet is a federation of tens of thousands of independently administered networks. This reality constrains every coordination model. BGP exists because no single routing protocol can span administrative boundaries where operators have conflicting economic incentives and refuse to share internal topology. DHCP is scoped to individual subnets because each network administers its own address space — a DHCP server on one organization’s network has no authority over addresses on another’s. DNS’s hierarchical delegation mirrors administrative hierarchy: ICANN controls the root, registries control TLDs, and organizations control their own zones. Every protocol requiring global state has failed to deploy at Internet scale, because the authority to maintain it would have to span all administrative domains.
- Hardware economics. Router hardware constrains what processing can happen per packet. Simple FIFO queuing dominates because it requires no per-flow state and no per-packet computation — alternatives like per-flow fair queuing are more expensive to implement at line rate.
- Standardization history. Some separations reflect organizational boundaries (who was in the room at the IETF), not analytical independence. The TCP/UDP split at the transport layer is as much historical artifact as principled disaggregation.
2.9 What the Framework Predicts
The framework so far describes and explains: four invariants, three principles, dependency graphs, decomposition. But a framework that only describes is a taxonomy. The test of a design tool is whether it predicts. This framework makes concrete predictions — testable claims that follow from tracing the dependency graph.
Prediction 1: Direct coupling + distributed coordination produces destructive scaling. Any system where agents share environment state, act through direct coupling (my action destroys your action), and coordinate without central authority will face destructive scaling as agent count grows. Ethernet’s shared bus confirmed this; Aloha confirmed it earlier (Abramson 1970). WiFi confirms it in a setting where dedicated links are physically impossible: collision probability rises superlinearly with station count, which is why WiFi 6 introduced centralized scheduling via OFDMA. Chapter 3 traces this arc in full.
Prediction 2: When environmental constraints shift, the invariant under greatest pressure restructures first, and the change propagates through the dependency graph. High bandwidth-delay product paths pressured TCP’s State invariant (loss signals arrive too late); BBR restructured State by building an explicit path model from delivery-rate and min-RTT measurements, changing Time (windowed estimators) and closed-loop dynamics (pacing instead of window inflation) while leaving Coordination distributed. The delay-based approach itself dates to TCP Vegas (Brakmo et al. 1994), which failed in mixed ecosystems because loss-based flows consumed queue space that Vegas voluntarily ceded — a Coordination problem BBR sidestepped through single-administrator deployment. Chapter 8 traces this evolution in full.
Prediction 3: Relaxing interface or coordination constraints enables tighter belief-environment coupling and operation at tighter margins. A single-administrator datacenter can mandate ECN marking on every switch and DCTCP on every sender, achieving near-zero queue occupancy at 90%+ utilization — margins impossible on the open Internet where mandating universal ECN support is infeasible. Chapter 8, Act 5 analyzes this in depth.
These predictions are falsifiable. A system with direct coupling and distributed coordination that scales gracefully would challenge Prediction 1. An environmental shift that restructures an invariant other than the most-pressured one would challenge Prediction 2. Falsifiability gives the framework empirical teeth.
2.10 The Method
Given any system — one you are designing, studying, or reviewing — follow five steps:
Step 1. Identify the anchor. What constraints are inherited from physics, architecture, or deployment history? These are the hardest to change and the most consequential for all other choices.
Step 2. Answer the four invariants. For each: what state exists (environment, measurement, belief)? How is time handled? Who coordinates? What are the interfaces? Be specific — name the actual state variables, the actual signals, the actual decision rules.
Step 3. Trace the dependency graph. Which invariant answers constrain which others? How does the anchor narrow the feasible space? Where do principles (disaggregation, closed-loop, decision placement) explain the choices?
Step 4. Evaluate the closed-loop dynamics. Given these invariant answers, will the system converge? What happens when the measurement signal degrades? When the environment changes? Where are the failure modes?
Step 5. Check the meta-constraints. Is this design deployable? Backward-compatible? Economically viable? What standardization or organizational factors shape the design beyond technical merit?
A useful question for reviewing any systems paper: which invariant does this system fundamentally improve, and how does that change ripple through the dependency graph?
Worked example: applying the five steps to DNS. To see the method in action, apply all five steps to a protocol you already know.
Step 1. Identify the anchor. The global namespace is too large for a single server (billions of records), and the Internet’s administrative decentralization means no single entity has authority over all names. These two constraints — scale and administrative fragmentation — anchor the design.
Step 2. Answer the four invariants. State: hierarchically distributed. The root zone stores pointers to TLD servers; TLD servers store pointers to authoritative servers; authoritative servers store the actual records. Caching resolvers at every level store copies with TTL-bounded lifetimes. The environment state is the current set of name-to-address mappings. The measurement signal is the query-response exchange. The belief is the cached record — which drifts from reality between TTL refreshes. Time: prescribed by the authority. Each record carries a TTL set by the zone administrator. Resolvers count down the TTL and re-query when it expires. The authority controls the freshness-load tradeoff unilaterally. Coordination: hierarchical delegation. Each level delegates authority to the level below. The root delegates .com to Verisign; Verisign delegates google.com to Google. No two authorities need to agree on anything outside their own zone. Interface: UDP on port 53, with TCP fallback for large responses. EDNS0 extends the UDP size limit. DoH and DoT encrypt queries for privacy.
Step 3. Trace the dependency graph. The anchor (scale + decentralization) forces hierarchical coordination — no single server can handle all queries, and no single entity has universal authority. Hierarchical coordination forces distributed state — each level caches independently, with no central cache. Distributed cached state forces TTL-based time — without a mechanism to invalidate caches, expiration is the only way to bound staleness. The interface (UDP) reflects the query-response pattern and latency sensitivity of name resolution.
Step 4. Evaluate the closed-loop dynamics. The feedback loop is: query → cache → serve from cache → TTL expires → re-query. The loop period is the TTL. When the loop is too slow (TTL too long), resolvers serve stale records — users reach decommissioned servers. When the loop is too fast (TTL too short), authoritative servers face query storms. The loop has no adaptive gain — the TTL is a static parameter. The system converges to correct state after one TTL period, but during that period, inconsistency is tolerated. Global synchronization is possible: if a popular record’s TTL expires at the same time across many resolvers, the authoritative server faces a coordinated burst.
Step 5. Check the meta-constraints. DNS is incrementally deployable — new record types can be added without upgrading resolvers (resolvers pass through any type they encounter). EDNS0 is backward-compatible — old servers ignore the OPT record. Administrative boundaries are respected — each organization controls its own zone. The root zone’s governance (ICANN) is a political meta-constraint that shapes the entire hierarchy. DNS-over-HTTPS raises a new meta-constraint: it shifts trust from the network path to the resolver operator, a change with political and regulatory dimensions that the technical design alone leaves open.
2.11 Generative Exercise: QUIC and the Interface Renegotiation
QUIC moves transport over UDP to avoid middlebox ossification (Langley et al. 2017). The Interface invariant is under pressure: middleboxes inspect and modify TCP headers, making new transport features undeployable. QUIC renegotiates the interface — and the framework predicts the cascade:
- Interface: TCP → UDP encapsulation. Encrypted headers are opaque to middleboxes.
- State: Connection state moves to user-space. Connection migration (changing IP mid-session) becomes feasible — the identifier is a QUIC-level concept, not an IP 5-tuple.
- Time: 0-RTT handshake becomes possible. TCP’s three-way handshake is tied to kernel SYN processing; QUIC caches credentials and resumes immediately.
- Coordination: Unchanged — still distributed, still endpoint-driven.
The framework predicts that an interface renegotiation cascades through State and Time but leaves Coordination intact — because the environmental pressure was on interface ossification, not on who decides. QUIC’s integrated design — combining transport, encryption, and stream multiplexing in a single protocol — also realizes the Application Level Framing principle proposed by Clark and Tennenhouse (Clark and Tennenhouse 1990), who argued that protocol processing should be organized around application-visible data units rather than rigid layer boundaries. Where ALF was an architectural argument for why strict layering wastes performance, QUIC is the engineering proof: by integrating functions that the traditional stack separated across layers, QUIC achieves 0-RTT handshakes, connection migration, and independent stream processing that the layered TCP + TLS + HTTP/2 stack could not.
Exercise for the reader: HTTP/3 runs exclusively over QUIC. What invariant answers change at the application layer compared to HTTP/2 over TCP? Trace the dependency graph.
2.11.1 In-Class Discussion Exercises
These exercises are designed for group work during class. Each works well in 10–15 minutes.
Exercise 1: What if DNS had no caching? Suppose every DNS query had to traverse the full hierarchy — root server, TLD server, authoritative server — every time. Trace the impact on each invariant. How does the State invariant change? (No local belief — every query produces a fresh answer, but load on authoritative servers increases by orders of magnitude.) How does the Time invariant change? (No TTL, no staleness — but no buffering against server unreachability either.) How does the Coordination invariant change? (The hierarchy still delegates, but every resolver must contact every level for every query — the hierarchy bears the full query load.) What would happen to root server traffic? (The 13 root server clusters currently handle ~10,000 queries per second; without caching, they would face billions.)
Exercise 2: What if DHCP leases never expired? Suppose the server grants an address permanently — no lease duration, no renewal, no reclamation. What problems arise? (Address exhaustion: departed clients never return their addresses. A coffee shop with a /24 subnet — 254 usable addresses — would exhaust its pool in a day. The server would have no mechanism to distinguish a client that left five minutes ago from one that will return tomorrow.) Which invariant is most affected? (Time — the lease duration is the mechanism that couples address allocation to actual usage. Without it, the state invariant degrades: the server’s belief about who holds each address diverges from reality, with no measurement signal to correct it.)
Exercise 3: Compare ARP and DHCP through the four invariants. Both protocols operate at bootstrap time on a LAN. Both use broadcast. But they solve different problems and make different structural choices. ARP maps IP addresses to MAC addresses; DHCP assigns IP addresses. Apply the four invariants to ARP: State (distributed — every host caches its own ARP table), Time (prescribed timeout, typically 20 minutes), Coordination (none — any host can claim any mapping, which is why ARP spoofing is trivially easy), Interface (broadcast query, unicast reply). Why is ARP’s coordination model so different from DHCP’s? (ARP maps an already-allocated resource — MAC addresses are pre-assigned by manufacturers — so the uniqueness constraint that forces DHCP’s centralization is absent.)
Exercise 4: Explicit rate feedback. TCP uses loss as a congestion signal. Suppose routers could send explicit rate feedback — “you may send at 50 Mbps on this path.” Which invariants change? (State changes: the sender receives an authoritative measurement signal instead of inferring capacity from loss. Time changes: the feedback loop tightens — the sender learns the available rate in one RTT rather than probing over many RTTs. Coordination changes: the router becomes an active participant in rate allocation, shifting from distributed to hybrid coordination.) What are the deployment obstacles? (Every router on the path must participate — a single legacy router that does not send rate feedback breaks the mechanism. This is the incremental deployability meta-constraint.)
2.11.2 Independent Reading Exercises
These exercises are designed for deeper self-study. Each requires applying the framework to a system not covered in class.
Exercise 5: Apply the five-step method to HTTP/2. Identify the anchor constraint that motivated HTTP/2’s design (head-of-line blocking in HTTP/1.1, where a single slow response blocks all subsequent responses on the same connection). Answer the four invariants: what state does HTTP/2 maintain (stream multiplexing state, HPACK header compression state, flow control windows per stream)? How is time handled (stream priorities, flow control window updates)? Who coordinates (client and server negotiate settings; server can push resources)? What is the interface (binary framing layer over a single TCP connection)? Trace the dependency graph from the anchor. Evaluate the closed-loop dynamics — in particular, how does TCP’s head-of-line blocking at the transport layer undermine HTTP/2’s stream multiplexing at the application layer?
Exercise 6: OSPF vs. BGP — why different coordination choices? Both OSPF and BGP solve routing. Apply the framework to explain why they make radically different coordination choices. Start with the anchor: OSPF operates within a single administrative domain (one operator controls all routers); BGP operates across administrative boundaries (each AS has independent, often competing, interests). Trace how this anchor difference produces different answers for every invariant. Why does OSPF share complete topology information while BGP shares only reachability? Why does OSPF converge to a globally consistent view while BGP may never converge? Why does OSPF use link-state flooding while BGP uses path-vector announcements?
Exercise 7: Design a configuration distribution protocol. You need to distribute configuration files to 1,000 servers in a datacenter. The files change once per hour on average. Use the framework: identify your anchor constraint (single administrative domain, low latency requirement, file consistency across all servers). Answer the four invariants — who holds the authoritative copy? How do servers learn about updates? How is time handled (polling interval vs. push notification)? What interface do servers use to retrieve files? Justify each choice by tracing it to the anchor. Compare your design to a pull-based approach (servers poll periodically) and a push-based approach (a central server notifies all servers on every change). Which invariant answers differ, and why?
Exercise 8: QUIC and middlebox dependency graphs. QUIC encrypts its transport headers, making them opaque to middleboxes (firewalls, NATs, load balancers, traffic shapers) that previously inspected TCP headers to make forwarding, filtering, or shaping decisions. Predict how this changes the dependency graph for these middleboxes. What state did middleboxes maintain about TCP connections (sequence numbers for stateful firewalling, connection tracking for NAT)? How did they obtain that state (passive observation of TCP headers)? What happens when the measurement signal disappears (encrypted QUIC headers)? How must the middlebox’s coordination model change? Consider specifically: how does a stateful firewall track QUIC connections without seeing transport headers? How does a load balancer distribute QUIC connections across backend servers without inspecting the transport layer?
2.12 References
- Abramson, N. (1970). “The ALOHA System — Another Alternative for Computer Communications.” Proc. AFIPS Fall Joint Computer Conference.
- Alizadeh, M. et al. (2010). “Data Center TCP (DCTCP).” Proc. ACM SIGCOMM.
- Braden, R. (1989). “Requirements for Internet Hosts — Communication Layers.” RFC 1122.
- Brakmo, L., O’Malley, S., and Peterson, L. (1994). “TCP Vegas: New Techniques for Congestion Detection and Avoidance.” Proc. ACM SIGCOMM.
- Cardwell, N., Cheng, Y., Gunn, C.S., Yeganeh, S.H., and Jacobson, V. (2017). “BBR: Congestion-Based Congestion Control.” ACM Queue, 14(5).
- Cerf, V.G. and Kahn, R.E. (1974). “A Protocol for Packet Network Intercommunication.” IEEE Trans. Communications, COM-22(5).
- Chiu, D.-M. and Jain, R. (1989). “Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks.” Computer Networks and ISDN Systems, 17(1).
- Clark, D. (1988). “The Design Philosophy of the DARPA Internet Protocols.” Proc. ACM SIGCOMM.
- Clark, D.D. and Tennenhouse, D.L. (1990). “Architectural Considerations for a New Generation of Protocols.” Proc. ACM SIGCOMM.
- Deering, S. (1998). “Watching the Waist of the Protocol Hourglass.” Keynote, IETF 43.
- Demers, A., Keshav, S., and Shenker, S. (1989). “Analysis and Simulation of a Fair Queueing Algorithm.” Proc. ACM SIGCOMM.
- Fischer, M.J., Lynch, N.A., and Paterson, M.S. (1985). “Impossibility of Distributed Consensus with One Faulty Process.” Journal of the ACM, 32(2).
- Ha, S., Rhee, I., and Xu, L. (2008). “CUBIC: A New TCP-Friendly High-Speed TCP Variant.” ACM SIGOPS Operating Systems Review, 42(5).
- Jacobson, V. (1988). “Congestion Avoidance and Control.” Proc. ACM SIGCOMM.
- Karn, P. and Partridge, C. (1987). “Improving Round-Trip Time Estimates in Reliable Transport Protocols.” Proc. ACM SIGCOMM.
- Lamport, L. (1978). “Time, Clocks, and the Ordering of Events in a Distributed System.” Communications of the ACM, 21(7).
- Langley, A. et al. (2017). “The QUIC Transport Protocol: Design and Internet-Scale Deployment.” Proc. ACM SIGCOMM.
- McKeown, N. et al. (2008). “OpenFlow: Enabling Innovation in Campus Networks.” ACM SIGCOMM Computer Communication Review, 38(2).
- Mockapetris, P. (1987). “Domain Names — Concepts and Facilities.” RFC 1034; “Domain Names — Implementation and Specification.” RFC 1035.
- Nagle, J. (1984). “Congestion Control in IP/TCP Internetworks.” RFC 896.
- Nichols, K. and Jacobson, V. (2012). “Controlling Queue Delay.” ACM Queue, 10(5).
- Postel, J. (1981). “Transmission Control Protocol.” RFC 793.
- Ramakrishnan, K., Floyd, S., and Black, D. (2001). “The Addition of Explicit Congestion Notification (ECN) to IP.” RFC 3168.
- Saltzer, J.H., Reed, D.P., and Clark, D.D. (1984). “End-to-End Arguments in System Design.” ACM Trans. Computer Systems, 2(4).
- Shannon, C.E. (1948). “A Mathematical Theory of Communication.” Bell System Technical Journal, 27(3).
- Wiener, N. (1948). Cybernetics: Or Control and Communication in the Animal and the Machine. MIT Press.
- Zhang, L. (1986). “Why TCP Timers Don’t Work Well.” Proc. ACM SIGCOMM.
- 3GPP (2018). “Study on New Radio Access Technology.” 3GPP TR 38.912.
This chapter is part of “A First-Principles Approach to Networked Systems” by Arpit Gupta, UC Santa Barbara, licensed under CC BY-NC-SA 4.0.
The multiplier 4 approximates the 95th percentile of RTT variation under the assumption that deviations are roughly normally distributed. Jacobson (Jacobson 1988) showed empirically that 4× RTTVAR provides a good balance between premature timeouts (multiplier too small) and delayed loss detection (multiplier too large).↩︎
T1 = 50% of the lease duration gives the client a 50% probability of reaching the original server before T2. T2 = 87.5% (7/8) allows a final broadcast attempt before lease expiration. These fractions are specified in RFC 2131 (Droms 1997) as engineering compromises between renewal frequency and server load.↩︎
The multiplicative decrease factor of 1/2 is constrained by the Chiu-Jain proof (Chiu and Jain 1989): to converge toward fairness, the decrease must be multiplicative (not additive) and strictly less than 1. The factor 1/2 was chosen empirically — aggressive enough to clear congestion quickly, conservative enough to avoid throughput collapse.↩︎