Lecture 2: Chapter 1 — First Principles for Networked Systems (The Four Invariants)

Course: CS176C — Advanced Topics in Internet Computing, Spring 2026
Instructor: Arpit Gupta, UC Santa Barbara
Date: April 2, 2026
Slides: Deployed slide deck
Pre-requisite: L1 (Welcome & Course Overview)

Five protocols, five different designs — why?

Last lecture, you loaded a web page. Five protocols fired in sequence: DHCP obtained an IP address from a server, ARP resolved the router’s MAC address, DNS resolved google.com to an IP, TCP established a reliable connection, and HTTP fetched the page content. These five protocols share a network, share a purpose (move data from here to there), and yet their internal architectures look nothing alike. DHCP and DNS use centralized or hierarchical coordination; ARP and TCP do not. The difference is not historical accident — it is forced by structural constraints.

This observation is the question the entire lecture answers. Lecture 1 showed a table with four rows — State, Time, Coordination, Interface — and TCP’s answers to each. Today the goal is to learn the tool that generates those answers for any system.

The tool from Lecture 1 — now we learn to use it

Lecture 1’s teaser table gave TCP’s invariant answers:

Invariant	Question	TCP’s answer
Interface	What’s exposed, what’s hidden?	Socket API above, IP datagrams below
Coordination	Who decides — and why?	Distributed — each sender alone
State	What does it believe? Is that correct?	cwnd, SRTT — inferred from ACKs
Time	When are decisions made — and how?	Event-driven (ACKs) + inferred timers (RTO)

Those were the answers. The question now is why each answer is forced — and how to generate answers for any system.

The dependency chain starts from the binding constraint. For TCP, the binding constraint is Interface (IP’s unreliable datagrams). That shapes Coordination, which shapes State, which shapes Time. For other systems, the chain may start from a different invariant. WiFi’s binding constraint is the shared medium (physics forces Coordination first). Video streaming’s binding constraint is human perception (Time first). The starting point depends on the system. The tool is universal.

Interface: what did TCP inherit?

The Interface invariant asks: what’s exposed, what’s hidden — and can it ever change?

TCP sits between two interfaces it did not choose. Below it lies IP’s contract: best-effort datagrams — unreliable, unordered, no guarantees [1]. IP deliberately hides reliability from the network layer. TCP must therefore provide reliability because IP chose to hide it. Above TCP sits the socket API: connect(), send(), recv() — unchanged since 1983, forty-three years of frozen interface [2]. Applications depend on this contract. TCP cannot change its own API without breaking everything above it. QUIC had to tunnel through UDP to escape this ossification [3].

Interface is the most permanent invariant. IP’s choice forced TCP to exist. TCP’s API froze its evolution.

Coordination: why is TCP on its own?

The Coordination invariant asks: who decides — and what constraint forced that choice?

Given IP’s interface, could a central server schedule all TCP flows — tell each sender exactly how fast to send? Three constraints prohibit it. First, no single entity controls the Internet: thousands of autonomous systems operate under different ownership, creating administrative decentralization [4]. Second, billions of flows require microsecond decisions — no server can keep up with that scale. Third, trust: who gets authority over your bandwidth? Would you accept a server in another country throttling your connection?

TCP must be distributed: each endpoint runs independently, making its own decisions with only local information. This is not a design preference — it is forced by the constraints [4].

The consequence: each endpoint needs its own machine

Because TCP is distributed, each endpoint runs its own finite state machine (FSM):

CLOSED → SYN_SENT → ESTABLISHED → FIN_WAIT → CLOSED

The FSM makes every decision: whether to send data, how much, whether to retransmit, when to give up, how to tear down [2]. No coordinator tells TCP what to do — the FSM reacts to local events (ACK arrived, data ready, timeout fired) using only its own internal state. Coordination dictated the FSM design: if TCP were centralized, endpoints would not need their own state machines; a controller would decide for them.

The FSM is not a separate concept from coordination. It is a consequence of distributed coordination. This link is critical — students who learn State and Coordination separately miss it.

State: what must the FSM track?

The State invariant asks: what does the system believe? Is that belief correct?

The FSM needs internal variables to make decisions. TCP’s primary goal is to figure out how fast to send without overwhelming the network. Understanding how belief forms requires distinguishing three layers:

Layer	What it is	TCP
Environment	What’s actually happening	Bottleneck capacity, queue occupancy, competing traffic
Measurement	What TCP can observe	ACK arrivals (timing + sequence), duplicate ACKs
Belief	The FSM’s internal model	`cwnd` (capacity estimate), `SRTT` (delay estimate)

The FSM decides based on belief, never environment — it cannot observe the network directly. The congestion window (cwnd) is TCP’s estimate of how many bytes the network can handle in flight; it grows via additive increase (one segment per RTT on ACK success) and shrinks via multiplicative decrease (halve on loss) [5]. The smoothed round-trip time (SRTT) is computed via Jacobson’s algorithm [6]: an exponentially weighted moving average of observed RTT samples, plus a variance estimate. When belief diverges from environment, the system fails — even if the FSM logic is perfect.

When does the belief go wrong?

TCP’s most infamous failure mode illustrates the gap between belief and environment: bufferbloat. The environment shows a router queue filled to 500ms of delay — the path is congested. The measurement signal shows ACKs keep arriving with nothing dropped — the signal says “all clear.” The belief responds accordingly: cwnd grows, because TCP thinks there is available capacity.

Which layer broke? Not the algorithm — AIMD is doing exactly what it was designed to do. Not the belief logic — cwnd responds correctly to the signals it receives. The measurement signal failed: large buffers absorbed packets without generating a loss signal.

Every protocol failure is a gap between belief and environment, caused by an inadequate measurement signal. This is not a TCP-specific insight. It is a structural pattern that recurs in DNS stale cache, WiFi rate adaptation, cellular power control, and video bitrate selection. The State invariant’s three-layer model (environment, measurement, belief) pinpoints failures to a specific layer. “The measurement layer failed” is more actionable than “TCP doesn’t handle large buffers well” — it tells you exactly what to fix, which is what Active Queue Management does (covered in Chapter 7).

Time: when does the FSM act?

The Time invariant asks: when are decisions made — and how?

The FSM makes decisions in two ways:

Trigger	Example	How the timing works
Event-driven	ACK arrives → update cwnd, advance send window	React immediately to observed signal
Timer-driven	RTO expires → retransmit lost segment	Wait for a deadline, then act

Most FSM transitions are event-driven — the ACK arrival IS the trigger. But when timers are needed (the retransmission timeout), the value is inferred, not prescribed. No authority tells TCP what its timeout should be. Each endpoint computes its own using Jacobson’s algorithm (1988) [6]: it estimates RTT from ACK round-trip samples, maintains a smoothed average (SRTT) and deviation estimate (RTTVAR), and derives RTO = SRTT + 4 x RTTVAR. A path with 10ms RTT gets a tight RTO; a path with 200ms RTT gets a longer one.

RTT estimation serves two purposes simultaneously: capacity inference (helps TCP assess path congestion, which is State) and timeout calibration (sets the retransmission timer, which is Time). One measurement bridges two invariants. The connection to Coordination is direct: TCP’s time is inferred because its coordination is distributed. If there were a central authority, it could prescribe timeouts.

TCP through all four invariants — the dependency chain

With all four invariants established, TCP’s complete architecture becomes visible as a single dependency chain:

Invariant	TCP’s answer	Forced by
Interface	IP datagrams (unreliable) ↔ Socket API (ossified)	IP’s design decision; 43 years of deployment [1]
Coordination	Distributed — each endpoint alone	Interface + admin decentralization → no central controller [4]
State	FSM + cwnd + SRTT (belief from ACKs)	Distributed → each endpoint needs its own FSM and belief model [5][6]
Time	Event-driven (ACKs) + inferred timers (RTO from RTT)	No authority → must infer; RTT bridges State and Time [6]

For TCP, the chain runs Interface → Coordination → State → Time. Each row follows from the previous one because the binding constraint is Interface. For other systems, the chain starts elsewhere: WiFi’s binding constraint is the shared medium (physics forces Coordination), and video streaming’s binding constraint is human perception (Time). The chain always starts from the constraint — not always from Interface.

Applying the tool to DNS

DNS resolves google.com → 142.250.80.46. Applying all four invariants:

Invariant	DNS’s answer	Why
Interface	UDP port 53 (query-response). Evolving: DoH/DoT [7]	Plaintext DNS inspected by ISPs → ossification pressure
Coordination	Hierarchical — root → TLD → authoritative	Namespace too large for one server; too structured for pure distribution
State	Cache = belief. Zone file = environment. TTL = measurement timer	When record changes but TTL has not expired → stale belief
Time	Timer-driven: TTL expiry triggers re-query. Prescribed — zone admin sets TTL	Authority exists → can prescribe

A common mistake is calling DNS centralized. It is hierarchical — root delegates to TLD delegates to authoritative. Three levels of delegation.

The structural connection between DNS and TCP is revealing. DNS stale cache and TCP bufferbloat are the same structural failure — belief diverges from environment because the measurement signal is delayed. Two protocols, zero shared mechanisms, identical failure pattern diagnosed by the same tool. In DNS, the zone admin sets TTL and the measurement signal is delayed, producing stale cache. In TCP, large buffers absorb packets and the measurement signal is absent, so cwnd grows unchecked. The difference in Time traces to Coordination: DNS has hierarchical authority (can prescribe TTL), TCP has no authority (must infer RTO) [6].

ARP versus DHCP — same LAN, why different?

Both ARP and DHCP run at boot. Both use broadcast. Both operate on the same LAN. Yet their architectures diverge sharply:

Invariant	ARP	DHCP
Interface	Broadcast query, unicast reply	Broadcast discover, unicast offer/ack
Coordination	None — any host can claim any mapping	Centralized — server decides
State	Distributed cache per host	Centralized allocation table
Time	Prescribed timeout (~20 min, OS-dependent)	Prescribed lease (server sets)

The difference is the resource. ARP maps pre-assigned addresses: MACs are burned in by manufacturers, globally unique by construction. No scarcity means no authority needed [8]. DHCP allocates scarce addresses: IPs are limited per subnet and reusable. Uniqueness is critical, so centralized authority is required [9].

This difference carries a security consequence. ARP’s lack of coordination makes ARP spoofing trivially easy — an attacker on the same LAN can claim to be the router, intercepting all traffic. Coordination choices have security consequences, a theme that recurs throughout the course.

The dependency chains make the contrast explicit. For DHCP: Interface (Ethernet broadcast — no IP yet) → Coordination (scarce resource → centralized) → State (server tracks allocation table) → Time (lease prescribed by server). For ARP: Interface (same broadcast) → Coordination (no scarcity → no coordinator) → State (each host caches independently) → Time (fixed timeout, OS-dependent). Same LAN, same broadcast, completely different designs — explained by one difference in the resource constraint.

The skill this lecture builds

Given any networked system, the four invariants and the dependency chain produce a complete architectural diagnosis:

Invariant	The question	The chain
Interface	What’s exposed/hidden? Can it change?	For TCP, this is the binding constraint — but not always
Coordination	Who decides? Why that entity?	Forced by interface + resource constraints → dictates FSM design
State	What does it believe? Where does belief fail?	FSM tracks state; failures = belief diverges from environment
Time	When are decisions made? Event or timer? Prescribed or inferred?	FSM transitions; timers prescribed (if authority) or inferred (if not)

The method is: identify the binding constraint, then trace the chain. For TCP it was Interface. For WiFi it will be the shared medium. For video streaming it will be human perception. The starting invariant changes; the method does not.

As a transfer exercise, consider applying the four invariants to HTTP. HTTP’s Interface is the TCP socket API — the byte-stream contract TCP provides. Its Coordination is server-driven (request-response, server holds authority). Its State is stateless by default (no belief about client between requests; cookies and sessions add state). Its Time is mostly event-driven (request arrives, respond) with some prescribed timers (keep-alive timeout).

Forward to Lecture 3

This lecture established the four invariants as a diagnostic tool: given any networked system, trace its dependency chain and the reasons behind each design choice become visible. TCP’s chain ran Interface → Coordination → State → Time. In tracing it, we saw Jacobson separate the congestion window from the receiver window, saw AIMD as a feedback loop, saw why the finite-state machine had to be distributed. Those were design decisions, not just architectural facts — but this lecture never named the patterns that produced them or asked whether those patterns recur elsewhere.

Lecture 3 completes the toolkit. Three design principles — disaggregation, closed-loop reasoning, and decision placement — explain how solutions get constructed. The same principles that built TCP also built routing, built by different pioneers solving different problems under different constraints. If the same patterns appear in both, they are structural principles, not TCP-specific tricks. Before Tuesday, re-read Chapter 1 of the textbook, especially the Design Principles and Dependency Graph sections [10].

References

[1] J. Postel, “Internet Protocol,” RFC 791, September 1981.

[2] J. Postel, “Transmission Control Protocol,” RFC 793, September 1981.

[3] J. Iyengar and M. Thomson, “QUIC: A UDP-Based Multiplexed and Secure Transport,” RFC 9000, May 2021.

[4] J. H. Saltzer, D. P. Reed, and D. D. Clark, “End-to-End Arguments in System Design,” ACM Trans. Computer Systems, vol. 2, no. 4, pp. 277–288, November 1984.

[5] D. M. Chiu and R. Jain, “Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks,” Computer Networks and ISDN Systems, vol. 17, no. 1, pp. 1–14, 1989.

[6] V. Jacobson, “Congestion Avoidance and Control,” Proc. ACM SIGCOMM, pp. 314–329, 1988.

[7] P. Hoffman and P. McManus, “DNS Queries over HTTPS (DoH),” RFC 8484, October 2018.

[8] D. C. Plummer, “An Ethernet Address Resolution Protocol,” RFC 826, November 1982.

[9] R. Droms, “Dynamic Host Configuration Protocol,” RFC 2131, March 1997.

[10] J. F. Kurose and K. W. Ross, Computer Networking: A Top-Down Approach, 8th ed., Pearson, 2021.