Design Principles & The Dependency Graph
2026-04-07
Invariants (last lecture)
What must be answered
Every system answers these. No choice.
Principles (today)
How to answer well under constraints
These are strategies — you can violate them, but you pay a cost.
Imagine a single system that resolves names, assigns IP addresses, AND computes routes. One protocol, one server, one database.
What goes wrong?
The Internet disaggregates: DNS handles naming, IP handles addressing, BGP handles routing. Each evolves independently. IPv4→IPv6 doesn’t require renaming every domain.
Principle: Decompose a system into independent components that can evolve separately.
| System | What’s disaggregated | Boundary it aligns with |
|---|---|---|
| DNS/IP/BGP | Naming from addressing from routing | Administrative (ICANN vs. IANA vs. ISPs) |
| Protocol stack | Application / Transport / Network / Link / Physical | Functional (each layer provides a service) |
| 5G | CU / DU / RU split | Temporal (seconds vs. ms vs. sub-ms decisions) |
Cost: Every interface between components adds overhead, latency, and complexity.
When NOT to disaggregate: When interface overhead dominates the benefit (e.g., kernel TCP vs. DPDK bypass — bypassing the kernel merges transport and application for speed).
TCP sends a packet. An ACK comes back. TCP adjusts cwnd. TCP sends more packets.
That’s a feedback loop. Signal: ACK. Decision: adjust cwnd. Period: ~1 RTT.
Now: Is DNS caching a feedback loop?
Yes. Cache a record → serve it → TTL expires → re-query the authority → update cache. Signal: query-response. Period: TTL value.
Is DHCP leasing a feedback loop?
Yes. Allocate address → client uses it → lease expires → client renews or server reclaims. Signal: renewal request. Period: lease duration.
Every adaptive protocol is a feedback loop. The question is: will it converge? What happens when the signal is delayed or wrong?
| Protocol | Signal | Period | Adaptive? | Failure mode |
|---|---|---|---|---|
| TCP AIMD | ACK arrivals / loss | ~1 RTT | Yes — cwnd adjusts continuously | Oscillation (sawtooth), bufferbloat (delayed signal) |
| DNS TTL | Query-response | TTL value (min–hours) | No — TTL is static | Staleness (too long) or query storms (too short) |
| DHCP lease | Renewal requests | Lease duration | No — lease is static | Address exhaustion (too long) or churn (too short) |
Key difference: TCP’s loop adapts (cwnd changes based on feedback). DNS and DHCP loops are static timers — they expire and refresh, but don’t adjust their period based on conditions.
A static timer can’t adapt to changing conditions. This is why DNS cache poisoning exploits the gap between TTL expiry and actual change.
Original WiFi (802.11 DCF): every station decides independently when to transmit. Distributed.
WiFi 6 (802.11ax OFDMA): the access point schedules who transmits when. Centralized.
What changed? The protocol reversed its coordination model. Why?
The environment changed. A coffee shop in 2000: 5 laptops. A lecture hall in 2025: 200 devices.
Distributed contention with 200 devices → collision probability explodes → throughput collapses.
Dense deployments made distributed coordination destructive. The anchor shifted, and the decision placement had to follow.
Principle: Decide where decisions are made based on what information is available where.
Centralized placement
Examples: DHCP server, 4G eNodeB scheduler, SDN controller
Distributed placement
Examples: TCP endpoints, WiFi DCF, BGP routers
The question isn’t “which is better” — it’s “what does your anchor constraint force?”
Disaggregation answers: how to divide the system. Closed-loop reasoning answers: how decisions adapt. Decision placement answers: where decisions are made.
The invariants tell you what to answer. The principles tell you how to answer well.
An anchor is a constraint that is harder to change than the invariant answers it constrains.
| Source | Example | What it forces |
|---|---|---|
| Physics | Wireless medium is shared | Carrier sensing, contention-based access |
| Legacy interface | IP delivers unreliable datagrams | TCP must infer congestion, build reliability |
| Admin boundaries | No single entity controls the Internet | Distributed coordination, no central scheduler |
| Hardware economics | Commodity switches use FIFO queues | Fair queuing is expensive → most routers don’t do it |
| Deployment reality | Billions of devices speak TCP | New transport must tunnel through UDP (QUIC) |
What makes something an anchor: it is harder to change than the design choices it constrains.
Anchor → constrains feasible invariant answers → principles guide choices within constraints → choices produce closed-loop dynamics → dynamics produce emergent properties
Anchor: IP provides unreliable datagrams + no single entity controls the Internet
Tracing the cascade:
Principles at work:
Every design choice in TCP traces back to the anchor. Change the anchor → the design restructures.
Apply the method: what constraint is hardest to change for DNS, and how does it force the design?
Anchor: global namespace too large for one server + administrative fragmentation (no single entity owns all names)
Cascade:
Same method, different anchor, entirely different design. That’s the framework.
Given any system — one you are designing, studying, or reviewing:
| Step | Action | What you produce |
|---|---|---|
| 1 | Identify the anchor | The constraint hardest to change |
| 2 | Answer the four invariants | Specific state variables, signals, decision rules |
| 3 | Trace the dependency graph | Which answers constrain which others |
| 4 | Evaluate closed-loop dynamics | Convergence? Failure modes? What if signal degrades? |
| 5 | Check meta-constraints | Deployable? Backward-compatible? Economically viable? |
A useful question for reviewing any systems paper: which invariant does this system fundamentally improve, and how does that change ripple through the dependency graph?
A dependency graph without objectives is a description. With objectives, it’s a design argument.
Objectives: throughput, latency, fairness, reliability — what the system tries to optimize
Failure: what happens when the loop breaks — measurement signal degrades, environment changes faster than belief can track
Meta-constraints — forces beyond technical merit:
| Meta-constraint | Example |
|---|---|
| Incremental deployability | ECN took decades — every router on the path must participate |
| Backward compatibility | IPv6 adoption stalled for 20+ years |
| Administrative boundaries | Can’t mandate DCTCP on the open Internet |
| Hardware economics | Fair queuing is technically superior but FIFO dominates because it’s cheaper |
| Standardization politics | IETF consensus process shapes what gets deployed |
| System | Core Question | Anchor | Chapters |
|---|---|---|---|
| Medium Access | How to share the transmission medium fairly? | Medium physics (shared, destructive) | Ch 3–4 |
| Transport | Deliver reliably across an uncontrolled path? | IP interface (unreliable datagrams) | Ch 2 |
| Queue Management | What to do when packets arrive faster than they leave? | Finite buffer at bottleneck | Ch 6 |
| Multimedia Apps | Deliver time-sensitive content over best-effort? | Human perceptual time constraints | Ch 8 |
| Network Mgmt | Allocate resources and enforce policy? | Need for visibility across admin domains | Ch 9 |
| Measurement | Observe what’s happening in an opaque system? | Information asymmetry | Ch 9 |
The framework is the same for all six. The anchor changes → the answers change → the design changes.
Suppose every DNS query had to traverse the full hierarchy — root → TLD → authoritative — every single time. No local cache, no TTL, no stored answers.
Your task (5 minutes, work in pairs):
This is another midterm-style question. The midterm asks you to trace what-if scenarios through the dependency graph.
| Invariant | With caching | Without caching |
|---|---|---|
| State | Distributed cache at every resolver | No local belief — every query produces a fresh answer |
| Time | TTL-based expiry (minutes to hours) | No TTL needed — but also no latency savings |
| Coordination | Hierarchy + local autonomy (cache serves most queries) | Hierarchy bears full query load at every level |
| Interface | Same (UDP port 53) | Same — but now every query hits the network |
Under most pressure: State. Removing caching eliminates the local belief layer entirely. Every resolver must contact every level of the hierarchy for every query.
Scaling consequence: root servers go from ~10,000 queries/sec to billions. The caching IS the disaggregation that makes DNS work at scale.
The framework makes concrete, falsifiable predictions about how systems behave and fail:
Plus: the QUIC generative exercise — tracing a full interface renegotiation through the dependency graph.
Before Thursday: review the “What the Framework Predicts” section of Ch 2.