Course: CS176C — Advanced Topics in Internet Computing, Spring 2026
Instructor: Arpit Gupta, UC Santa Barbara
Date: April 9, 2026
Slides: Deployed slide deck
Pre-requisite: L3 (Design Principles in Action)
The framework becomes a design tool
By the end of Lecture 3, the toolkit was complete: four invariants (State, Time, Coordination, Interface), three design principles (decision placement, disaggregation, closed-loop reasoning), and the Estimate-Measure-Believe decomposition that classifies every gap between what a protocol knows and what the environment actually is. Until now, the framework has been analytic — applied after the fact to systems that already exist. Today it becomes generative. Starting from a single baseline system — OSPF — two different environmental changes produce two entirely different architectures: BGP and SDN. The framework predicts both from constraint shifts alone.
The structure is two design exercises from the same starting point. In the first, trust disappears and OSPF’s State answer breaks across commercial boundaries; the result is BGP. In the second, scale and cost explode and OSPF’s Coordination answer becomes economically unsustainable; the result is SDN/OpenFlow. Both exercises demonstrate that when the binding constraint shifts, the framework generates the correct architectural response — not by guessing, but by tracing which invariant answer can no longer hold.
OSPF as baseline: the dependency graph
OSPF’s binding constraint is survivability — the network must continue operating despite failures [1]. From that single constraint, the four invariant answers follow in a dependency chain:
| Invariant | OSPF’s answer | Forced by | Design principle |
|---|---|---|---|
| Coordination | Distributed — each router computes independently | Survivability → no central point of failure | Decision placement |
| State | Full topology — every router knows every link and cost | Distributed → need shared truth to avoid loops | Disaggregation: measurement from belief [3] |
| Time | Event-driven flooding, sub-second convergence | Full topology → changes trigger immediate reflooding | Closed-loop: fast, honest feedback |
| Interface | LSAs — the format for exchanging raw link measurements | Cooperative trust → share everything, hide nothing | Honest measurement signal |
The assumption holding this together is cooperation. Every router shares everything honestly. No secrets, no filtering, no policy. This works because OSPF operates within a single trust domain — one organization, one administrative authority [6]. The question is what happens when that assumption breaks.
Exercise 1: commercialization breaks the State invariant
In 1989, ARPANET was decommissioned and the Internet became an interconnection of independent commercial networks. The environment shifted from a single cooperative research community to thousands of organizations — AT&T, Sprint, universities, corporations — with competing business interests [7]. The tool at hand was still OSPF. The question is whether it remains the right tool.
| ARPANET | Commercial Internet | |
|---|---|---|
| Who operates it | One cooperative research community | Thousands of organizations with disparate interests |
| Relationship | Collaborative — shared goals | Commercial — competing business interests |
| The problem | Route within one backbone | Organizations must exchange traffic despite commercial competition |
The invariant that breaks is State. OSPF’s State answer requires sharing full topology honestly — every link, every cost, via Link-State Advertisements. AT&T will never share its 500-router internal topology with Sprint. Internal topology is a trade secret. The cooperative trust that made full-topology sharing possible no longer exists across organizational boundaries [7].
The consequence is a forced disaggregation. OSPF continues to work inside each organization — single administrative domain, full trust. But it fails across organizations. Routing must split: one system inside (intra-domain), a different system between (inter-domain). This separation creates gateway routers — routers at the AS boundary that speak both protocols. Inside: OSPF. Outside: the new inter-domain protocol that must be designed [7][10].
What the gateway router can advertise
The State failure constrains what information crosses organizational boundaries. OSPF advertises link states — raw topology like “Link A-B, cost 10” at router-level granularity. But across trust boundaries, internal routers are hidden. The outside world cannot know AT&T’s router names, link capacities, or traffic engineering policies.
The right unit of advertisement is prefixes — groups of IP addresses the AS is responsible for (e.g., 208.65.152.0/22) — plus the AS-level path for loop detection (e.g., [AS7018, AS3356, AS15169]). This is a path vector: more information than a distance vector (which hides the path entirely and enables loops), less information than full topology (which no competitor will disclose). It represents the maximum that commercial entities will share [7].
| What distance-vector shares | What path vector shares | What OSPF shares |
|---|---|---|
| Distance only | Prefix + AS-level path | Every link, every cost |
| Hides path → loops | Shows AS path → loop detection | Hides nothing |
| Too little | Just enough | Too much (across trust boundaries) |
The E-M-B gap becomes structural
The inter-domain protocol has a fundamentally different E-M-B gap than any system encountered previously. In OSPF, measurement equals environment — every router sees every link honestly, and the gap is zero. In the inter-domain case, the gap is permanent and by design. Commercial organizations deliberately hide internal topology, capacity, congestion, and cost. The protocol must be designed to work despite this permanent information deficit [7][8].
| Gap type | System | Cause | Fixable? |
|---|---|---|---|
| Accidentally noisy | TCP bufferbloat | Measurement honest but delayed | Yes — better estimators |
| Circular belief | DV count-to-infinity | Stale belief echoes back | Yes — share raw measurement [3] |
| Structurally filtered | Inter-domain routing | Sender deliberately hides information | No — filtering is intentional |
This is a new category: the structurally filtered gap. The system cannot be fixed by improving measurement fidelity or breaking circular dependencies. The information is absent because the source refuses to provide it. The protocol must make correct decisions with permanently incomplete knowledge [7][8].
Time and selection under structural filtering
The Time answer for the inter-domain protocol reflects a lesson learned from distance-vector’s instability. DV ran at 128 ms update intervals and oscillated [2]. OSPF floods within seconds inside one administrative domain [6]. The inter-domain protocol operates across thousands of ASes exchanging prefix-plus-path updates. Speed is dangerous at this scale — a 30-second minimum between updates for the same prefix prioritizes stability over speed. The cost is 3-15 minutes of convergence after failures. This is closed-loop reasoning applied to the update frequency itself: the DV experience proved that fast, unrestricted updates across multiple administrative domains produce oscillation, not convergence [7].
Selection presents a different challenge. OSPF picks shortest path — an objective metric optimized cooperatively. But commercial ASes have business preferences that override objective metrics. A paying customer’s longer path beats a competitor’s shorter path. LOCAL_PREF overrides shortest-path selection. The protocol enforces business relationships, not optimal routing. This is decision placement at work: each AS applies local policy autonomously because no global authority exists to impose a unified objective [7][8].
The result is BGP
The design that emerges from tracing the framework through the trust failure matches BGP as specified in RFC 4271 [7]:
| Invariant | Design from framework | BGP (RFC 1105, 1989) [7] | Design principle |
|---|---|---|---|
| Coordination | Each AS decides by local business policy | LOCAL_PREF > AS_PATH length > tie-breakers | Decision placement: maximally distributed |
| State | Path vector — prefix + AS path (structurally filtered) | AS_PATH + policy attributes; permanent E-M-B gap | Closed-loop: design measurement signal given privacy |
| Time | Slow updates for stability | 30-second minimum; 3–15 min convergence | Closed-loop: stability over speed (DV’s lesson) |
| Interface | Disaggregated from internal routing | BGP ↔ OSPF/IS-IS boundary | Disaggregation: inter-domain from intra-domain |
The summary: OSPF to BGP is a State failure. Trust changed, the State invariant answer could no longer hold, and every other answer adapted to function under the structurally filtered gap that resulted.
BGP’s deeper lessons
Three consequences follow from the structurally filtered gap.
First, stability is institutional, not algorithmic. Griffin et al. (2002) proved that BGP convergence with arbitrary policies is NP-complete — there exist policy combinations for which no stable routing solution exists [8]. Gao and Rexford (2001) showed that BGP converges if and only if policies follow the customer-provider hierarchy — the economic relationships that structure the commercial Internet [9]. The E-M-B gap is permanent, and the system works not because the algorithm guarantees convergence but because economic incentives prevent pathological policy combinations.
Second, trust and verification remain unsolved. In 2008, Pakistan Telecom announced YouTube’s prefix. BGP accepted it — no origin authentication existed. YouTube went dark globally for two hours. BGP inherited cooperative trust assumptions from its ARPANET heritage without verification mechanisms. RPKI (2012) partially fixes origin validation, but deployment remains incomplete — the same deployability meta-constraint that slows every Internet-wide upgrade.
Third, a tension emerges that bridges to the second exercise. BGP selects one best path per prefix — stability demands simplicity. But operators want finer control: route video one way, bulk transfers another. Every additional policy rule (communities, route maps, prefix-list filters) adds complexity per device. Precision competes with stability. Control competes with cost. This same tension exists inside organizations — and it will break the architecture.
Exercise 2: scale and cost break the Coordination invariant
Return to OSPF operating inside a single organization — datacenter, campus, WAN. The trust assumption still holds (single administrative domain), so State is fine. But three forces converged to make OSPF’s architecture economically unsustainable:
Scale exploded. Cloud providers built datacenters with tens of thousands of switches. Each device running link-state computation plus Dijkstra plus line-rate forwarding required expensive compute and memory. Routers cost \$500K or more. Cisco and Juniper monopoly pricing trapped organizations in a hardware cost spiral.
Precision demanded bloat. OSPF routes by shortest path to IP prefix. Organizations wanted per-application routing: video on path A, conferencing on path B, bulk transfers on path C. Per-application rules require more FIB/TCAM entries per device — more memory, higher cost. Every additional rule inflates every device in the network.
Policy required touching every device. Each router owns its own control plane. Changing routing policy means reconfiguring every router individually. Casado et al. (2007) documented that human error accounted for 62% of network downtime — because network-wide policy was expressed through thousands of lines of local configuration on individual devices [10].
Identifying the invariant under pressure
All three forces trace back to one architectural choice: every router computes its own control plane. Route computation, policy expression, traffic engineering — all distributed across every device. The invariant under pressure is Coordination. If distributed coordination is the problem, the alternative is to centralize path computation.
But Baran proved in 1964 that distributed coordination was essential for survivability [1]. Centralizing routing creates a single point of failure. This is exactly the tension the SDN pioneers faced. The resolution lies in recognizing what changed between 1964 and 2004:
| Baran’s context (1964) | SDN’s context (2004) | |
|---|---|---|
| Domain | National network across hostile territory | Datacenter/campus you fully control |
| Threat | Nuclear attack — centralization = one bomb destroys routing | Operational complexity — distributed config = 62% downtime |
| Central point | Physically vulnerable, unreplicable | Replicable — 3 controllers in different racks, millisecond failover |
Inside your own domain, you can replicate controllers, add fast failover, and monitor health continuously. Logically centralized does not mean physically centralized. The data showed that distributed configuration caused more downtime (62% from human error [10]) than centralization risked. The binding constraint shifted from survivability against an external adversary to manageability of internal complexity.
Disaggregation: separating control from data
The control plane and data plane are coupled in every router — the same monolithic pattern that Heart identified in the original IMP design [2]. Heart separated forwarding from routing in 1969 — same box, different processes. SDN applies the same disaggregation principle more aggressively: pull the control plane out of every router entirely [11][12].
Switches become simple match-action engines. A separate controller computes routes, policies, and traffic engineering centrally. Heart separated processes; SDN separates devices. The disaggregation is deeper because the cost constraint demands it. Feamster et al. (2004) asked the question directly: “Why should network-wide routing decisions be implemented through thousands of lines of local configuration on individual, distributed devices?” [12]. The answer, increasingly, was that they should not.
The SDN design through the framework
The binding constraint is per-device cost plus operational complexity — manageability, not survivability. From this constraint, the four invariant answers follow:
| Invariant | Design | Rationale |
|---|---|---|
| Coordination | Centralized controller | Single admin, full authority. Survivability tradeoff acceptable — replicate for redundancy. |
| State | Global in controller; switches hold only forwarding rules | No secrets — you own everything. No Dijkstra on the switch. |
| Time | Sub-second — controller pushes rules directly | No distributed convergence. No path exploration. |
| Interface | Match-action rules on any header field (not just IP prefix) | Video traffic? Match on port 443 + specific server IPs. The flexibility OSPF lacked — without inflating every device [11]. |
The result is SDN/OpenFlow
The design matches OpenFlow as described by McKeown et al. (2008): “OpenFlow provides an open protocol to program the forwarding table in different switches” [11].
| Invariant | OSPF (coupled) | Design from framework | SDN/OpenFlow (2008) |
|---|---|---|---|
| Coordination | Distributed — each router decides | Centralized controller | NOX, ONOS, ODL |
| State | Full topology in every router | Global in controller; switches hold only rules | Network Information Base |
| Time | Distributed convergence (seconds) | Controller pushes rules (ms) | Flow setup: milliseconds |
| Interface | Per-prefix forwarding (limited) | Match on any header field (flexible) | OpenFlow match-action tables |
The intellectual lineage traces through three papers, each addressing a different invariant:
- Feamster et al. (2004) — “The Case for Separating Routing from Routers” — focused on State: centralize the routing database [12].
- Casado et al. (2007) — Ethane — focused on Coordination: centralize policy enforcement [10].
- McKeown et al. (2008) — OpenFlow — focused on Interface: standardize the controller-switch abstraction [11].
All three were motivated by the same constraint shift: from survivability to manageability.
Two evolutions from one baseline
The two exercises reveal how a single framework generates radically different architectures from different constraint shifts:
| OSPF (baseline) | BGP | SDN | |
|---|---|---|---|
| Context | Intra-domain, cooperative | Inter-domain, commercial | Intra-domain, scale + flexibility |
| What broke | (baseline) | State — topology becomes a trade secret | Control/data coupling → cost bloat + inflexibility |
| Binding constraint | Survivability | Commercial sovereignty | Per-device cost + policy precision |
| Key principle | Closed-loop (LS flooding) | Closed-loop (slow updates, measurement under privacy) | Disaggregation (control from data plane) |
| Coordination | Distributed | Distributed (sovereign) | Centralized |
| State | Full topology | Filtered paths (permanent gap) | Global (controller) |
BGP: trust changes, the State answer changes, coordination stays distributed. SDN: cost and flexibility change, deeper disaggregation occurs, coordination centralizes. Same framework diagnosed both, generated both, from constraint shifts alone.
Forward: from logical networks to physical media
Both BGP and SDN operate on wired networks where communication is point-to-point — a packet sent on a fiber or copper link reaches exactly one destination. The shared-medium problem does not arise because the medium is not shared. Starting in Lecture 5, the domain shifts to wireless, where transmission is inherently broadcast and the medium is shared by all devices within range. Physics denies the most basic form of feedback: a wireless transmitter cannot hear what is happening to its own transmission. The four-invariant framework carries forward unchanged, but the binding constraint becomes physical — shared spectrum, not commercial trust or economic cost. The question becomes: how do you coordinate access to a medium that everyone hears but no one fully observes?
References
[1] P. Baran, “On Distributed Communications Networks,” IEEE Trans. Communications Systems, vol. CS-12, no. 1, pp. 1–9, March 1964.
[2] F. E. Heart, R. E. Kahn, S. M. Ornstein, W. R. Crowther, and D. C. Walden, “The Interface Message Processor for the ARPA Computer Network,” Proc. AFIPS Spring Joint Computer Conference, pp. 551–567, 1970.
[3] J. M. McQuillan, I. Richer, and E. C. Rosen, “The New Routing Algorithm for the ARPANET,” IEEE Trans. Communications, vol. COM-28, no. 5, pp. 711–719, May 1980.
[4] R. Bellman, “On a Routing Problem,” Quarterly of Applied Mathematics, vol. 16, no. 1, pp. 87–90, 1958.
[5] C. Hedrick, “Routing Information Protocol,” RFC 1058, June 1988.
[6] J. Moy, “OSPF Version 2,” RFC 2328, April 1998.
[7] Y. Rekhter, T. Li, and S. Hares, “A Border Gateway Protocol 4 (BGP-4),” RFC 4271, January 2006.
[8] T. G. Griffin, F. B. Shepherd, and G. Wilfong, “The Stable Paths Problem and Interdomain Routing,” IEEE/ACM Trans. Networking, vol. 10, no. 2, pp. 232–243, April 2002.
[9] L. Gao and J. Rexford, “Stable Internet Routing Without Global Coordination,” IEEE/ACM Trans. Networking, vol. 9, no. 6, pp. 681–692, December 2001.
[10] M. Casado, M. J. Freedman, J. Pettit, J. Luo, N. McKeown, and S. Shenker, “Ethane: Taking Control of the Enterprise,” Proc. ACM SIGCOMM, pp. 1–12, 2007.
[11] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, “OpenFlow: Enabling Innovation in Campus Networks,” ACM SIGCOMM Computer Communication Review, vol. 38, no. 2, pp. 69–74, April 2008.
[12] N. Feamster, H. Balakrishnan, J. Rexford, A. Shaikh, and J. van der Merwe, “The Case for Separating Routing from Routers,” Proc. ACM SIGCOMM Workshop on Future Directions in Network Architecture (FDNA), 2004.