Lecture 1: Welcome & Course Overview

Course: CS176C — Advanced Topics in Internet Computing, Spring 2026
Instructor: Arpit Gupta, UC Santa Barbara
Date: March 31, 2026
Slides: Deployed slide deck
Pre-requisite: CS176A or equivalent introductory networking course

Logistics

Item	Detail
Lectures	Tue & Thu, 2:00–3:15 PM, TD-W 1701 (Sanjay covers when Arpit is traveling)
Discussion	Fri, 1:00–1:50 & 2:00–2:50 PM, NH 1109 — attendance required
Communication	Slack — join today
Grading	PAs (40%), Midterm (20%), Project (20%), Participation (20%)
Participation	Discussion attendance + PA oral check-ins (can you explain your own code?)
Midterm	May 5, in-class, closed-device — no LLMs, no notes
Book	First Principles of Networking — free, online, work in progress
Full details	Course website

Course team — all SNL PhD students building the tools you will use:

Person	Role
Sanjay Chandrasekaran	TA — runs discussion, covers lectures when Arpit travels
Sylee Beltiukov	TA — runs discussion
Jaber Daneshamooz	Staff — lead developer of NetReplica and NetGent; supports PA4 + project
Manni Moghimi	Staff — broadband quality data infrastructure

Course schedule:

Weeks	Chapter
1–2	Ch 1
3–5	Ch 3 (Wireless Link) + Ch 4 (Wireless Infra)
6–7	Ch 7 (Queue Management)
7–8	Ch 11 (Multimedia)
8–10	Ch 12 (Measurement)

Before Thursday: Join Slack, read Ch 1: First Principles, PA1 drops Wednesday.

The north star: performant Internet for all

This course is taught by Arpit Gupta — Associate Professor at UC Santa Barbara, UC Presidential Faculty Fellow, Faculty Scientist at Lawrence Berkeley National Laboratory, and Fellow of the Benton Institute for Broadband & Society. The research group behind this course is the Systems & Networking Lab (SNL), and the north star driving every project, every tool, and this course is a single commitment: make performant Internet accessible for all.

The Internet transformed work, healthcare, education, and government. But connectivity remains deeply unequal across four dimensions:

Dimension	What it means	The gap
Availability	Can you get broadband service at your address?	30M Americans lack access; 40% in rural/tribal areas
Quality	Does the service actually deliver what’s advertised?	ISP self-reported data is unreliable and noisy
Affordability	Can you afford the service that’s available?	10% of the population cannot afford Internet
Adoption	Do people actually use the service when it’s available?	Seniors and low-income households lag behind

Two research paths flow from this north star. The first is building systems that make high-quality connectivity cheaper to deliver. The second is building data infrastructure to measure whether we are succeeding. The United States has spent over \$100 billion in the past 30 years on broadband programs — and still cannot reliably answer whether those programs worked [1].

The data gap: what we can’t see, we can’t fix

Policymakers allocate billions based on ISP self-reported data. The problem is that ISPs have incentives to overstate coverage and quality. The FCC’s National Broadband Map relied on ISP certifications that nobody independently verified. Crowdsourced speed tests from Ookla and M-Lab lack context — a “slow” result might be a cheap plan working correctly, or a premium plan failing. No independent dataset for broadband affordability existed at all. “What we can’t see, we can’t fix” [1].

BQT (Broadband Quality Tool) automates querying ISP websites to extract advertised speeds and prices at individual street addresses. The insight is that ISP websites already contain accurate data — because they need to sell to customers [2][3]. BQT enabled querying over one million addresses for more than fifty ISPs, bridging three critical data gaps:

Gap	Before BQT	With BQT
Availability	ISP self-reports (unreliable)	Independent spot-checking of ISP claims
Quality	Crowdsourced tests (noisy, no context)	Subscription tier context to denoise speed tests
Affordability	No dataset	First-ever street-address pricing data

Subscription tier context matters: median speed can vary 3–10x depending on the plan [4].

A \$10 billion audit

The Connect America Fund (2011–2021) spent \$10 billion subsidizing ISPs to serve high-cost rural areas. BQT audited what was actually delivered [3]:

Metric	Finding
Addresses actually served	~55% of what ISPs certified to the FCC
Plans meeting quality standards	~33% compliance among served addresses
Largest recipient (~\$1B)	Systematically ignored remote rural areas

A decade-long, \$10 billion investment did not deliver as promised — and nobody had the data to know.

The intellectual impact includes publications at SIGCOMM ‘23 [2], SIGCOMM ‘24 [3], and IMC ‘22 Best Paper [4], alongside the IRTF Applied Networking Research Prize ‘25 and the SIGCOMM Doctoral Dissertation Award ‘24. The policy impact spans multiple levels of government: the City of LA now requires subscription tier context for speed test measurements; the FCC reclassified DSL-served locations as underserved; Virginia’s JCOTS delivered a broadband affordability report to the legislature; and the research contributed evidence to an amicus brief before the U.S. Supreme Court in Wisconsin Bell v. United States. One research tool bridged the gap between measurement and accountability at every level of government.

BQT continues to shape legislative decisions. California’s CPUC is using BQT data to audit ISP compliance and support digital discrimination cases. Virginia’s JCOTS delivered its broadband affordability report. New York City partnered with the NYC Office of Technology & Innovation on a broadband equity assessment. Next, BQT is positioned to track the \$42 billion BEAD program — independent accountability at national scale. A continuous broadband quality monitoring platform called NetVibe, which captures application-level quality of experience, will be discussed later in the course.

Self-driving networks

The second research path addresses why community networks cannot afford large operations teams. Self-driving networks bring down the cost of delivering performant connectivity — and represent the networking instance of a broader challenge in the AI era: embedding ML models inside reliable, trustworthy operational loops. Current work with Google and the Department of Energy focuses on building the representation layer enabling agentic systems to manage complex networks with minimal human intervention. The core technical challenge is that ML models trained for networking do not generalize [5]. The root cause is data quality, not model architecture — which is where the tools come from.

The self-driving networks pipeline required a data generation substrate, and that substrate became the tools students use this quarter:

Tool	What it does	Reference
NetUnicorn	Data-generation thin waist — connects learning problems to network infrastructure	CCS ‘23 [6]
NetReplica	Programmable control over real access network conditions	arXiv [7]
NetGent	Agent-based automation of application workflows — the Agentic Thin Waist	NeurIPS MLforSys ‘25 [8]

These are not textbook exercises. They are research infrastructure built by the same PhD students who are the course TAs.

From research to first principles

Every breakthrough described above came from asking the right first question — not from knowing the right mechanism. BQT did not start with “how do I build a better speed test.” It started with: “What is the fundamental data gap in broadband policymaking?” Self-driving networks did not start with “which ML model is best.” It started with: “Why don’t existing ML models for networking generalize?” The mechanism came later. The first-principles question came first. This pattern — identify the structural constraint before optimizing the mechanism — is exactly what this course teaches.

Computing as a generative discipline

Computing is a generative discipline — it produces the abstractions from which entirely new problem spaces emerge [9]:

Decade	Abstraction	What it is	What it created
1960s	Packet switching	Breaking data into discrete, independently routed units	Networking — resource sharing, routing, congestion theory
1970–80s	The IP thin waist	A single narrow interface (best-effort datagrams) between heterogeneous networks	The Internet — any network below, any application above
1990s	HTTP + URL + HTML	Universal naming, transport, and rendering for hyperlinked documents	The World Wide Web — content creation, search, e-commerce
2000s	Virtualization + cloud APIs	Hardware abstraction + on-demand elastic compute via programmatic interfaces	Cloud computing — MapReduce, containers, large-scale ML training

Each row names a specific computing abstraction that opened a problem space with its own theory, methods, and communities. The IP thin waist is critical for this course: it is why TCP must infer congestion rather than being told [10].

AI itself emerged from decades of computing abstractions stacked on top of each other: packet switching enabled resource sharing, the IP thin waist enabled an interoperable global network, HTTP and the web enabled data at scale, cloud APIs enabled elastic GPU clusters, and distributed training at scale produced LLMs. Remove any link and LLMs do not exist. Computing generated AI — not the other way around.

The frontier has moved: from LLMs to agentic systems

Two years ago, the consensus was that LLMs would subsume everything. Today the frontier is agentic systems — AI models embedded in larger frameworks that plan, use tools, maintain state, and interact with environments [11]:

Dimension	LLMs	Agentic Systems
Core abstractions	Statistical learning, optimization, parallel computation	Distributed systems, planning, state management, security
Key challenge	Scale model training	Build reliable systems around models
Foundation required	Linear algebra, probability, GPU programming	Systems design, formal reasoning, software architecture

The shift happened in under two years and demanded entirely different foundational knowledge [11]. Whatever follows agentic systems will require yet another combination of computing abstractions. The value of your education is learning to reason at the level of the abstractions that generate paradigms. Students who understood decomposition, resource allocation, and formal specification navigated the paradigm shift — because they were operating at that generative level.

First-principles thinking amplified by AI

An LLM can recite TCP, describe AIMD, and list 802.11 frame fields. That recall is now automated. But a human trained in first-principles reasoning combined with an LLM produces qualitatively better outcomes than either one alone. The trained human knows what to ask, how to structure the problem, and when the LLM is wrong. The LLM accelerates execution. Someone without first-principles training struggles to outperform AI-only solutions. Someone with it turns AI into a force multiplier.

By the end of CS 176C, you will have built three capabilities: the ability to analyze any networked system by identifying its anchor constraint and tracing how it shapes every invariant answer; the ability to predict how a system must restructure when a constraint shifts, before looking at what engineers actually did; and the ability to design a new system under novel constraints using the same framework. These are capability objectives, not recall objectives. The midterm tests them on novel systems. The PAs build them progressively.

Three puzzles this course will solve

Three puzzles will recur throughout the course. First: in March 2020, the EU asked Netflix to reduce streaming quality to prevent network collapse — why did that help, and which three systems are coupled in the dependency chain? Second: in 2020, Zoom surged with remote work and chose UDP instead of TCP — what constraint forces that choice? Third: in 2025, 5G radio delivers 1ms latency, so why does your phone still buffer? You already know these events happened. This course teaches why they had to happen — by tracing the structural constraints that forced each outcome. We revisit each puzzle when covering the relevant system: Netflix and bufferbloat in Chapters 7 and 9, Zoom and time constraints in Chapter 11, and 5G latency in Chapter 4.

The four invariants

Every networked system must answer four structural questions:

Invariant	Question	TCP’s Answer
State	What does the system know? How does it learn?	cwnd — belief about capacity, from ACK arrivals
Time	When do things happen? How fast?	RTT inferred via Jacobson’s algorithm [12]
Coordination	Who decides? One entity or many?	Distributed — each sender adjusts via AIMD
Interface	What’s exposed? What’s hidden?	Byte stream above; datagrams below

The answers differ radically across systems. The questions never change.

The anchor constraint

Every system has one constraint that shapes all four answers. TCP’s anchor is unreliable IP with no central authority — from which distributed state (cwnd per sender), inferred timing (RTT from ACKs), distributed coordination (AIMD), and the byte-stream/datagram interface split all follow. WiFi’s anchor is the shared wireless medium — from which carrier sensing, microsecond timing (slots, DIFS/SIFS), distributed contention, and frame-level collision handling all follow. Find the anchor and the design becomes predictable.

Loading a web page: the dependency chain

Consider what happens when you open your laptop, connect to WiFi, and type www.google.com. Your laptop starts with nothing — no IP address, no idea where the router is, no idea where Google is. Five protocols fire in sequence, each solving one problem and producing state the next one needs: DHCP gets an IP address and learns the router and DNS server IPs; ARP resolves the router’s MAC address to send Ethernet frames; DNS resolves www.google.com to an IP address; TCP establishes a reliable connection to Google’s server; and HTTP fetches the page content.

The topology from Chapter 1 of the book shows this dependency chain spatially. DHCP and ARP are local — confined to the LAN. DNS queries traverse the Internet to the DNS server. TCP and HTTP reach Google’s web server through ISP routing. Each protocol operates at a different scope because each solves a different structural problem.

Three protocols, three design patterns

Placing three of these protocols side by side through the four invariants reveals how different constraints force different designs:

	DHCP	DNS	TCP
State	Centralized — server tracks pool	Hierarchical — partitioned namespace + caching	Distributed — independent cwnd per endpoint
Time	Lease-based — fixed duration	TTL-based — each record expires	Inferred — RTT from ACKs (Jacobson) [12]
Coordination	Centralized — server decides	Hierarchical delegation — root → TLD → auth	Distributed — AIMD, no central scheduler
Interface	UDP broadcast (no IP yet)	UDP port 53, on IP routing	Byte stream above, datagrams below

DHCP centralizes because unique addresses need one authority. TCP distributes because decentralization prohibits a central scheduler. DNS delegates because the namespace is too large for one server and too structured for full distribution. Same four questions, radically different answers — each forced by different constraints [10][13].

What comes next

Thursday’s lecture (L2) takes the four invariants from a descriptive table to a generative tool. Where today’s lecture showed TCP’s answers, L2 teaches how to derive those answers from a binding constraint — and how to apply the same derivation to any networked system. Read Chapter 1 of the book before then.

For curious minds:

References

[1] A. Gupta, “What We Can’t See, We Can’t Fix,” Benton Institute for Broadband & Society OpEd, January 2026.

[2] V. Padmanabhan, R. Pang, A. Gupta et al., “BQT: Broadband Quality Tool,” Proc. ACM SIGCOMM, 2023.

[3] A. Gupta et al., “Auditing the Connect America Fund,” Proc. ACM SIGCOMM, 2024.

[4] A. Gupta et al., “Subscription Tier Context for Speed Test Denoising,” Proc. ACM IMC, 2022. (Best Paper Award)

[5] A. Gupta et al., “Why ML Models for Networking Don’t Generalize,” Proc. ACM CCS, 2022.

[6] R. Rivera et al., “NetUnicorn: A Data-Collection Thin Waist for Network Measurement,” Proc. ACM CCS, 2023.

[7] S. Chandrasekaran et al., “NetReplica: Programmable Control over Real Access Network Conditions,” arXiv, 2025.

[8] S. Beltiukov et al., “NetGent: Agent-Based Automation of Application Workflows,” NeurIPS MLforSys Workshop, 2025.

[9] A. Gupta, “Computing Is a Generative Discipline,” Blog post, 2025.

[10] J. F. Kurose and K. W. Ross, Computer Networking: A Top-Down Approach, 8th ed., Pearson, 2021.

[11] A. Gupta, “Systems for Agents, Agents for Systems,” Blog post, 2025.

[12] V. Jacobson, “Congestion Avoidance and Control,” Proc. ACM SIGCOMM, pp. 314–329, 1988.

[13] J. H. Saltzer, D. P. Reed, and D. D. Clark, “End-to-End Arguments in System Design,” ACM Trans. Computer Systems, vol. 2, no. 4, pp. 277–288, November 1984.