CS176C — Advanced Topics in Internet Computing
2026-05-19
L10–L12: Netflix built a 60-second buffer. Zoom shrank it to 50 ms. Cloud gaming pushed below 50 ms.
Every technique — buffering, ABR, jitter management, FEC, loss concealment — was the application compensating for a network that treats every packet identically.
Today we go inside the router.
Question: If you could redesign the router, what is the first decision you would change?
The Architectural Anchor
A campus router: 1 Gbps inbound, 100 Mbps outbound. Traffic arrives 10x faster than it can leave.
Packets accumulate in a buffer. The buffer is finite. Someone must decide:
Who gets transmitted next?
That is the scheduling question.
This is not a subplot of transport. Queue management is a peer system to TCP — it creates the environment transport operates in, and transport reshapes queue behavior in return.
Order Is Destiny
One queue. Packets served in arrival order. Head pointer, tail pointer. Nothing else.
At 100 Gbps line rates — hundreds of millions of packets per second — “cheap and fast” matters enormously.
Question: A Netflix client downloads a 4 MB video segment — roughly 2,700 packets. A VoIP packet (200 bytes, 20 ms of voice, 150 ms end-to-end deadline) arrives one packet later. What happens under FIFO?
The VoIP packet waits behind all 2,700 Netflix packets.
At 100 Mbps, each 1,500-byte packet takes 0.12 ms to transmit:
2,700 × 0.12 ms = 324 ms of queuing delay
The VoIP budget for the entire end-to-end path is 150 ms. This single queue blew the budget by more than double.
The call is unusable — not because the network lacks capacity, but because the scheduler is indifferent to urgency.
Question: Two flows share the link. Flow A: 1,000 packets/sec. Flow B: 10 packets/sec. Is FIFO fair?
FIFO is fair per-packet — every packet waits its turn.
But Flow A consumes 99% of the link. Flow B’s packets land behind Flow A’s long train. Flow B’s latency is determined by Flow A’s sending rate.
Per-packet fairness ≠ per-flow fairness. FIFO punishes the innocent along with the aggressive.
FIFO’s coordination answer: no one decides. Arrival order is destiny.
Let Important Traffic Cut the Line
Instead of one queue, maintain multiple queues at different priority levels. Always serve the highest-priority non-empty queue first.
| Priority | Queue | Traffic type |
|---|---|---|
| High | Queue 1 | VoIP, emergency services, network control |
| Medium | Queue 2 | Interactive web, video conferencing |
| Low | Queue 3 | Bulk transfer, email, software updates |
Classify packets using the DiffServ field in the IP header (6 bits, designed for exactly this).
Now the VoIP packet enters Queue 1, gets served immediately. Problem solved?
Question: Priority queuing works when high-priority traffic is light. What happens when it is heavy?
If the high-priority queue is always full — misconfigured server, attacker marking bulk traffic as high priority, or a legitimate surge — lower queues never get served.
Low-priority packets wait indefinitely. They starve.
Starvation is not a bug — it is a feature of strict priority. “Always serve highest priority” means always, even if it means other traffic never gets through.
The coordination evolved from “no one decides” (FIFO) to “the operator decides statically” (Priority). But static decisions break under dynamic traffic.
Everybody Gets a Turn
Round-robin: one queue per flow. Scheduler visits each non-empty queue, transmits one packet from each.
Queue A: [pkt] [pkt] [pkt] [pkt]
Queue B: [pkt] [pkt]
Queue C: [pkt]
Transmission order: A, B, C, A, B, A, A
Question: Is this fair?
Two flows share the link:
Round-robin gives each flow one packet per turn.
Flow A transmits 1,500 bytes per turn. Flow B transmits 500 bytes per turn.
Flow A gets 3× the bandwidth. Fair in packet count, unfair in byte count.
Different applications routinely use different packet sizes:
We want per-byte fairness, not per-packet fairness.
Proportional Shares via Virtual Time
Imagine an impossible scheduler: transmit one bit from each flow in turn. Perfect bitwise fairness regardless of packet sizes.
Of course, this is physically impossible — networks are packet-switched.
But: if we could compute which packet the ideal bit-by-bit scheduler would finish transmitting first, we could serve that packet first in the real system.
This is the insight of Weighted Fair Queuing (WFQ) — Demers, Keshav, and Shenker, 1989.
Each flow gets a queue and a weight \(w_i\) (its fair share of the link).
Virtual time: a clock that advances by bits transmitted, normalized by active flow weights.
Virtual finish time: when would this packet finish in the ideal bit-by-bit scheduler?
The scheduler serves the packet with the smallest virtual finish time — the one that would finish first in the ideal system.
Over any sufficiently long interval, flow \(i\) receives:
\[\text{bandwidth share} = \frac{w_i}{\sum_j w_j}\]
A flow with weight 2 gets twice the bandwidth of a flow with weight 1.
100 Mbps link, three flows, equal weights:
| Flow | Needs | Weight | WFQ share |
|---|---|---|---|
| V (VoIP) | 64 kbps | 1 | 1/3 (~33 Mbps) |
| N (Netflix) | 5 Mbps | 1 | 1/3 (~33 Mbps) |
| B (Bulk) | greedy | 1 | 1/3 (~33 Mbps) |
Flow V only needs 64 kbps. Flow N only needs 5 Mbps. Unused share redistributed to Flow B.
Result: V gets 64 kbps with near-zero queuing delay (its queue is almost always empty). N gets 5 Mbps comfortably. B gets ~95 Mbps.
Isolation: one flow’s behavior cannot degrade another flow’s performance. The VoIP packet no longer waits behind 2,700 Netflix packets.
Question: WFQ sounds ideal. Why isn’t it used everywhere?
For each flow, the router must maintain:
Per-packet cost: O(log F) — must find the smallest virtual finish time among F flows.
| Deployment | Flows | Packet budget | WFQ feasible? |
|---|---|---|---|
| Campus router | hundreds | microseconds | Yes |
| Core Internet router | millions | ~8 nanoseconds | No |
Millions of queues. Millions of virtual finish times. Log-million comparison per packet at 100 Gbps.
A Practical Compromise — O(1) per Packet
Shreedhar and Varghese, 1996. Each flow gets a queue and a deficit counter.
Each round: add quantum \(Q\) bytes to the deficit. Transmit packets while head-of-line size ≤ deficit. Subtract each packet’s size. Carry leftover deficit to the next round.
Example (\(Q = 1000\) bytes):
| Round | Flow A (1,500 B pkts) | Flow B (500 B pkts) |
|---|---|---|
| 1 | deficit = 1,000. Pkt = 1,500. Can’t send. Keeps 1,000. | deficit = 1,000. Send 2 pkts (1,000 B). Deficit = 0. |
| 2 | deficit = 2,000. Send 1 pkt (1,500 B). Deficit = 500. | deficit = 1,000. Send 2 pkts. Deficit = 0. |
Over 2 rounds: A sent 1,500 B, B sent 2,000 B. Not perfectly equal, but the deficit self-corrects.
DRR never sorts. Never computes virtual finish times. Visits each active flow once per round, adds a constant, compares, transmits.
Per-packet cost: O(1) — independent of the number of flows.
At 100 Gbps with 1,500 B packets: ~8 nanoseconds per packet. No time for log-F comparisons.
DRR sacrifices exact packet-by-packet ordering (off by one packet per flow per round). Over any reasonable interval, fairness converges.
DRR and its variants are the most widely deployed fair scheduling algorithms in real networks.
More State → More Fairness → More Cost
| Discipline | Per-flow state | Per-packet cost | Fairness |
|---|---|---|---|
| FIFO | None | O(1) | None — arrival order |
| Priority | Per-class (few queues) | O(1) | Class preference; starvation risk |
| Round-Robin | Per-flow queue | O(1) | Per-packet fair; byte-unfair |
| WFQ | Queue + virtual time | O(log F) | Max-min fair (per-byte) |
| DRR | Queue + deficit counter | O(1) | ≈ Max-min fair |
Each step up adds state. More state → more fairness. More state → more memory, computation, and complexity at line rate.
Question: Given this tradeoff, what scheduling discipline do you expect at the Internet core? At a cellular base station? Why?
| Internet core | Cellular base station | |
|---|---|---|
| Line rate | 100–400 Gbps | Hundreds of Mbps |
| Active flows | Millions | Tens to hundreds |
| Per-packet budget | ~nanoseconds | ~microseconds |
| Existing per-user state | None | Full (auth, handoff, power) |
| Scheduler | FIFO | WFQ / proportional-fair |
The core uses FIFO not because it is fair — we just showed it is not — but because it is the only discipline that scales to the core’s demands.
The base station uses WFQ because it already maintains per-user state. Adding a scheduling weight is trivial.
Same Coordination invariant, different resource:
| Medium access | Scheduling | State |
|---|---|---|
| ALOHA — no coordination | FIFO — no fairness | Zero |
| CSMA/CA — distributed, minimal | Round-Robin — per-flow turns | Per-flow |
| OFDMA — centralized, per-user | WFQ — proportional allocation | Per-flow + weights |
In medium access, the resource is the wireless channel. In scheduling, the resource is the output link.
The tradeoff between simplicity and fairness is identical.
Stochastic Fairness Queuing: hash flows into a fixed number of queues (e.g., 1,024). Apply DRR across those queues.
Tradeoff: O(1) state per queue (not per flow). Probabilistic fairness, not deterministic. Deployable at high line rates.
Who gets served next?
The fundamental tradeoff: fairness requires state, and state has a cost that depends on deployment context.
This is the Coordination invariant applied to the queue — the same invariant we traced through medium access (L5–L8), transport (L3–L4), and multimedia (L10–L12).
Today: who gets served next? We left a question untouched.
Under every discipline we discussed, when a packet arrives and the buffer is full, the router drops the last arrival (tail-drop).
Question: What happens when TCP senders all hit a full buffer at the same time?
Every sender loses packets simultaneously → every sender backs off simultaneously → buffer drains → every sender ramps up simultaneously → buffer fills → repeat.
Global synchronization — a destructive oscillation between full utilization and collapse.
Could the router drop packets before the buffer is full? Choose which packets to drop? Use the drop as a signal to senders?
That is Active Queue Management — L14.