Who Gets Served Next?

CS176C — Advanced Topics in Internet Computing

Arpit Gupta

2026-05-19

Three Lectures of Application Coping

L10–L12: Netflix built a 60-second buffer. Zoom shrank it to 50 ms. Cloud gaming pushed below 50 ms.

Every technique — buffering, ABR, jitter management, FEC, loss concealment — was the application compensating for a network that treats every packet identically.

Today we go inside the router.

Question: If you could redesign the router, what is the first decision you would change?

The Bottleneck Queue

A campus router: 1 Gbps inbound, 100 Mbps outbound. Traffic arrives 10x faster than it can leave.

Packets accumulate in a buffer. The buffer is finite. Someone must decide:

Who gets transmitted next?

That is the scheduling question.

This is not a subplot of transport. Queue management is a peer system to TCP — it creates the environment transport operates in, and transport reshapes queue behavior in return.

FIFO: The Default

Order Is Destiny

How FIFO Works

One queue. Packets served in arrival order. Head pointer, tail pointer. Nothing else.

  • No per-flow tracking. No classification. No weights. No priorities.
  • Work-conserving: the link never idles when packets are available.
  • The default everywhere. The vast majority of Internet routers use FIFO today.

At 100 Gbps line rates — hundreds of millions of packets per second — “cheap and fast” matters enormously.

The VoIP Packet Behind Netflix

Question: A Netflix client downloads a 4 MB video segment — roughly 2,700 packets. A VoIP packet (200 bytes, 20 ms of voice, 150 ms end-to-end deadline) arrives one packet later. What happens under FIFO?

The VoIP packet waits behind all 2,700 Netflix packets.

At 100 Mbps, each 1,500-byte packet takes 0.12 ms to transmit:

2,700 × 0.12 ms = 324 ms of queuing delay

The VoIP budget for the entire end-to-end path is 150 ms. This single queue blew the budget by more than double.

The call is unusable — not because the network lacks capacity, but because the scheduler is indifferent to urgency.

FIFO’s Fairness Problem

Question: Two flows share the link. Flow A: 1,000 packets/sec. Flow B: 10 packets/sec. Is FIFO fair?

FIFO is fair per-packet — every packet waits its turn.

But Flow A consumes 99% of the link. Flow B’s packets land behind Flow A’s long train. Flow B’s latency is determined by Flow A’s sending rate.

Per-packet fairness ≠ per-flow fairness. FIFO punishes the innocent along with the aggressive.

FIFO’s coordination answer: no one decides. Arrival order is destiny.

Priority Queuing

Let Important Traffic Cut the Line

The Fix: Multiple Queues, Strict Priority

Instead of one queue, maintain multiple queues at different priority levels. Always serve the highest-priority non-empty queue first.

Priority Queue Traffic type
High Queue 1 VoIP, emergency services, network control
Medium Queue 2 Interactive web, video conferencing
Low Queue 3 Bulk transfer, email, software updates

Classify packets using the DiffServ field in the IP header (6 bits, designed for exactly this).

Now the VoIP packet enters Queue 1, gets served immediately. Problem solved?

The Starvation Problem

Question: Priority queuing works when high-priority traffic is light. What happens when it is heavy?

If the high-priority queue is always full — misconfigured server, attacker marking bulk traffic as high priority, or a legitimate surge — lower queues never get served.

Low-priority packets wait indefinitely. They starve.

Starvation is not a bug — it is a feature of strict priority. “Always serve highest priority” means always, even if it means other traffic never gets through.

The coordination evolved from “no one decides” (FIFO) to “the operator decides statically” (Priority). But static decisions break under dynamic traffic.

Round-Robin

Everybody Gets a Turn

Separate Queue Per Flow, Take Turns

Round-robin: one queue per flow. Scheduler visits each non-empty queue, transmits one packet from each.

Queue A:  [pkt] [pkt] [pkt] [pkt]
Queue B:  [pkt] [pkt]
Queue C:  [pkt]

Transmission order: A, B, C, A, B, A, A
  • No flow starves — every flow gets its turn.
  • Coordination answer: take turns.

Question: Is this fair?

The Packet-Size Problem

Two flows share the link:

  • Flow A: 1,500-byte packets (bulk transfer, full MTU)
  • Flow B: 500-byte packets (VoIP / interactive)

Round-robin gives each flow one packet per turn.

Flow A transmits 1,500 bytes per turn. Flow B transmits 500 bytes per turn.

Flow A gets 3× the bandwidth. Fair in packet count, unfair in byte count.

Different applications routinely use different packet sizes:

  • Bulk transfers: 1,500 B   |   VoIP: 200 B   |   DNS: ~100 B   |   ACKs: 40–60 B

We want per-byte fairness, not per-packet fairness.

Weighted Fair Queuing

Proportional Shares via Virtual Time

The Ideal: Bit-by-Bit Fairness

Imagine an impossible scheduler: transmit one bit from each flow in turn. Perfect bitwise fairness regardless of packet sizes.

Of course, this is physically impossible — networks are packet-switched.

But: if we could compute which packet the ideal bit-by-bit scheduler would finish transmitting first, we could serve that packet first in the real system.

This is the insight of Weighted Fair Queuing (WFQ) — Demers, Keshav, and Shenker, 1989.

How WFQ Works

Each flow gets a queue and a weight \(w_i\) (its fair share of the link).

Virtual time: a clock that advances by bits transmitted, normalized by active flow weights.

Virtual finish time: when would this packet finish in the ideal bit-by-bit scheduler?

The scheduler serves the packet with the smallest virtual finish time — the one that would finish first in the ideal system.

Over any sufficiently long interval, flow \(i\) receives:

\[\text{bandwidth share} = \frac{w_i}{\sum_j w_j}\]

A flow with weight 2 gets twice the bandwidth of a flow with weight 1.

WFQ in Action: VoIP + Netflix + Bulk

100 Mbps link, three flows, equal weights:

Flow Needs Weight WFQ share
V (VoIP) 64 kbps 1 1/3 (~33 Mbps)
N (Netflix) 5 Mbps 1 1/3 (~33 Mbps)
B (Bulk) greedy 1 1/3 (~33 Mbps)

Flow V only needs 64 kbps. Flow N only needs 5 Mbps. Unused share redistributed to Flow B.

Result: V gets 64 kbps with near-zero queuing delay (its queue is almost always empty). N gets 5 Mbps comfortably. B gets ~95 Mbps.

Isolation: one flow’s behavior cannot degrade another flow’s performance. The VoIP packet no longer waits behind 2,700 Netflix packets.

The Cost: Per-Flow State

Question: WFQ sounds ideal. Why isn’t it used everywhere?

For each flow, the router must maintain:

  • A separate queue
  • A weight
  • A virtual finish time for the head-of-line packet
  • The virtual time clock

Per-packet cost: O(log F) — must find the smallest virtual finish time among F flows.

Deployment Flows Packet budget WFQ feasible?
Campus router hundreds microseconds Yes
Core Internet router millions ~8 nanoseconds No

Millions of queues. Millions of virtual finish times. Log-million comparison per packet at 100 Gbps.

Deficit Round-Robin

A Practical Compromise — O(1) per Packet

DRR: The Key Idea

Shreedhar and Varghese, 1996. Each flow gets a queue and a deficit counter.

Each round: add quantum \(Q\) bytes to the deficit. Transmit packets while head-of-line size ≤ deficit. Subtract each packet’s size. Carry leftover deficit to the next round.

Example (\(Q = 1000\) bytes):

Round Flow A (1,500 B pkts) Flow B (500 B pkts)
1 deficit = 1,000. Pkt = 1,500. Can’t send. Keeps 1,000. deficit = 1,000. Send 2 pkts (1,000 B). Deficit = 0.
2 deficit = 2,000. Send 1 pkt (1,500 B). Deficit = 500. deficit = 1,000. Send 2 pkts. Deficit = 0.

Over 2 rounds: A sent 1,500 B, B sent 2,000 B. Not perfectly equal, but the deficit self-corrects.

Why O(1) Matters

DRR never sorts. Never computes virtual finish times. Visits each active flow once per round, adds a constant, compares, transmits.

Per-packet cost: O(1) — independent of the number of flows.

At 100 Gbps with 1,500 B packets: ~8 nanoseconds per packet. No time for log-F comparisons.

DRR sacrifices exact packet-by-packet ordering (off by one packet per flow per round). Over any reasonable interval, fairness converges.

DRR and its variants are the most widely deployed fair scheduling algorithms in real networks.

The Scalability–Fairness Tradeoff

More State → More Fairness → More Cost

The Full Progression

Discipline Per-flow state Per-packet cost Fairness
FIFO None O(1) None — arrival order
Priority Per-class (few queues) O(1) Class preference; starvation risk
Round-Robin Per-flow queue O(1) Per-packet fair; byte-unfair
WFQ Queue + virtual time O(log F) Max-min fair (per-byte)
DRR Queue + deficit counter O(1) ≈ Max-min fair

Each step up adds state. More state → more fairness. More state → more memory, computation, and complexity at line rate.

Internet Core vs. Cellular Edge

Question: Given this tradeoff, what scheduling discipline do you expect at the Internet core? At a cellular base station? Why?

Internet core Cellular base station
Line rate 100–400 Gbps Hundreds of Mbps
Active flows Millions Tens to hundreds
Per-packet budget ~nanoseconds ~microseconds
Existing per-user state None Full (auth, handoff, power)
Scheduler FIFO WFQ / proportional-fair

The core uses FIFO not because it is fair — we just showed it is not — but because it is the only discipline that scales to the core’s demands.

The base station uses WFQ because it already maintains per-user state. Adding a scheduling weight is trivial.

The Medium Access Parallel

Same Coordination invariant, different resource:

Medium access Scheduling State
ALOHA — no coordination FIFO — no fairness Zero
CSMA/CA — distributed, minimal Round-Robin — per-flow turns Per-flow
OFDMA — centralized, per-user WFQ — proportional allocation Per-flow + weights

In medium access, the resource is the wireless channel. In scheduling, the resource is the output link.

The tradeoff between simplicity and fairness is identical.

SFQ: A Middle Ground (Summary)

Stochastic Fairness Queuing: hash flows into a fixed number of queues (e.g., 1,024). Apply DRR across those queues.

  • Flows in different hash buckets → isolated
  • Flows in the same bucket → share FIFO within that bucket
  • Periodically re-hash to prevent permanent collisions

Tradeoff: O(1) state per queue (not per flow). Probabilistic fairness, not deterministic. Deployable at high line rates.

The Grand Arc: Five Coordination Answers

Who gets served next?

  1. FIFO: “Whoever arrived first.” Zero state. Deployable anywhere.
  2. Priority: “Whoever the operator designated.” Static coordination. Starvation risk.
  3. Round-Robin: “Everyone takes turns.” Fair in packets, unfair in bytes.
  4. WFQ: “Everyone’s proportional share.” Max-min fair. O(log F).
  5. DRR: “Approximately everyone’s share.” Nearly WFQ-fair. O(1).

The fundamental tradeoff: fairness requires state, and state has a cost that depends on deployment context.

This is the Coordination invariant applied to the queue — the same invariant we traced through medium access (L5–L8), transport (L3–L4), and multimedia (L10–L12).

Bridge to L14: What Happens When the Buffer Is Full?

Today: who gets served next? We left a question untouched.

Under every discipline we discussed, when a packet arrives and the buffer is full, the router drops the last arrival (tail-drop).

Question: What happens when TCP senders all hit a full buffer at the same time?

Every sender loses packets simultaneously → every sender backs off simultaneously → buffer drains → every sender ramps up simultaneously → buffer fills → repeat.

Global synchronization — a destructive oscillation between full utilization and collapse.

Could the router drop packets before the buffer is full? Choose which packets to drop? Use the drop as a signal to senders?

That is Active Queue Management — L14.