Breaking the Contention Ceiling

CS176C — Advanced Topics in Internet Computing

Arpit Gupta

2026-04-23

Where We Left Off

Last lecture: we proved CSMA/CA hits a structural throughput ceiling under density.

  • Bianchi (2000): ~30% utilization at \(n \approx 50\) — no BEB tuning fixes it
  • State grew at every step (ALOHA → CSMA → CSMA/CA → DCF)
  • Coordination stayed distributed the entire time

The ceiling is not a state limit. It is a coordination limit.

Today’s question: WiFi shipped this protocol in 1997 and survived 20 years. What did it actually do about the ceiling — and why did it take until 2021 to break it?

The binding constraint: WiFi operates in unlicensed spectrum. No entity owns the channel. From that single fact, the invariant answers cascade: Coordination → distributed, State → local, Time → coarse, Interface → minimal. CSMA/CA is not a design choice — it is the only viable MAC under these constraints.

Part 1: WiFi’s Co-Evolution

Making the Pipe Wider — and Discovering It Wasn’t Enough

The Pressure Chain

WiFi’s first instinct: if CSMA/CA wastes most of the channel, make the channel so large that even 30% is enough.

Three simultaneous advances, each enabling the next:

1. Modulation — more bits per symbol

  • 802.11b (1999): 11 Mbps via Complementary Code Keying (CCK)
  • 802.11a (1999): 54 Mbps via Orthogonal Frequency Division Multiplexing (OFDM) — splits the channel into many narrow subcarriers, each carrying data; robust against indoor multipath
  • Modulation up to 64-Quadrature Amplitude Modulation (QAM): 6 bits per symbol
  • 802.11ac (2013): 256-QAM → 8 bits/symbol; 802.11ax (2021): 1024-QAM → 10 bits/symbol

2. Antennas — Multiple-Input Multiple-Output (MIMO)

  • Spatial diversity: combine signal copies from different paths → better Signal-to-Noise Ratio (SNR) without more power
  • Spatial multiplexing: 4 antennas → up to 4 independent data streams simultaneously

3. Channel width — 20 MHz → 40 → 80 → 160 MHz

Multiply together: 8 streams × 256-QAM × 160 MHz = 6.9 Gbps peak PHY rate (802.11ac)

MU-MIMO: The Promise

Downlink Multi-User MIMO (MU-MIMO) in 802.11ac: the AP uses beamforming — shaping the antenna pattern to focus energy toward specific clients — to serve up to 4 users simultaneously on different spatial beams.

The question: if the AP can serve 4 users in parallel, does throughput quadruple?

MU-MIMO: The Half-Victory

Partly. 802.11ac cracked the downlink: the AP wins a CSMA/CA contention round, then beamforms to 4 users in parallel during that single Transmit Opportunity.

Downlink delivery is genuinely parallel. 802.11ac DID break the one-user-at-a-time paradigm — for the downlink.

But: the uplink is untouched — every client still contends one at a time. And the AP itself must fight those same clients just to get a turn.

MU-MIMO: What Remains Broken

Downlink parallelism exists, but it’s gated by two bottlenecks:

1. Serial access. The AP must win CSMA/CA contention against every client before starting its MU-MIMO burst. At n=250, winning is the problem.

2. One direction only. Uplink is still one client at a time. The 6.9 Gbps peak assumes the AP has the channel — getting the channel is still a contention problem.

Even the downlink win is taxed: MU-MIMO requires a channel-sounding handshake before every burst — overhead that can negate the parallelism gain for small packets.

The Overhead Tax

Your pre-lecture exercise used 100 µs as protocol overhead. The real number — including average backoff — is closer to 160 µs. That makes the picture worse.

Component Value Why it’s fixed
DIFS (DCF Interframe Space) 34 µs Radio turnaround physics
Average backoff 67.5 µs \(CW_{\min}=15\), avg = 7.5 slots × 9 µs/slot
PHY preamble 20 µs OFDM training symbols
SIFS (Short Interframe Space) 16 µs Radio RX-to-TX switch
ACK at 6 Mbps basic rate ~24 µs 14 bytes + preamble at lowest mandatory rate
Total ~160 µs None scale with PHY rate improvements

What happens as the PHY gets faster?

PHY rate Frame TX (1500 B) Overhead Data fraction
54 Mbps 222 µs 160 µs 58%
600 Mbps 20 µs 160 µs 11%
6.9 Gbps 1.74 µs 160 µs 1.1%

The PHY engineers gave us 6.9 Gbps. The MAC wastes 99% of it on protocol gaps. Making the pipe wider made the overhead ratio catastrophically worse.

Aggregation: The MAC’s First Response

The overhead tax is 160 µs per frame. What if you could pay it once for many frames?

802.11n introduced Aggregated MAC Protocol Data Unit (A-MPDU): bundle up to 64 frames under a single PHY preamble.

  • Without aggregation: 1 preamble, 1 DIFS, 1 backoff, 1 ACK → per frame
  • With aggregation: 1 preamble, 1 DIFS, 1 backoff → 64 frames → 1 Block ACK

Block ACK: a bitmap where each bit marks one frame as received or lost — selective retransmission of only failed frames.

Transmit Opportunity (TXOP) (from earlier 802.11e amendment): the winner holds the channel for a burst duration instead of re-contending after every frame.

Result: per-frame overhead drops from 160 µs to ~2.5 µs. Data fraction recovers.

In the framework: aggregation is a Time invariant fix — it changes the effective timescale of the protocol. But Coordination remains distributed. State remains local. The binding constraint hasn’t changed. The ceiling is still there; aggregation raised the floor beneath it.

Density Breaks the Fix

Aggregation fixed per-frame overhead. It did not fix contention. Let’s quantify why.

Define \(\tau\) = probability a station transmits in any given slot. Not “per attempt” — per slot. If we freeze time at a random slot boundary, \(\tau\) is the chance station \(i\)’s backoff counter is at zero.

If every station is at \(CW_{\min}\) (no collisions yet): \(\tau \approx \frac{2}{CW_{\min} + 1}\). For \(CW_{\min} = 15\): \(\tau = 0.125\).

In this one slot, what’s the probability exactly one station transmits?

\[P(\text{success}) = n \cdot \tau \cdot (1 - \tau)^{n-1}\]

Pick one transmitter (\(\tau\)), require all other \(n-1\) to stay silent (\((1-\tau)^{n-1}\)), any of \(n\) could be the one.

  • \(n = 10\): \(P \approx\) 0.40 — 40% of slots succeed. Workable.
  • \(n = 50\): \(P \approx\) 0.009 — less than 1%.
  • \(n = 250\): \(P \approx\) 0. Virtually no successful slot.

Caveat: we assumed \(\tau = 0.125\) — all stations at \(CW_{\min}\), no collisions yet. At high \(n\), collisions are constant, BEB grows CW, and the real stationary \(\tau\) is much smaller. Our numbers are pessimistic. But Bianchi’s full model (which solves for the true equilibrium \(\tau\)) still shows throughput dropping to 40-55% at \(n=50\) and falling further at higher density. The collapse is real — our approximation just makes it look worse than steady state.

WiFox: A Step Toward Centralization

This wasn’t just theory — it was observable in the wild.

In 2012, studying WiFi in packed lecture halls and conferences: pages wouldn’t load, video buffered endlessly — even at hundreds of Mbps PHY rate.

Root cause: the AP was starving for channel access. Massive downlink queue, but it could only transmit when it won contention against hundreds of clients — each with almost no data but equal contention rights.

The protocol’s fairness, designed for 5 devices, was the source of failure at 500.

WiFox (ACM CoNEXT 2012): software-only fix. Dynamically shorten the AP’s backoff window when its queue depth grows — giving the AP priority over clients.

  • 400-700% improvement in downlink throughput
  • 30-40% reduction in response time
  • Deployable as firmware update — no hardware or client changes

In the framework: WiFox was a partial Coordination shift. We gave the AP privileged access — it no longer competed on equal terms. We broke CSMA/CA’s per-station fairness because that fairness was the source of failure.

WiFox was a step toward centralization. The full step came with 802.11ax.

802.11ax: WiFi Centralizes

802.11ax (WiFi 6, 2021) changed the MAC architecture fundamentally.

Instead of stations contending for the whole channel one at a time, the AP divides it into Resource Units (RUs) — time-frequency slots assigned to individual clients.

How many RUs? Each RU = 26 Orthogonal Frequency Division Multiple Access (OFDMA) subcarriers ≈ 2 MHz. In a 160 MHz channel: \(160/2 = 80\) naively, but 74 RUs in practice. The missing 6 RUs’ worth of subcarriers go to guard bands, DC null subcarriers, and inter-RU gaps.

The mechanism: Trigger Frame.

A specialized AP broadcast carrying per-client metadata — which RU, what modulation, how much power, how long. Clients cannot transmit on OFDMA RUs without a Trigger Frame. All assigned clients begin transmitting exactly one SIFS (16 µs) later.

Multiple clients transmit simultaneously on non-overlapping frequency slices. Zero contention on the data path.

Follow the logic: Coordination must be centralized → someone needs a global view → that’s the AP → the AP must partition resources deterministically → that structure is OFDMA. WiFi didn’t borrow OFDMA from cellular Long-Term Evolution (LTE) out of convenience — the constraint shift forced the same invariant answers.

What WiFi Lost — and Why 20 Years

What WiFi lost: any station can transmit whenever it wants, no permission needed, no infrastructure required. Under 802.11ax, the high-throughput path requires an AP scheduler. No AP, no OFDMA. Ad-hoc and mesh networks fall back to CSMA/CA.

What WiFi gained: channel utilization above 70%, multi-user parallel transmission, deterministic latency, elimination of hidden terminals on the OFDMA path.

Why did centralization take twenty years? Three forces:

1. No operator. Cellular has a carrier that owns spectrum and runs the scheduler. WiFi has no single entity — the AP-as-scheduler only became obvious when APs became ubiquitous infrastructure.

2. Backward compatibility. An 802.11ax AP must still serve an 802.11b client from 1999. You cannot remove CSMA/CA — only layer scheduling on top.

3. Deployment context. Before ~2012, a home AP served 3-5 devices. The ceiling was theoretical, not lived. It took lecture halls, airports, and stadiums to turn the math into a visible problem. Technology adoption follows deployment pressure, not theoretical possibility.

Part 2: The Other Path

Cellular — Licensed Spectrum, Centralized from Day One

Licensed Spectrum: The Mirror Constraint

WiFi’s binding constraint: unlicensed spectrum → distributed coordination forced.

Cellular’s binding constraint: licensed spectrum — exclusive operator ownership, bought at auction for billions → centralized coordination from day one.

The base station is the sole authority over every transmission. There is no contention for data, ever.

  • State can be global (the BS knows every registered device)
  • Time can be precise (the BS synchronizes all transmissions)
  • Interface can carry rich scheduling information

Cellular’s evolution question was never “when to centralize” — it was “how to schedule efficiently as traffic changed from voice to data.”

Voice → Data: the scheduling challenge

  • Frequency Division Multiple Access (FDMA), 1G (1980s): 30 kHz channels, one per call, ~60 per cell. Hard capacity — call 61 is blocked. Average voice activity ~35%, so two-thirds of spectrum carries silence.
  • Time Division Multiple Access (TDMA) / GSM, 2G: 8 time slots per 200 kHz carrier, 576.9 µs each in a 4.615 ms frame. More users, better codecs. Still hard capacity.
  • Code Division Multiple Access (CDMA), 3G: all users share the same frequency, separated by orthogonal spreading codes. Adding a user raises the noise floor — soft capacity instead of hard. But power control becomes the MAC: the base station adjusts each mobile’s transmit power 800 times/second to prevent the near-far problem (a phone 100 m away arrives 10,000× stronger than one 1 km away).

From Voice to Data: The Scheduling Bridge

Voice: steady, symmetric — well-served by dedicated circuits.

Data: loading a webpage involves a few hundred milliseconds of transfer, then seconds of idle reading. Under the circuit model, you hold a dedicated channel the entire time — carrying data for a fraction of a second, then silence for ten seconds.

What scheduling concept bridges circuit-switched voice to packet-switched data?

High Speed Downlink Packet Access (HSDPA), deployed ~2005: replaced dedicated per-user channels with a shared pool scheduled every 2 milliseconds — one Transmission Time Interval (TTI).

  • Each TTI, the scheduler picks users with the best channel conditions
  • Channel conditions reported via Channel Quality Indicator (CQI) feedback — the cellular answer to the State invariant
  • TTI = the cellular answer to the Time invariant: how often does the scheduler re-evaluate?

HSDPA proved the concept: per-TTI packet scheduling works at millisecond timescales. LTE built its entire architecture around OFDMA — the same OFDMA that WiFi borrowed in 802.11ax, twelve years later.

The Universal Bootstrap

GSM is fully centralized — zero contention for data. But there’s one moment when centralization fails.

When? Think about it. When can’t the base station schedule a device?

When a brand-new device powers on. The BS doesn’t know it exists. It can’t schedule what it hasn’t discovered.

Solution: Random Access Channel (RACH) — the device announces itself using slotted ALOHA. The 1972 protocol, running inside a 1991 TDMA system.

Could you eliminate RACH entirely? Could you build a system with zero contention at every level?

No. Any system with dynamic membership needs at least one contention-based moment: the discovery moment. You can centralize everything after discovery, but discovery itself is irreducibly contention-based.

802.11ax has the same pattern: OFDMA for data, CSMA/CA for initial association. Contention is the tax you pay for letting new participants join without prior arrangement.

Feedback Loop Speed: What Unifies the Landscape

Every wireless MAC is a control loop: measure → allocate → transmit → measure outcome → reallocate.

How fast can that loop run? This determines the architecture’s ceiling.

System What it measures Loop period Coordination
CSMA/CA Carrier sense (local, binary) ~12 ms Distributed
GSM TDMA Slot assignment ~4.6 ms Centralized
CDMA power control Received power per user ~1.25 ms Centralized
LTE OFDMA CQI per user per RB ~1 ms Centralized
802.11ax OFDMA Per-client feedback ~1-5 ms Centralized

Faster loop → tighter coordination → higher utilization.

CSMA/CA had the slowest loop in the landscape. That’s the structural reason contention broke — and why centralization, with its faster and richer feedback, was the destination for both WiFi and cellular.

Part 3: Convergence

Two Paths, One Destination

Convergence: Each Side Borrowed the Other’s Technique

After thirty years of divergence, WiFi and cellular arrived at the same architecture.

WiFi started fully distributed → fought through 20 years of PHY improvements and MAC patches → centralized under density pressure → borrowed OFDMA from LTE.

Cellular started centralized from day one → entered unlicensed 5 GHz band through Licensed-Assisted Access (LAA) in 2016 → couldn’t schedule spectrum it didn’t own → adopted Listen Before Talk (LBT) — carrier sensing, the WiFi mechanism.

Each side borrowed the other’s technique for the exact reason the other had adopted it originally.

  • WiFi borrowed scheduling because density demanded it
  • Cellular borrowed contention because shared spectrum demanded it

Both retain contention for exactly one purpose: bootstrap.

  • Cellular: RACH for device discovery
  • WiFi: CSMA/CA for initial association

The universal architecture: schedule everything you can, contend only for discovery. The destination was determined by physics and density. The path depended on where you started.

In-Class Exercise: CSMA/CA vs. OFDMA at Scale

Setup: You’re an AP in a stadium section. 250 phones. 160 MHz channel. PHY rate: 600 Mbps.

Part 1 — Using the formulas from earlier in the lecture:

\(\tau = 2/(CW_{\min} + 1) = 0.125\) and \(P(\text{success}) = n \cdot \tau \cdot (1 - \tau)^{n-1}\)

Compute P(success) at \(n = 10\), \(n = 50\), \(n = 250\). (3 minutes with your neighbor.)

  • \(n = 10\): 0.40 — 40% of slots succeed.
  • \(n = 50\): 0.009 — less than 1%.
  • \(n = 250\): ≈ 0 — virtually no successful slot.

Exercise Part 2: OFDMA Parallelism

Same 160 MHz channel. Now with 802.11ax OFDMA.

Each RU = 26 subcarriers ≈ 2 MHz. How many parallel RUs?

\(160 / 2 = 80\) naively → 74 RUs in practice (guard bands, DC null, inter-RU gaps).

Each Trigger Frame serves up to 74 users simultaneously — zero contention, zero collision.

Side by side:

CSMA/CA (\(n = 250\)) OFDMA
Users served per access 1 (if it succeeds — almost never) Up to 74 (every time)
Contention Catastrophic None
Per-user bandwidth Full 160 MHz (winner-take-all) ~2 MHz per RU

The tradeoff: under CSMA/CA, the winner gets the full channel. Under OFDMA, each user gets ~2 MHz. At 5 devices, CSMA/CA’s winner-take-all is fine. At 250, 2 MHz of guaranteed collision-free access beats fighting over 160 MHz and getting nothing.

Density is what flips the tradeoff. Which invariant changed? Coordination — distributed → centralized. What forced it? Density shifted the binding constraint from “no authority” to “coordination cost exceeds capacity.”

The Invariant Shift: 1997 → 2021

Invariant CSMA/CA + DCF (1997) 802.11ax OFDMA (2021)
State Local only — each station tracks its own backoff counter AP-aggregated — per-client channel quality, buffer status, scheduling decisions
Time Frame-time-bounded feedback (~12 ms) Scheduling-cycle feedback (~1-5 ms)
Coordination Fully distributed — each station decides independently Centralized — AP schedules all transmissions
Interface 802.11 frames (ACK, RTS/CTS, NAV) 802.11 frames + Trigger Frame + OFDMA RU assignments

The pattern:

  • State, Time, Coordination all shifted. Breaking Coordination is what broke the ceiling.
  • Interface didn’t change — 802.11ax still speaks 802.11 frames. Backward compatibility locked the interface, and it’s the reason the transition took twenty years, not five.

The lesson is structural: when density shifts the binding constraint, the invariant answers converge — regardless of starting point. WiFi and cellular arrived at the same State, same Time, same Coordination, same Interface. Two different paths, one destination. The framework predicted it.

But centralization creates its own problem: the scheduler is a single point of complexity — and of failure. How do you build, scale, and open up that scheduler? That’s the infrastructure question, and it’s next.