4 Wireless Architecture — Disaggregation at Infrastructure Scale
4.1 The Monolith and Its Escape
In 2G cellular networks, the base station was a monolith: radio transmission, channel coding, mobility control, and billing all coupled in one box connected to one mobile switching center (MSC). The MSC itself was monolithic—voice switching, call routing, subscriber lookup, charging, all in the same appliance. This architecture answered the four invariants with solutions tightly bound to hardware: state was tied to specific cells, time was prescribed (circuit switching: fixed bandwidth, long holding times), coordination was centralized (one MSC per region), interfaces were proprietary (vendor lock-in).
The architecture worked for voice—a stable, symmetric, long-holding service. But when demand shifted toward bursty packet data, the monolith cracked. Voice calls assume fixed bandwidth reservations; packet data is asymmetric, variable-rate, bursty. Forcing VoIP into a circuit-switched model was like fitting water into a rigid mold—inefficient and structurally wrong. The escape was disaggregation: separate radio access from mobility from core switching from billing. By 5G, disaggregation applies at every architectural layer. The central narrative is simple: disaggregation is how you transform a monolithic system constrained by physics and history into a composable platform.
Wireless architecture sits at the boundary between two decomposition axes: the functional split (control plane vs. data plane) and the temporal split (fast PHY decisions vs. slow management decisions). The 5G RAN disaggregation (CU/DU/RU) is an explicit re-decomposition driven by changing latency and cost constraints.
The anchor constraint that drives disaggregation is convergence on a unified transport. Once all traffic—voice, video, data—is carried as IP packets (not fixed-bandwidth circuits), you can separate concerns: who manages the radio? (RAN). Who routes packets? (core network). Who enforces policy? (separate from routing). Who anchors the user’s IP address? (separate from policy). Each separation is an opportunity for independent scaling, independent optimization, independent deployment. This is disaggregation as a principle: partition coupled concerns so each can be answered independently, constrained only by its interface.
4.2 Cellular Architecture Evolution: From Monolith to Microservices
4.2.1 The Three Architectures
2G (GSM, 1991): Monolithic base station (BTS) + monolithic switching center (MSC). The BTS knows radio (channel assignment, power control, handoff). The MSC knows subscribers (HLR lookup), call routing, billing. Tight coupling: a subscriber’s registration is in the MSC’s HLR; the BTS cannot function independently.
3G (UMTS, early 2000s): Bifurcation emerges. Voice still uses the circuit MSC. But packet data bypasses it: a new path SGSN (Serving GPRS Support Node) → GGSN (Gateway GPRS Support Node) handles IP packets. The base station (Node-B) now branches: voice packets go to MSC, data packets go to SGSN. State bifurcates: circuit state (MSC path) and packet state (SGSN/GGSN path) coexist but do not interact. The architecture admits packet data by adding parallel infrastructure, not replacing the circuit core. This is incremental disaggregation under backward-compatibility constraints.
4G (LTE, 2010s): Unification. All traffic is IP—voice becomes VoIP, data is native IP packets. The circuit MSC vanishes. The core network unifies around IP: eNodeB (evolved base station) connects to MME (Mobility Management Entity), S-GW (Serving Gateway), P-GW (Packet Gateway), HSS (Home Subscriber Server). Each function specializes: MME handles attachment and handoff; S-GW anchors the user’s IP address; P-GW is the gateway to the Internet; HSS stores subscriptions. State is distributed across functions, but traffic is unified.
5G (2020s): Atomization. Core functions disaggregate into stateless microservices—AMF (Access/Mobility Management), SMF (Session Management), UPF (User Plane Function), UDM (User Data Management), PCF (Policy Control), NRF (Network Repository Function). Each is independent software running on commodity cloud infrastructure. Functions are invoked via REST APIs, not proprietary protocols. State lives in databases, not in memory. This is the endpoint of disaggregation: functions are no longer coupled to hardware; they are pure software on commodity infrastructure.
4.2.2 The Invariant Answers Shift at Each Generation
State: Progressively distributed and decoupled from hardware. - 2G: state bound to specific MSC (home network) and BTS (current serving cell). - 3G: radio state (Node-B) separate from packet state (SGSN/GGSN). - 4G: radio state (eNB), access state (MME), session state (S-GW), subscription state (HSS)—distributed across functions but still bound to specific hardware appliances. - 5G: state is function-independent. An AMF instance holds some registration state; another AMF instance holds state for different users. State is stored in databases (UDM, policy stores). If an AMF crashes, another instance resumes with state reloaded.
Time: Feedback loops accelerate. - 2G: attachment seconds, handoff rare (minutes). Billing computed offline, hours post-call. - 3G: attachment sub-second, handoff more frequent (seconds). Policy changes within minutes. - 4G: attachment <1s, handoff frequent (every few seconds). Policy changes in seconds. - 5G: attachment <1s, handoff <100ms, policy changes real-time. Distribution enables parallelism—policy changes to one function do not block others.
Coordination: Centralization to distributed choreography. - 2G: home MSC decides everything. - 3G: HSS centralizes subscription; all nodes query it. - 4G: MME coordinates attachment/handoff; other nodes follow instructions. - 5G: decentralized by design. AMF makes attachment decisions independently. SMF makes session decisions independently. PCF sets policy; UPF enforces independently. Functions use a service registry (NRF) to discover each other and make localized decisions without central coordination.
Interface: Proprietary to cloud-native. - 2G: proprietary circuit protocols (A-bis, MAP). - 3G: added packet protocols (Gn, Gi); MAP persists. - 4G: all data is IP; control plane uses Diameter, GTP signaling. - 5G: service-based interfaces (HTTP/REST APIs). Functions call each other’s APIs (N10, N11, etc.). This is not a networking innovation—it is adoption of web architecture in telecom.
4.3 CU/DU/RU Disaggregation: The Radio Functional Split
While the core network disaggregates into microservices, the radio access network (RAN) disaggregates along the time axis. The base station—which contains radio transmission (RF), signal processing (Layer 1: modulation, coding), and control logic (Layer 2-3: scheduling, mobility)—splits into three units based on latency sensitivity.
4.3.1 The Split Points
While the core network disaggregates into microservices, the RAN disaggregates along the time axis: radio transmission, signal processing, and control logic split into three units based on latency sensitivity. The split is not arbitrary—it is constrained by the physical requirement that RF transmission must stay near the antenna, while control can migrate centrally if transport delays permit. This creates a spectrum of options: aggressive centralization (Option 7) requires premium fronthaul; conservative splits (Option 6) work with microwave backhaul. The different split options are compared in Figure 4.1, showing the tradeoff between centralization benefit and fronthaul cost.
The radio access network disaggregates along the temporal axis: the traditional monolithic base station—containing RF transmission, signal processing, and control logic—splits into three units organized by latency sensitivity. This disaggregation is not arbitrary; it is constrained by physics: RF transmission must remain co-located at the antenna, while control logic can migrate centrally if fronthaul latencies permit. The Radio Unit (RU) handles RF transmission, digital-to-analog conversion, and sub-microsecond operations at the antenna. The Distributed Unit (DU) handles Layer 1-2 processing—modulation, channel coding, MAC scheduling, and real-time resource block allocation—with a 5-millisecond latency tolerance, allowing centralization to an edge data center 100-200 kilometers away. The Central Unit (CU) handles Layer 3 control: RRC (radio resource control), mobility management, and session setup, with a ~100-millisecond latency tolerance, enabling further centralization to regional data centers hundreds of kilometers away.
RU (Radio Unit): At the antenna. Handles RF transmission, digital-to-analog conversion, the lowest-latency operations. Cannot be centralized—physics requires it near the antenna. Must operate at 1-10 microsecond latencies.
DU (Distributed Unit): Layer 1-2 processing—modulation, channel coding, scheduling decisions, MAC (medium access control). Can be centralized to an edge data center (100-200 km from cells) if fronthaul latency is <5 ms. Answers: which users get which resource blocks (PRBs) this TTI? At what modulation-coding scheme?
CU (Central Unit): Layer 3 control—RRC (radio resource control), mobility management, session setup. Tolerates latencies up to ~100 ms, so can be further centralized (regional data center, hundreds of km away).
4.3.2 The Anchor: Fronthaul Latency
The split is constrained by transport latency—the time for signals to travel between units. A 1 ms LTE frame means Layer 1 processing must happen in <~5 ms round-trip. I/Q samples (raw radio signals) are high-bandwidth (100-500 Mbps per cell). If the DU is too far, fronthaul latency exceeds processing deadlines, and the system cannot meet frame deadlines—calls drop, capacity collapses.
The result is not one split, but a menu of options (3GPP splits 1, 2, 6, 7, 8):
Option 7 (aggressive): I/Q samples flow over fronthaul to DU. Fronthaul: 100-500 Mbps, latency <5 ms. Requires premium fiber. Benefit: maximum centralization (one regional DU pool serves many cells). Cost: expensive fronthaul.
Option 2 (moderate): Layer 1 (modulation) stays at RU. Layer 2-3 go to DU. Fronthaul: 10-50 Mbps, latency <10 ms. Benefit: moderate centralization. Cost: lower fronthaul burden.
Option 6 (conservative): DU and RU co-located (RAN stays on-site). CU is centralized. Latency <100 ms. Benefit: minimal backhaul upgrade needed. Cost: no radio processing centralization.
The choice is economic and operational. Dense urban: fiber is available, Option 7 is affordable. Rural: microwave backhaul, Option 6 is pragmatic. The anchor is not physics alone—it is physics + cost + deployability constraints.
The three-unit architecture answers the state-time-coordination invariants distinctly. State is distributed: RU maintains RF hardware; DU maintains channel state and scheduling history; CU maintains mobility state. Time horizons differ: RU operates in microseconds, DU in milliseconds (TTI = 1 ms), CU in seconds. Coordination flows vertically: CU makes session decisions (is this user attached?), DU makes per-TTI scheduling, RU executes transmission. The functional split enables independent operation while maintaining vertical coordination through well-defined interfaces (fronthaul for RU↔︎DU, X2/Xn for DU↔︎CU).
4.4 Base Station Scheduling: Centralized Coordination at the Air Interface
While the RAN disaggregates functionally (CU/DU/RU), it remains centralized in decision-making. Every millisecond (LTE TTI = 1 ms), the scheduler makes the most critical allocation decision: which users get which radio resources?
4.4.1 The Problem
A base station has finite spectrum: ~50-100 physical resource blocks (PRBs). At any TTI, hundreds or thousands of users compete for these PRBs. Each user’s channel quality differs (a nearby user with clear line-of-sight can use 64-QAM modulation—6 bits per symbol; a distant user with obstruction uses QPSK—2 bits per symbol). If you allocate all PRBs to the user with the best channel, you maximize instantaneous throughput but starve the poor-channel user. If you allocate equally, you achieve fairness but waste spectral efficiency (wasting the good-channel user’s potential). The scheduler must answer: given channel quality and queue state, which allocation maximizes throughput while ensuring fairness? Figure 4.2 shows the two-dimensional resource grid (frequency × time) that the scheduler partitions across users, with color intensity reflecting per-user channel quality.
OFDMA (Orthogonal Frequency Division Multiple Access) disaggregates the spectrum into discrete, non-overlapping resource elements—time-frequency blocks that can be allocated independently to different users. The base station scheduler partitions a 20 MHz channel into 100 physical resource blocks (PRBs), each 180 kHz wide and 1 millisecond deep (in LTE), creating a two-dimensional resource grid. This grid eliminates the binary decision (“transmit or defer”) that CSMA/CA faces on an undefined medium. Instead, the scheduler makes explicit allocation decisions: “User A gets PRBs 1–5 in TTI 10; User B gets PRBs 6–12 in TTI 10; User C gets PRBs 13–25 in TTI 10.” Collisions are impossible by construction because resource elements are strictly partitioned.
The scheduler works with resource elements (RE: 15 kHz × ~0.07 ms) aggregated into physical resource blocks (PRB: 12 subcarriers × 7 OFDM symbols = 84 REs). The scheduler makes one coding/modulation decision per PRB, not per RE—aggregating decisions reduces complexity but still enables fine-grained allocation across users.
4.4.2 The Measurement Signal and Closed Loop
Every 5-40 ms, each user sends a Channel Quality Indicator (CQI) report—a 1-byte measurement of signal-to-noise ratio on each frequency band. The base station receives ~1000 CQI reports per cell per second. From this measurement signal, the scheduler infers what modulation-coding scheme (MCS) each user can reliably support. Reporting granularity balances measurement accuracy (report per PRB for precision) against uplink overhead (cost in spectrum). Most systems report summaries: CQI per frequency group (~10 groups instead of 100 PRBs).
The feedback loop is tight:
- Users report CQI → (5-40 ms latency)
- Scheduler allocates PRBs and selects MCS for each user → (1 ms decision)
- Users transmit on allocated PRBs with allocated MCS → (1 ms transmission)
- Base station observes success/failure (ACK/NACK) → (1 ms feedback)
- Retransmit failed packets on next opportunity → (loop closes)
This is dramatically tighter than 802.11, where the feedback loop (ACK timeout) is measured in hundreds of milliseconds. The tight feedback loop enables rapid adaptation: if the channel degrades, the scheduler can select a lower MCS immediately (next TTI). If the channel improves, it can attempt higher modulation (higher risk, higher reward if successful).
4.4.3 Invariant Answers
State: Each user has queue state (backlog waiting transmission), channel state (CQI estimate), and scheduling history (was this user allocated last TTI, did it succeed?). The scheduler maintains a coarse model: (user, channel quality estimate, queue depth). Tracking per-PRB CQI for every user would consume too much state and reporting bandwidth—instead, the scheduler works with summaries.
Time: Allocation decisions happen every TTI (1 ms in LTE). CQI feedback is periodic (5-40 ms intervals). Retransmission is reactive (NACK triggers immediate re-queuing, but transmission waits for next available opportunity). The tight 1 ms cycle for allocation is the binding constraint—the scheduler must compute thousands of allocation decisions per millisecond, so algorithms are greedy heuristics, not exhaustive optimization.
Coordination: Fully centralized. The base station scheduler is the single decider. Users have no negotiating power—they report CQI (feedback), but the scheduler makes unilateral allocation decisions. This is the opposite of WiFi’s distributed CSMA/CA.
Interface: Downlink control information (DCI) tells each user which PRBs are allocated and which MCS to use. Uplink CQI reports inform the scheduler. The interface is tightly defined—3GPP specifies CQI format and feedback intervals—but NOT the scheduling algorithm. Each vendor’s base station uses proprietary scheduling logic. All must achieve fairness (no user starvation), but the heuristic (proportional fair, max-throughput, latency-based priority) is a competitive differentiator.
4.4.4 Principles at Work
Disaggregation: Channel measurement (CQI reports) is separated from allocation decisions. A user measures locally and reports asynchronously; the scheduler makes synchronous decisions every TTI. This separation allows users to measure independently without waiting for scheduler acknowledgment.
Closed-Loop Reasoning: UEs report CQI → scheduler allocates → UEs transmit → success/failure observed → scheduler adjusts next allocation based on outcomes. The loop is fast (1 ms) and tight. If CQI prediction is pessimistic, the scheduler urges higher modulation. If optimistic (high retransmission rate), urge caution. This forms a control loop, similar to TCP congestion control but at the radio layer and with much faster timescales.
Decision Placement: Centralization is anchored by resource scarcity and coordination complexity. With thousands of users competing for 100 PRBs, a distributed approach (each user negotiates for PRBs) would create collision and overhead. Centralization enables coordinated allocation and prevents collisions. The cost is that the base station must be computationally powerful enough to make thousands of allocation decisions per millisecond—a real architectural constraint.
4.6 Network Slicing and Virtualization: Platforms that Constrain
Network slicing enables multiple independent logical networks to coexist on shared physical infrastructure. An operator allocates “slices” to different tenants (automotive, enterprise, video), each with own QoS guarantees, failover policies, security domains. A slice is a constrained version of the network: “your slice gets 50 PRBs, guaranteed 10 Mbps, <20 ms latency, 99.99% reliability.”
4.6.1 The Abstraction and Its Constraints
Slicing is a platform—a layer of abstraction that constrains what invariant answers are feasible. State constraints: each slice maintains independent resource pools (RBs, power budget, security tokens). State is distributed: the orchestrator tracks global allocation; each base station tracks local consumption and which slice each RB belongs to. Time constraints: slices operate on multiple timescales. Long-term (hours-days): SLA negotiation and resource provisioning. Medium-term (seconds): traffic-aware reallocation. Short-term (milliseconds): per-TTI scheduling within the slice’s resource budget. Coordination constraints: a centralized orchestrator makes long-term decisions (allocate these RBs to automotive slice, those to video); distributed schedulers make per-TTI decisions within boundaries. Interface constraints: tenants interact via SLA API (“guarantee 10 Mbps, <20 ms latency”), not radio details. The abstraction hides PRBs and scheduling—tenants see a virtual network.
4.6.2 The Tradeoff: Isolation vs. Efficiency
Slicing promises isolation (your SLA is guaranteed) and efficiency (pack multiple tenants on shared resources). These are in tension. If every tenant gets dedicated RBs, isolation is perfect but efficiency is low (RBs idle in low-demand slices). If RBs are dynamically shared, efficiency is high but isolation is weak (one tenant’s burst can starve another until orchestrator reallocates, which takes seconds). Real systems navigate this tradeoff: most slices get reservations (guaranteed minimums) + oversubscription (shared excess capacity, first-come-first-served if available). An automotive slice gets guaranteed low-latency path (reserved short buffer, immediate scheduling); a video slice gets bulk path (lower priority, variable latency). SLA design reflects this: “guaranteed minimum 5 Mbps; best-effort up to 20 Mbps with lower latency guarantee.”
4.6.3 Principles at Work
Disaggregation: Slicing disaggregates resources by tenant. Traditional network: one scheduler for all users. Sliced network: per-slice scheduler, or one global scheduler that respects slice boundaries. This disaggregation enables independent SLAs: each slice has its own QoS requirements, and failure of one slice does not affect others (resource isolation).
Closed-Loop Reasoning: The orchestrator monitors slice performance (are we meeting SLAs?) and adjusts allocations. If automotive slice’s latency budget is threatened (load too high), the orchestrator can reallocate video slice’s RBs to automotive (preemption). If video slice is consistently underutilized, the orchestrator reclaims its RBs for other slices (cost optimization). The measurement (SLA compliance, resource utilization) drives reallocation decisions.
Decision Placement: Centralization at the orchestrator level (who gets which RBs) combined with distribution at the scheduler level (how to use those RBs). This hybrid approach is practical: the orchestrator makes slower decisions (timescale: seconds), while schedulers make fast decisions (timescale: milliseconds) within constraints.
4.7 The 5G Core: Microservices and Service-Based Architecture
4.7.1 From Monolith to Microservices
The 4G core was monolithic: one HSS appliance stored all subscriptions, one MME handled all attachments, one S-GW anchored all sessions. The core was vertically integrated—a vendor sold a complete box; operators couldn’t pick and choose. Scaling one function meant buying bigger appliances; new features required vendor upgrades.
The 5G core disaggregates: functions become stateless microservices running on commodity cloud infrastructure (Kubernetes, AWS, Azure, on-premises clouds). Services: AMF (Access/Mobility Management), SMF (Session Management), UPF (User Plane Function), PCF (Policy Control), UDM (User Data Management), NRF (Network Repository Function). Each is independent: if session load spikes, add SMF instances; if user data throughput is bottleneck, add UPF instances. With monolithic 4G, you’d scale the entire core (wasting resources on lightly-loaded functions like subscription lookup).
4.7.2 Service-Based Interfaces and Choreography
Services communicate via REST APIs (Service Based Interface, SBI). When a user attaches:
- UE → AMF (attach request)
- AMF → NRF (discover SMF) → returns SMF address
- AMF → SMF (create session)
- SMF → NRF (discover policy function) → returns PCF address
- SMF → PCF (get policy) → returns rate limits, charging info
- SMF → UPF (establish data path) → configures packet forwarding
- SMF → AMF (session created) → AMF confirms to UE
This is choreography, not orchestration. No central conductor decides the flow; services react to messages and call each other’s APIs. Each service owns a piece of state: AMF owns attachment registrations, SMF owns sessions, UDM owns subscriptions, PCF owns policy. Consistency is loose (eventual consistency)—if SMF crashes, another SMF reloads from persistent storage and resumes; brief session loss, but system recovers (RTO: 10-30 seconds).
4.7.3 Invariant Answers in Microservice Architecture
State: Distributed and decoupled from hardware. Each service holds minimal state needed for its function. AMF doesn’t care about sessions; SMF doesn’t care about attachments. State is stored in databases (UDM, policy stores), enabling stateless service instances. If an SMF instance crashes, another instance reads state from database and continues. This enables horizontal scaling: add SMF instances without losing state.
Time: Multiple timescales. Session events (setup, deletion): microseconds-to-seconds. Service discovery (NRF lookup): milliseconds. Policy updates: seconds-to-minutes. Each service operates asynchronously; messages are delivered with some latency (network hops add milliseconds). A user attach involves ~5-7 service calls; each hop adds latency. Complex attach sequence: 50-200 ms total (acceptable for session setup, would be problematic if per-packet).
Coordination: Distributed via choreography. NRF is the service registry but not a centralized decider. Each service makes local decisions (AMF decides where to attach user; SMF decides which UPF to use; PCF decides rate limits). No single orchestrator coordinates; services coordinate implicitly through message passing. Advantage: resilient (no single point of failure at the control level). Disadvantage: harder to guarantee global consistency (if messages are lost or delayed).
Interface: REST APIs (N10, N11, etc.). Each service exposes endpoints. Contrast with 4G’s proprietary protocols (Diameter, GTP)—REST is simpler, language-agnostic, and leverages web infrastructure. Debugging is easier (standard HTTP tools work); monitoring is standard (Prometheus/Grafana); security is standard (TLS).
4.7.4 Cloudification: Software, Not Hardware
Cloudification is the realization of disaggregation. Disaggregate functions (conceptual separation); virtualize them (run on commodity servers); commoditize infrastructure (Kubernetes, cloud APIs). The result: network is software running on hardware you don’t own (cloud) or on hardware you choose (commodity servers). This shifts economics from CAPEX (buy specialized appliances upfront) to OPEX (pay per compute hour used).
An operator wanting 5G network has two paths:
Path A (Traditional): Buy integrated RAN+core from vendor (Nokia, Ericsson). CapEx: $10M upfront for hardware, software licenses, integration. Constraint: locked into vendor’s roadmap and pricing.
Path B (Cloud-Native): Deploy open-source 5G (ONAP, OSM) on cloud (AWS, Azure, or on-premises). CapEx: $2M upfront for software engineering. OpEx: $0.5M/month cloud compute. Benefit: flexibility (swap vendors, update faster, pay only for used capacity). Cost: higher OpEx if load is constant (cloud compute is more expensive than dedicated hardware at scale).
4.7.5 The Anchor: Load Heterogeneity and Optimization
The anchor constraint that drives cloudification is heterogeneous load profiles. Session setup (SMF) load spikes with attachment storms; policy enforcement (PCF) load spikes with quota checks; user plane (UPF) load is continuous. A monolithic core is sized for peak across all functions (all peaks at once—worst case). A microservice core is sized per-function: if SMF is peak 100, PCF is peak 30, UPF is peak 500, you provision SMF for 100, PCF for 30, UPF for 500. Combined CapEx is lower (no over-provisioning of low-demand functions).
Additionally, cloudification enables faster innovation: push a new SMF version without restarting core; route traffic gradually to new version; rollback if issues arise. With monolithic cores, feature updates require planned downtime (every 6-12 months). With microservices, updates are continuous (daily/hourly).
4.9 Generative Exercises
4.9.1 Exercise 1: Private 5G Factory Floor
A manufacturing company deploys a private 5G network on its factory floor. Unlike a public network (millions of users, massive scale), this network has ~1000 employees, controlled environment, and single administrative domain. The question: how do the invariant answers change when administrative boundaries collapse?
Analysis:
State: In a public network, state must be distributed (users roam, administrative borders are hard). In a private network, state can be centralized (all users belong to one company, all devices registered at one place). How might the core architecture change if you had one central UDM, one central AMF, one UPF at the factory center?
Time: In a public network, handoff must be fast (users move between operators’ networks, delays are visible). In a private network, handoff can be slower (all base stations are within company control, coordination is simpler). Could you reduce handoff latency if there were no roaming delays? What if handoff were coordinated across all cells centrally (one orchestrator)?
Coordination: In a public network, no entity controls all cells (multi-vendor, multi-operator). In a private network, a single company controls all infrastructure. What if the factory deployed one global scheduler for all cells (versus distributed schedulers at each cell)? What would be the benefits (optimal spectrum allocation across cells) and costs (central scheduler bottleneck)?
Interface: In a public network, standardized interfaces (X2 for eNB-eNB, N11 for SMF-UPF) enable interoperability. In a private network, you could use proprietary interfaces if you control all equipment. Would that accelerate performance or introduce unnecessary coupling?
Hypothesis: Collapsed administrative boundaries enable centralization (state, coordination, control). This trades off resilience (no fallback if central element fails) for optimization (global visibility enables better decisions). The design should reflect the risk tolerance: a factory can tolerate brief outages (production paused for repair); a public network cannot (millions of users affected).
4.9.2 Exercise 2: 5G RAN Disaggregation Economics
An operator has 100 cells in a dense city. Today, it uses monolithic base stations ($100k each) = $10M hardware cost. It is considering disaggregation: 100 RUs ($10k each) at cells, 2 centralized DU/CU data centers. What is the tradeoff?
CapEx Analysis: - RU hardware: 100 × $10k = $1M - Fronthaul deployment (fiber to each RU): $40k per RU × 100 = $4M - DU/CU data center equipment: $1M - Total: $6M (lower than $10M monolith)
But (implicit costs): - Fronthaul latency is tight constraint (5 ms max for Layer 1 split). Is every site reachable with <5 ms? If not, fallback to Layer 2 or Layer 3 split, reducing optimization benefit. - Operational complexity: orchestration, monitoring distributed DUs/CUs, debugging failures spread across geography.
Hypothesis: Disaggregation is cheaper at scale (10+ cells). At small scale (10 cells), the fronthaul cost dominates. The breakeven point depends on fiber availability and labor costs.
Design exercise: For a smaller operator with 20 cells (where disaggregation isn’t obviously cheaper), what other factors might justify it? (Vendor flexibility? Faster feature deployment? Energy efficiency through centralized cooling?)
4.10 Summary: Disaggregation as the Organizing Principle
The entire 5G architecture story is disaggregation applied at infrastructure scale. Starting from a monolithic 2G base station and MSC, each generation peeled away coupling:
- 3G: voice and data paths separate (first disaggregation)
- 4G: all-IP unification, distributed core functions (MME, S-GW, P-GW), but still hardware-bound
- 5G: complete atomization into stateless microservices + RAN split (CU/DU/RU) + cloud deployment
At each step, the anchor constraint shifted. 2G: circuit switching (rigidity). 3G: parallel voice+data (incremental change). 4G: unified IP transport (opens door to disaggregation). 5G: cloud commodity hardware (enables software-defined networks).
The principles drive design:
Disaggregation: Separating coupled concerns enables independent scaling, updating, deployment. CU/DU/RU split isolates radio processing from control. Core microservices isolate session management from policy from user data. Each separation creates a new interface; each interface is an opportunity for decoupling and optimization.
Closed-Loop Reasoning: Base station schedulers adapt MCS based on CQI feedback (1 ms loop). Mobility applies hysteresis to avoid ping-pong oscillation. 5G core uses measurement-driven policy updates (PCF adjusts limits based on traffic metrics). All loops operate at different timescales; all require stable feedback signals.
Decision Placement: From centralized (4G MME controls mobility, S-GW anchors sessions) to distributed (5G AMF, SMF, UPF make localized decisions via choreography). Distribution reduces single points of failure but complicates consistency. The sweet spot varies by constraint: admission control (centralized, requires global view) vs. packet forwarding (distributed, local decisions sufficient).
Network slicing is the platform that constrains what variant answers are feasible. It is not a new invariant; it is a constraint layer that partitions available resources and enforces SLA boundaries. Functions beneath the slice (scheduler, handoff) must respect the slice’s resource budget.
The central insight: architecture is how you answer the four invariants under constraints. As environmental constraints shift (from voice to data, from monolithic hardware to commodity servers, from single operator to shared infrastructure), the answers shift. Disaggregation is the strategy that keeps the system decomposable—each function can evolve independently, limited only by its interface.
4.11 References
- 3GPP (2018). “3GPP TR 38.912: Study on New Radio (NR) Access Technology Physical Layer Aspects.” 3GPP Technical Report 38.912.
- 3GPP (2022). “TS 23.501: System Architecture for 5G.” 3GPP Technical Specification 23.501.
- Bria, A., Gessler, F., Queseth, O., Stridh, R., Unbehaun, M., Wu, J., and Reit, J. (2001). “4th Generation Wireless Infrastructures – Scenarios and Research Challenges.” Proc. 12th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC).
- Fettweis, G. P. and Alamouti, S. (2014). “5G: Personal Mobile Internet and the Internet of Things.” IEEE Wireless Communications Magazine, 21(2):64-75.
- Hoang, H., Harada, H., Mori, K., and Sato, Y. (2009). “Aspects of Mobile Broadband Wireless Access.” IEEE Wireless Communications Magazine, 16(5):36-42.
- Jacobson, V. (1988). “Congestion Avoidance and Control.” Proc. ACM SIGCOMM.
- Kurose, J. F. and Ross, K. W. (2020). Computer Networking: A Top-Down Approach, 8th ed. Pearson.
- Lamport, L. (1978). “Time, Clocks, and the Ordering of Events in a Distributed System.” Communications of the ACM, 21(7):558-565.
- McKeown, N., Anderson, T., Balakrishnan, H., et al. (2008). “OpenFlow: Enabling Innovation in Campus and Enterprise Networks.” ACM SIGCOMM Computer Communication Review, 38(2):69-74.
- NGMN (Next Generation Mobile Networks) Alliance (2015). “5G White Paper.” NGMN.
- Open RAN Alliance (2022). “Open RAN Explained.” O-RAN Alliance White Paper.
- Richter, F., Fehske, A. J., and Fettweis, G. P. (2009). “Energy Efficiency Aspects of Base Station Deployment.” Proc. IEEE VTC 2009 Fall.
- Saltzer, J. H., Reed, D. P., and Clark, D. D. (1984). “End-to-End Arguments in System Design.” ACM Trans. Computer Systems, 2(4):277-288.
- Shannon, C. E. (1948). “A Mathematical Theory of Communication.” Bell System Technical Journal, 27(3):379-423.
- Wiener, N. (1948). Cybernetics: Or Control and Communication in the Animal and the Machine. MIT Press.
This chapter is part of “A First-Principles Approach to Networked Systems” by Arpit Gupta, UC Santa Barbara, licensed under CC BY-NC-SA 4.0.