I am currently an assistant professor in computer science at UC Santa Barbara. At UCSB, I co-direct the Systems and Networking Lab (SNL). I am also affiliated with UCSB's Center for Responsible Machine Learning (CRML). I received my Ph.D. in computer science from Princeton University.

As a systems researcher, I design and build flexible, scalable, and deployable systems that solve the real-world problems at the intersection of networking, security, and machine learning. Currently, my focus is on making distributed network-telemetry system scalable and robust to traffic dynamics, and building self-driving network-management systems for access networks. Please find more details about my current research here.

Note: I am always on the lookout for motivated students. Please see these notes before reaching out to me over email.

Selected Publications

Flexible and Scalable Systems for Network Management
Arpit Gupta
Ph.D. Thesis
Received honorable mention for SIGCOMM Doctoral Dissertation Award
abstract Paper Talk

Our daily lives are heavily reliant upon Internet-connected devices, services, and applications. This reliance makes it more critical than ever that the underlying networks they depend on be reliable, performant, and secure. At the same time, the increasing complexity and diversity of today's devices, services, and applications have made network management tasks more complicated than ever. Modern network management mandates that operators can systematically monitor what is going on in their networks (network monitoring) and use this information to take real-time preventive or corrective actions (network control). Achieving these goals while also adhering to the limited compute and storage resources available on modern network devices poses significant challenges.

The contribution of this dissertation is the design and implementation of two systems that enable flexible and scalable network monitoring and control. The network-monitoring system, Sonata, collects and analyzes network traffic to infer various network events in real time. The network-control system, SDX, enables fine-grained reactive control actions for interdomain traffic without disrupting the existing routing protocols. For each of these two systems, the dissertation focuses on (i) the abstractions that allow network operators to express flexible programs for both network monitoring and control; (ii) the algorithms that make the best use of limited compute and storage resources; and (iii) the systems that combine the high-level abstractions and the low-level algorithms and can be deployed in production settings.

The lessons learned from this dissertation can help us develop next-generation network-management systems. More concretely, unlike existing systems that rely solely on a single device-type, this dissertation shows that designing systems that can pool resources from a heterogeneous set of devices (targets) is critical for building flexible and scalable network-management systems. It also demonstrates that as the networking technologies and protocols evolve rapidly with time, it is imperative to design modular systems that can swiftly catch up with these changes. Finally, this research also illustrates that it is crucial to select strategic locations (e.g., Internet exchange points) for deployment to drive innovations in Internet-wide traffic monitoring and control.

Detecting Ephemeral Optical Events with OpTel
Congcong Miao, Minggang Chen, Arpit Gupta , Zili Meng Lianjin Ye Jingyu Xiao Jie Chen Zekun He Xulong Luo Jilong Wang Heng Yu.
abstract Paper Talk
Degradation or failure events in optical backbone networks affect the service level agreements for cloud services. It is critical to detect and troubleshoot these events promptly to minimize their impact. Existing telemetry systems rely on arcane tools (e.g., SNMP) and vendor-specific controllers to collect optical data, which affects both the flexibility and scale of these systems. As a result, they fail to collect the required data on time to detect and troubleshoot degradation or failure events in a timely fashion. This paper presents the design and implementation of OpTel, an optical telemetry system, that uses a centralized vendor-agnostic controller to collect optical data in a streaming fashion. More specifically, it offers flexible vendor-agnostic interfaces between the optical devices and the controller and offloads data-management tasks (e.g., creating a queryable database) from the devices to the controller. As a result, OpTel enables the collection of fine-grained optical telemetry data at the one-second granularity. It has been running in Tencent's optical backbone network for the past six months. The fine-grained data collection enables the detection of short-lived events (i.e., ephemeral events). Compared to existing telemetry systems, OpTel accurately detects 2x more optical events. It also enables troubleshooting of these optical events in a few seconds, which is orders of magnitude faster than the state-of-the-art.

Sonata: Query-Driven Streaming Network Telemetry
Arpit Gupta, Rob Harrison, Marco Canini, Nick Feamster, Jennifer Rexford, Walter Willinger
ACM SIGCOMM, Budapest, Hungary
abstract Paper Talk Code
Managing and securing networks requires collecting and analyzing measurement data. Current technologies do not make it easy to do so, typically because they separate data collection (e.g., packet capture or flow monitoring) from the analysis, producing either too much data to answer a general question or too little data to answer a detailed question. This paper presents Sonata, a network telemetry system that exposes a query interface that directs the joint collection and analysis of network traffic. Sonata allows operators to directly express queries in a high-level language, partitions each query into a portion that runs on the switch and another that runs on the streaming analytics platform and refines the query to capture only the traffic that satisfies a query. Sonata allows operators to express real network monitoring tasks using dataflow operators, a compact, familiar programming idiom. Evaluation using traffic traces from a large ISP backbone show that Sonata's ability to compile portions of these queries to the data plane can reduce traffic rates at the stream processor by up to seven orders of magnitude.

iSDX: An Industrial-Scale Software Defined Internet Exchange Point
Arpit Gupta, Robert MacDavid, Rüdiger Birkner, Marco Canini, Nick Feamster, Jennifer Rexford, Laurent Vanbever
USENIX NSDI, Santa Clara, CA
Winner of Community Award
Selected in the Best of the Rest session at USENIX ATC, 2016
Media Articles: CircleID, ONF Blog, NewIP

abstract Paper Talk Code
Software-Defined Internet Exchange Points (SDXes) promise to significantly increase the flexibility and function of interdomain traffic delivery on the Internet. Unfortunately, current SDX designs cannot yet achieve the scale required for large Internet exchange points (IXPs), which can host hundreds of participants exchanging traffic for hundreds of thousands of prefixes. Existing platforms are indeed too slow and inefficient to operate at this scale, typically requiring minutes to compile policies and millions of forwarding rules in the data plane. We motivate, design, and implement iSDX, the first SDX architecture that can operate at the scale of the largest IXPs. We show that iSDX reduces both policy compilation time and forwarding table size by two orders of magnitude compared to current state-of-the-art SDX controllers. Our evaluation against a trace from one of the largest IXPs in the world found that iSDX can compile a realistic set of policies for 500 IXP participants in less than three seconds. Our public release of iSDX, complete with tutorials and documentation, is already spurring early adoption in operational networks.

SDX: A Software Defined Internet Exchange
Arpit Gupta, L. Vanbever, M. Shahbaz, S. Donovan, B. Schlinker, N. Feamster, J. Rexford, S. Shenker, R. Clark, E. Katz-Bassett
210+ citations, one of the highest for SIGCOMM 2014
abstract Paper Talk Code
BGP severely constrains how networks can deliver traffic over the Internet. Today's networks can only forward traffic based on the destination IP prefix, by selecting among routes offered by their immediate neighbors. We believe Software Defined Networking (SDN) could revolutionize wide-area traffic delivery, by offering direct control over packet-processing rules that match on multiple header fields and perform a variety of actions. Internet exchange points (IXPs) are a compelling place to start, given their central role in interconnecting many networks and their growing importance in bringing popular content closer to end users. To realize a Software Defined IXP (an SDX�), we must create compelling applications, such as application-specific peering, where two networks peer only for (say) streaming video traffic. We also need new programming abstractions that allow participating networks to create and run these applications and a runtime that both behaves correctly when interacting with BGP and ensures that applications do not interfere with each other. Finally, we must ensure that the system scales, both in rule-table size and computational overhead. In this paper, we tackle these challenges and demonstrate the flexibility and scalability of our solutions through controlled and in-the-wild experiments. Our experiments demonstrate that our SDX implementation can implement representative policies for hundreds of participants who advertise full routing tables while achieving sub-second convergence in response to configuration changes and routing updates.