ONRG Kicks Off Work on Programmable Switch Hardware with DynATOS at NSDI 2022

Chris Misa, a fourth year PhD student and member of the Oregon Networking Research Group (ONRG), presented a paper titled “Dynamic Scheduling of Approximate Telemetry Queries” at the USENIX Symposium on Networked Systems Design and Implementation (NSDI) on April 5th. This paper describes a project called DynATOS which is the first principled approach to runtime programmable network telemetry on switch hardware. This project also marks the ONRG’s first chapter in developing systems on programmable switches and an exciting opportunity to share this work with a global audience at a top-tier networked systems conference.

Network Telemetry and Programmable Switches

As the performance and security demands on computer networks continues to grow, so too does the complexity of administrating and operating such networks. Modern networks carry massive volumes of data between millions of computers using dedicated hardware switching devices. Monitoring this data traffic (a process known as network telemetry) is critical for network administrators to ensure network users are able to use connected services smoothly, without interruption, and without being impacted by malicious attacks. However, trying to monitor traffic flowing through a network is sort of like trying to monitor water flowing through the pipes of a very large high-rise building---a specialized system must be built into the network itself to enable monitoring.

Recently, several companies have developed hardware switching devices that allow more flexible processing of network traffic directly in the network hardware. These devices, known as programmable switches, are like miniature water processing plants that have a large number of knobs to direct water on different paths or to insert more complex water-driven “devices” or “programs” into the pipes of the network. In particular, network administrators can write programs for programmable switches that compute high-level monitoring results directly in the network without needing to divert network traffic to expensive software systems. For example, a network administrator could create a specialized “water wheel” kind of program that only “spins” when some malicious traffic is flowing through the pipe, then monitor for when this attack-indicator water wheel spins.

A number of recent research efforts demonstrate that using programmable switches enables network traffic monitoring systems that are orders of magnitude cheaper and use orders of magnitude less power than software-based alternatives.

Scheduling Telemetry Operations on Programmable Switches

Despite the promise of programmable switches, there are still a number of critical challenges that must be solved before such systems can be widely adopted. Due to physical constraints imposed by the large volume of traffic flowing through the pipe, programmable switches only allow specific kinds of programs with strict resource limitations. Moreover, the programs running on programmable switches need to be adjusted periodically while traffic flows in order to adapt to different monitoring tasks or to adapt to changes in the underlying flow of traffic. This ability to adapt programs while traffic flows is known as “runtime programmable” and is a key limitation of prior efforts to build network monitoring systems.

The DynATOS project addresses the need for runtime programmable monitoring systems head on by looking at monitoring systems as resource schedulers. Rather than simply inserting the monitoring programs into the pipe as in prior efforts, DynATOS carefully considers a number of different submitted monitoring programs and comes up with a strategic way (i.e., a schedule) to run these programs. In doing so, DynATOS also exposes the ability to trade off monitoring result accuracy and/or latency for reduced hardware resource requirements.

ONRG Breaks New Ground

The ONRG has several long-standing projects in networking and Internet measurement. However, the NSDI presentation of the DynATOS paper marks the first chapter of the ONRG’s research on data plane telemetry. NSDI is a top-tier conference which presents cutting-edge research in design, implementation, and evaluation of networked and distributed systems and is an ideal platform for the DynATOS project to reach a global audience. ONRG is actively working to expand the impact of DynATOS through live deployments of the system on several representative networks and by developing advanced traffic monitoring methods based on the capabilities offered by this system. Work on DynATOS is supported by an active collaboration with ONRG’s industry partner Broadcom, Inc., a major designer and vendor of programmable switch hardware chips.

About Chris

Chris came to the UO in 2018 with little formal training in computer science but with a passion for building things and a vested interest in the Internet. He has led the ONRG’s work on programmable switch hardware since 2019 under the guidance of patient faculty advisors, Ram and Reza, and with help from Master’s student Walt O’Connor and several undergraduate students. Despite the challenges of debugging hardware behaviors, Chris thoroughly enjoys working at the intersection of networking, statistics, and formal methods where the DynATOS project has taken shape. He is excited for several ongoing projects in the same area, in particular developing novel attack detection methods on top of DynATOS and exploring more expressive language models for network monitoring programs. During his time at UO, Chris has received a VPRI undergraduate research fellowship, two Ripple graduate fellowships, the Gurdeep Pall graduate student fellowship, and several other departmental awards.

See https://onrg.gitlab.io/projects/dynatos/ for more details about DynATOS.