All Episodes
Listen in on Jane Street’s Ron Minsky as he has conversations with engineers working on everything from clock synchronization to reliable multicast, build systems to reconfigurable hardware. Get a peek at how Jane Street approaches problems, and how those ideas relate to tech more broadly.
Nate Foster is a professor at EPFL in Switzerland in the Networked Systems Abstractions Lab, and a visiting researcher at Jane Street on the Networking team. In this episode, he and Ron consider what happens when you bring a software mindset to network engineering. Can you use programming language theory and formal methods to realize the dream of software-defined networks? Along the way, they discuss how hyperscalers have shaped networking hardware; the return (or not) of multicast; the ways ML workloads are reshaping the networking layer; and the success Jane Street has had using a foundational Internet protocol, BGP, together with a more declarative high-level specification language.
Nate Foster is a professor at EPFL in Switzerland in the Networked Systems Abstractions Lab, and a visiting researcher at Jane Street on the Networking team. In this episode, he and Ron consider what happens when you bring a software mindset to network engineering. Can you use programming language theory and formal methods to realize the dream of software-defined networks? Along the way, they discuss how hyperscalers have shaped networking hardware; the return (or not) of multicast; the ways ML workloads are reshaping the networking layer; and the success Jane Street has had using a foundational Internet protocol, BGP, together with a more declarative high-level specification language.
Some links to topics that came up in the discussion:
Welcome to Signals and Threads, in-depth conversations about every layer of the tech stack from Jane Street. I’m Ron Minsky. It’s my great pleasure to introduce Nate Foster. Nate Foster is both an old friend and an accomplished researcher at the intersection of programming languages and networking. He’s both a professor at EPFL and a visiting researcher here at Jane Street, where he spends a day a week mostly focusing on work with our networking team. So thanks for joining me.
Thanks. I’m a big fan of this podcast. It’s great to be here.
So to start out, I’d love to hear a little bit more about your origin story. How did you actually get into this whole computer science world in the first place and also into the place that you’ve gotten over time in this area of research that you think about?
Yeah. So I didn’t actually get into computer science really until partway through college. I started out as a physics major and I like to say I was a failed physics major. I was taking the sort of standard sequence. You take some mechanics, electricity and magnetism, you do quantum. And as I was going through that sequence, I found I was kind of liking physics a little bit less. And on the side, I was taking CS classes and I just loved them. I was kind of fully engaged. I was doing well. And I really liked how all the pieces kind of fit together. I could understand after a couple years how a program gets implemented down through a compiler, down through the ISA, through the chip, all the way down to gates or even lower. And that was very pleasing. And I also really liked the idea that unlike physics say, there’s a lot of creativity. A lot of these abstractions that we have in computing are really designed by people. So things like lambda calculus and functional programming, like who would think of that as a basis for computation? And yet it runs on a computer. I just found that really beautiful.
Yeah. It’s kind of an amazing thing about computer science. It’s not like a science in a traditional sense, or at least not totally like a science, in that you’re not studying the natural world. You’re studying human creations, and yet those human creations are very bounded by both all sorts of things about the physical world of what you can reasonably implement and bounded in important ways by mathematics. It’s something kind of amazing about the field.
So you can say a bit about how I got into research. That was a bit of an accident. I think many things in my career were sort of an accident. I had the idea that I might want to go to grad school, but I was maybe a little too shy to articulate that. And there was an ad when I was my junior year, an ad for a summer research project working on a Java compiler. And I thought, well, I might want to get a master’s or maybe even a PhD. I’ll spend a summer doing research. And my mentor was a guy named Kim Bruce, who’s a programming languages researcher, and spent the summer hacking on that Java compiler, working on type systems, and just fell in love.
So actually, what were you doing in Java? What were you trying to do with the type system?
Yeah, so this is way back. Java originally was sort of a big advance. It sort of brought things like garbage collection to the mainstream for the first time. I mean, garbage collection had existed, but in a language that was being used really widely, that was new.
And just to be clear, shockingly, 40 years after the invention of garbage collection.
That’s right. Garbage collection existed decades before.
It’s like the single biggest advance in programming language productivity and something like 40 years between the time of invention and the time of mainstream use, which I’ve always found a really shocking fact.
Yep. And Java also had this type system. It had a static, believed to be sound or mostly sound type system. And that was very exciting to the academic programming languages crowd because we love type systems. And yet, again, the mainstream languages at the time, C and C++ don’t really have sound static type systems. So the research we were actually doing was adding polymorphism to Java. Now Java has generics, but at the time it didn’t. And there was a process by which the community was allowed to sort of propose ways of bringing generics to Java. So my advisor, Kim Bruce, had such a proposal and he’s at a small teaching college, so he didn’t have grad students, so he found undergrads to work on implementing this type system.
So is the one that you worked on, the one that was in the end adopted?
No. So the one that was adopted was a very nice design by Phil Wadler and Martin Odersky called GJ, who was the name of the original paper. And that’s the one that eventually made it through the community process. But there were actually lots of teams working on different proposals. There was one out of MIT from Barbara Liskov and Andrew Myers, one out of Rice by Corky Cartwright, Kim Bruce had his, and they all varied in their notation and how expressive they were and different features.
Oh, that’s super cool. And I don’t think I’ve ever heard of another language process that quite worked like that.
Yeah. I don’t actually, I was too young to know the reason why Sun Microsystems, who was sort of controlling Java at the time, why they did it through this community process. But perhaps the feeling was type systems are kind of this thing that seems very simple, but actually can be quite subtle. It’s very easy to design a type system that has a soundness bug or has some maybe runtime cost. So they tried to leverage the smarts of the community to help them design it.
Okay. So that’s how you got your first taste of programming languages. What happened then?
Yep. So then after undergrad, I spent a couple years in England doing not computer science actually, but I knew I wanted to come back and do a PhD. And I ended up at Penn where I worked with Benjamin Pierce, who’s a friend of yours as well. And I thought that I would work on, I don’t know, more type systems or maybe something in semantics, the kind of really macho kind of big topics in program languages. And when I got to Penn, Benjamin said, “I’m working on this data synchronizer. That might be a good project for you to start with.”
Unison.
Yeah. So Benjamin has this tool built in OCaml called Unison. It’s a file synchronizer. And at the time he was trying to make it generic so that you could synchronize data that was in different formats and the project was called Harmony. So I’ve never told this story publicly. I was actually a little disappointed because I thought I’m going to work on a really crazy type system or some new fancy denotational model. And instead I was working on a data synchronizer, which sounded like a kind of grubby systems problem. Yeah. But embedded in this project was this really beautiful program languages problem. And Benjamin and his postdoc at the time, Alan Schmitt, had discovered that in order to synchronize data in different formats, you need to convert between those formats in two directions. So say you’re synchronizing maybe a version of your calendar in the standard iCal format and maybe another version at the time XML was all the rage, so in XML. So it has somehow the same information, but there’s some differences. And if you’re going to synchronize them, you need to basically convert from one into the other and vice versa. And so they’d been writing these conversions, writing sort of functions that go from version A to version B and from version B to version A, and very quickly realized these two functions that we’re writing separately, they’re like really close to each other. They’re almost inverses. And as programming languages researchers, they said, “Well, maybe we should have some kind of abstractions for writing these two mappings.” And that became what’s called lenses. So the idea was to have one abstraction from which you can derive both of these functions.
Right. And actually, you worked on this lens idea early on and in fact it’s had a whole big life afterwards. There’s all sorts of people in the Haskell community who have all sorts of variations on the lens idea and use it kind of all over the place and seem to be very excited about it.
Yeah. So lenses are one of these ideas that I think was sort of in the air. Benjamin and Alan were not the first to discover it. There was prior work in the database community. There was prior work actually in the functional programming community. Someone named Lambert Meertens had worked on a very similar idea, but their definition was particularly sort of clean and elegant, and so it kind of caught people’s attention. And yeah, I worked on it for my PhD. We worked on a variety of lenses, playing with what we could express, how complicated could we make the functions, using types to make sure that the lenses had good properties. For example, you generally want to know that if you’re making changes to one side and then you use a lens to push the changes to the other side, somehow those changes should be reflected accurately and you can try to characterize that using some kind of laws or some kind of specification. And then building this out and trying to figure out how this could turn into useful tools. So I spent six years working on lenses, had a lot of fun. And you mentioned that lenses have kind of caught fire. Other people, much smarter than us, took the idea and sort of generalized it. So in Haskell, there are versions of lenses that are not exactly our definition. They’ve sort of loosened some of the mathematical definitions, but in ways that allow them to have lenses for all kinds of different structures, and it’s really cool.
Okay. But that was still you doing a little bit systemsy, but still like what feels like pretty straight ahead PL research, but that’s not what you do today. So how did that happen?
So I have this funny kind of right turn in my career. As I was finishing my PhD, I decided to apply for faculty jobs, got hired at Cornell where I spent 15 years on the faculty. And then I decided to take a gap year, really for personal reasons. My wife was finishing grad school. I wanted to stay near her. But also I think although programming languages and functional programming is my home, I kind of felt like I’ve thought about lenses for six years, I’ve kind of had all my good ideas, I want to have something else to work on. And so I decided to take a leap and go to Princeton where I worked with Dave Walker and Jen Rexford on problems in networking. And I described this as a leap of faith for me. It was really a leap of faith, I think, for both sides. Jen in particular kind of hired me almost blind and thought that it might be fun to have a project at the intersection of programming languages and networking and hired me as a postdoc with really no background in networking. And that was the start of my current career push.
So maybe this is a good moment to stop and just talk about what is actually your current research like? What’s the overall thrust of your research? What are the kind of core ideas that you’re trying to explore there?
Yeah. So I’ve spent really the last 15 years working at the intersection of programming languages and networking. And maybe before I answer my own goals, it’s worth giving a little bit of context. Starting around 15 years ago, there was a big change that happened in the networking community. It’s become known as software defined networking, and it was driven by lots of factors, some kind of economics, some based on changes in hardware, but the real technical changes were really twofold. Some of the big organizations like tech companies, cloud companies, they wanted to have more freedom to change how their networks work. And that was very difficult for them with how networks were built, say circa 2005 to
So just to make sure I’m getting this right. So in some sense, the situation ex ante was, there were a bunch of switch manufacturers. There was a kind of preexisting notion that was kind of built out of the early internet days of like, what does a switch do and what is IP and what are the protocols? And then there were actually programming languages of a kind. There were like configuration languages that you could use to choose the behavior of an individual switch. And then the kind of classic way of managing a big network is you figure out what the layout of this physically should be, how you want to wire things together. You buy a bunch of these devices, you wire them up, and then you hire a bunch of network engineers whose job is to kind of very carefully configure all of this stuff so that it doesn’t break the network and you get the properties out of it that you want. And then SDN in some sense is like, no, no, no, no, we’re going to like … I guess you still have to do a lot of the physical layer work, but then the configuration part is very different. Now, instead of configuring each individual piece, you try and have an overall program that tells you how the network as a whole works. That’s right. And then the advantages, I guess one advantage you’re saying is much more configurable so that you can get more behaviors out of the hardware than you could have gotten out of the stock thing that you would get from the vendor before. So there’s like a kind of faster cycle to be able to iterate on new things that you want to do. And then maybe another big thing is the ability to reason about the thing that you’re doing at a higher level. All the things you said about composition. The thing I hear from that is like, oh, I can actually predict how the thing is going to behave and understand based on a relatively small program, what the behavior of this sprawling network is going to be and make sure that properties that I care about are enforced.
Yep. And I think one thing that’s different though, we should understand that what counts as a program that you might want to run in a network is going to be different than the kinds of programs that you might run on a server. So there still are some things about hardware, for example, due to the speeds and the sort of scale that networks operate at, you’re not going to write an algorithm that sort of has a heap and is sort of allocating memory on every packet at every hop or something that’s going to be crazy. So you are in somewhat of a special domain. There’s also some differences just from the fact that networks are part of the infrastructure. So even in, say, a single organization, there’s usually a desire to know things about how different pieces of the network are isolated from each other. Or again, even in a single organization, you may have multiple units that are responsible for controlling the network. And so you still have, in some form, federated or distributed control. So this is what makes it not just let’s take Lambda calculus or let’s take Java and that’s our way of writing network algorithms. There’s still some interesting domain specific structure that needs to be explored.
Yeah. And in a lot of ways, this echoes to me the story around hardware synthesis, right? Where again, it’s like there’s a big graph structured computation thing and you need some kind of language for expressing it. And then there’s a lot of play there of like how sophisticated is the language and how much power does the language give you to kind of reason about the thing you’re doing and to flexibly compose bigger designs out of smaller designs. But again, there are these very profound limitations on what you can do in a hardware design. You’re not going to write the same kind of code at all. I guess it’s a question of like how much of traditional PL theory applies in this world because it’s like the constraints are very different.
Yeah. So maybe I can tell you about a couple of projects so we can work through them, but one thing that’s been very exciting and kind of cool for me is despite all these differences, there are some things from classical PL theory that match up quite well. So one of the languages that we discovered early on is a framework we call NetKAT, and this is a language that we’ve designed for describing not the whole network. So we’re not describing the behavior of the end host. We’re not describing TCP and congestion control. We’re not even describing what’s called the control plane, which is the sort of brains of the network that decides which routes you’re going to use and how to respond to failures. So we’re just describing really how the network processes packets at the forwarding level. And we’d been working on a sort of series of DSLs for several years and following on from my postdoc at Princeton, and we had, I would say, ideas about how to make these languages, what features we might need, how to do it in an elegant way. But there was this exciting moment when we realized that all of these languages lined up really well with a system called KAT, which is shorthand for Kleene algebra with tests, which had been around for a couple decades. In fact, Dexter Kozen at Cornell had discovered this framework and had worked on it for a long time. So KAT is really, it’s a pretty high level mathematical abstract framework that’s meant to be kind of a model of kind of standard imperative programming. We can talk more about it, but that actually lines up really well with at least how you can think of the forwarding behavior of packets through a network. And so the reason I say this is exciting is, although you start from a domain that seems to have all these weird primitives and weird constraints, you can sort of extract from it this thing that looks a lot like finite automata and state machines and all the things you learn about as a second year computer science student, and that alignment is quite cool to discover.
Yeah. I sort of think of this as the one weird trick of programming language theory, which is that there’s this complicated and messy and very human process of writing programs. And then it turns out a lot of the best ideas in programming languages come from relating the thing you’re doing to very simple mathematical models. And there’s something nice about languages that have this tight relationship. Not all of them do. There are languages that are kind of much messier mathematically, but the ones that have a kind of tighter and simpler mathematical foundation, at least my sense is, tend to be better at generalizing, like the features that you add to solve one problem, turn out to solve lots of different problems and compose nicely with other ideas. Having this foundation gives you this kind of nice playground with the different ideas that you can come up with, end up integrating better with each other because they kind of all fit into this relatively simple worldview.
Yeah, I completely agree. And actually if in our academic paper on NetKAT, that’s exactly what we said in the introduction. We sort of said, the value of this framework is not so much that we can do something we couldn’t do before, but we’ve now aligned ourselves with this theory that is actually backed by going all the way back to Kleene to the 1950s. And that gives us both some confidence that what we’re doing is maybe right or at least sensible. It also gives us a whole bunch of constructions and tools that we can pull from formal language theory. And in our work, we’ve actually used a bunch of those tools to build compilers, to build verification tools, but especially it’s what you said. So as we’ve extended NetKAT with new features, now we’re kind of back into being confused researchers, just playing with examples and trying to make things work, but having this structure of Kleene algebra with tests has provided guidance to us. So for example, we’ve worked on a probabilistic version of this language, which is very useful in networking because you have unexpected things. You have traffic and you don’t know how much traffic there’s going to be, you have failures, you don’t know when the failures are going to happen, and you may also have randomized algorithms that are used to load balance across the network. And so reasoning about the behavior of all these things requires reasoning about probabilities. So we extended NetKAT with probabilities, worked out the semantics, and that was very not obvious and a bit subtle. And I think if we didn’t have the structure of KAT and other associated theories, we would’ve very easily ended up with a language that was kind of incoherent.
We’ve been talking about a mix of some about what software defined networks is and kind of in the broader sense, and then also about your research. How would you explain the particular take that you and the other people that you work with have on this field?
Yeah. I mean, I think the main slogan is the network should just be thought of as another program. And that’s a short sentence that maybe sounds very trivial, but to networking people, it’s a really different way of thinking about things. For many years, innovation in networks has been driven either by things that happen at the hardware level. A vendor comes along with a new router that has a hardware pipeline that has some extra feature, and then that turns into something new that you do in load balancing or congestion control or something or queuing or something else, or driven by standards bodies. For a long time, if you want to do something new, you have to have a good problem, have a solution, and then go convince a vendor to implement it, go convince a bunch of other users of networks that we should standardize this, get it ratified by the standards bodies. It’s a very long, slow process. And so thinking of networks just as programs is like, well, just as we don’t go ask permission from Intel or NVIDIA when we want to deploy a new algorithm on hardware, we just write a new program, we should have the same freedom with networks.
How much of this is basically gated on the sort of existence of the hyperscalers of huge companies like Google and Amazon and stuff that have enormous networks that they can configure this way? In some senses, an issue around the domain of administrative control. If I need to primarily build a thing that interacts with things I don’t have any control over, then I can sort of see how I’m gated by the standards bodies because I have to get everyone to agree to share the language with which we communicate. Whereas if I have my own enormous network that I’m going to configure, I can just think of the whole thing as a program and then I maybe have to think about the standards bodies on the edges, but within the network, I get to do that.
Yeah, I think that’s right. I mean, if you, again, go back and tell the intellectual history of software defined networking, it very much did emerge when these really large private networks became a thing and the companies that were building them wanted to have the freedom to basically define new features at software timescales. And it’s true, the thing I was just sort of throwing shade at of vendors and standards bodies, well, that’s what built the internet. And the internet worked because you took tens of thousands of autonomous systems, tens of thousands of networks built by different organizations, different people on different hardware in the, originally, like even all the way down to the physical layer, they were using different ways of moving bits around and you connect it all and make it all work and interoperate. So the internet was designed for really connecting up all these different networks and making it work worldwide. And that’s why we ended up with certain solutions. But in these really large networks with hundreds of thousands or even millions of computers and a comparable number of switches and routers, you may also want to have the ability to define those networks and to optimize them and to have them implement certain features. And so that’s a big part of the story, more economic than technical, but I think it’s an important change that you do have sort of … Again, even in a large organization, you do have often multiple teams or multiple units that are involved in this, but still there’s sort of one ultimate unit of control that gets to define the goal.
Do you think the idea of the network as a program generalizes to the open internet?
This is one of the big challenges that the community’s been thinking about for decades. So one of the kind of paradoxes of the internet is it’s so successful that you can’t change it. And this is something that the community’s been very worried about going back quite a number of years, at least to the ’90s. So there’ve been a bunch of efforts to think about, well, the internet works really well at today’s scale, for today’s applications, but there are things that come along that we’d like it to do and how are you going to change these tens of thousands of ASs? How would you decide to move to a different routing protocol for the internet or a different way of moving packets around? You can’t turn it off and turn it back on tomorrow with a big flag day. And so there was this sense, and you go back and read papers from 20 years ago, people would talk about ossification, the idea that the internet structure and its kind of skeleton were kind of setting in and it was impossible to change. So there’s a whole other community that’s been thinking about how we could design an internet that is extensible and evolvable. And that’s a sort of very rich, cool space. And there are people with cool ideas. Some of them involve program languages, but a lot of them also involve different architectures, different ways of getting extensibility.
And to what degree do you think this idea of your network as a program has caught on, has been influential, has kind of changed how people build networks in practice, both among … There’s like an academic audience for this, but there’s also lots of practitioners and companies and also all of these hardware vendors. How is this idea propagated over time?
Yeah. It’s funny, if you track ideas or these trends, I mean, my view is this idea has become just the way things are done. In fact, I can back this up with a little bit of evidence. A few years ago, some of my collaborators, including Jen Rexford and Nick McKeown and some others, we wrote a paper basically arguing that the network as a program was here and it was a sort of vision paper for a short conference and it got rejected and we were a little bit miffed. I mean, we get papers rejected all the time, but we’re really proud of this paper. But the reviews actually said, “You’re describing the way the world works. Your ideas aren’t spicy enough. This is how things work.” So I think to some extent, the network as program or software defined networks just is how things work. Now, for folks who are familiar with the sort of original articulation of software defined networking, of course, there are some ideas that 15 years ago people were saying, “Well, we should build centralized algorithms or logically centralized algorithms.” And that idea has not so much caught on in practice. Or another example is there was a big push, I played a small part in it in making network routers sort of truly programmable. Almost every piece of their functionality could be specified in a program. And again, for mostly economic reasons, that idea has not caught on. But at the same time, the sort of ability to change the, say, hardware pipelines of these routers is coming, just not in the way that it was originally articulated. So again, I’m biased, but I think it really is the way the networks are going and the path towards this vision of you can write code and get the rich behaviors you want. It’s not exactly smooth in the way that was predicted at all times, but the general trend is in that direction.
So the SDN idea is interesting to me because I feel like it falls along another theme that’s a little different from the way you framed it, where you’re kind of talking about this basic, being able to specify things in cleaner ways, getting better abstractions out of it. And I think that’s all part of it. But another thing that I think has been very valuable in this kind of work is just adopting the culture of software and the kinds of tools of software. There’s all these domains in computer science that just have picked up different approaches and techniques for building things. And if you look at the way in which people think about management of databases or doing hardware synthesis or networks or building traditional software, the techniques are actually all really different. And there’s a bunch of really good ideas that have come up in software that I think aren’t as clearly expressed in the other domains and things around things like the way that you do like version control and code review and testing and things like that. And in the old world of networking where you just go and configure the switches to do the thing that you want, you kind of don’t have this centralized place where you can do all of these pieces. And so in some ways, separate from the nice semantic improvements, which I think are super important, this thing of allowing the kinds of tools that people use and the kind of engineering approaches that people use in software to apply that to domains like networking seems like another advantage of this whole thing, which I guess is maybe hard to summarize nicely in an academic paper, but I feel like in practice is a big part of where the advantage comes from.
Yeah, I completely agree. And the other, so if we’re talking about trends that got a bit of maybe hype and a lot of attention and maybe the hype wasn’t quite deserved or things didn’t quite play out, this area Well, you’re talking about something more broad than just formal reasoning. I think you’re just talking about adopting modern software practices to keeping things in databases or repositories, having not … Humans don’t just log into a router and YOLO a change. They run it through a process. Maybe there’s even some checking. Absolutely. And you can actually, I mean, you can go back. People were thinking about this even at the ISPs in the ’90s. Large ISPs were sort of the equivalent of the hyperscalers back then. They operated big networks that were complicated, expected to work. And they had started to experiment with some of these ideas of having at least sort of centralized specification of the functionality at companies like AT&T and then how to realize that. But the other piece is verification. And this I think is also, it’s maybe not quite as far along, but it’s something that’s becoming quite commonplace. All the hyperscalers are doing it. There’s also some startups. There’s, of course, many academics who are interested in this idea. And here it’s that, well, if you have some representation, maybe it’s not a beautiful representation in NetKAT, but you at least have some program that describes how the network is supposed to behave or how it is configured. You could start to apply all the tools of software engineering, testing tools, validation tools, even verification tools. And this could then become part of your workflow. So before someone decides to push down a change that might change the routes between two data centers, you could check, is this going to break connectivity anywhere else in my network? And that’s a good idea.
Yeah. And I guess you actually hear lots of stories of large companies managing large networks where config changes break things and cause huge outages. It’s actually like one of the biggest problems you run into is people having a config change that unexpectedly has some semantic behavior that they didn’t expect.
Yeah. I mean, I think to be clear, this area is really not done. And it’s in part because although this … I’ll take network verification. It sort of took this one layer that became exposed. So the idea that there’s a centralized either database or program that is defining the behavior of the network, and then that gets pushed down to the routers who then realize it, that gives you a place where you can sort of interpose and you can intercept snapshots of the network and start to test them or reason about them. And so that’s what people mostly have done, but networks are much more than that. And they’re distributed systems. So they have all the complexity of distributed systems where you can have failures that you didn’t expect and interactions that weren’t part of your model. And so we’re definitely not done. I mean, you still see outages due to human error or flaws in the model all the time. And I think this will get better, but it is really hard to reason about these complex systems that have components you didn’t even really think about or know about. Multiple control loops, funny interactions. It’s a true puzzle that requires some new ideas to make progress on.
Do you have a good example of a kind of problem in this space that is now pretty well solved, like a thing that people would pretty routinely get wrong in the past, and that now there are at least in some places, good verification checks to help people not make those mistakes?
Yes. I’m going to twist the question slightly and not say that people got it wrong, but that there was also sort of conservatism. People were afraid to make changes because they weren’t sure what the impact of those changes would be.
This is, by the way, just a huge part of the network engineering story, as I’ve experienced it. I think part of the job of a good network engineer to tell you no. It’s like, that’s too complicated. We’re not going to do that.
Yeah. So I think, I mean, one example that’s, I think it’s no longer research, it’s sort of been fully reduced to practice is reasoning about these snapshots of the so-called forwarding plane of a network. So of course, networks have changes happening all the time. There’s failures, there’s different controllers that are maybe monitoring the system and making changes, but you can pretend that there’s a snapshot and you can … A consistent snapshot maybe that you can extract and then reason about. And that snapshot can be modeled using tools like model checkers or SAT solvers, these automated theorem provers that understand first order logic, or custom tools like NetKAT is such a tool you could put it into NetKAT and then ask questions about the model. So this is something that is pretty widely done and mostly just works. I mean, it does what it’s supposed to do. And so it does catch certain errors in the sense that if you could write a specification like these two hosts should always be connected no matter what routes are being used and I want these two hosts to be able to send traffic to each other, or these two hosts should be isolated. And if something goes wrong and somehow there’s a path between them, that’s bad. So these kinds of properties you can check. And if your control plane, okay, the brains of the network makes a change that would violate that property, you get some kind of signal or exception. And I think that’s useful in, say, cloud companies, but what it doesn’t solve, of course, is what do you do? So if you’re getting a failure of a property because the control plane has done something bad, then what? So it’s not like we’ve sort of made networks perfectly reliable or perfectly able to satisfy their specifications.
Well, although if a controller wants to do something and you know it’s bad, can’t you just, I don’t know, in a software context be like, “Oh, we won’t merge that PR.” If you can catch it at the time where the change is proposed before it’s accepted, then there’s something you can do about it, which is just, again, like the network engineer, you can say no.
Yes. Although there are cases where that may be sort of the wrong move. If it has some other effect like you made this change because of a failure, do you keep the failure unsolved because you decided to reject this change that violated your spec? This is where things get a little murky.
Oh, interesting. So like..
There’s a control system sitting on top and merely saying, “I reject your change.” If the control system is not going to then do something better, you may not have actually improved life.
Got it. No, that makes sense. So a lot of the things that you’re talking about seem like they involve a pretty rich connection between a bunch of ideas that you can develop in an academic context and then a bunch of industrial use cases. Some of these at places like the hyperscale or some of them at the actual network switch vendors. And I know that you spent some of your career with various kinds of engagements with the kind of industrial side. Can you say a little bit more about how that worked and how you’ve integrated that into your career and research and approach to thinking about this space?
Yeah. So this is something I actually love about the networking research community. And I say the research community and not the academic community because it truly involves the hyperscalers and the switch vendors. Somehow, the particular community that identifies as researchers in networking is not just driven by universities and PhD students, it involves all these different entities. And so it has a really nice mix of, you have people doing pure theory, but you also have people who have designed the wide area backbone for a giant cloud company and they’re all coming together to share ideas. So that’s really cool as a researcher because you have this relatively small group of people who all know each other and they’re working on related ideas and you therefore have this kind of quick, you can have quick transitions of research ideas getting into practice. I think my sort of original home community of program languages also has this, but as we already said, the timescales are often much longer. We talked about garbage collection from the 1950s till the 1990s or type systems, similar many decades. And actually, I mean, Jane Street’s a very prominent company in the functional programming space. So people understand that functional programming is being used industrially, but Jane Street’s a little not unique, but there’s not as much of a conversation between sort of mainstream programming languages as used by the millions of developers in the world and the academic community. So it’s something I really love about the networking research community.
Although maybe there’s like more now, like Rust is another example of a language where there’s been a lot of very rich connection between academic and industrial.
Yeah, yeah. I do feel like it’s been changing and yeah, similar kind of short timescales to cool academic idea appearing in some mainstream language and then being broadly used. There’s another piece of this that’s kind of interesting, which is when these ideas like software defined networking first came out, a lot of the companies decided deliberately to sort of engage a broader ecosystem, a broader community. Quite famously, Google was sort of very interested in ideas like OpenFlow, which was the early SDN sort of standard, but chose to do it in open source for strategic reasons. But that sort of created an opportunity to build a community around these things.
And were you involved in thinking through and helping set any of those standards?
No, I was not involved in OpenFlow at all. That was already pretty baked by the time I did my postdoc with Jen and Dave at Princeton. I did get involved in this second phase of trying to design languages and associated hardware for describing the behavior of individual routers, switches, network interface cards. We worked on a language called P4, and that was a similar sort of community effort.
Right. And OpenFlow was more like you get to set the routing table, like a little bit more general than that, but it was like there are tables that you get to configure there. And then P4 was more like you kind of get to write the whole switch.
Yeah. OpenFlow people like to bash on because it was kind of cartoony. I think its designers did not intend it to be a cartoon, but there’s sort of a big gap between OpenFlow’s model of how a router works, which is basically there’s one big lookup table and you’re going to cram all your logic into this one big lookup table. And the reality of high speed routers and switches, which have pretty complex pipelines with specialized units that do certain things. So I think the original hope was that somehow OpenFlow would be realized by smart compiler teams, people who would sort of lower it down to these pipelines. That’s actually a pretty hard task. And so P4 had the benefit of being sort of a second mover. It was sort of a second attempt and it just exposes the structure of the pipeline. So you do get to customize what happens, but there are certain things that can be just exposed in the language or in the programs itself.
Right. And the work with P4, this actually involves some pretty deep engagement on the industry side for you as well, right?
Yeah. So I chose during my first sabbatical at Cornell to go be a part of the company that was developing one of these programmable switches called Barefoot Networks, and then also the P4 community. This was a choice because I think I felt some, maybe, I don’t know if imposter syndrome is the right word, but I felt very much like I was sort of the program languages academic who was sort of going and cosplaying in networking. And I wanted to understand at a deep level, how does a router really work? And so going to a hardware company seemed like a great way to do that.
And then I’m a little curious both how did it feel making that transition? I feel like going to be a CS academic is like a choice, right? You can go and do computer science in academic context, you can go and do it in industrial context. And I’m kind of curious why you in the end made that move and what you felt like you got out of it and how it affected your thinking and research after that.
Yeah. I mean, I can tell you the kind of personal history. There’s, again, mentors who sort of provided advice. I was very unsure. My colleague, Fred Schneider, who you know well- My advisor. Your advisor. Yeah. So I remember talking to Fred, should I do this? And he was like, “Absolutely. If you have a chance to go deepen your knowledge and expertise in a space, that’s going to pay dividends down the road.” I had other mentors, George Varghese, who’s at UCLA now, he was sort of like, “Go do this.” And then Nick McKeown, who was the co-founder of the company and one of the SDN pioneers, he really sort of opened the door to me to come be at the company. So it did not feel like a big risk. I confess, of course, really not knowing anything about hardware except for my VLSI class from sophomore year, and not that I became a hardware designer, but the hardware is just amazing. And you have just these people who are the best at what they do at designing circuits, at optimizing them, at physical layout, at integrating all the pieces from different vendors, going to the fab and getting it manufactured. I mean, it’s just amazing what is involved in making a chip. And the startup had some real veterans and people who really knew what they were doing. It was some alumni of Texas Instruments and then others from around the Bay Area. And I learned a lot. It was really fun to be with those kinds of experts and learn how hardware works and how it’s built.
Do you have any concrete examples of ways in which your research after was different than your research before for having done the experience?
I think, well, there’s a line of work that came out of the sabbatical that I think I wouldn’t have done. I can’t take credit for it. My colleague at Barefoot, Changhoon Kim, he’s now at Google and does lots of AI infrastructure for them, had this idea that if we have routers that are programmable, like fully, I should say on the router chip that Barefoot was designing, literally almost the entire behavior end to end you could specify in the program. There were a few things that were fixed, but pretty much you sort of receive bits and then you can write a program that parses those bits into some data structures and then you can write some code that interacts with different memories and you can change the bits and you can run them through different hash functions and other functions, and then you can spit them out the other end, the whole thing you could really specify. And so he sort of realized this is just another kind of processor. It’s a processor that looks a lot different than a CPU, but it has a little bit of memory, a little bit of state, and it’s very high throughput. And if you’re thinking about a data center, what would [… elided 2613 characters …] his thing that … I remember when I was eons ago in grad school and learned about networking, multicast was this important thing that was going to be the way that we delivered video to people and that turned out not to happen. And in fact, I think there was a basic confusion wasn’t clear till later, which is that it turns out multicast, which is a mechanism for essentially broadcasting data, which involves laying out trees that you can use for the automatic transmission and take advantage of the network switch’s capability of copying data in parallel down multiple paths concurrently. And it seems like a great way to get the same data to lots of different people. But I think the thing that wasn’t clear at the time was that the data plane was going to be super cheap and the control plane was going to be really expensive, meaning you would have a huge amount of bandwidth for sending data around and actually very little space for the control data with which you would lay down the trees. And if lots of people want to consume lots of different data in these different logically separate multicasts, then you just weren’t going to have space to kind of specify all that. But in our world, there’s a relatively small number of channels, a small number of things we want to get to everyone. And so the old multicast idea kind of works in this context, even though it kind of totally failed in the outside world. And now it’s just like in the cloud, there’s just like no multicast. In fact, sometimes cloud vendors will give you things that look like multicast, but they’re implemented really, really badly and slowly. And it turns out those are there for if you have some ancient application that you want to run and it thinks it wants multicast, then you can give it that interface and we will do a thing that kind of delivers the right packets at totally the wrong timescales, but you can get some legacy software to work that otherwise wouldn’t work at all. Yeah. Well, I want to talk about multicast some more, but I think it’s also interesting, you sort of mentioned a way that your grad school version of networking and certain distributed systems kind of was wrong. And I’ve described a few cases where I worked on things that kind of didn’t end up being as successful as we hoped. And I think that’s actually really healthy. I mean, certainly university researchers should be working on things that don’t work out. And I don’t mean don’t work out like you couldn’t solve the problem, you couldn’t prove the theorem, you couldn’t build the system or whatever, but don’t end up being the way the world works. I mean, that’s part of being in a creative, innovative community is like, people are trying wild things and not all of them are going to be the right thing to do or the good thing to do at a particular juncture. And I think when communities get too conservative and you’re supposed to do things the sort of Orthodox way, that’s actually a recipe for stagnation. And that’s a little bit of the ossification that happened in the internet community was like, there were these principles, which are good principles. The end-to-end principle’s a good principle, but you should know when to break it. And breaking it may not be the right thing for every scenario, but if we never are allowed to go revisit that rule, somehow the world just got a little smaller. So I love communities that in some unruly way are like advancing and making progress towards greater and greater things, but along the perimeter, there’s just all kinds of chaos and people doing crazy things. I know you watched this video, we just put on a conference called Nines, which is a new conference for the networking community, devoted this idea of like, let’s explore new ideas and we’re going to be, not going to be wrong, but we’re going to be very accepting of new ideas. That’s going to be the main criteria we use to evaluate papers. And to accompany some great papers that got published, we also invited some sort of luminaries in the field to lend credibility to our effort and also to tell stories about their experience with new ideas. And there’s a video from Scott Shenker, What I got wrong about QOS. So Scott is like, for those who don’t know Scott, he’s one of the giants in computer networking. He’s another, well, Scott’s not the failed physicist. He actually is a successful physicist, but who switched to networking. But he was involved in bringing networks that have support for quality of service in the late ’90s and put his name behind it, wrote lots of papers. And he gave this very thoughtful piece about why things didn’t work out and what was wrong, but I mean also what was right. But I just love that piece. I think academics should, A, be doing wild things and then not be so shy about reflecting on, hey, this thing didn’t end up catching fire or being the way the world works, but it was still interesting to explore.
Great. Yeah. I think we’ve had our own kind of evolution within Jane Street where I think early on in some sense we only did things that worked. It’s a small company and there was lots of low hanging fruit and lots of opportunities and we kind of started out with a working business and there were lots of small things you could do to make that business better and almost always make things better in relatively short timeframes. And we still do a lot of that. There’s a lot of plucking of low hanging fruit, but as the organization has grown, we’ve had more opportunities to try bigger things and to do projects that sometimes take years to bear fruit. And there’s something lovely about that as well. And I think academia can and should and does take an even more extreme version of that where you can take more bets that … Some of these bets might not work out for 10 or 20 years, and that’s okay. And I think it’s an exciting way of moving forward the bounds of knowledge and to be able to do that in a way that isn’t as constrained by the needs to get the next practical thing up and running. So you’ve had this kind of industry experience at Barefoot. Now you’re here at Jane Street with a different kind of industry experience. I was sort of involved in the story of how you got here. We’ve known each other for a long time. I think I first met you when you were, I think, a PhD student at Penn in Benjamin Pierce’s office talking about the lens work. So we’ve known each other for a long time, but I’m kind of curious just from your perspective, you spent a long time knowing vaguely of Jane Street as this weird trading firm that uses functional programming languages. And I’m kind of curious how that process felt to you of what you thought about Jane Street in the past and what in the end led you to think, “Ah, actually maybe coming and spending some time here and doing work here would be an interesting thing to do.”
Yeah, I think we had been talking, I mean, Jane Street, for those that don’t know, is sort of quite prominent in the academic, functional programming community, which I consider myself a part of. So Jane Street sometimes publishes papers at ICFP, which is the main conference in functional programming, sends people to the conference. I was sort of aware of, here’s this company that is using a functional language for much of its work, happens in my favorite functional language, and also really takes on hard technical problems and is willing to make these longer term investments in systems and tools. And so I was sort of aware of that. And I think we’d been talking, I’d even had some meetings with some folks in the networking team about some of the efforts that were being made to, I would say, bring SDN-ish ideas to Jane Street’s network. I wasn’t around, but my understanding is Jane Street’s network has emerged from being a smaller human managed network to something that’s, well, maybe not quite as big as the hyperscalers, but getting big enough that you want to use those same ideas of top-down specification, having some tools to understand what’s going on, being able to really make changes with confidence. So to me, the chance to work on those kinds of problems at a place that has, and we should talk more about the firm’s culture, but we talked about ideas that fail. Jane Street has a very open culture. There’s a Wiki page, you can just do things. And I think that really appealed to me as well, the idea that we’re not going to be just doing things a conventional way. We’re going to think about cool, maybe new ways to solve these problems and then get smart people to work on it together. The other thing that, I don’t know if I’ve told you, but it was definitely in the back of my mind is, although the research community has this lovely sort of tight embrace with the hardware vendors and the cloud companies and now the ML companies, I was very excited about the idea that, well, maybe financial networks are different, right? There’s certain things like multicast like latency and latency at a different timescale than what the cloud companies care about. And whenever you just take a problem and you tweak it a little bit, you add some new assumption or different constraint, often that leads you to a very different kind of solution. And so I thought it might be fun to understand some of the unique problems that finance and Jane Street has, and then to be at a place where there’s going to be the smart people and resources to go solve some of those problems.
Yeah. And for all that academia is a place where you can work on all sorts of wild ideas, it’s also the case that in lots of contexts, academic work gets kind of caught up by the industry thing of the moment. And I think in networking, the hyperscalers have that shape of it’s a legitimately big and important problem and you see that networking papers kind of overwhelmingly want to think about that. There’s like an older fun example of this in the garbage collection world where there’s like a stretch of years where all the garbage collection papers are about Java garbage collection. And Java was like a very particular kind of language with a very particular approach to garbage collection that skewed the way things were done. One interesting aspect of this is the way in which you tune a garbage collector, like you always have to tune a garbage collector, trade off between space and time of how much time are you going to spend collecting and versus how big you’re going to let the heap go. And then the traditional way of doing this in Java collectors is a kind of roofline model where you’re just going to be like, “How much memory can I use? Well, how much memory do you have?” And you sort of say, it’s like you have this big hulking enterprise application and it will run on a box and it’ll be able to use every bit of RAM on that box. And then when it gets close to using it up, then it’ll have to work harder to collect memory. And that’s not like the only, and for many context, not really the best way of tuning it. In fact, OCaml has a very different way of tuning its garbage collector, which operates in percentage terms and a whole different set of heuristics and stuff is in mind. But for a long time, all the papers were structured around this roofline model and then eventually it breaks out of that. And so anyway, I sort of am open to this idea that it’s often useful to break away from the standard thing that everybody is doing. Yeah. So now you’ve been here for a while and worked on some interesting problems here. What are examples of problems that you have seen that come up in the Jane Street context that actually do look different from the problems you see in the outside?
And one example, we wrote a short paper on this just getting at this question of multicast. We’ve already talked about a little bit. So I don’t want to summarize the paper, but the basic story is support for multicast hasn’t gone away, but it’s really been sort of diminishing at the hardware level. And although I think most, if not all trading firms, of course, use multicast because that’s what the exchanges are giving us in terms of data. The trend has been that commodity routers have gotten, they have more features, they have a lot more bandwidth, but they’re getting a little bit slower. So the latency is kind of creeping up as you make more complex pipelines that can do more processing of every packet. And then relative to the growth in things like bandwidth, support for multicast has sort of been flat or even getting a little worse. And so we wrote this paper that was just asking, this is not yet, I wouldn’t say it’s like a looming problem, but if you sort of follow the trends out for some years, it could become an issue. And then in the second part of the paper, we sort of asked, well, what are some really different designs that we could think of? Are there different kinds of fabrics that we could build, maybe things based on optical networks or circuit switching, and there’s interesting trade-offs. You can actually build a fabric that sort of delivers lots of traffic simultaneously to lots of places, but then you have sort of a proliferation of traffic everywhere and you have to filter it. So you sort of have this trade-off between easy, cheap delivery with certain kinds of networks versus the ability to inspect, classify, and then split and drop. And so that’s a paper that I think sort of uniquely could be written in this context.
Do you think there are lessons to be learned from places that are in more of the hyperscaler mode from the things that you see in more trading style networks? I’ve looked over time at the kind of designs that people have built for doing all sorts of standard web style problems of … An example that I remember talking to some of the engineers there about is the way in which Twitter does distribution and analysis and transformation of the sequence of tweets, which is like now on the modern scale, a pretty small data problem, but at some point it was a bigger one. And I remember looking at that and thought multicast would be really useful here. And I wonder to what degree whether the kind of magic powers of multicast are kind of underappreciated and underused in other kinds of context and that maybe they should pick up them, maybe they should pick them up more than they actually do. Another thing that shows up a lot in trading context and showed up a lot in my own PhD is state machine replication, which is a kind of core idea for building distributed systems and like shows up a ton everywhere and like cloud providers are also building things with this. But multicast is like a super nice primitive for building efficient state machine replication systems. And it’s not one that seems to show up a lot in practice. And that’s like another just concrete example where I suspect it could be more useful. I think it’s … Multicast gets used for lots of things in a trading context. There’s using it in a very kind of local environment in the middle of building a certain kind of more or less super computer. You can have like lots of systems that are hooked up to each other with multicast and it’s a way of like giving you an efficient bus for just distributing messages to everyone. And then it can also be used as a way of connecting data across different organizations. And that’s the exchange side of this, right? They deliver multicast as a way of efficiently and fairly getting their data out to all the many people who are consuming it. And then that same multicast tree kind of extends into the network of the consumer. And maybe that latter one is like very trading specific, the kind of cross-institution version of it. But the inside of the company version of it or the inside of a system, it feels to me dramatically underused.
Again, I don’t have your distributed systems instinct, so I can’t quite a spar with you on that. But one thing that I’ve been pondering recently is, and this is inspired by a paper by Nick McKeown and his student, Sundar, that they published just last year. Most networking infrastructure is still based on the good old packet switching model. And there’s reasons that we moved to packet switching in the 60s. It gives us efficiencies. We don’t have to schedule things. We don’t have to understand reserve capacities and so on. It’s a very simple building block that has really nice properties. And you might wonder, what does this have to do with packet switching? Well, sorry, with multicast, the challenge with packet switched routers is that building support for multicast is actually pretty complicated. So doing that with low latency and building in the heart of a router, a unit that can move packets along any combination of input and output ports at speed, maybe even doing some queuing at that same time, that’s pretty complicated.
I guess, and it’s complicated because if all you had was a single multicast tree to distribute along, then packets would come in and you’d copy them out to multiple outputs and everything would be cool, but you actually have many different things happening concurrently. So you both want the parallelism of being able to emit out of multiple wires at the same time, but also you have to tolerate all this dynamism. And there’s like some fundamental tension there of like, you can’t at the physical layer do things completely in parallel if some of the resources you’re trying to address are busy doing something else.
And this part of a router is sort of the middle part that has to run the fastest. It’s what determines the rest of the performance of your whole router. So that’s generally complicated. So what this paper that Nick and Sundar wrote is, they were looking at machine learning workloads, in particular training workloads, which are often very regular. And so why are we doing packet switching at all? Why don’t we just understand what’s going to happen when, and then take those schedules, this data here is going to be delivered according to this permutation and this one on this permutation. And then you can build a much simpler switch that just understands how to implement these permutations on a schedule. And Nick has deep hardware understanding. So the paper explains why this would lead to simpler, cheaper, faster switches. But for that use case, it seems like the right way to do things if you were to be able to boil the ocean and build all the infrastructure from scratch. So to me, the sort of intellectually interesting question is, we have certain kinds of networks that can do these tasks that are otherwise expensive or slow or hard. We know how to build networks that can do those tasks very well, but we’re sort of afraid to build them because there’s so many benefits of packet switching that we can just sort of spray these packets into the network and whatever resources we have will be used efficiently by the herd. But maybe in the future, we’ll start to think about going back to some hybrids where we do a little bit of both.
Yeah. And in some sense, we are kind of boiling the ocean or maybe making several new oceans or something because the whole ML world is creating this enormous revolution in networking and you have both much higher demands in terms of throughput and latency just because of the … In fact, in part because of this very regular process, right? A lot of the synchronization in machine learning training is this kind of barrier synchronization where you have a bunch of hardware in parallel doing a thing, that hardware is actually very deterministic. And so it finishes pretty much at the same time across multiple, and then they need to exchange their tensors really quickly. And all that time where they’re exchanging tensors, I mean, you can do some overlapping, but sometimes you can’t do overlapping. And any uncovered communication is just time where these very expensive GPUs are just idle and it’s just wasted money. And so there’s a huge amount of pressure on these networks and people are also increasing the heterogeneity of the networks because now you have the networks on the inside of the … I guess NVIDIA uses, maybe everyone uses this kind of somewhat odd terminology of scale up versus scale out, where scale up is like the really fast little network and then scale out is the network beyond that. So that could be a context where you have the freedom to go and try very different things.
Yeah. It’s actually, I mean, in networking, it’s a pretty exciting time because people are playing with maybe not ideas quite as radical as building just a scheduled crossbar in making that be the building block. But there is a lot of innovation in transport protocols, collectives, co-optimizing the low level CUDA code and the communication code. So things feel very suddenly like, oh, we can sort of play with all these pieces of the design. And then because training and serving AI models is sort of the central problem of the day for systems, you kind of get immediate feedback and when something works, people get very excited.
So another thing you’ve been working on while you’re here is BGP. Maybe you could say a few words about what BGP is and then talk about the problems that we’ve run into that you’re working on making better.
Yeah. So this is maybe one of the areas where Jane Street was sort of, I think, living in the past. We have a by now big, and it’s been growing a lot, worldwide network that connects all of our sites. And although we have a lot of tooling and analysis, we were still expressing what we wanted the wide area network to do in terms of configurations for individual BGP routers. And that very much feels like sort of the dark ages. So maybe I’ll quickly explain what BGP is. BGP is what was originally designed as the routing protocol for the internet. So you have the internet with all these tens of thousands of autonomous systems. Every organization is its own system, gets to decide how it routes traffic to other autonomous systems, and you need some protocol that these so-called AS’s can use to agree on how traffic flows.
What’s an AS?
Autonomous system.
Oh, okay.
And the way the BGP works is essentially every AS knows who its neighbors are, and it selectively shares information about certain paths it knows to reach certain destinations. So for example, our routers that connect to our peers on the internet might say, “Well, hey, we’re Jane Street. If you want to reach any Jane Street IP address, come to us.” And then those routers will send to their neighbors a similar advertisement. If you want to reach Jane Street, I can reach them in one hop and you can also share other characteristics about the path. So this is sort of the basics of BGP. It’s a so-called path vector protocol. It’s disseminating information through the internet about paths that reach certain destinations. And what makes it very rich is there are many so-called attributes that you can add to these advertisements. So you can decorate an advertisement, not just with, I know how to reach Jane Street, but I know how to reach Jane Street with this cost. You can add sort of tags, you can add a whole bunch of information. And now when a router receives this advertisement, it can sort of compare maybe a whole bunch of advertisements it has all for reaching Jane Street on different paths, and it can then make a selection and decide which one it thinks is best. So it lets every node kind of make a local choice and express its own preferences, but it also kind of quickly disseminates information about all the paths through the internet.
And in some sense, it’s sort of like the opposite of what we were describing in this kind of … I step back and write a big program that lays out a mostly static graph. This is like, instead I have a rich distributed system of individual nodes sharing information and then making local decisions about how to route data. Although hopefully somewhere in there, there’s like some reason to think that those local decisions actually lead to good global outcomes.
That’s right. What I’ve described is basically how BGP works on the internet and it’s actually, it was not known for a long time why BGP seems to work so well. You would think that a bunch of nodes that are making independent decisions —
That would seem like a very important thing to know.
Well, in fact, the internet routes are fairly stable. So things sort of converge to … I mean, the internet’s always in motion, of course, but if we could pretend that we could stop the world, the internet sort of converges to the paths that are sort of at least a local optimum pretty well. And there’s a really nice paper by Jen Rexford, my postdoc advisor and Lixin Gao that explains why this is the case. And it turns out that the internet has a kind of structure that comes from the economic relationships. You have ISPs and customers, and because the kinds of BGP choices that different players in this ecosystem tend to make, it turns out there’s sort of latent properties that cause BGP to behave particularly well. These are the so-called Gao Rexford conditions. This is like ancient stuff, but it’s kind of cool that this unruly distributed system actually works pretty well for these reasons.
Why did this present problems for us?
So one of the things I haven’t said is that BGP is often also used inside of organizations. And it’s a little confusing because it was, again, originally designed for the internet where the nodes that are participating are an entire organization like Jane Street or an entire university like NYU. But of course inside of Jane Street, there are also many thousands of routers and they need to understand how to reach certain destinations, both internal and external. And so there are other protocols that have been used in the past, but for many decades now, it’s been really common to use BGP also internally to share knowledge about what paths exist.
Is part of the reason for that, basically the dynamism that you need, like if nothing else, links can fail and you need to be able to recover from link failure.
Yeah. I think it’s a really expressive protocol. It’s got all these ways that you can cram in information about different routes and make choices and selectively disseminate information. It’s widely supported by vendors. All the network engineers know it because it’s been this way for a long time. So why not? It’s a good tool for disseminating information about the network topology and its paths.
Right. I guess in an alternate universe where you’re totally down the SDN route, you could imagine that you could just look at your overall network and just decide what you want to lay out and then you don’t have to think about the communication part of it, but then there’s no story there for a dynamic.
One of the sort of maybe flaws of the original SDN conception is, although you might want to think about your specifications or your program as being truly one program, I have one objective for my network and I’m going to check that into a repository and have people review it and argue about it and test it. But then the way you realize that there are good reasons to have distributed protocols. They detect and respond to changes very quickly. They don’t involve lots of coordination. And so if you can map your high level objectives into a distributed implementation, there are good reasons in a large system like one that spans the whole world to do that.
So maybe you want to compile down to something, but maybe you don’t want to compile down to a static graph.
Right. So the system I’ve worked on here is a system called Butane, and really we’re trying to do exactly this. We’re trying to have, well, we have a sort of higher level and in our case, it actually is centralized. It’s like checked into our repositories and if you want to make a change, you go propose a change and it gets reviewed just like our other software. But then it gets compiled into snippets of BGP, one for every router in the network, and then the behavior of the whole thing is somehow the distributed behavior of all of these routers exchanging BGP messages with each other, and then ultimately arriving at some graph that forwards packets through the network.
So I guess the top level of Butane is some kind of specification of what you want the behavior to be, and then that compiles down to the actual concrete configs that land everywhere.
That’s right.
So what can you say in that top level spec?
So the top level spec, I guess one thing that we care a lot about is latency. So we, to a first approximation, would like certain kinds of traffic to absolutely take the fastest paths. And then there’s other traffic that we just wanted to get to its destination somehow. And we may not actually care that much if it loops around the world to get there, if it takes twice as long or three times as long as it has to, as long as it gets there, that’s okay. And so the Butane’s policy abstractions are really designed to support sort of the default case just sort of happens. You don’t have to specify very much. And then where you want to say, no, this traffic should take a fast path, you can do that. And then there’s some other pieces that are kind of a little bit inside baseball, but we internally have certain structures to the network. And so there’s the policy abstraction sort of expose some of those structures, things like where are their sites? We have certain expectations about how traffic may flow or not flow between certain sites. There’s a whole set of ways that we sort of classify and differentiate traffic. So there’s some features for doing that as well.
And then what is the technically hard part of this story? I could write a program that writes a bunch of BGP configs, but I don’t have much of a lock between what I wanted to happen and then what’s the dynamic behavior of it. Is that like the central problem here?
So yeah, let me not answer the question. I’m not going to talk about the hard part. I want to tell you, why has Butane been valuable or what have we found is sort of the most important parts of Butane? And to a first approximation, it’s a little bit something we talked about before. It’s just bringing a sort of software mindset to thinking about the wide area network. So instead of operators having some change they want to do, network engineers might want to move this traffic over to this other path and then go make a bunch of changes expressed at the BGP config level to several routers. Now you modify a little bit of Butane config, the compiler generates the actual BGP code, and then it gets validated. We have some tools for visualizing and validating the changes, and then it gets pushed to the network. So there’s actually not, I would say, giant technical challenges. This is all sort of fairly well understood stuff, but for Jane Street, moving us from a world where we’re making changes to individual routers at the BGP config level to being able to work with these other abstractions has been pretty exciting. And I’ll say, I was actually not sure what the network engineers would think, but so far people seem to really like it. The ability to take these bigger steps has been very exciting. And then the other piece, which maybe surprised me a little bit is all of the tooling that we’ve built, especially tooling for testing and visualizing, that’s what they’re actually so excited about, the idea that I can make a change and then we have this UI where you can see what’s the expected change in latency say between sites. And this is something that maybe they could have worked out on paper or could have pushed and then tested, but now we have a model, again, based on some historical measurements and we have a semantics for both Butane and BGP, so we can sort of compute what’s the change going to be. And we can do this both for sort of like propose one change and then see what might happen. And we’re in the midst of making this more powerful. So you can do sort of what if kinds of things like-
Or like, what if we lose a link?
Well, what if we lose n links or I’d really like this hotspot to disappear. Proposed to me a set of changes you could make that would move traffic off here while minimizing other changes. So that’s a different kind of, I don’t know, edit or UI that you might like to have where you’re not specifying a different Butane policy abstraction, you’re really kind of expressing constraints on what you’d like the system to get to.
So do you essentially need a kind of solver which does just like exploration of the space of other configs?
The second thing I described is still research.
Got it.
Yes. So we’re actively working on that, but yeah, it’s going to look very much like ideas from program synthesis or some kind of solver where you can take these constraints and then explore the space of programs that might meet them. This is something that, again, maybe to a programming languages person doesn’t sound all that wild, but to do this to the network infrastructure I think is pretty wild. And what keeps it safe is that we do have this, we have a semantics for both Butane and BGP. It’s extensively tested both kind of mathematically and against the actual hardware. You have to make sure that the way that BGP is realized by the vendors doesn’t somehow differentiate from the internet standards. And so we have quite a lot of confidence that our model of BGP is good, and therefore we can do these analyses and give answers to engineers before the
Market. Right. And at the end of the day, there’s a kind of formal methods piece where it is trying to do something that gives a provable up to the fact that you don’t know if the underlying model is quite right because the vendors might be doing something differently, but gives you a kind of provable guarantee of the fact it is analytically telling you about the network. Is that right?
Yeah. I think this use of formal methods actually may become a lot much more commonplace. So I think formal methods, people often think, I build a tool to verify something to stop bugs. This is going to be the seatbelt of my complex system. But here, of course, we care about stopping bugs and preventing us from making mistakes, but the real power is now we can start to explore. So we can start to take bigger steps and even automate some of the exploration of those steps. And that would be unthinkable if you have to reason about the impact of all these changes network-wide and let alone on latency, on congestion. And so having models that are backed by some kind of mechanical implementation that you believe at least closely corresponds to what’s going to happen, that’s what empowers these tools that take bigger steps.
And rather than it giving you a single notion of correctness of like, I’ve written a spec and does it follow the spec, it’s almost like a kind of observability of like, I get to explore different possible setups and think about how they perform and what their behaviors are so that I can, with more confidence, make trade-offs between different design decisions and figure out how I want to structure the network.
So one surprising use of this that has just happened in the last couple of weeks is some of the folks in the team are starting to use it for not actually routing, but capacity planning. So deciding what future links should we buy. And this is something where there is actually a well-understood mathematical theory. You can model all the demands for bandwidth and the current network and then figure out how to augment it subject to what fiber is available. But to do that connected to our current Butane policies and our historical workloads and latencies is kind of cool. So to put it into not just how do we route today’s traffic, but how do we figure out how to expand the network so that we can do a better job of routing tomorrow’s traffic?
And I guess that’s just all on the back of essentially having a more complete model of the network that includes these kind of dynamic behaviors.
Yeah. I’ll maybe say one more thing. You asked about what was hard about Butane, and we benefited from a lot of academic work. People like Zach Tatlock at UDub had written down a formal semantics for BGP in the Rocq Proof Assistant, and we didn’t use their implementation, but their paper did a really nice job of spelling out, here’s how BGP configurations should be understood. There was also work on designing higher level abstractions for BGP, and some of these were really different, sort of the analog of like OCaml instead of X86. And we actually chose not to adopt those. So our Butane policy abstractions are fairly simple and kind of like OCaml, they compile fairly straightforwardly into BGP configs for different vendors. And we made that choice early and I was, if I’m honest, a little bit disappointed, it would be fun to do something kind of more wild to have something like OCaml as an analogy. But I think it was the right choice to actually pick something where the abstraction abstracts, but in a way that you can, when needed, sort of peel back the abstraction and understand how it might map to all the components of a BGP config. And so there are hard problems we could have solved, like how do you compile some very expressive policy language onto a bunch of distributed router configs? We did not solve that problem or we didn’t solve it in the fanciest way we could have. Maybe in the future, we will start to explore richer policy abstractions, but in this context, at least, I very much now agree it was the right choice to sort of pick something relatively simple and then iterate based on that.
Yeah. There’s something powerful about having an abstraction that isn’t just simple in terms of semantics, but is simple in terms of how it elaborates, like how it goes from the high level thing that you want to the actual thing you’re running. If you’re thinking about people who are trying to in close detail engineer how a system behaves in all sorts of different aspects, just giving them that kind of vision onto the behavior. And I think the point about OCaml, it’s sort of both. It’s both like a fancy, wild, high level language, and also it has a relatively straight ahead compilation story where it’s relatively easy to understand from looking at the OCaml code, how that code is going to execute. And so that’s, I mean, there are trade-offs here, obviously. I think more optimization is good, but also more straight ahead compilation makes it easier to think about what’s happening. How did the process of taking these ideas that came in part from you and in part from the team and turning this into a thing we could actually roll out into the network and use.
So it’s kind of amazing it’s worked this way. Maybe I could say a bit about how my engagement as a visiting researcher has gone. So I’ve been spending about a day a week here for a few years. I’ve spent some periods of time where I’ve spent more intensely a whole week or several days in a row. And so a lot of the design of the system was done by sitting side by side with some network engineers, some folks on our team that builds networking tools and trying to understand our problems, trying to come up with some solutions and then prototyping them and then seeking feedback from others on the team and continuing. So for me, this has been really fun because in my day job as a professor, I’m teaching and doing research with a team of students. Really the goal is to train students. Here I get to work with really great engineers and that’s a great privilege. Not my students are also great, but it’s fun to work with sort of really smart software engineers who can solve problems really quickly. For me, it’s always a little bit sad when the end of my day or week ends, we’ll have had a team meeting and maybe I’ll have done a little bit of development and synced with the team. And then I know that by next Thursday when I teleport back in, amazing things will have happened. But that’s very much from my experience. So I get the privilege of being sort of a small player in this team and then I get to work with some really fantastic engineers.
Has the process of actually putting it into production been relatively straightforward or complicated? How’s that played out?
So I think I have had less of a role to play here than in the design of the system. Again, I’ve had the privilege of working with a program manager and some other people from the more operational side that have really helped roll it out. And there’s all kinds of problems that have had to be solved. We had Butane running, its current set of features were more or less done. And then we did a bunch of testing and then we did more testing in the lab and then we started to roll it out. And the first rollout was on a new piece of infrastructure that Jane Street had stood up somewhere else in the country but wasn’t yet using. So we had sort of a living lab that we could roll it out and sort of kick the tires and see how it worked. And then rolling it out more broadly across the firm was yet another step. And this involves doing firmware upgrades to the whole fleet of routers, making sure everything’s on a good version that’s very hard to do. Building all the tooling for automation so that these things can go into the standard workflows that we have for making changes to the infrastructure. So again, I did almost nothing here, but there’s an amazing set of people and processes. And maybe to say one thing that’s kind of not just … I mean, every large company has these kinds of processes, but one thing that’s really neat about Jane Street is a lot of these tools are really homegrown. And so in some cases, we’ve gotten to work with the team that’s actually building the deployment tool. And if there’s something that we need that the tool doesn’t have, they can build it for us.
Yeah, it’s kind of fun. For both good and ill, like Jane Street has been on its own like 25 year long software adventure that has been pretty different from other places. And some of that had to do with a language choice and some of it has to do with idiosyncrasies of the kind of business that we’re in. But like Jane Street software ecosystem is kind of weird. And I think there’s a lot of great things about that. I think there are some pain points about that, but you certainly get a lot of control all the way through the stack, which is really cool. Yeah. So part of the point of the whole visiting researcher program is that it’s a way for us to kind of build good relationships with researchers. And I think part of the value proposition for the researchers is that connection to the kind of work we’re doing internally is a kind of useful way for them to kind of develop ideas and learn about the world in a way that can influence their research beyond our walls. I’m curious to what degree you feel like that has played out and to what degree like you have learned things that affect how you think about research outside of the stuff you do here.
I’ll confess that part of why I’ve had so much fun spending a day a week in industry is I actually like maybe playing software engineer. So I really enjoy … I love my job as a professor. It’s the best job in the world. There are a lot of interruptions. Even if you’re very careful with your time, you end up spending a lot of time on teaching and service. And working with students is a joy, but it takes a lot of time. So it’s really wonderful to have a day where my calendar’s blocked off. I have meetings with folks here, but Jane Street’s pretty efficient with meetings. My day doesn’t get filled up with one-on-one Zoom calls or anything. And I actually get to sit in front of a terminal and write some code. I have to be very modest about how much I can do in a day, but for me, that gives me a lot of joy. In terms of what I’ve brought back, some of the technical ideas that we used in Butane are things that I’m now working on in my lab. So I didn’t talk too much about this, but under the hood, the semantics for BGP and Butane that we built originally was based on just sort of a simple simulator, a little operational model, but there’s a more powerful mathematical model that’s been studied for some time in the networking community based on a more algebraic approach. So that’s become a topic that I’m working on with my students in my group.
Can I say, what’s the upside of the algebraic approach? How would it be better than this kind of more operational models?
So I think the upside of the algebraic approach is the original vision was to sort of give you the building blocks for building policy DSLs. There’s a paper by Tim Griffin called MetaRouting, and that expresses this idea. So the assumption is choices that you make about how to route traffic through a network are very specific to a particular organization. Every organization has their own policies about how they want traffic to flow, how they want to share information with their neighbors and so on. So you can’t really come up with a one size fits all solution and even what information you choose to share with your neighbors by sharing information about latency or bandwidth or trust. And so Tim anticipated that BGP might be the assembly language of many policy languages that might all look really different. And so how could we design some general building blocks for designing these policy languages? And so the algebraic approach is that you sort of abstract what’s happening in BGP. There’s basically information being exchanged with your peers, and then there’s choices being made about the information you receive from your peers. You make a selection. And you can model those abstractly in terms of some kind of what he calls a rooting algebra because he’s British, but routing algebra. And once you do this, you could sort of write down, here’s what a routing algebra is. Here’s what happens if you take a given routing algebra and you sort of run it in a graph, here’s the set of paths you’ll get. But also, here are constructions you can do on routing algebras. So I could take two routing algebras, maybe that one that talks about latency and one that talks about bandwidth, and I could run them together. I could sort of glue them together. And now I get a routing algebra that shares information about both latency and bandwidth, and it makes choices in some deterministic way about whether it prefers low latency or more bandwidth, but I can glue them together, glue their preference function together, and now I have a more interesting routing protocol. So this sort of becomes a factory for building DSLs. And then the really interesting part is-
So basically more composability in the space of writing policies?
That’s right.
Okay.
And then you can also study, it’s not the case that every instance of BGP or every routing algebra is going to converge to a unique solution. Sometimes, so BGP in general can have this property that you end up oscillating between Multiple solutions. And A prefers these paths, use those paths for a while, but then B’s unhappy. So then you switch to the other paths and then A’s unhappy. So that’s bad. You would like that not to be the case. And you can study generally what conditions on my algebra do I have to have to ensure convergence. Got it. And so anyways, these are kind of old ideas, but we started to play with these in my lab. And I think the grander vision is we’d like to take the sort of original vision of SDN that you can write a sort of top level program, but we’d like to have the distributed implementation based on BGP, which is widely supported and such. And so this could be sort of the IR of that kind of system.
Cool. So like a kind of richer compiler. Can you imagine that work eventually reflecting back into the kind of things that we’re doing here?
Potentially. I think, as I mentioned, we were sort of somewhat modest in our original goals for Butane’s policy abstraction, and we have things we know we could do that are fancier. And so I think to take that next step, it may be that understanding that the system as a whole is working well could be done in terms of routing algebras.
Cool. All right. Well, maybe that’s a good place to stop. Thank you so much.
Thank you. It’s been a lot of fun.
You’ll find a complete transcript of the episode along with show notes and links at signalsandthreads.com.