Listen in on Jane Street’s Ron Minsky as he has conversations with engineers working on everything from clock synchronization to reliable multicast, build systems to reconfigurable hardware. Get a peek at how Jane Street approaches problems, and how those ideas relate to tech more broadly.
Equal parts science and art, programming language design is very much an unsolved problem. This week, Ron speaks with Leo White, from Jane Street's Tools & Compilers team, about cutting-edge language features, future work happening on OCaml, and Jane Street's relationship with the broader open-source community. The conversation covers everything from the paradox of language popularity, to advanced type system features like modular implicits and dependent types. Listen in, no programming languages PhD required!
Welcome to Signals And Threads, in-depth conversations about every layer of the tech stack from Jane Street. I’m Ron Minsky.
Today, we’re going to have a conversation about programming language design with Leo White. Leo works in our compilers team here at Jane Street, and he works mostly on what’s called the frontend of the compiler, which means he spends a lot of time thinking about language features and about the type system that supports those language features, and in particular, he does that work in OCaml, which is the programming language we use on a day-to-day basis.
So, Leo, to get started, I’m curious, can you tell us a little bit about how you got involved in all of this? How you got involved in programming language design, in the first place?
I had some interest in programming languages from a young age. I started programming at 11 or something, and then I think I was fairly interested in how these mysterious things were created. The most formative thing I can think of in this regard, is between my first and second years at university, I remember reading Benjamin Pierce’s Types and Programming Languages. That really clicked. I just thought that stuff was great and I just really wanted work on it, and explore those ideas.
So, I think that’s kind of one of the main things that got me into it. I knew I wanted to do some programming language stuff, I started doing a PhD in that area. It was more about compilation than programming language design, and I was working on a source-to-source C compiler, and I was writing that compiler in OCaml, because OCaml had a good library for reading C files, and I already knew Standard ML, because it’s what they teach you in the first year at Cambridge. So, anyways, I was using OCaml to write this compiler, and I realized that there was a language feature that I wanted to have that would make it easier for me to write this C compiler, and since I kind of hated the work I was doing on the C compiler and was not that interested in it, I procrastinated by adding this feature to OCaml instead. And that’s kind of how I got into writing code for the OCaml compiler. So, I was writing, this language feature and went through the process of trying to upstream it, and whilst I was doing that and my PhD was coming to an end, Anil Madhavapeddy was looking around for people to join OCaml Labs, which was this new project to work on the OCaml compiler that Jane Street was kind of funding in Cambridge. So, he was trying to look for people to work on the OCaml compiler and suddenly this Cambridge address shows up with the implementing language features for OCaml and so he thought, well, I should go and see this person. So, he got in touch with me. And that’s kind of how I started working at OCaml Labs as an RA after having finished my PhD.
It’s by the way worth noting that most people have the wisdom to not immediately go and dive into adding features to languages that they’re just starting to use, and in particular, diving into the type system. I think when Anil saw someone who was willing to just going in and add a new type system feature, he was pretty excited.
Yes. I did not realize at the time that this wasn’t something that you just did. Yes, I was somewhat under-supervised in my PhD and clearly, no one told me that this is a terrible idea, and a bad way to spend your time. So, I just went ahead and did it instead, and I really enjoyed it. I mean, it’s quite a terrible…like, the type system is hard to dive into. I would not recommend doing it without some assistance.
The thing you mentioned about how you got involved in OCaml, is the thing that’s always worried me a little bit about how languages are designed, which is to say, languages are almost inevitably designed by people who write compilers, and so, they go around and look at the features they would like for writing compilers and go ahead and add them to the language, and I always wondered, like, how much of a bias is this in how our languages are designed?
I would say, not much of one mostly, because that is probably a fair accusation of OCaml, which is very nice for writing compilers and has all the things you would want for writing a compiler, but that is not true of most languages. Most languages are kind of horrible to write compilers in, so I don’t think it’s applying there.
So, you got excited about programming language design from reading Types and Programming Languages, which is Benjamin Pierce’s extremely famous book in this area, or at least extremely famous in an extremely narrow group of people, and you’ve been doing it now for quite a while. Some of that work during your PhD. Some at OCaml Labs, and a bunch of it here, in the intermediate time. What do you find exciting about language design? Why do you like it?
One of the main things I think I like about language design is that it sits at an interesting intersection between some very interesting pure mathematics, especially on the type system design side of things, and a more aesthetic, kind of, design style. A lot of it is about trying to find things that fit within the mathematical constraints, but are also a design that people like aesthetically and enjoy using.
And this is one of the real challenges of programming language design. It is really not, at its heart, a scientific enterprise, or at least, not at the end. This is a thing you often hear programming language people complain about, is the fact that they can’t do any proper studies, or rather, whether they can or can’t do it, that in practice, that is not how the field is driven.
Some people do try and do studies in how effective a programming language is, but it kind of hits up against all the things that make it hard to do experiments in the social sciences. It’s just seems like, in principle, you can manage it, but the experiments are just very hard to do, right? What you really need to do is get a load of professional programmers to agree to use all these different languages to solve the same problems for a prolonged period of time, and clearly this isn’t going to happen.
Yeah, it does seem completely and utterly hopeless to me because, if the things you care about are what you can do in a programming language, with large groups of people, solving hard problems, with significant amounts of time being devoted to it, which seems like, to me, at least, the interesting problem, and I think, a problem that a lot of programming languages are not even motivated by, you just have no hope. Like the kind of experiments people can run in practice, are like, they grab the thing they have at hand, which is usually a collection of sophomores, and then they study the thing that they can, which is something like how easy is it to understand the very first errors they see in their language or like, how well do they do when they try and do stuff in their first few weeks or maybe few months of working at it. And that just doesn’t get at the actual thing you care about when you think about what makes programming effective in the real world.
This kind of genuinely bothers quite a lot of people, I think, who work in programming languages. They really would much prefer that it were a proper science. It’s much better thought of as a kind of intersection between some mathematics and design.
And those are both things that we know how to study, right? The scientific method is not the only approach to knowledge. With mathematics, you can try to prove useful theorems and interesting properties, and people successfully do research in design disciplines like, my favorite example is really architecture, which I think is a very similar area to programming language design, in that it’s kind of constrained by mathematics. You know, you have to make sure the building stays up and also, is interesting from a design perspective. You know, you want to make buildings that people enjoy being in and looking at, and so on.
And there’s more than just the raw structural things you have to check. When you think about the design of a building, you think about what are transit times, and people do all sorts of studies to try get feels for how effective it is, but in the end, no one thinks that the final way to figure out whether a building is good is to do a randomized control trial, where you’re like, you make four different buildings and you have this, this and this. You have four equivalent businesses that you sit in the four different buildings and you see how it goes. And I think the scale of the economic investment required, I’m not sure if it’s not quite as absurd or a little more absurd to do the programming language equivalent, but it’s up there.
I think that that’s a better way of looking at programming language design, as a discipline.
You know, there are two ways of handling design problems. One is, you can solve them in the market, right? You can, you know, go off and have people go and design programming languages and see which are successful, and we have that down. There’s a whole system for doing that. And then there’s an academic equivalent of this, and a lot of the work on OCaml, in particular, happens in a context which includes a lot of academic staff. Do you have a feel for how you would make things like evaluation of papers different than it is now in order to accommodate for this?
I don’t have concrete suggestions. I’ve always wanted to, actually seriously, look into disciplines like architecture and so on, and look at how they, I mean they have journals. They have assessment of things. There must be mechanisms by which they do this, but I’ve never actually sat down and put serious time into seeing what that is.
I’m kind of more struck by the fact that when you’re at, say ICFP, which is the main functional programming language academic conference, that there are topics that the people there care about, and think are important, in the design of languages that are not ever the subject of talks. Like they will only talk about the mathematical side of things, or maybe some attempt at some social sciences style experiments, but they’ll never talk that much about the aesthetic aspects of the design in the talks. In the hallway, where they’re all chatting with each other, they’ll talk about it there, and that seems, like that seems wrong to me. Like that doesn’t seem like the best way to do things. Beyond that, I haven’t really given it serious thought… one of the benefits of no longer being an academic, I don’t have to actually solve that problem.
So, let’s switch gears a little. You’ve now spent a bunch of time working on OCaml in particular, how do you think about how OCaml fits into the overall space of programming languages?
So, firstly, I would say, it’s a general-purpose language. Most program languages that people use get described as general-purpose programming languages, but I would say it’s particularly true in OCaml, it really is, not necessarily the best in every area. It is strong in a lot of areas. You can use it for scripting things. You can use it for low level high-performance programming, and you can get pretty far with it in both of those worlds.
In terms of its identifying characteristics, it has good predictable performance, which I think is very important. So, I think if you take an experienced OCaml programmer, and show them some OCaml code, they have a pretty good shot at being able to tell you roughly what the assembly that gets produced looks like; it is a fairly simple predictable compilation story.
Coming back to where OCaml sits, the other identifying characteristic is it has a strong type system. Quite a lot of languages in the functional family have these strong type systems, but I think it’s a pretty key part of what makes OCaml, OCaml. Personally, I think I could probably no longer manage to write in a language without that strong type system. It really kind of, over time, shapes the way that you think about programming.
It both affects how you design things, and it also, you get into the habit of offloading certain habits of carefulness onto the type system, and then you get to think about other things, and you get bad at doing all the tracking of those things in your head.
Yeah, and also you spend time doing things like fairly subconsciously structuring the code to make it so that if there’s a mistake, the type system will check it. I noticed recently that I can sense when there are two variables in scope with the same type, because that’s like danger. You know, like, oh no. I might confuse them. And I can really feel when that’s happening, whereas the rest of the time, oh, they’re all different types. It’s fine, I can’t really screw this up.
Anyway, so, strong type system, and the last thing, which I think is more unique to OCaml or at least more unique to the MLs, is that OCaml takes modularity very seriously.
So, what do I mean by that? I mean the best way to describe what I mean by modularity is probably to talk about abstractions. So, the definition of abstraction is essentially that you have some system, and what you want to be able to do is take one component of the system and replace it with another, and the mechanism by which you do that is that you have some notion of interface that describes the behavior loosely of a component, and when you replace one component with another, you replace it with one that provides the same interface as the other one.
And so, this ability to kind of, swap things out is very powerful, and very pervasive in the design of OCaml.
Right. In particular, there’s this feature called functors, which allow you to essentially write a module of code that is parameterized over a different module of code.
The key thing is in OCaml, everything lives inside modules, and so, functors, which are a module parameterized by another module, so basically, just like a function from module to module, they allow you to parameterize by absolutely anything. So, you can take any part of your program and abstract it over any other part of your program. Very powerful idea, and from a language design perspective, it really forces you to take the idea of an interface seriously.
So, unlike, most languages, OCaml has separate files for the implementations and the interfaces, it has a separate, essentially language, for describing interfaces, and whenever you want to add a language feature to OCaml, when you want to make a change to it, you have to think seriously about how would you express that feature in the interface, as well as in the implementation.
So, I saw Simon Peyton Jones, who is one of the originators of GHC and Haskell, and so on, and I saw him give a talk where he described laziness in Haskell as wearing a hair shirt. So, laziness in Haskell, means that when you write an expression to compute something, it won’t be evaluated immediately. It’ll wait until you need the result of that computation, before evaluating it, which can be useful, in terms of avoiding doing computations that you don’t actually need the result of, but it means that, you can’t really know when computations happen. They could happen, anytime, basically.
Right and in an ordinary strict language, you can more or less, look at the program and say, I write down this expression, I write down that expression, and kind of, more or less, in the same linear order as the lexical text of the program, that’s the order in which things will execute. And in a lazy language, it’s not like it’s impossible to know, but it’s really complicated. It’s hard to reason about, hard to predict where those evaluations are going to occur.
Exactly. And so, the reason Simon would describe that as wearing the hair shirt, is because that forces Haskell to, essentially be a pure language. Like, you can’t really have computations with side effects in a lazy language, because if you don’t know when the side effects are going to happen, it’s going to be a nightmare to reason about your program. So, the decision to make the language lazy forces them to make sure that it is pure and remains pure, and I think that the modularity aspect of OCaml are very similar. You know, they kind of force you to take abstraction and interfaces seriously.
Just to unwind a bit of terminology. When we say about purity and side effects, just by a pure language, we mean a language that doesn’t allow you to express side effects, and by side effects we mean things like do mutation, or interact with the outside world. Haskell puts a premium on being explicit about which parts of your program are pure computation. Just computing something without any of this kind of interaction with other parts of the program in other parts of the world, and keeping to just this pure subset, and keeping it really explicit what that is, is nice because it’s easier to reason about pure computations than stuff with effects in it.
Yes. Exactly. So, I think that the modularity in OCaml is very similar to that and it kind of affects what language features you can reasonably put in the language, or at the very least, it affects the design of those language features, because you have to be able to accurately and conveniently express their interface, which is not the case for a lot of features that a lot of languages have.
So, Simon has actually said about Haskell, that “the next ML should be pure and the next Haskell should be strict,” by which the way I read that as to say, if you were to build a new language now, he would keep the purity, but he thinks the laziness – strictness being the opposite of laziness – the laziness is not a feature he would add to the next language. In other words, he was happy to have laziness in Haskell, because it really got them to invest deeply in purity, but the feature on its own was not necessarily the right default.
I’m curious if you feel more or less the same way about modularity, which is, is OCaml’s focus on modularity useful for its own sake or for the ways it structures the language, or both?
So, I would say both. The idea that you would make an ML and not….
The next ML should not have no module system.
That’s my opinion. So, the most recent thing that looks vaguely ML-ish, is probably Rust, you know, so Rust’s original compiler was written in OCaml, so, I’m pretty sure they had some experience with MLs when they wrote it and you can see some of that coming through in some of the language design, but it lacks a module system, and so, maybe the next ML didn’t have a module system, but I think that’s a mistake. I think that’s a shortcoming in Rust that makes me sad, and makes me less likely to use the language.
It’s a bit of a chicken and egg thing. Like I don’t know whether I use OCaml because I like modularity, or I like modularity because I use OCaml. You know, it’s kind of hard to unwind that.
Yeah, this is the general Stockholm syndrome problem of language design, which is you use a programming language and you get very used to it and it is hard to keep track of whether you like its features because you use them or whether you use it because you like them.
Okay, so you think modularity is useful for its own purposes. What are the ways in which it puts virtuous pressure on the language designer? Why is it good that language designers don’t add features that they don’t have a really clear way of expressing the interfaces of in thinking about how you abstract over pieces of code that utilizes those features?
A good example of a language feature that would not work in a language with a module system would be Haskell’s type classes, which for those who are not familiar with them, are very similar to Rust’s trait system. It’s the same thing. These are basically systems for doing ad-hoc polymorphism#H. When you have a single function that you would like to work on multiple types, that’s kind of the polymorphism part, but you would like the behavior to vary on those different types. So, a good example is addition. You probably want addition to work on integers and you also want an addition that works on floats, but the operation is actually very different amongst those two things.
And maybe you want addition to work on matrices and vectors, there’s lots of different things on which addition might reasonably be defined.
Yes, but it is actually a different function in all of them, so that’s why it’s kind of ad-hoc polymorphism#H.
Maybe function overloading is the term that people know best for this?
Yes. Operator overloading, is a mechanism that gives you ad-hoc polymorphism#H. So, I would say that type classes were a major breakthrough in ad-hoc polymorphism#H, and they’re still kind of the gold standard to measure your ad-hoc polymorphism#H features against, but it’s a very kind of, anti-modular feature. So, let’s say, you want an addition function and so you define an addition type class, and then you decide that you want it to work on integers, so you have to, somewhere in your program, write “this is how addition works on integers,” and the important thing there is that it really has to be, just in the entire program, for type class to work, there has to be just one place where you’ve written, this is how addition works on integers.
And so, if you have one library that’s written an addition instance over here, and another one that written another addition instance, you will not be able to link them together. And it’s kind of anti-modular, because you don’t notice this problem until you get to link time. Like you try to link them together and it says, “Oh, no, those two bits have to live in separate worlds. They can’t combine like this. You have kind of broken some of the compositionality of your language.”
I remember being actually quite surprised about this when I first heard about it, because there’s essentially no way of hiding this aspect of your implementation. You cannot in any way, wrap it up such that the particular choices you made about type classes in the middle of your program, aren’t exposed. It’s sort of antimodular in a fairly in your face way.
Yeah. Like you are very clearly leaking details of your implementation.
You give this example of different ways of doing addition, and then that, I think makes it a little bit easier to understand why people are kind of okay with this, in the sense that, like are there really multiple different kinds of addition on integers? And well, yes. Actually, there are. There are cases where you really want to have some kind of general operation, which you do want to make different choices in different places. Addition is maybe not the most compelling example, although I’m sure there are good cases where you really do what to have different forms of addition on what is on an underlying level the same type.
So, a good example would be maybe someone has built, so you start with the notion of addition, but maybe really you want addition as part of some algebraic structure. It seems like you wouldn’t want two additions on ints, but actually, depending on what people have built on top of that, you might find yourself going, actually, I’d really like to say that the addition on the ints is actually maximum, and things like that.
You might just simply have a case where two different parts of the program have made different choices about what implementation they want. Right? It might just be a performance difference, but in this case, I want to try and do the operation this way, in this case, I want to do it that other way, and then again, you’re in this strange place where you literally can’t link things together.
So, I did some work on the language feature called modular implicits, which is kind of based on some mix of type classes and a feature in Scala called implicits, and this is basically a direct attempt to solve, like the question of how do you make type classes work with modules, and you basically, you remove that restriction to only have one of them and you deal with the problem that that solves in a different way. To my mind, the result is a cleaner system – that is certainly subjective and debatable. That for me, is a good example of the modularity kind of pushing the design in a direction that gives you a better end result.
I think this theme of the end result of what is actually a better language feature being subjective and debatable is, I think, a kind of persistent feature of this area of work.
Let me poke in another direction, which is one of the things that you talked about before is this notion of mathematical simplicity. And I think it’s fair to say that there’s a kind of community and approach to language design, which really cares about mathematical simplicity, and there’s a whole wing of people who design programming languages who really don’t think about that at all, and I’m curious what you think about the tradeoff there. What are the reasons why someone who’s a mere user of a programming language should prefer using the language that was designed with mathematical simplicity in mind?
The phrase mathematical simplicity is probably an over-simplification. A lot of people would maybe use the word say elegance or something like that, but I tend to think of the problem in terms of languages having sharp corners, and the interplay of different language features. If your language features are, in some mathematical sense, fairly simple and composable, then there will tend to be many fewer corners that you can cut yourself on in the language.
I think a language that has not got this is something like C++, which I mean, I’ve never really used it professionally. I’ve used it here and there. But it is a language made up of sharp corners. If you want to understand exactly what happens when you use interesting interplays of templates and operator overloading, and this, that and the other, you can very easily end up with code that kind of does what you want mostly, but then you try and use it on a different type and suddenly it breaks for all kinds of confusing reasons, and what’s happening there is that the features of the language are not really compositional. They’re somewhat ad-hoc and complex, and when they interact with each other, it’s hard to reason about what they’re doing.
And it’s worth saying, this is not like a pointy-headed outsider critique of C++. This is how people inside of C++ think about this.
I remember many years ago, I did an internship at Bell Labs and was trying to build a matrix library in C++, and this was pretty early in the language’s development, and I was like I don’t know how to do these two things together. I want to have inheritance and I want to have operators that work in a reasonable way, and like, how am I supposed to make this work, and my advisor said, oh, go over to the 6th floor over there, and talk to the C++ people. Because the people who were writing C++ were right there. And I basically went over and talked to them, and they’re like, “Oh yeah. You shouldn’t use those features together.”
The thing I always hear from people who’ve worked with C++, is always like, oh, it’s a really good language as long as you stick to a subset, but none of them agree on what the subset is.
Anyway, so I think that that’s a good example, and so, I mean, maybe another way to think about mathematical simplicity is really, that, when you’re looking at languages and the behavior of programs mathematically, really what you’re doing is trying to formalize how you reason about programs. As programmers, we reason about programs all the time in many diverse and interesting ways, and people attempt to encode that in mathematics, and if you can encode how to think about your language feature, how to reason about its behavior mathematically, because it is simple – because if it’s complicated, it would be very hard to do the maths there – it’s probably just easier to think about. It’s probably easier for people to reason about it, too.
I’m less certain that this is true, but another thing that I’ve come to think works out better in languages that are kind of mathematically grounded, is that features end up doing more than you might think they would. Like, an example that really struck me many years ago was when there’s this thing that was added to OCaml called GADTs, generalized algebraic data types. Without going into the details of what this language feature is, or how generalized algebraic data types generalize on ordinary algebraic data types, it’s a language feature for which the obvious use cases that I had heard about and seen, seemed to all be about the kind of things that you do in constructing compilers of like representing abstract syntax trees or other representations of program fragments. And it seemed like a great tool for compiler writers, but maybe not a great tool for me, since I’m not really a compiler writer. And it turns out, GADTs have been incredibly useful in a wide variety of ways, in ways that surprised us, and I think, surprised people who added GADTs to the language.
It turns out to have been, for us, very useful for doing performance optimization and various ways of trying to get more control over the memory representation of objects in the language, had been aided by GADTs in a way that, I think, A lot of people found pretty surprising, and I tie some of this, uncanny generality of the features to the fact that the features hold together in a way that goes just beyond the initial use cases that were imagined by the people who added that feature.
Yes. I think that one also kind of goes back to the things you said right at the beginning, about your concern that compiler writers tend to write features that help you write compilers, and GADTs does seem like one of those.
I would tell you that, obviously, all of your programming problems are secretly writing a compiler and that’s why these things generalize. But I would say that, because that brings all of programming into my domain, so I am somewhat biased.
You spend a lot of time working on programming language design, but you never design a new language, or at least very rarely, think about designing an entirely new language. The work you do is mostly about evolving a language forward. How do you think about the difference between designing new language features from scratch and figuring out how to extend an existing language to make it better?
It is harder to evolve an existing language than to work on a new one. I tend to kind of think of writing a language from scratch as kind of doing the whole thing on easy mode, like where’s the fun there? I genuinely enjoy the challenge of trying to fit features into the existing language, which comes in many forms, from frustration that keyboards only have so many symbols on them, and there just really isn’t room for one more symbol in the language… I’ve yet to try and use Unicode emoji for anything, but we’ll get there eventually. We talked earlier about the interactions of features, to just thinking hard about making sure that your feature is composable and sensible and simple, and composes with the stuff that’s already there. And also, you tend to hit technical debt in the language, there are places where people took a shortcut from the language design perspective, and said, “oh, let me just make it do this, it’ll be fine,” often turn out to be the places that are a bit of a nightmare for you later when you write something else. I enjoy the challenge of that more. I also think that it’s really the only feasible way to actually write stuff that’s going to get used by real users to solve real problems.
If I just make a toy language, the odds of anyone using it other than me are almost 0. The mechanics of how languages become popular are confusing and unknown and you are very unlikely to produce the next big language, so, to a certain extent, you don’t really have a choice, if you want to actually solve real problems and actually help real people with their programming problems.
Part of the reason I think you have to spend almost all of your effort on language evolution, is decent languages take forever. Jane Street started using OCaml 18 years ago, and I remember at the time thinking, it was already like, a pretty good language. Like, it had a good implementation. In fact, surprisingly good given that it was an academic language, since academics don’t usually produce reasonable software, and we’ve been working on it for a long time, and there’s still a lot of deficiencies and lots of things to fix. Writing a really great programming language is literally a decades long affair, and so, if you want to do great things, it’s much, much easier if you can glom on to some existing language and figure out how to evolve it, then just come up with some grand new idea.
New languages when they do break through, they tend to always be built around some new idea. Rust, for instance, breaking through is kind of the idea of having some kind of linearity style typing, or ownership typing in a language kind of breaking through. I doubt that Rust has the best ownership story that will ever be. They’re the first one. It’s the big breakthrough. Basically, I just like taking other people’s language stuff and moving it on to OCaml, and then doing it after they’ve already done the hard work of working out where all the problems are, which you normally do, through trial and error, by implementing the wrong thing and then finding out that it was the wrong thing later.
In fact, OCaml has kind of a habit of this, which is language features tend to arrive, I guess, depending on what perspective you have, kind of late or early. You can look at various features in languages like OCaml and think of OCaml as really cutting edge. Lots of languages now are starting in the last 5-10 years to get type inference. OCaml had type inference from the mid-90s, of course, that’s a feature that had already existed in ML, and the ML languages that preceded it for 20 years, right? So even that feature that now seems kind of cutting edge was already kind of old hat when OCaml picked it up.
And OCaml over time, has lots of examples of picking up features after they have been bounced through lots of other language communities, but it takes such an incredibly long period of time for language features to get popular, that you can sort of still be somewhere in the middle and seem both cutting edge and incredibly stodgy at the same time. I guess my favorite example here is garbage collection, because it is the single most obviously useful advance in programming languages. People can argue about how valuable types are and I’m a really big proponent of types and all of that, but garbage collection. Garbage collection is clearly a useful language feature and it was invented in the mid-50s and got popular in like, ‘95 with Java. You could sort of argue that’s the point where it really hit the mainstream.
Yes. Algebraic data types is another one; that’s like, Hope in 1980-something. From a programming design perspective, the lack of sum types, the lack of types that represent something being either this or this, is just kind of insane. Like, it’s just so obviously dual to like records and structs, which have been in everything since forever, I’m always amazed when a new language comes out without them.
What’s the most recent big language that came out with that… probably Go, like what? How is there not a variant type in there? It’s insane.
Say that again, because I think it’s really easy to miss. There’s this thing that every language on earth does, which is give you a good way of essentially expressing conjunction. I have this and that and the other, and whether you call it a struct or a record or a class, there’s always a way of doing that “and” that conjunction. And there’s, it turns out, there’s the obvious compliment to that, which is disjunction. I would like to have a type that represents this, or that, or the other, and there are always of encoding it. There’s union types in C, and you can have things that inherit from the same class in object-oriented languages, and all of that, but it’s always clunky and second-class, and doesn’t give you the kind of type checking and static guarantees that you really want, and it is, I agree, the most screamingly obviously useful feature, and I guess, among mainstream languages, Swift now has it.
And Rust. Swift and Rust.
And that’s it. Like, I guess, you could argue Scala is, again, another one of these things that’s maybe falling into the mainstream-ish.
So Scala does, although, even there, they have to represent it in a way that kind of fits into the JVM’s view of the world, so they kind of look like classes still, which is a bit weird.
So another constraint around evolving languages, which you didn’t talk about is backwards compatibility and the pressure from all the existing source code that’s out there. Or that even if you want to make relatively subtle changes in the semantics of a language, it can be very hard to get those changes rolled out, because, there’s millions or billions of lines of code that depend on the current behavior and figuring out where that is and how to evolve that forward, can be extremely challenging. The most extreme terrible case of this I know of is the Python 2 to Python 3 transition, where people tried to make a whole bunch of changes to the language and it took, I think it is starting to get finished up, now, 15 years or something after it started. I don’t know the exact timing but it’s been an incredibly long process.
I’m curious how you think about the role that popularity plays in the ability to evolve your language.
Oh, that’s interesting. Yeah, so, I mean, popularity, I think, does make it harder to evolve your language. I mean, like kind of trivially it does, when the least popular language is used only by me, and then it’s easy. I can just change all my code and then it’s fine. There’s no compatibility issue.
Yeah, so the more popular your language is, the more difficult it is going to be to change it. At least to change it in a backwards incompatible way. It’s worth saying that like, especially if you’re careful, you can make a lot of changes in a way that just doesn’t break anything, and I think, where possible, you should just do that. But sometimes it’s unavoidable. There’s just something that you implemented earlier that’s just wrong, or you know, these just some tiny corner that’s preventing a new feature from working how it should, and so, you just kind of really need to fix an old feature before you can add this new one. That definitely does get more difficult if your language is more popular.
A lot of people are kind of fairly obsessed with the popularity of their language. I’m not indifferent, but it’s not a thing that I spend my time thinking about very much. There’s like a certain amount of popularity that is required for you to have an active community that is producing useful libraries and doing good things, and you want to make sure you have that. Beyond that, I’m less clear on the cost versus the benefit. It’s a mixed bag.
So, your goal isn’t to make OCaml the most popular language on Earth, it sounds like. What are your goals for OCaml?
I think one thing that’s worth saying, is that like, to a certain extent, I’m still fairly selfish in my approach to language design. Like, I still kind of think that the point of working on OCaml is to make it easier for me to write programs. And I try and generalize that to encompass everyone or at the very least, the other people at Jane Street, bit I think that in terms of where ideas come from, and I think they kind of come from there.
I said earlier that OCaml was very general purpose, but it’s certainly not great in every niche, in every use case. I’d really like to make it competitive with Rust and C++ and C, for writing low-level hand-tuned code, and that seems plausible to me. I can see a path by which you add certain features and make it so that you can do that in a way that is convenient.
I mean you can get pretty far right now. You can avoid allocating, and so on, and keep your latencies – so you avoid garbage collection latencies, and we have some upcoming features, things like unboxed types that we’re currently working on here, similar to the unboxed types in Haskell, that will give you better control of your layout.
So, what do you think of the limitations of OCaml as a competitor for languages like C++ and Rust, and to be clear, at Jane Street, we in fact, spend a lot of time writing very high-performance programs in OCaml, so as you said, it is a thing that you can do. But there are certainly ways in which it’s not optimal, so, I’m kind of curious if you can try to lay out what you think the core limitations are.
Sure. So, I guess, control of memory layout is just crucial to performance on a modern processor and you don’t have as much control of your memory layout in OCaml as you do in those languages. There is some cost to using garbage collection. So sometimes you would like to avoid it, which you can do now in OCaml, but when you do it in OCaml, you lose the ability to create things like records and structs of data and the variant types that we were just talking about, you lose the ability to use those, really. So, you go from a language which is generally higher level and easier to program in than C to actually losing some of the things that C has, because OCaml relies on garbage collection to provide those features.
A thing that I think is really worth understanding about OCaml as a language is it has a bunch of lovely abstractions, and then, an almost brutally simple way of representing those on the heap, you get very little choice in how memory layout works, and that’s in some ways, nice. It leads to an overall simpler implementation. There’s lots of benefits from simplicity, but it means that it’s sometimes grotesquely more expensive than it should be.
Like a fine example is just every single element in a record, takes a full machine word. So, if you would like to represent a single character as a field in the record, that will cost you 64 bits, and if you would like to represent a bool, which is a single bit, logically, that’s going to cost you 64 bits. And it doesn’t matter what you do, you always take a word.
Yes. I mean, it’s worse than that. Like my favorite would be if you used like OCaml’s normal support for 32-bit integers, it’s like 10 words per 32-bit integer, because you’ve got to box the thing, and then the boxed thing is a few words, so that’s clearly room for improvement. That is not the smallest and simplest way to represent 32 bits of data, and so that specific case is something we’re looking at, quite a lot with the unboxed types, that’s our approach to fixing that aspect.
So another thing that these low-level languages give you is the ability to allocate things on the stack and refer to those, have pointers into your stack, basically, that gives you a much cheaper than garbage collection way of allocating data, without having to copy all the, you know, you can make a struct on your stack and then pass the pointer forwards down the stack, down to functions that you call, which gives you a cheap way of grouping some data together and passing around a reference to that data.
So that’s another feature that you just can’t really do in OCaml.
The key issue there is that when you’re relying on the stack frame for storing the data, when the function call that allocated returns, the thing goes away, and you have to make sure at that point, that there are no references that have leaked out there. So, you have have to, in some sense, control where the data moves around at the type level to make that idiom safe.
Yes, so, if you want to keep the kind of safety that OCaml users are used to, then you have to do that. I mean, you can also just be C and let people screw it up, but that’s, you know, not great. That’s kind of the thing that leads to tons of security bugs, so, it’s not ideal.
Yes. Those are probably the main things that I think of that the low-level languages provide that we don’t really have. Another one, I suppose, is having facilities for conveniently reasoning about resources that you hold. This is kind of by necessity. Those languages lack a garbage collector, so when you allocate memory, this is a resource that you have to pay attention to, because you got to free it yourself, and so, they provide a lot of useful mechanisms to help you with that process.
But the mechanisms that are good for helping you manage a piece of memory that you have allocated, are also great for managing like, a file you’ve opened, or a database connection that you’ve made. Things that need closing when you’re done with them, so, those languages have some great features for that, and I think OCaml could definitely benefit from brazenly stealing them.
Those are all examples of making OCaml a better low-level systems programming kind of language. What other ways do you see that the language could be improved?
So, the type system is expressive, but it is not the most expressive. There are more interesting type systems that let you express more properties of your program. In particular, I’m thinking of the dependently typed languages, so, this is theorem provers, like Coq and I guess Lean, and languages like Agda and Idris. Dependently typed languages are maybe a bit difficult to explain in a short period of time. Classic example would be the ability to have a type for lists of length N, where N is not some fixed constant. N is like a variable in your program. Like, you know, someone passes you a number N, and the list that’s as long as that, the type system knows that it’s exactly that long. So, if N is 0, it knows that the thing is empty, and if it’s 1, it knows that you don’t have to worry about it, you can just take the first element, you don’t have to think about whether it’s empty or not.
That’s the quintessential example of a dependent type. But that’s very expressive.
And it’s called dependent because the type depends on the value, essentially? Is that correct?
Yes. So, the type of those lists was depending on the value of the integer or natural number that you… N, basically.
So, anyway, I’d like to push OCaml in some ways in that direction. So, Haskell is doing this in a big way. There’s been long-running projects to kind of make Haskell more expressive like this, and I would like to do similar things in OCaml, especially, kind of around the module system, which is already, vaguely related to dependently typed languages in some ways.
If you could make a pitch to a grizzled old systems programmer like myself, as to why I should care about this fancy type theoretic dependent types nonsense, like, why will this make life better as an OCaml programmer to have easier access to dependent types in the language?
It’s quite easy to make that argument, since what GADTs are giving you is what this is giving you, but less broken. GADTs are to some extent, a broken version of dependent types that are not actually quite the thing you want, and so, since as you said earlier, you’ve already been convinced of the use of GADTs, it’s not that hard to take you the rest of the way and I’d probably do that. I’m not quite sure how you take someone who’s never written anything using something like GADTs and show them that this is how they want to write their programs.
Can you maybe come up with an example of a time some Jane Street programmer came to you with a practical problem of “I am trying to achieve to X and I can’t figure out how to jam it through the type system, that having proper dependent types would have made easier?”
Yeah, sure. So, I think that people currently use, in Jane Street, they will write code where they have a type and the type is parameterized by some notion of permission. Basically, whether this is a read thing or a writing thing. A file is an obvious example for this. A file handle rather. Is this a read-only file handle or a write-only file handle, or is it read and write? So, the way that they achieve that is they make up a type called “read” and a type called “write,” and then they have their file type be a parameterized type. It has a type parameter, and that type parameter is either going to be the read type or the write type. That’s kind of how they achieve that. But this is clearly kind of nonsense. This is just like an encoding to get the type checker to check what they actually want. Read is not a type like Int is a type. There are no values of type read. It’s kind of just a fiction they’re creating to get the behavior that they’re after.
And so, dependent types would let you do that properly by actually, rather than indexing your type by these made-up types, you would index your type by an actual value of a sum type, you know, literally just an enum in C that’s either read or is write. And you would have a read file and a write file, and they would be an actual data type that you could actually write functions on and things like that.
This actually reflects a problem we’ve run into a lot, which is there’s lots of places where you try and get some extra power out of the type system, and so you do some clever encoding to get the type system to check an extra property that you want, but I think people are often surprised at how error prone it is to get that encoding just right, because as you say, it’s kind of disassociated with what you’re really doing. OCaml programmers are in some sense spoiled. They’re used to the fact that there’s a large class of errors that the type system just automatically captures for them, and they’re kind of used to things clicking together. But when you start doing these kind of encoding tricks to add extra rules into the type system that aren’t naturally there, they end up being kind of free floating and it’s very easy to just screw it up and you think you’ve encoded some permission thing correctly, and you could just get it wrong, and it’s very hard to notice that you got it wrong.
So, it sounds like having something that’s kind of more organized around dependent types would ground this kind of coding, so it would be less likely to just be wrong in a silent way.
Yeah. That’s right. Every time people are doing that encoding, dependent types are roughly the feature that lets you actually just directly write what they’re trying to do, and it’s always going to be easier if you’re not trying to encode something. You can just write down what you actually mean. It’s just simpler. And like, that’s how I think of that feature. It’s really about doing a thing that people already do, as soon as you give them anything that’s a bit like GADTs, they straightaway start doing these kind of encoding stuff, and really that’s telling you that this wasn’t the feature you’ve should have given them in the first place. It’s dependent types is what they’re after.
It’s a ton more work than implementing GADTs, so yes, there’s a reason one comes first. I would love to add that to OCaml in some capacity or other.
So I want to switch gears, yet again. Can you talk a little bit about what it’s like working with the open source community around OCaml? I think, one thing that’s interesting about the position that Jane Street has is that we’re kind of not the two endpoints that organizations often are, when it comes to working on a programming language. I think, by far, the most common situation to be in is, you are victim of the language that you are using, which is to say, there’s a programming language, it has certain properties, and you’re going to use it and figure out how to deal with those properties, good and bad, and you have very little ability to affect things. And then there are organizations that are totally on the other end, where they are the designers of a new programming language, think Google with respect to Go, or Mozilla with respect to Rust. I think, in those cases, it’s an organization that created a programming language, and got to set up, even if there is a larger community and they got to set that community up and decide how it was going to run, and we had a very different experience.
We started as mere supplicants. We were using the language and you know, didn’t really have the power to make changes ourselves, and would occasionally ask nicely for things, and over time, we’ve moved to a spot where we are among the significant contributors to the language. I’m kind of curious how you think about negotiating that situation, especially when it is, as it really is the case, that we don’t have the deciding say. It is not, in a sense, our language. It’s a language that we participate in the community of, but we definitely don’t have the last word.
We definitely don’t have the last word. I mean, you can go and go to GitHub and observe us attempt, and fail to get various things, merged upstream to see that we don’t have authority to just go and put whatever we like in.
Yes, so, I think, firstly, it’s worth pointing out how much we benefit from this. If we were to like fork and manage just manage our own language and do our own thing off to the side, we would lose the review from the upstream developers, who’ve been working on OCaml for 30 years and really are the kind of people you want reviewing your code, and all the libraries and tools in OCaml, which we can use that come from the excellent open-source community. So there’s a lot of benefit to, I think, working with an existing language and working with an existing community that you don’t just kind of own. And I think if you look at languages, places where they have kind of spawned the language, they often go to great lengths to try and give the ownership away, essentially, to try and get some of these things. F#, I think is a good example of that, where Microsoft is trying to kind of let the community take a bigger and bigger role, because the input that you get from that doing is great. At the same time, it obviously, kind of slows us down and means that we have to stop and think about things.
So, I think effective communication with upstream is kind of the key to having this process be successful and useful for everyone involved. It’s not a thing we’ve always got right. Sometimes we’ll develop something, completely internally, for far too long, and not really share it upstream, and then just kind of drop this finished thing, and I think that that is a lack of communication that you should try to avoid. You can see how it happens. It’s very tempting to just kind of not show people a thing until you’ve got it just right. You want to add things to the language and for it to make progress, and the process of upstreaming is obviously the mechanism by which that is slowed down, but I think that that’s kind of by necessity. I think language design is quite a slow process, and should be quite a slow process. The backwards compatibility things that we were talking about earlier, I mean, you put the wrong thing in, you will be stuck with it, kind of indefinitely, or you will have to do some huge amount of work later to undo it. So, you really want to be sure that you’re doing the right thing in language design.
So, I think that necessarily makes everything kind of move a bit slowly, and that can be a frustrating process, but pros outweigh the cons.
You talk about ways in which we’ve not always done the ideal job in communicating the changes that we want to make upstream, and it sounds like OCaml itself is also evolving in this world. Like the larger community is trying to get, more organized, and get better about the way in which it communicates and discusses new features, can you talk a little bit more about how that process has evolved?
I mean, so when I first submitted a feature to OCaml, they used, what was it… Mantis? for bug tracking, old web program. You would just kind of like copy/paste a big patch onto that bug tracker and be like “Here you go, maybe apply this to…” I think it was SVN. It switched from CVS to SVN for its version control management. And so, that process was, obviously, not ideal and did not encourage external contribution, very much, I think. Over time, it switched to putting the source on GitHub, using GitHub pull requests and issues for managing the process of contributing code, and I think that’s aligned with more people contributing to the compiler.
Recently, we’ve created, an RFCs repo. So, like a “request for change,” I guess, is what it should stand for in this case, so basically, somewhere where people can go and say, here is a design document explaining something I think I would like to add to the language, or add to the standard library, or change about the implementation of the compiler. So, the point of that is to really let you collaborate earlier in the process.
GitHub is very good for collaborating on the code. You start by writing all the code, and then you start collaborating. So, the idea here is to get to use the same workflow, but earlier on, so you start writing the design and you collaborate the design, and then you go away and write some code.
Rust is the great example of how they manage this process of people submitting design ideas. Their process is more open than ours. They really are encouraging just anyone to come along and write ideas. I don’t think we have the kind of manpower to deal with the kind of flow of design things that they get. It’s a little bit more of a tool for the existing compiler developers, but still, it’s I think, a step in the right direction towards increasing collaboration and increasing communication.
So you were talking a little bit before about forking and why it’s not really a great idea to just straight up fork the language. And indeed, that’s something I get asked about a lot. People are like, Jane Street has a few thousand people and lots of resources, why doesn’t Jane Street just fork? And I think you did a good job of explaining the downside of that. But I wonder how you think about forking a little, which is to say, some of the constraints you were describing around language evolution have to do with the problems of backwards compatibility. Another constraint that you talked about, essentially, is the need for feedback. You were saying in some sense what OCaml had been doing, the MO has been, wait for someone else to try something out in some other language, work out a theory, write a few papers, try a few awkward implementations in other languages, and then when you see the dust settle, well, okay, pick something up and try to get the best version of that in OCaml. But it’s super tempting to want to do some of that experimentation and it feels like a place like Jane Street, feels like, in some sense, a safer place to do it, because one of the things that we can do, is we can fix things. We have these kind of work flows around these large monorepos where we have an enormous amount of control, and we can add a feature in, expose it only internally, and only turn it on in a few places, and then, when we discover, inevitably, that we did a terrible job of it, we can rip it out and just update all the code that used it, or modify the feature in significant ways, and worry much less about these kinds of constraints of backward compatibility and the unbounded usage of the feature, and by dint of that, get a lot of extra feedback, which can hopefully help us iterate and innovate faster.
What do you think tradeoffs are here? Should we do more of that?
I think basically, it would be good to be able to do this, and the thing that places limits on it is the management of that kind of fork. Like how much rebasing large changes over a moving codebase is a kind of painful and expensive process, and that’s the kind of cost you’re going to pay to have this forked version of the compiler.
To be honest, I think we are leaning more and more in this direction. I think, we’ve always had our own branch, because you want backport a bug fix or something before you move to the next version, and I think that the patches that we are having on our own branch are getting bigger, so I think we are leaning in this direction. But yes, I think the limits on it mostly comes from just the technical aspects of how you manage that fork.
It’s striking, how in some sense, crude, our method of extending some other person’s software is. We literally represent the changes we want to make as a set of textual patches, right? Where you just say, oh let’s like remove this line and put some other line in instead. The surprise there is not how hard that is to maintain, but that it kind of works at all.
Yes, and the thing you can do to do to try and improve it is, obviously, to kind of improve the abstractions in the compiler itself. You know, to kind of put some APIs in place, put some decent interfaces on things, that don’t have to move all the time, that makes it easier for things to evolve independently without breaking each other.
There are whole language communities that work in some sense, this way, and have easier to use ways of extending the compiler. I think of languages like Racket, which is a descendent of Scheme, and is itself a descendant of LISP, which is a language that goes back to the mid-50s, but if you look at a language ecosystem like Racket, everything is designed around making the compiler extensible so that people can write their own extensions without having to fork. Can you imagine a future where we would have more capabilities of that kind inside a language like OCaml?
To me, that kind of Scheme-stye extensibility, working out how you do that in a typed language, in a way where you maintain the type safety, that’s the thing I think of as the holy grail, that’s what a lot of work in, I think, programming language design, is kind of towards. I’m not sure that people doing that work, they’re explicitly thinking in those terms, but certainly, that’s how I see it.
Yeah, so I would basically just say, yeah, that’d be great, but that is an open research problem.
And can you give a little more texture as to why it’s hard? Why does a type system make this kind of extensibility more difficult?
So, if you want to add a new type system feature, to a language, without breaking its safety, the code of that feature, like the implementation that the user is writing, is going to need to have certain properties and you’re going to have to show statically that the code that they’ve written has those properties. So, essentially, you’re going to end up needing a type system that is very expressive in order to express the type of language extension that doesn’t break everything, which is kind of roughly what you want to do.
If you ever tried to implement or prove correct, prove sound, at least, a type system in a theorem prover, which I have done, it is a painful and slow, and very time-consuming experience. I’ve been working on a type system implementation in the Coq theorem prover for like 4 or 5 years. I proved it sound after about two, but it was still two years of work to get to that bit.
Is maybe a way of saying this that in order to make the language modular and extensible at the level of the definition of language, you essentially have to make the proof of correctness of its type system also modular?
Yes. I think that’s a reasonable way of thinking about it. Yes. And also, so what I just described with the writing the proof, is kind of like, you write the thing and then afterward you kind of prove it to a certain extent, and actually, you probably want to do this in a more correct by construction way, where like the only way that, you know, you’d only let people express things that actually worked. One day, I am hopeful that we will have that kind of ability. But yes, it’s difficult.
So, we talked about this a little earlier, but one of the things I often worry about when I think about how OCaml evolving forward, is the kind of Stockholm syndrome you get, of just kind of getting used to the features that you have, thinking that these features are great and all I need, and learning how to encode all the things you want to do in this, no matter how awkward the encoding. I think one way of escaping from the Stockholm syndrome is just to think about other languages and what they provide, and you know, think about what useful features you can steal. I’m curious if you have any thoughts about features you really like in other languages that you wish OCaml would have?
So first of all, we already mentioned a couple of things. We mentioned type classes, and dependent types, ad-hoc polymorphism#H. Oh, you just mentioned Scheme, so macros.
So, I think of like macros in Scheme and LISP as kind of doing two things that are related, so one is like extending the language to add new language features; it also, is kind of for like meta-programming where you want to just have some code to generate some code, which is like a slightly simpler problem. So, there are ways of doing that. You can write macros that you know generate correctly typed code, and that’s sufficient for this problem, which is much easier. So, I have some work in that area.
What other kind of problems do you think are best solved by writing a program that writes program, rather than just like the normal one-step thing of writing a program that does the thing that you want?
I think about this a lot in terms of making the abstraction free. Let’s say I have some problem and I want to write some code that does something, and I have a piece of data that represents the solution to that problem, and one way to implement my solution in code is to have a function that reads that data, and executes the solution that it represents. At run time, that’s going to be literally reading in this description and then kind of acting on it, and that’s a nice way to write code, but it has a cost.
Reading and following the instructions, live at run time, it takes time. It’s actually spending computation on that. If you replace that with something that’s staged, something that’s based on macros, you do that reading at compile time and just generate the expanded version of the code, and so the performance improves.
So, it’s kind of a way of being able to write nice high-level code that has the performance of low-level code.
Can you make that example a little more concrete? Like, you say you have some data that represents a way of solving a problem. What’s the example of how that would come up in real life?
I worked with an intern who implemented some macro stuff for OCaml, and here, there was a good example that he wrote based on a library that Oleg Kiselyov had written about, for writing code, operating over lists of data or streams of data. And essentially, you would describe a series of operations that you wanted to do on the list, so you kind of, oh, I want a map over the list, a map over this, adding one to it, and I’ll filter the elements out that are even, and then I’ll map over the list again and divide by two, everything in the list. And then, I’ll run a function that kind of, I’ll take the maximum number out of all of them. Something like that. This is clearly a nonsense program, but you get the idea. So, that’s the kind of high-level description you would write.
The very specific thing is a nonsense program, but what you’re describing is a very real way of constructing programs, which is, as these kind of chained sequences of transformations, that shows up all the time. That part does not feel made up to me at all.
Yes. So, anyway, what the library would do is, it would then read that description, and at compile time, it would produce code that was just always a single while loop, operating on the list. You know, rather than it’s reading over the list, making a new intermediate list, and then making another intermediate list and another intermediate list, which has an obvious cost, it would all just become like a single while loop that just read the first list and did these necessary things to work out what the final answer was.
And that was like a static guarantee. Like the tools you were given in the input language, are such that no matter how they’re composed, you could always just produce a single loop.
And part of the point of this example is, the method of optimization, the fact that you can collapse all of these things together, is something that is under the explicit control of the person who’s providing the library for doing these transformations, right? A thing comes up in a lot of the different design work you’ve been involved in is, this choice between two different ways of thinking about optimization. One kind of optimization is, you have a nice general idiomatic way of writing programs, and then you try, at the level of the compiler, find ways of transforming that into machine code that operates reasonably efficiently. And then there’s, this quite different approach of trying to give very precise control to the programmer, where you really take the performance behavior as a first-class part of the semantics that you’re presenting to the programmer, and it sounds like, your point here about macros, is it’s an example of the latter. It’s a way of giving control over performance-relevant details to the person who’s writing the program.
Yeah, I think that’s a good description of it, yes.
So I guess the features from other languages would either be kind of, delimited continuations in the LISPs or async/await style thing, that’s in almost everything now. So, async/await is specifically tools for writing concurrency in what I would call, direct style. So, a nice conveniently expressing concurrency. The delimited continuation is a more expressive version of that where you can kind of write, more general, what I would call control effects.
I’ll just give some examples, I think that will give you an idea. So, async/await, a good example: generators, coroutines.
Generators are a feature that show up in Python, the people may have seen a lot of like conveniently writing complex programs that generate things that essentially morally look like lists, but behind them have some computation that is generating the things on the list.
And so the feature that we want to add to OCaml for that is algebraic effects, and that’s kind of an attempt to steal, basically those kinds of ideas, especially the delimited continuation style of things.
I think algebraic effects is a quite good example actually of a couple of the things that we were talking about. So, on the one hand, it’s a good example of how a mathematically simple approach can be beneficial, because they’re very mathematically grounded as a very elegant mathematical explanation of what algebraic effects are, and that simple mathematical construct is able to express a load of ad-hoc constructs that have been added to other languages. It lets you do async/await and generators, and coroutines, and depending on how you implement them, kind of backtracking search and things like that, all of which are just provided by a single language feature. So that’s a good example of that.
Another thing it’s an example of is that, with the idea that you can go and take an idea from somewhere like Scheme and then work out how to type check it to give you a language feature.
It’s funny. I think the people who work on languages like Scheme and Racket, and all of that, actually think in a very type-oriented way. In fact, a bunch of those people wrote some key type theory papers, but there’s this kind of super power they have of operating in an untyped context, gives you a lot of freedom to do exciting experimentation around language features, which is really quite hard work to do in a typeful context. So, yes, the idea that you see what are the interesting things they do, wait a decade to two to see which ones turns out to be really valuable and then spend the effort to figure out how to lift that into a language where you have a type system as well.
Yes. So, I think this is just generally always a good idea. That macros design we talked about was very much about looking at Racket’s macro system and looking at how to make it work with OCaml’s module system with typing, and yeah, algebraic effects are, there’s a technical term for it. They’re essentially macro-expressible with delimited continuations. That means that you could do a totally local transformation. So, you could take a program written in delimited continuations and turn it into one written with algebraic effects and vice versa, entirely locally. You don’t have to look at the whole program to see how to do this. You can just do it locally, which means, it’s a good way of saying that they are equally expressive.
But that translation is not type preserving. So, essentially, they are, from a dynamic semantics perspective, in terms of like what happens at run time that they’re the same language feature, basically, but one is easy to type and the other is not. And so algebraic effects are kind of that, that’s one way of thinking about them.
One of the things that I’ve always really liked about OCaml is that OCaml is a language which is pretty easy to read, which is to say, I can look at an OCaml program and without too much effort, understand what it’s doing and why, and part of that easiness of reading, is that its relatively explicit. This is a funny thing to say for a language that has type inference, but it’s still kind of the case that, for the most part, you can pretty straightforwardly, trace through OCaml programs and without a lot of guessing, figure out what it’s doing, and some of the features that you’ve been working on over the years, and in particular, this kind of ad-hoc polymorphism#H style feature, modular implicits, in some ways compromises this, which is to say, it adds a lot more interesting inference on the part of the compiler to decide what needs to be done. In particular, you write down your program, and then some type inference happens, where it figures out what the types of the different pieces are, and then it decides, which concrete code to dispatch based on the types. So, I’m wondering how you think about tradeoffs around explicitness and implicitness, and what we gain and what we lose, as we add more of these kind of powerful search-based techniques for deciding what code to dispatch into the language.
Sure. So, yes, with modular implicits, I think there is a certain amount of implicitness necessary for any kind of ad-hoc polymorphism#H. Kind of, in a way, the point, is that being able to write 1+2 and 1.5 + 2.5, and having those both do the thing people think they will do, and like you’re clearly missing some information there. So, some amount of adding implicitness is kind of the point, and then, I think from there, we try to basically design the system with the least amout of implicitness possible. Like keeping things as explicit as you could, whilst getting this little bit of – I don’t like implicitness. Is there a better adjective? Implicity. Yeah. Trying to have the least amount of that that we can get away with. So I said that the feature was kind of, I mean, you can tell from the name, the feature was kind of based on some of the work, in Scala on implicits and Scala implicits have a reputation for being extremely implicit and hard to reason about, and I often have to reassure people that whilst we took inspiration from there, we did not, like, copy all of it, and that a lot of the aspects of Scala’s design that make it extremely implicit are not really present in our version.
Can you say a little bit more about how, like, what are aspects of how implicitness works in Scala, and how that can be confusing in ways that you’ve tried to defend against that in the context of modular implicits?
Right. So, the basic idea of both language features, is that you look for something based on its type. In the OCaml modular implicits case, it’s a module you’re looking for, in Scala, it’s just kind of any value, but either way, you’re looking for something based on its type. So, I’m just going to say, assume that you’re looking for an int, which is not likely to be the kind of thing you want to look for via this mechanism, but it’s easy to think about. You know, you’ve got a list of possible things, and you’re trying to find the one that’s an int and you going to use that one, and in Scala, the places where they search for that int are complicated, because you have the type of the thing that the method is being called on. So, you go to its class and look inside of it there and see if there’s an int in there that looks good, and then you go to the classes of the arguments, the other arguments, and see if they got anything, and then you go to the friend classes of those classes and see maybe there there’s something good. It’s difficult to understand where it’s searching. It’s just a bit more complicated.
I’m not sure there’s much they could do about this. In modular implicits, it’s got a very simple notion of scope. You’re just looking, kind of, basically, up the file to see if it’s there. Maybe you can include other files, but again, it’s in a very simple way.
Some of these things are not things that one should exactly blame the Scala designers for, in the sense that they’re, to some degree, solving a harder problem, which is that Scala is a language that has to cope with something like a Java-style object-oriented system, and also something like an OCaml-style Hindley-Milner, whatever kind of style type system, and doing those together increases, in a, to some degree, unavoidable way, the complexity of the language. Probably, there are things that one could have done differently, but there’s a sense in which the problem in OCaml is fundamentally simpler.
Yes. Exactly. There’s not much they could have done on that part. There are some other things they’ve done, where they could have gone another way. If you’re looking for something and you find two possible candidates, there are some rules by which they decide which is the best candidate, whereas in modular implicits, if you find two, you just error and say, “no, it wasn’t obvious which one you meant.” Like, both of these would have worked. And that makes things quite a lot safer.
If you think about reasoning about the code, it’s a lot easier if you don’t have to worry about that problem.
I still suspect its case that as with lots of language features, how well it works in practice, also depends on the approach and discipline with which people start writing libraries to use it. I think once you have this notion of implicit selection, you could probably design libraries that use it all over the place, or you can design libraries that use it in a small number of places where it’s high-leverage, and I suspect those kind of choices probably affect, in practice, how easy the language feature is to use.
Yeah. I think there’s always a question when you’re designing a language feature about exactly how much rope you want to give people, and then it’s kind of up to them whether they hang themselves with it or not. You can give them less, and trust them less with it, and often that’s the right decision, but I think there’s always a trade-off there. Algebraic effects, are also I think another thing adds a certain amount of implicitness, at least… originally, we thought about adding them untyped, without any types tracking the effects. It depends on how you look at it. But one way to think about that is that we looked at GOTO, and thought the problem with it was that the labels weren’t dynamically bound, you know. The problem with GOTO is that you statically can tell where you’re jumping to, where clearly what you need is dynamically-bound GOTO.
“Dynamically-bound GOTO considered awesome.”
Yeah. Exactly. Which sounds absolutely – yes, sorry. There’s a very famous paper “Go To Considered Harmful” about how GOTO was a terrible idea and far too implicit in how it does things, and also dynamic binding is historically thought of as mostly a mistake in a lot of language design things, for again, being really horrendously implicit, and so the idea that combining the two would be a great idea, is amusing.
But actually, in practice, it is actually a great idea. They only look like that from a particular angle and if you look at them a different way, you can see that it’s totally fine and that none the issues that you might have with something like that should arise.
Okay. Well, thanks very much. This has been a lot of fun.
You can find links for more information about the topics we’ve discussed including some of Leo’s talks and papers about things like algebraic effects and modular implicits, along with a full transcript of the episode at signalsandthreads.com. Thanks for joining us, and see you next week!
A tree representation of the structure of source code in a given programming language, often used in program analysis and transformation.
A form of polymorphism in which polymorphic functions can be applied to arguments of different types, with different behavior based on the types of the arguments; sometimes referred to as function overloading or operator overloading.
A type whose definition depends on its value. A feature in some advanced functional programming languages like Agda, and Idris.
In OCaml and related languages, a module which is parameterized over one or more other modules. A reasonable mental module for this is to think of a functor as a function that takes one or more modules as arguments and generates a new module.
Generalized algebraic data types are an extension of algebraic data types (i.e. "sum types") providing more expressive power to programmers.
A form of automatic memory management.
A type system that supports an ability to infer a type of a given program without the need for explicit annotations. Most functional programming languages have a type system descended from Hindley-Milner.
A program evaluation strategy which delays evaluation of an expression until its value is needed. Opposite of "strictness." Haskell is an example of a lazily-evaluated language (and OCaml, a strictly-evaluated language).
The natural unit of data handled by a particular computer processor architecture.
A proposed extension to the OCaml language adding support for ad-hoc polymorphism; inspired by Scala's "implicits" and the concept of type classes.
The degree to which a programming language allows a program to be be broken down into smaller, composable components.
The main unit of source code structure in an OCaml program.
In a programming language type system, polymorphism refers to the ability for an entity (e.g. a function) to take on multiple forms, allowing for programmers to write general code that can be applied to multiple different cases.
Attribute of a program that lacks side effects, e.g. will always return the same value for given arguments
In a programming language context, the opposite of purity -- a side effecting operation is one that affects state outside of its local environment (e.g. by doing I/O, or modifying a global variable)
A general purpose functional programming language popular among compiler writers and academics. Ancestor of OCaml.
Also known as a variant type, a data type that represents a value that can take on one of several different types.
A proposed extension to OCaml which allows for greater flexibility over the memory representation of a given value.