Listen in on Jane Street’s Ron Minsky as he has conversations with engineers working on everything from clock synchronization to reliable multicast, build systems to reconfigurable hardware. Get a peek at how Jane Street approaches problems, and how those ideas relate to tech more broadly.
Ty Overby is a programmer in Jane Street’s web platform group where he works on Bonsai, our OCaml library for building interactive browser-based UIs. In this episode, Ty and Ron consider the functional approach to building user interfaces. They also discuss Ty’s programming roots in Neopets, what development features they crave on the web, the unfairly maligned CSS, and why Excel is “arguably the greatest programming language ever developed.”
Ty Overby is a programmer in Jane Street’s web platform group where he works on Bonsai, our OCaml library for building interactive browser-based UI. In this episode, Ty and Ron consider the functional approach to building user interfaces. They also discuss Ty’s programming roots in Neopets, what development features they crave on the web, the unfairly maligned CSS, and why Excel is “arguably the greatest programming language ever developed.”
Some links to topics that came up in the discussion:
Welcome to Signals and Threads, in-depth conversations about every layer of the tech stack, from Jane Street. I’m Ron Minsky. It’s my pleasure to be talking today with Ty Overby. Ty is a software engineer who’s been at Jane Street since 2018. And his work here has been focused on our tools for building web applications and specifically on a library he designed called Bonsai. We’re going to talk a lot about the ideas behind Bonsai and about UI development more generally. But first Ty, can you tell us a bit more about your background before Jane Street?
Yeah, so I think my first experience with web development was probably when I was in middle school writing petpages for my Neopets. After that, I’ve been doing hobby programming ever since I was a kid, went to the University of Washington and worked on compilers for a while at Microsoft. I decided at some point to move to New York, so Jane Street seemed like a great place to continue my love for programming languages and somehow managed to get on a team doing some programming languages and web development stuff, which I think is pretty exciting.
So you got to combine two different loves.
Yeah.
You talked a little bit about your hobby programming. I know that you’re someone who really spends time and really enjoys digging into all sorts of things in your spare time. Can you tell us a little bit more about what kind of things you’ve done on the side?
Yeah, so I guess a running theme of mine is that I wind up taking the hammer that is compiler development and bashing everything that I can get my hands on with it. I think my biggest, longest-running project is probably a computer-aided design library, suite of software, where in order to describe the shapes for your object you write programs, little functions that go from some point in space to the distance, to the outline of your shape. Then there’s all sorts of fun combinators that you can write on top of this and all sorts of fun optimizations that you can do if you somewhat limit the capabilities of the person using it, taking compiler optimizations to the computer-aided design workflow.
So you’re diving right into the abstractions and the math, but to step back for a second, the thing you’re talking about as I understand it is what’s called programmatic CAD, which is instead of having something that’s the moral descendant of MacPaint or something, where you go in and interactively construct some 3D object that’s going to then be printed in the 3D printer or constructed in some other way, you write a program to do that. And there’s actually lots of stuff out there that lets you do this already. One of them that I’ve played around myself and with my kids is called OpenSCAD.
Yeah. So OpenSCAD is programmatic CAD. I believe it operates on a boundary representation. So this is very similar to what you would get in CAD software or Blender or any of the other 3D modeling softwares, which is that the fundamental primitive is the points or a face. Whereas the style that I’m particularly interested in as a functional programmer, is called functional representation or F-rep. And these two different styles of representing shapes lead to some really interesting trade-offs. Some things that are really easy in B-reps or boundary representation are very hard in F-reps and vice versa.
Right. And just to see if I have the description right, an F-rep is something where instead of describing explicitly through a collection of polygons or whatever, the boundary of some surface, you write down a function and then the boundary is something like the zeros of that function. So implicitly you’re describing the boundary by just writing down what the function is.
They’re also called implicit surfaces.
Why is it better to use implicit surfaces versus a boundary representation? You said there are some things that are easier and some things that are harder. Can you give me a sense of someone who might use this? Why I might prefer the implicit representation?
Oh, I think a lot of the combinators are trivial to express. So things like taking two shapes and finding their union is well, the two shapes, they’re two functions, so you apply them both. And then you take the maximum of whichever of the two points and the function that combines these now describes the shape that is the union of those two shapes. These functions return numbers that represent the distance to the edge. So zero is on the edge, negative numbers are inside and positive numbers are outside. So union would be max, intersection would be min, and negation or taking a shape and making the outside inside and the inside outside is just literally negation. And you can use negation and union to get cuts. And also something that you can’t do in boundary representation stuff is just write some code. And I really enjoy just being able to think of a function and see what kind of shape it would create.
Right. Although, I mean, you can write code, OpenSCAD lets you write code that generates boundary representations and manipulates those boundary representations. So can’t you do it programmatically in both styles?
Probably. I think some of the shapes that I’m interested in are things like the interference pattern between two sine waves, each of them describing distances to the outside of a shape and cutting that off via a circle. And the cutting off via a circle OpenSCAD can do just fine, but describing a shape implicitly is something that implicit surfaces do pretty well. And the type of programming that you do is very different.
And I can immediately see how your functional programming background leaks into the way you talk about CAD. I think people who are doing CAD don’t normally talk about combinators. They might talk about the operations you use for combining shapes. And I guess those two are one and the same in the way that you’re describing them.
Yeah.
As I understand it, there are other things that are supposed to be good about implicit surface-style representations. I’ve used OpenSCAD a lot and one thing that’s really annoying about it is you have to decide early on when you’re constructing the boundaries how detailed the boundary representation is going to be. So there’s essentially almost tessellation of the surface or an approximation of the surface by a bunch of little tiny polygons. And then when you do things like merge things together, you can get all sorts of weird artifacts that come from exactly the kind of beats in the frequencies of the different polygonalizations of the two shapes and you get all sorts of weird things that happen on the boundaries. And I think the implicit surface approach gives you something where while you’re doing all of your transformations, you have everything in some sense at full resolution. And then finally when you render it, you get to decide how you want to do the final display and how detailed that needs to be.
Yeah, that’s exactly right. Sadly, one of the downsides is that doing that final transformation from F-rep to B-rep, which is useful if you want to 3D print something or pull it into a game engine, that’s really hard. As far as I can tell, no one has done it perfectly yet. There’s an amazing project by Matt Keeter called libfive, which I’ve built a ton of stuff on top of, and I’ve loved using that. And it’s very good, but it still has issues with weird edge cases, in this case, literally edge cases. You can imagine interesting things happening to a shape as points go to infinity or negative infinity. And unless you’re very careful, you can easily create topologies that are hard or impossible to render into a regular 3D file.
Right. And I guess if you have points that go to infinity it’s probably hard to find a 3D printer big enough to actually render the resulting surface.
It’s true.
Okay. So that’s a fun kind of hobby. You said you did a bunch of web development on the side as part of your personal work. Have you done web development in a professional context as well before your work here?
Yeah, so I helped out on a project at Microsoft building a in-browser REPL. I don’t know if any of my code is still alive in there, but I think that’s it actually, I think that was the only brush that I had with web dev professionally before Jane Street.
Got it. And then somehow, despite the lack of any professional background, nonetheless, you ended up building the library that is now Jane Street’s default library for building user interfaces, which is called Bonsai. So let’s talk about that a little bit. So when you arrived here, we already had some foundation for doing web development. There was a library called Incr_dom and this was based on a compiler component, which we still use, called Js_of_ocaml. So Js_of_ocaml is a backend that takes OCaml and converts it into JavaScript, doing actually a surprisingly good job of respecting OCaml semantics. And we had successfully built a number of user interfaces using Incr_dom and Js_of_ocaml, and also run into some trouble with it. Maybe you can tell us a little bit about what Incr_dom looked like to you when you arrived and explain how it works for someone who’s never touched before?
For people that are familiar with the Elm architecture, Incr_dom would seem quite natural. Let’s back up and talk about the Elm architecture for a little bit. Elm is a programming language built entirely for developing user interfaces in the browser, certainly its standard library is very focused on this. And it also comes with a design pattern and that design pattern is separating the logic of your application into a view function that takes inputs and the model for an application and produces a pure functional representation of its view. And then also an apply-action function, which can take in an existing model for the application and an action that is triggered either by the user interacting with the page in some way or by interactions with a web socket connection or other IO. And it’ll take this action and the current model and produce a new model. And then the render function gets called again with a new model and hopefully the changes that you want to see represented on the screen make their way there.
Great. So this is in some sense, just another spin on the classic model-view-controller pattern and model is your representation of data, view is what you’re going to display, and the controller is somehow how you interact with the outside world and do transformations to your model, right? And I guess this whole thing will sound very familiar to anyone who’s used React, where React is also based on a notion of a pure functional representation of the DOM. You have some just pure data structure that represents the DOM you wish you had, and it gets slammed into the actual messy, complicated tree of objects that is the actual DOM in the browser in a hopefully efficient way. Can you say maybe a word about how Elm architecture differs from what a React programmer might be used to?
On the surface, they appear to be very similar. The one thing that React users typically reach for is a state management library. And this is changing over time. I think it’s a bit less common now that React has adopted some of this into the library itself, but there were some very popular libraries like Redux or MobX that did the pure functional model updating that Elm has kind of built into its framework. Prior to this, the way that you did state in React was by putting state in your components. And this is handy. It’s private. You don’t need the rest of your application to know about this state, but it also has some problems like that if a component is removed from the component tree and then added back in again, the state is gone. So it’s a bit more like having an immutable variable inside of your component tree. So people reach for state management solutions that provided the model to action to model transformation that Elm has.
And just to maybe say it in a slightly different way, the Elm approach is to some degree, like a good first-order approximation is, your UI is a function. And the key function is a function from model to view. And it really is just a pure function where the model is just simple data and the view is just simple data. And in fact, immutable data at that. And then there’s this extra action step where inside of the view, there’s some, essentially, description of how actions can be used to update the model, say when you click a button and that leads to an event which asynchronously will later then cause the model to be updated. So you don’t get this whole encapsulation of hiding state inside of components. Instead you have this pure, in some sense, more totalistic like no, the whole UI is a function. And then we compose functions in the ordinary way that we compose functions in the context of functional programming.
Yeah, exactly. And I think a major selling point for libraries like this is your application as a function where you take the state and you transform it into a view. But really I think that that’s the least interesting part. I think the way that you get applications that people want to use is by having really well defined and edge-case-free interactions. And that means transitioning from one model to the next, in a way that makes sense to your users. And I think that modeling things as a state machine like Elm does, and like some of these React state managers is a great way to do that.
So that’s like a new piece of jargon, a state machine. When you say it’s modeled as a state machine, what does that concretely mean?
Yeah. So I guess the abstract notion of a state machine is you have some state and then you have actions that express a desire to change the state, and another function that takes an action and state and produces a new state. This separation allows programs to more accurately read when interesting things happen as opposed to direct mutation. So you might notice a few actions coming in altogether and you could debounce them to separate multiple actions into just a single one. You can imagine a user accidentally hitting a button twice or typing something a bit too fast, and you might want to put some guardrails on that and make sure that they aren’t doing anything accidentally. This is the type of thing where if in an action handler, you are directly mutating some state, it might be a bit harder to do this, but if you’re representing everything as transitions between states that are described by actions that are raised by events, then this kind of intentional transition becomes a lot more possible.
Which is to say, you might think about something like this as being primarily about two different types of data, the model that represents the state of your application and the view, which is how you present it to the user, but really there’s three, there’s the model and there’s the view and there’s also the action, which is the thing that represents transitions, ways in which you change your model. And you have some kind of explicit way of talking about the way in which transitions cause the state to update. And it’s that structure that you’re referring to as a state machine.
Yeah, exactly.
Great. So how does this Elm model that you described, how does that relate to what Incr_dom was like?
Incr_dom took the pattern that Elm popularized and left it relatively unchanged with the exception of a single addition, which was combining it with a library that we have developed here internally called Incremental. And the Incremental library is effectively a smart memoization technique where if you write a function using the Incremental library and you run it multiple times where the inputs are changing only slightly between runs, then you could expect some better performance if the output that you’re producing is changing slightly in response to inputs changing slightly.
Right. And this goes back to a general fact I’ve long noted about UI frameworks, which is that hidden inside of every UI framework is some kind of incrementalization framework as well, because you basically need this incrementalization for performance reasons everywhere. UIs in general, what are they doing? They are computing something to display to the user and those things need to change quickly. And one of the things that can help them change quickly is they don’t change all at once, little bits of them change. You go and click somewhere and some small part of what you’re seeing changes, it’s not everything in the entire view being transformed all at once.
So you need some way of computationally taking advantage of this. Maybe we should mostly leave off those gory details here as well. In some sense, the difference between Elm and Incr_dom is that Elm has a way of doing incrementalization and Incr_dom has Incremental, which is a more powerful way of incrementalizing meaning you can express more complicated algorithms incrementally, but I think that’s kind of mostly not the thing that’s interesting about the gap between Incr_dom and Bonsai. So maybe we shouldn’t talk too much about Incremental, but I’m curious what you felt was problematic about Incremental. The story you’ve told so far is Elm is this nice way of describing things, and then Incr_dom takes the Elm model and adds this more powerful incrementalization framework. That sounds great. What doesn’t work that well about the model?
One of the things that I think Incremental does differently than your average React application is that it also incrementally computes the state machine transition functions, these apply action. And this is really nice. It means that you can thread values throughout your application and make use of them in the state machine transition functions just as you would threading values through your view function. But it does mean that combining components becomes a lot harder because you’ve not only got to take the views from multiple components and stitch them together into one parent view, but also you need to compose the apply actions for child components into this super component. And the way that this is done with Incremental is very boilerplate heavy. The downside of having this library that expresses a lot more of the power of incremental computing means that these compositions were quite verbose.
So this whole point about the action application function itself being computed incrementally, on the face of it seems very abstract. Can you give an example of where that gives you something useful?
We have this library that implements a UI component for very large tables. And one of the things about this table that’s quite unique is that it can sort and filter and display an absolutely enormous amount of data, hundreds of thousands of rows. And really a lot of the magic there is in the view where we only actually show a very small subset to the DOM. But during that sorting and filtering, we produce a lot of intermediate data structures that could be very useful in downstream components. You might imagine that these sorted and filtered tables might expose internal structures that allow you to very efficiently implement row selection, where you want to be able to navigate through the table using a keyboard. In a naive implementation, they might have to sort and filter all of that data again, but because we’ve done it incrementally and because the apply-action function is also computed incrementally, we can share that data a lot easier.
It sounds like the problem you see that comes out of all of this is that it makes the system in some sense less composable, it’s harder, at least requires more boilerplate to take little pieces of a UI and combine them together to make larger UIs. Do you think that problem is worse for Incr_dom than it is for Elm? Or do you see the same problem in both approaches?
I see the same problem in both approaches, but I think that the boilerplate that you get from Elm is quite minimal and for Incr_dom, it was absolutely not. I believe that when I made the simplest possible combinator for two Incr_dom components, which was just combine their models or rather take a supermodel that splits into sub-models and apply action that would dispatch to one or the other, and a view that combined both sub-views, this was something like 50 to 70 lines of code. And what that meant was that people were making monoliths where they didn’t want to break their code up into multiple pieces because they knew that they would have to do this recombining at some level. And that recombining was error prone and it was tedious. Maybe in the opposite order, it was tedious and therefore error prone.
Great. So this is a problem that you wanted to solve, right? I think it seemed clear to you that it’s not great to not give people a way of building little reusable components. And if everyone’s building monolithic things, they’re not going to throw off small, useful, reusable bits. And you built this library called Bonsai, how does Bonsai help?
Bonsai started out basically as a single type definition that encompassed what I saw as the right way to build an Incr_dom component. This type signature was almost a reification of a programming pattern. That programming pattern being that there is a function that can incrementally compute a view, arbitrary extra data that might be useful to downstream components, like talked about earlier, and an apply action. And this function is given a model to act on and a function for raising events so that it can perform the state machine transition. So Bonsai really started out as that function type.
So just a bundle, you just take together the function for computing the view from the model, the function for updating the action, the way of throwing actions out into the world, just take all of those things and pack them together. And that’s what a Bonsai component was.
Exactly. Yeah. And I was able to write some pretty nice combinators on top of this. So things like given two components, give me a new one that combines their models, combines their apply actions in the trivial manner. And then you could specify how you wanted the views combined as well. But fairly early on, I recognized that the fact that UI components were computing … Well, the fact that we called them UI components at all was a bit of a misnomer, really they’re something-producing. And the fact that they happened to be views most of the time was a bit of an accident really.
So among the first things that I did was remove the hard-coded view from that type. So now instead of a UI component that produces a view and maybe some extra data, it was just, here is a component, not necessarily a UI component, that produced some result incrementally. And frequently, that result type would be a view, but it doesn’t have to be. And I think more than half of the components that I write for a web application won’t be manipulating views at all. They will be state machines that produce some results that I think might be interesting. And maybe they get transformed into a view later on the line or maybe they get passed in as inputs to other components.
And this is a real break with the style of components that you see in systems like React, right? Every React component by its nature is producing some chunk of DOM. And here Bonsai is a kind of more general computational system that you can use for computing all sorts of things.
Yes, absolutely. That is a massive difference. But the difference that matters to me is that Bonsai structures its components and its compositions in the form of a directed acyclic graph rather than as a tree. And this is the thing that allows you to get components that produce values to thread them into inputs for other components. Whereas in a React-style UI, the value that you’re computing incrementally must be a view. So if you want to communicate with other components, you’re almost forced to walk up the tree, pass something onto its parent, and then that parent is responsible for shuffling down whatever value you’ve computed.
So that means that in React, when you break down your components, you essentially have to do it by slicing it along the structure of the generated DOM.
That’s right. The structure of these components wind up being very closely related to the structure of the view that you want to compute in the end. There are workarounds for this. One of the ones that I’m thinking of now is that a parent component might produce a function that gets passed on to its children, and that function produces some more view. And this is a good way of kind of doing a bit of a decoupling, but well, it doesn’t really match my aesthetics for how I want to be programming. The tree style of building components in React also doesn’t let you move values from one child’s component of a node to another, a sibling.
What’s the case where you’d want to do that?
One motivating example might be a text box and a button, and you want the text box to have some text in it before the button is clickable. Now in a tree-motivated UI component, you might have a component for your text box, a component for your button, a callback that you pass to the text box to be notified when there’s text in it, and then a value passed into the button that says if it’s disabled or not. Whereas with a graph-oriented structure, you could have the value of the text for the text box flow into the button, and the button could check to see if it’s invariance or upheld. And the button doesn’t need to be a child of the text box, it just needs to depend on the value that is produced.
So one thing that’s changed from the original view is that the components aren’t specifically view- or virtual DOM-producing components, they’re more general components. How else does a Bonsai type differ from what you might find in some other UI component framework?
Well, when you’re comparing it to Incr_dom or Elm, one of the major differences is that components don’t actually expose what model they have, or really even if they have a model. And they don’t expose what their action type is either. So really all that a Bonsai.t has, which in Bonsai the primary type is called a computation, it’s only parameterized over what value it produces. And any state machine and transitions that it has is local to it.
Part of what’s going on here is you’re basically giving a good way of compartmentalizing the model and the action. So essentially, there’s a part of this which is like a function, which takes some input and produces some output. And part of this which is a state machine, and it gives you a way of composing together functions in a way that let combine state machines as well. Is that a fair description?
Yeah. Depending on who I’m talking to. I’ll describe Bonsai as a framework for building incremental state machines.
So given that description, how general do you think Bonsai is? Is it useful outside of building web UIs? Is it useful outside of UI contexts entirely?
I want to say yes, but I’ve been looking for an alternative example for a very long time and I haven’t been able to really come up with anything.
Is that true for both the question of user interfaces that are not web user interfaces?
Oh, no. I mean, I think Bonsai is broadly applicable to user interfaces, as long as you have some abstraction that exposes the desired view for your UI as a pure value. That is one of the downsides of Bonsai is that effectively everything inside of it has to be pure. It becomes very hard to reason about when you have mutable values. So it might be hard to build a Bonsai UI on top of platform du jour UI component library, which is primarily mutation-based. But it should be possible if someone wrote a virtual DOM-style abstraction on top of it.
And this restriction, I guess, is essentially the same restriction the systems like React and Elm have. And Elm, I guess, at the language level enforces purity, and React, being in JavaScript, doesn’t. And similarly with Bonsai, because it’s in OCaml, which is a language that supports mutation without restricting it, like at the level of types. Also it’s the case that this is just a mistake you can make. If you use Bonsai and you use side effects in the middle, it’s going to be confusing, and that’s kind of that. So early on, you said that a repeated pattern in your way of thinking about the work you do is ways to turn the problems you solve into compiler problems, into programming language problems, and we haven’t really talked about how that affects Bonsai. What is the way in which the Bonsai work also looks like a compiler project?
I’ve been describing Bonsai as an incrementalized directed acyclic graph. And one of the invariants that we enforce about this directed acyclic graph at the type level is that it can’t be dynamic. That is to say, once you’ve constructed the super-component that represents your entire application, all possible combinations of components are known. And this lets us effectively do a number of compiler optimizations. If you look at this DAG as a program, instead of as a forest of nodes communicating with one another, then you can employ some fairly rudimentary compiler optimizations, like constant folding or collapsing multiple nodes into one if you detect that one of them is going to run really fast and you would rather not have the overhead of incrementalization. In some cases, we can even transform the algorithms that are being used on a node by node basis. So we talked earlier a bit about view library Incremental allows you to express a lot of powerful algorithms incrementally. And one of the places where it really shines is when it deals with map-like data structures. And we are able to pick and choose from a number of these different algorithms at the point where Bonsai application is first evaluated to pick the one that is the most optimal for that particular node.
Right. And you’re talking about this as being compiler optimizations, but it’s not like you’re hacking the OCaml compiler to do something. Where in the process of doing this does the compilation-like phase fit in?
Yeah, so it happens exactly once. And that is when you start the
Bonsai application. You give a Bonsai.t
to the functional equivalent
of a main method. And it’ll take that, it’ll crunch it, it’ll optimize
it, and then it’ll start running. After it starts running, we don’t
touch it anymore. But at that point where the user hands off the
representation of their application, we are able to fully traverse the
entire component graph and see where we want to make changes.
Right. So when someone is building up a Bonsai computation, the thing I’m really building is something that’s closer to the abstract syntax tree of an expression that represents that computation. And then you’re running a compilation step to convert that into the real live running computation.
Exactly. That’s why we call the thing computation. It is an unevaluated expression and that’s what they give to us. And we can look through the entire thing and then start running it.
Right. So that’s a great approach, but it also limits you in some ways.
Yeah. The way I think of an application written in Bonsai is really an application written in three different programming languages. The first one is a meta language, this language happens to be OCaml and it manipulates computation.t and composes the structure of the application. But once this meta programming is done, it’s done. You don’t get to add new possibilities for components at runtime. The second language is Bonsai itself, which when you look at code written in Bonsai, you can identify things that really look like programming language constructs. So we have our own if, we have our own way to do recursion, we have functions and ways to apply values to these functions. The third programming language is the one that you alluded to, which was that the actual business logic, the code that’s running inside of these abstract notions of the components is again, OCaml. At this level, you cannot add to the component graph. It’s restricted to performing calculations on values that are being passed through it and applying actions to transition state machines. But yeah, after stage one, you’re kind of set in stone.
And maybe I should go back a little bit and say that when I say that the graph itself is static after this first stage is finished, I don’t mean that the application can’t change itself dynamically. So in Gmail, for example, you’ve got your inbox view and your compose view, and your reading-someone-else’s-email view. And in Bonsai you can absolutely switch between all of these at runtime. But at compile time, well, what I’m calling compile time, it’s known statically that there are those many views possible and it can traverse all of those separate sub-graphs even though there isn’t any data actually flowing through it, even though those components themselves aren’t active right then and there.
And this is a kind of classic trade-off here where you’re taking away some real freedom from someone who is constructing the UI, there’s some kind of highly dynamic things that are maybe impossible or at least awkward to express in the context of Bonsai. But in exchange, what you have is the ability to do more things like these kind of optimizations that you’re talking about within the framework. So in some ways, you take away power from the user of a library and you give power to the implementer of the library to provide better services to the user.
Exactly. Yeah. And I would say that the services that we provide go deeper than just compiler optimizations. I’ve been saying that I will build a debugger for a very long time and I haven’t yet, but it would be possible to build a very useful debugger that up ahead of time knows the exact structure of your UI and allows you to navigate the DAG and poke values and see what the states of various components are in a way that I think would be a lot harder if we didn’t have this upfront, solidified representation of the application.
So one of the things that strikes me about the way that you describe all of this is in some sense, quite abstract, right? There’s all these fairly fancy sounding ideas about how we build computations and compose them together and how we’re going to have our own language inside of the language and our own version of if and our own version of recursion. It kind of sounds like a lot. I wonder, how does that affect the ability to construct a tool that you can give to people and they can easily use? What’s the learning curve like with Bonsai and how hard has it been for people and teams to pick it up and use it?
Yeah. So I think it’s getting a lot better. It’s still not great. I learned a whole lot trying to teach very early versions of Bonsai to people. And in fact, I think we ran about 50 people through a class that included Bonsai in that class when it was still very young, and their frustrated expressions still motivate me today. I think that one of the things that we’ve been leaning on quite heavily is the ability to extend OCaml syntax itself to make expressions in Bonsai look a lot like expressions in OCaml. So here I’m talking about the PPX system which lets me write little extensions to OCaml syntax like “let” for binding variables or “if” or “match” for evaluating different possibilities of data that’s flowing through the Bonsai graph. So in a way, the structure of the Bonsai code looks an awful lot like OCaml, but still has all of the restrictions that we were talking about earlier.
So this is another way of turning this into a programming languages problem, not just inside of Bonsai itself, do you do programming language self transformations, but you’ve also moved up into the OCaml syntax itself and added essentially new language features to OCaml to make it easier and more convenient to express Bonsai computations in a way that feels normal to people.
Yeah. And I think that it’s one of the things that I really enjoy about a number of our other libraries that operate monadically in that you don’t really need to know what a monad is in order to use them effectively. But if they have radically different syntax than what you’re used to, it can still be an enormous pain. So by extending the syntax for monads, I think that a lot of people that start at Jane Street don’t know what it is, don’t really need to care. They’ll learn eventually as they see numerous examples of monads in the code, but it looks so much like regular OCaml that you can think of it as though it were regular OCaml.
Yeah. This kind of just reminds me of the general fact that the ability to transform the syntax of your language is such a great feature. And it’s always sad when you run into languages that don’t have it. Just having some kind of syntactic abstractions where you can just write some piece of code that transforms it to simplify things. There are downsides to various kinds of macro capabilities in that the macros themselves can be hard to reason about and can sometimes give you awkward error message and all of that. But when done right, it’s just enormously powerful part of the world and large parts of the languages out there just never adopted.
I think they’re moving in that direction though. It’s really hard to think of a programming language that has been designed in the past 10 years that hasn’t at least acknowledged the power of metaprogramming and good metaprogramming in some way. I’m thinking here specifically about like Zig and Rust as examples of languages that are very static and very serious, and they still have excellent metaprogramming capabilities.
I guess Java just recently got closures. So maybe it’ll get metaprogramming next.
Yeah. One would hope. I think C# got source code generators which don’t let you extend existing files, but let you produce new ones based off of the contents of a syntax tree.
Yeah, and it’s maybe worth saying, I think OCaml’s metaprogramming facilities are okay. They’re not great. It’s great that they’re there and they’re extremely useful, but I think people who work in Lisp or Scheme or Racket or something in that would mostly point and laugh at OCaml’s macro system.
Yeah. I think that’s right. It’s simultaneously too powerful in that you can do basically anything and inject arbitrary code, but it’s also because of that, it’s really hard to just make a tiny one-off extension to OCaml that you’re using perhaps only within lexical scope of the definition of this macro. It would be great if we had some kind of macro system in addition to compiler extensions.
Yeah. I agree. And there’s some exciting work in the direction of strengthening OCaml’s support for various kinds of macros. And also I think it would be great to have macros exposed in a modular way. Right now, what macros you’re using is a thing that’s kind of controlled entirely from the build system and is not integrated in a real way in the language. And I think that’s another thing that would make the system more usable and easier to apply in more cases. So a big part of landing a new library like Bonsai is evangelism, right? You can write the nicest library that you want, and if no one uses it, nothing good happens. How have you approached the problem of evangelizing Bonsai and getting people excited and getting people using it inside of Jane Street?
I think one of the things that I’m still very happy that I did was backwards compatibility with Incr_dom in that you can kind of spread throughout an Incr_dom program, providing value to people that have existing applications. And once they see the difference between the code that they’re maintaining and the code that they’re adding, there were a lot of people that were very motivated to convert wholesale and start new applications using just Bonsai.
And how has that gone in practice? How popular now is Bonsai compared to Incr_dom for people who are building new things?
I’m not aware of any new serious projects that aren’t using it, but I’ll also say on the point of evangelism, I really enjoy it. Well, the main goal for Bonsai was to make people want to modularize their applications and break off bits and pieces that other people might find useful. So whenever someone does this, I will be their evangelist. And I maintain an enormous collection of examples using components that I’ve written, but also a lot of the other people have written and I will make sure that any new application knows exactly what powerful features other people at the company have contributed.
What are you dissatisfied about with respect to Bonsai? What are the remaining things about it that are problematic, that you feel need to be much better than they are now?
One of the things that I’m a bit disappointed by that isn’t directly related to Bonsai, but we still get flak for it, is composing components can lead to issues when you are styling things with CSS, because if a component author wants to ship a CSS file alongside their component, there could be issues with name clashes where two people pick the same name for a thing and now they’re overriding each other’s styles. This is largely solved by doing component styling in our virtual DOM library. So styles added directly to the virtual DOM nodes. But I think that CSS is a much maligned pattern-matching language. And the things that you can do in it are incredibly useful, especially for expressing things that are possible in OCaml like highlighting every other row in a table a different color. But also there are just things in CSS that you can do that you couldn’t do with inline styles, like changing the appearance of an item based on whether or not it’s being hovered over, for example.
So this is a pretty weak point when you’re writing something using Bonsai web. And I have a project about to get merged that I hope will fix this, which also uses OCaml’s extension language. And the way that it works is by writing CSS directly in your OCaml file inside of an invocation to this compiler extension. It’ll pull out all of the CSS, look through the names, give them all unique names and then expose those names to your OCaml program so that you can use these uniquified hashed name values that are coming out of a very crisp and clean CSS string.
So the goal here is to specifically get rid of this case where you essentially pun your way into the problem where you had two different classes for nodes, and you happened to name them the same thing, and you’re just going through and making sure that you have no unintended name clashes.
Yes. Well, in fact, we do that by just hashing the contents of your string and also the file name that you’re currently in. And appending that hash will generate something that hopefully doesn’t wind up clashing with anything else. I’ll also say that the CSS language itself is not what people complain about when they complain about CSS. What they’re complaining about is the implementation of layout in the browser which can be very confusing, and the complex interplay between various styling rules. So these things aren’t going away with a PPX, they might go away with some other abstractions. We hide the details of the algorithms that are used to display and lay out content. But CSS itself, no, it’s quite nice I think.
Got it. So you think this does solve the primary problem here, which is this kind of name clash issue?
I think I can come up with other things that I don’t like about Bonsai. Right now, you need to repeat yourself a lot. Moving from one language to another, in this case, the Bonsai component language into the actual OCaml code that manipulates values at runtime is quite verbose. I would really like to get rid of a lot of that repetition. Right now, it’s not repetition that I think would cause people to inadvertently write bugs, but you do wind up repeating a bunch of names in multiple places, which is a bit unfortunate. And it’s something that a very recent addition to OCaml, by our own Stephen Dolan, is going to hopefully make better. And that is “let punning.” This verbosity, the repeating of these names is going to be fixed in a very new version of OCaml that lets you ignore cases where the variable that you’re binding to has the same name as the value that you’re binding.
So that actually leads to another question. I’m curious what you think the ups and downs are of OCaml as a language for writing web applications? The normal language used in JavaScript, and there’s lots of others, you could write an Elm, you can write in TypeScript, you can write in Dart. How do you think OCaml measures up as a language for doing this kind of work?
I have to say that web development at Jane Street is very different from web development anywhere else in that we have some of the fastest networks that money can buy. And because of that, a lot of the things that would kill Bonsai on arrival at any other company like somewhat enormous JavaScript bundle size and not being able to do server side rendering in order to cut down on initial load times or being able to split a application into many files so that you’re only loading the ones that are needed on the initial display. Things like this that are commonplace in other languages and frameworks are simply not present. And that’s because here we primarily use our web browsers as a 2D vector graphics library where we don’t care too much about web pages, almost all of them are a single page that loads a whole bunch of JavaScript, connects back to the server with a web socket connection and sits there chugging along all day. And the constraints that most web developers have to face are just simply nonexistent.
So for us, the web browser is just another place to run an OCaml program and an OCaml program that happens to generate a UI. But a lot of the ordinary stuff that are the concerns of the web just kind of don’t show up.
Yeah. I would say that OCaml is well-suited for this style of immutable program construction and its tools and its libraries make this very easy in a way that JavaScript, for example, does not.
Is there anything about the language itself that you feel like gets in the way from a kind of expressiveness and convenience point of view? Are you mostly pretty happy with OCaml as a language for this kind of work?
Yeah. Yeah, absolutely. I think if I wasn’t happy with it, we wouldn’t be having this conversation.
Well, you can be happy but have some axes to grind about things that you’d like to make better. I feel like it’s rare to be so happy about a programming language that you don’t have complaints.
Oh, I definitely have complaints. But they’re mostly from a library author’s perspective. I think there’s not a ton that I think I desperately want from my user’s perspective. I mean, I’m going to regret saying this. As soon as we’re done, I’m going to come up with five things that are actually deal breakers.
So to ask a grander question still, in what ways are you dissatisfied about the web as a platform for writing user interfaces?
I think that the one thing that every UI author has to contend with is the only options we really have are JavaScript and the DOM for programming language and UI runtime. And I think that this may change as WebAssembly picks up steam, and as they add the features that allow OCaml to compile down to it, like access to the garbage collector, or I guess the ability for us to implement our own garbage collector. I think we would be happy getting the OCaml GC to run on WebAssembly if that was also possible. But I think that the DOM is going to be here for a while. And it does a whole bunch of really amazing things. The fact that anyone really can create an accessible, styleable user interface with very little thought is unbelievably powerful. And I think that the prevalence of the web goes to show for just how valuable this is. But for people building user interface components that lie outside of what the styling and layout engines of browsers consider normal, it would be nice to be able to tell them, “Hands off, we’re going to be managing what we consider to be the DOM tree from here on out.” And be able to forward on some of the invariants that we maintain to the web browser in order for them to make it as performant as possible.
So how different is this from just drawing your own widgets on a canvas inside of the browser? In fact, I think I heard a week ago that Google is planning on doing this for various things in the Google Docs Suite. Where they have done a lot of work over time to wrestle with the DOM and get it to do all sorts of things that the DOM was never designed to do, display hundreds of thousands of lines of a spreadsheet. And it sounds like they are giving up and they’re like, “Now we’re just going to render to canvas.” Does that give you the flexibility that you need? Is that enough?
So, yes, but at great cost. The great cost being reimplementing text boxes. Clearly there’s more than just text boxes, but I think text boxes are an incredible example of complexity that you really don’t notice until it’s wrong. Things like selection not wrapping around lines correctly, or maintaining cursor position and moving them correctly through strings that aren’t necessarily ASCII. Doing text layout, doing text shaping, these are things that yeah, Google is going to have to do and they have the manpower to do it. But if you told me that I needed to implement a text box on top of canvas, well man, that’s maybe best measured in years.
So what you want is not to have just the raw ability to kind of draw stuff to the canvas, you want something that preserves more of the power of the DOM that exists but still gives people more control. What would you do? What’s the magic feature you would ask for from the web?
I guess direct integration into the layout engine and maybe APIs that allow us to tell the Chrome renderer what things to avoid drawing, direct access to the layout engine, maybe direct access to their underlying 2D graphics renderer. And I think a more powerful integration with incoming user events, things like keyboard and mouse that allow us to take a bit more power.
Is this the kind of thing that people are talking about or is this just stuff you think would be awesome but it’s not on anyone’s radar?
I think that there’s a proposal called Houdini that includes a lot of this. It’s not totally clear whether or not it will be adopted as a standard and when those features will arrive or how powerful they will be. But I think that certainly there are people, probably the same people that are working on Google Docs, that have been bitten by these problems and are eager to solve them.
So you spent all this time on Bonsai, which is a UI toolkit. Are there other tools for building UIs that you are especially jealous of or have a lot of admiration for? What are other great systems that you’ve encountered for building user interfaces?
Yeah. So I think that the Svelte project is a recent UI toolkit but also programming language that has a lot of features that I think make it very desirable for outside-of-Jane-Street UI development. It fundamentally operates on the basis that UI components are specified ahead of time. They are composed dynamically, but the UI components are passed through a compiler first and it can do optimizations, it can find places where bugs might be introduced by composing things incorrectly. And it just feels to me … I also haven’t done any programming in Svelte, but reading over the tutorials, it feels like the type of thing that might’ve been developed for the web with hindsight.
Are there any approaches to building UIs that you’ve seen outside of the web context that you’re excited about? There’s a whole world, I don’t know, there’s wizards for composing UIs together and all sorts of systems for building UIs in different contexts. Is there anything in that world that you’re excited about that you think does something impressive and good?
No. I think that UI-based UI design will almost always fall over. It will not expose enough power to the people that want it. And at that point, you need either very powerful hooks so that developers can attach things into a UI that was built in a GUI, or you need to transition that GUI into code and then hand it off to the dev. And I haven’t seen any of these work particularly well.
Arguably, one of the most successful systems for building user interfaces is Excel. I wonder how you think about that fitting in?
Excel is arguably the greatest programming language ever developed. Certainly its ability to convince non-programmers that they aren’t doing programming is unbelievable. It builds in an entire incremental framework for reducing recomputation of values. And it makes people integrate the data input that they have to do with the viewing and analysis that they want to do in a way that I just really haven’t seen replicated anywhere else.
Right. I mean Excel is another case where it does the thing where it’s very limiting. You get the best behavior out of Excel when you do the maximum amount possible in the equation language and the equation language is very limited. Although excitingly, they recently got let expressions and closures, but still, it’s a very limited language. But once you accept those limitations, boy oh boy, is it a nice system to work in. I don’t want to build everything inside of Excel, I mean, I feel like there’s a whole separate long podcast one could have about all the things that go wrong when you use Excel, Excel has some imperfections. You can talk about the genomics disaster of various genomic markers being misinterpreted as dates and wildly messing up computations or results. Excel is far from perfect, but as a way of constructing easy-to-use and easy-to-understand interfaces for people doing data analysis, it’s pretty impressive.
Absolutely. And I think that one of the reasons for that is that there’s this very tiny barrier between seeing something on your screen, seeing a visualization or some results for a formula and editing the formula that produced that result. You don’t need to dig through source code and find out where that value is being produced. You click on the thing and it’ll show you exactly how and why and where those values and those computations are coming from. And it’s unbelievably powerful. And I’ve been experiencing a different kind of visual-based programming in the Blender program, which is a 3D modeling animation rendering, but I’m using it almost entirely for shading in which a typical shader is a source code in a language that looks a lot like C. And you write a program that operates on usually a single pixel and it produces a color at the end. Blender has a wonderful setup in the form of what they call nodes, which is this graph-construction UI where you create nodes that represent different combinators over numbers or colors or textures. And you can visually see the output of an individual node in the editor. And as you combine and compose all of these different nodes, you build up something that is practically, well, for me at least, literally impossible to do in code without seeing all these intermediate steps represented so easily.
When you say seeing the intermediate steps, do you mean seeing essentially the instructions in the intermediate steps or do you mean actually being able to interrogate and see the results in the middle of your image transformation pipeline?
The latter. There’s a single key binding that if you have a node selected, it’ll change the output for the entire shader to just be that node, and of course, all of the things that it depends on. But there’s also an incredible plugin that’ll actually inline previews above every node that you care about. So you can see just what does this node contribute to the rest of the program.
So part of the value here is just a really good debugger.
Yeah. Yeah. Oh, absolutely.
Right. And it makes sense. It’s for a kind of programming where a lot of the ordinary things you would do for understanding whether you’re doing the right thing just kind of don’t make sense. When you’re doing this numerical transformation of data, when you make mistakes, your program doesn’t do something like enormously wrong, it just looks wrong or something. And it’s hard to know what’s going on if you can’t visually inspect the intermediate components.
Right. And a lot of the tools that we bring to bear on programs written out in source code just can’t apply here. How do you write a test that passes if your shader looks good? You can’t, you need a human to look at it and to tweak the parameters and make it appease some art director somewhere in a movie studio. The debuggability and expectability is the feature. It would not be what it is if it were merely a projection into source code or out of source code. And like with Excel, it has managed to convince a whole bunch of artists to do programming without knowing it. And I think that’s really empowering as well.
Well, I look forward to you tricking a bunch of Jane Street people into thinking that they’re not programming while they develop UIs inside of our walls, and also provide them with excellent debugging tools.
That’s right. Yeah, get back to me on that one.
Will do. Okay, well Ty, thanks for joining me. This has been a lot of fun. The world of UIs is enormously complicated, both the stuff we’re doing and just the whole world. It was fun to get your view on it all.
Thank you for having me. This has been a blast.
Ron You can find a full transcript of the episode along with more information about some of the topics we discussed, including Ty’s Bonsai library at signalsandthreads.com. Thanks for joining us and see you next time.