Dependent Haskell with Vladislav Zavialov

Podcast

Dependent Haskell with Vladislav Zavialov

You can also follow our feed. Listen to more episodes in the archives.

Curious about dependent types? Special guest Vladislav Zavialov from Serokell breaks it down for us. We discuss comparisons to refinement types, sigma and pi types, interactions with linear types, and much more!

Episode 51 was published on 2021-08-23.

Transcript

>> Hello and welcome to the Haskell Weekly podcast. This is a show about Haskell, a purely functional programming language. I'm your host, Taylor Fausak. I'm the Director of Software Engineering at ACI Learning. I'm very excited to welcome a special guest onto the podcast this week. With me today is Vlad, or Vladislav Zavialov. Welcome Vlad, thanks for joining me.

>> Hi, I'm very glad to be a part of this podcast.

>> I'm very glad to have you. And we were talking earlier about mispronouncing names. So could you say your own name, just so that we got it on the record?

>> Sure. So my name is Vladislav Zavialov. It's a Russian name, so no worries if you don't get right. I don't worry about it too much myself.

>> All right. Well, thanks again for joining me. And for the listeners that haven't heard of you, how would you describe yourself?

>> So I'm a Haskell developer, and mostly I'm a self-taught Haskeller. So I want to make the language easy for others to learn as well. And my interest in dependent types actually started as a way of simplifying the current Haskell practices. So the more I tried to develop software with static guarantees in Haskell, the more I started using extensions, such as GADTs, type families, and so on. And I got burned bad by them. So if you're not careful with that, then you end up writing basically unmaintainable code. And now I've made my mission to change Haskell into a language where you can make — where you can write code that is simultaneously maintainable and has good static guarantees.

>> That is a good mission. And I'm happy that you're working on it. I've definitely experienced some of that pain you're talking about, where as you add more advanced type level features, the program gets simultaneously easier and harder to maintain. So I'm looking forward to some — to diving into some of the talk on dependent Haskell. But before we get there, I understand that you work for Serokell, which is a Haskell company, more or less. Right?

>> Yeah, exactly. So I represent the research division of Serokell. We have some commercial clients and we work in the areas of financial technology and other areas where high security and correctness are required. But myself, I mostly work with my team of four people and we are focusing on GHC exclusively. And maybe sometimes we help with internal products, and the blog, to write something useful for the public to make our work more visible and maybe help others get into this topic, get more familiar with dependent types in Haskell. So yeah. Basically I wanted to praise my company for giving me this opportunity because it's basically my dream job to be working on GHC. And I get to decide which tasks to work on. I have basically complete freedom of prioritizing what needs to be done in GHC, as long as I can argue why it helps us to get closer to dependent types.

>> Yeah, that sounds fantastic. It's really encouraging to have a private company with so many developers working solely on GHC. I'm merely a consumer. Me and my company are consumers of GHC. So we definitely benefit from your work. So thank you. And thank your company. You're both doing great stuff. So you touched on this already, but dependent Haskell is an area that you're interested in. And as you mentioned, I've seen some posts on the company blog about that. Both authored by you and an interview with you. So, from an extremely high level, could you tell us what are dependent types and more specifically, what is dependent Haskell?

>> Right. So dependent types are a feature of the type system. And for any given language, you could say that it either has them or it doesn't have them. Haskell is currently in a sort of gray area where it technically doesn't have them, but with enough effort and work arounds, you can get similar expressivity. That's called Singleton types. And that's current practice with GADTs and type families. So, dependently typed programming in Haskell is a topic that — well, many people are trying to do it. But one of the ways to get into it is to actually start with an actually dependently typed language such as Agda or Coq or Idris, and then try to translate those concepts back into Haskell and use the Haskell's awkward encoding of them. So that's why basically — I want to add dependent types to Haskell. To make the encoding, instead of it being awkward, to make it straightforward. And the exact description of the feature, what dependent types are, basically boils down to two built in types: sigma types and pi types. Or dependent sums and dependent products. So if we could say that Haskell has those two types, then we say that Haskell has got dependent types. And the way to understand that — these two concepts is that — actually we will also have user defined dependent types. And the analogy I would use is algebraic data types. So you have Either which represents usual, non-dependent sums. And you have tuples which represent normal non-dependent products. And you have functions. And basically when you are defining your own algebraic data type, you're just using — you're just doing it for a better API. You could, in theory, just stack tuples on top of eithers and do this like giant nested structure, where instead of defining like a data type with five constructors, you use Either five times nestedly. Yeah, and that's the same relationship between dependent types that will be user definable, and sigma types, and product types. So those two are the core concepts. If you understand them, you basically get the building blocks for dependent types. But then it's up to the users to define their own and apply them — apply the type system feature to design better APIs. Now, I will explain first dependent sums because they're easier, and then dependent products. So dependent sum, you can think of it as a generalization of normal sums. So if you think — you can think about an Either value actually as a pair. So as the first component you have the constructor, it's either left or right. And as the second component, you have the value attached to it, it's either the value of one type or the other type. And in this sense, the first value in this pair determines the type of the second value. So if you have for example, Either Int Bool. Then Left in the first component means you have an Int in the second component and Right in the first component means that you have a Bool in the second component. And this is basically the dependency. That's how the type of one value can depend on another. And product types like this — well, I mean sum types, dependent sum types — are a generalization, because instead of just Left or Right, you can have arbitrary values in the first component. So the direct correspondence would be between Either and a dependent pair, where the first component is Bool because you have two values. It would say that False corresponds to Left and True corresponds to Right. And then if you have False, then it's one type in the second component. If you have True, it's another type in your second component. But it could also be three possible values, or four possible values. And as long as you have a finite amount of possible values in the first component, you can actually still model it as a normal algebraic data types. But if you have an infinite amount of possible constructors, if you have an Int in the first component, then that's actually — you cannot define an ADT with an infinite amount of constructors. And then you would actually use a dependent pair. For example, you may want to define a vector, which is at least parameterized by its length. And this length is supposedly a compile time parameter. So it's not a part of the structure. And then you have — then maybe in some part of the code, you want to produce a vector that is of unknown length. For example, the user has entered it from the keyboard. Or you have read it from some other sort of file, or downloaded from network, and you are trying to write out this type which is a vector of ints for example of length N. But what is N? We do not know what N is at compile time. And that's when we would have to use a dependent pair, we would say that, okay, here is a pair. The first component is some N, which is known only at run time. But then in the second component we use this N, which is known — which is a value known only at runtime. We actually use it as part of the type. So we say we have a vector of this length. Length stored in the first component. But even though it's known only at runtime, it's still recognized by the type system as some symbolic value. And that's basically what dependent types means. It means that the compiler in the type system starts reasoning about values, which are actually runtime values. And you get this — the distinction between types and values blurs. You basically no longer speak of types and values as two separate worlds in which, like, your program lives. It's no longer two sub languages. And instead you start talking about the compile time and runtime phase separation. So you still know that you have erasure. And you know that these values will be passed around, and they will take memory when your program runs. And these values are only during compile time and only the type system makes use of them. But you also can mix and match those two as much as your use case requires.

>> Right. Okay. Well, thanks for that overview of dependent Haskell. I feel like already, I understand it better than I did 10 minutes ago. I especially appreciate the analogy you drew between kind of the value level ADTs for either and — eithers and tuples, versus the type level sigma and pi types, which are terms I've heard before, but never really knew what they meant. So I appreciate that. I also like drawing the parallel between the ergonomics of being able to define your custom types versus building complex types out of eithers and tuples. And I think most Haskell programmers can probably realize that those two things are isomorphic. You can convert between them without losing any information. But it would be a big pain to always work with the eithers and tuples. Using those custom types is very nice. You also mentioned a couple of things that I want to dig into a little bit. So one of the things is that you noted Haskell is kind of in this weird middle ground between having dependent types and not having them. So what motivated you, or really the larger community, to want to add dependent types to Haskell, versus trying to make another language that already has them, like, let's say Idris — you know, bring all the nice stuff that Haskell has to Idris. So, so why bring dependent types to Haskell rather than the other way around?

>> Yeah, so basically the reason for that is that there is always more than one feature that defines a language, and Idris is not Haskell plus dependent types. It's a different language. And one of the most notable differences is that it has strict evaluation. And it has it by design. There's no plans to make it lazy by default. Which is a fine design choice, but not the one that I subscribe to. I want to be writing in a lazy language. And in this sense, Haskell is very unique. There are very few other lazy languages, and the reason for that is the difficulty of implementation. Basically our hardware is not meant to execute lazy programs, and there has been a very large amount of engineering effort and insight into developing STG, the intermediate representation, the virtual machine,with lazy evaluation semantics and the backend that translates it into machine code. And then the runtime system, which supports all of that. And it's basically, in my view, a miracle that Haskell runs as fast as it does. And it's got lazy evaluation, which has all the nice — all the nice denotational properties. So when I'm writing code, I often make use of them. And if I wanted to have a language that has dependent types and lazy evaluation, Haskell is basically the most practical starting point to get there. Another one would be to work on the performance of Agda and make — because Agda is also a lazy language, by virtue of compiling to Haskell. And that I think is also a very nice thing to have. But I'm starting from Haskell because I'm more comfortable with Haskell and I have a stronger emotional attachment to it. I've been writing it for so long, I just want it to get better.

>> Yeah, that makes a lot of sense, and I can definitely appreciate that Idris is a completely different language. And, you know, compiling down to Haskell maybe isn't the same as having it a part of GHC in the first place. That does make me wonder about related kind of advanced type level efforts that have been landing in GHC recently, most notably linear types. Does dependent Haskell play nicely with linear types, or are they completely orthogonal things? Do they not care about each other at all? How do they fit together?

>> Oh, they care about each other very much. And it's a very difficult research topic of how to combine them. And basically there has been a fresh paper by Richard Eisenberg about using a formalism which unites dependent types with linear types as a property of quantifiers that introduce variables. But as far as I understand it, it currently doesn't handle multiplicity polymorphism. It's closer to what Idris 2 has. So in Idris 2, the second version of Idris, they've also combined linear types and dependent types. And every time you write a function you annotate the — in the type, whether you can use this function parameter zero times, one time, or as many as you want. And zero times corresponds to Haskell's forall, which means it's erased. It has parametricity properties that we care about, and nice performance properties. As many as you want is basically dependent types. It's a normal function value, which you can duplicate, ignore, and, since it's Idris, you can also use it in the rest of the type signature. That's pi types, which I've just realized I've yet to explain. I only explained the sigma types. But anyway, and then there is the one multiplicity annotation, which means that you've got to use it exactly once. And that's linear types. And in Haskell, we are looking towards something very similar. So we are going to have annotations on the variables introduced by a function. But we've also got to worry about multiplicity polymorphism, which has that — we actually don't know if this is a linear or a nonlinear value. For example, if we are writing the map function for lists, we would like to say, in linear types, that if the function provided by the user that we are applying to each list element is linear, then the entire mapping of the list is also linear. So if F is linear, then map F is also linear, but if F is not linear, then map F is not linear. And in this sense, like, the annotation on one function arrow determines the annotation on another, and that's where you get a multiplicity variable that's part of linear types. And the current theory doesn't actually combine those sort of linearity variables with dependent Haskell in a nice way. So we have like a design in which linear types and dependent types are sort of separate — entirely separate features. So if you have like a dependent quantifier, you can not use linearity with it. But so basically that's the design that was established before linear types were introduced into Haskell. And we're roughly following this as our guideline, or more of a fallback plan. Like if we don't figure out how to combine linear types with dependent types, then we still can get both of them. They will just not play too nicely, but there is like a real opportunity that we will figure it out. And that those will actually have a very beautiful interplay between each other.

>> Okay. So it sounds like it's still a bit of an open research problem, but also it sounds like you're hopeful. And linear types landed in GHC 9.0. Is there an expectation for when dependent Haskell may be available to end users?

>> Well, no. The problem is that with dependent Haskell there is still some design questions that need to be resolved. And that's happening as part of the GHC proposals process. So while we have like a general vision of where we want to get, we don't have the specifics because the specifics that were developed as part of the theory, they were developed in a vacuum. Like that's the dependent Haskell that we want. And then there are a million questions about backwards compatibility. Because we don't want to suddenly have, for example, one namespace for variables. If you're defining a function, you can use the same name for a type variable and the term level variable. And that will be like entirely non-problematic currently because we have two sub languages, entirely separated. And it doesn't even matter, but basically there is no direct interaction between them. And if you have two things with the same name, well, no problem. GHC always knows which one to use. But if we want types to depend on values, that suddenly means you can reference values in your types. And then it means you can reference variable names in your types. And that means that suddenly you do care about two variables in two namespaces, which are the same thing. And it's a very tough balancing act, how to introduce this and to have Reasonable error messages, reasonable user experience, instead of like unleashing this monster patch that breaks half of Hackage. And yeah, basically it has terrible error messages. No, we don't want that. So the current efforts are more focused towards figuring out how to integrate the dependently typed features and the features that exist in Haskell today. And then there is also the engineering work required because we will need to modify pretty much every bit of the GHC front-end. So that includes parsing, that includes name resolution, and that includes type checking. We will also need to modify the core language to support dependent quantification. And the core language is heavily used by the optimizer. That's like the representation on which it operates. But hopefully we will not have to change the backend. So STG and everything past that, we will not have to worry about that, it stays the same. But the front-end and the internal language, all of it requires some sort of update. There is already existing work towards that end. For example, the current internal representation that GHC uses is called System FC. Which means System F with coercions. There are very nice papers about it. And there are primitives in it which are used to represent type level equalities, which are used to decode — basically this is the abstraction into which current type level features of Haskell are compiled into. So type families are compiled into equalities. And GADTs are compiled into equalities. Data families are also compiled to — using coercions and equalities. But this is not expressive enough to represent proper dependent types — dependent sums and products. And an alternative formalism was developed, called System DC. System D with coercions. D for dependent, I suppose. And it's a new system, which is — like it's, it exists only on paper. So we will need to rewrite the very like core of GHC. The most basic — the most central representation of Haskell programs, through which the — every program basically compiles through core. We will — we are changing that to include this new feature. And that means we will need to very carefully update the optimizer as well. And, like, that's tricky engineering work. And personally, I haven't even started looking into it. Like, I'm just observing it from a distance and focusing on like the parser and the name resolution and more user-facing aspects, more than internal aspect. That's like beyond my current capability to tackle these super technical topics, although I'm planning to get into it eventually. I'm just like focusing on things that I understand right now and which are equally valuable, I think.

>> Yeah, yeah. That engineering work does sound tricky, to put it mildly. But I'm glad that you're focusing on the user facing stuff first. Cause I think, hammering out those proposals, and figuring out how should this feature even behave, is really important before digging into, how are we going to implement it into GHC? And I'm actually curious about that proposal process. Because I know that you're part of the GHC steering committee, and that you have authored many GHC proposals yourself. And that's not something that we've discussed on the podcast before. So could you tell the listeners a little about the proposal process and how, you know, what you do as part of the committee?

>> Yeah, sure. So the proposal process was conceived as a way of getting more community feedback about Haskell features. Because previously, if you've got someone who is very motivated and wants to get something into Haskell, they talk to Simon and they get it into Haskell. But then it was decided that maybe we are now a mature language which needs — which needs a proper process, and representatives from industry, from education, and from research areas to express their views. Do they actually want this feature? Or maybe they want it, but they want it to be different. And that's how the GHC steering committee's compiled. From people who represent as diverse as possible areas. And then there's also community feedback. So anyone who cares about Haskell can comment on any proposal with constructive feedback. Or even just to support it, or say that it like — it looks very difficult, please change something about its user experience. Every comment is fair game, as long as it helps the proposal author to modify his proposal. And the process is that, there is also — there is always a person who must have enough time and resources to push the proposal forward. So they've got this idea and they've got to develop the design for this feature. They've got to write a convincing motivation, convincing to everyone, both the committee and the community. They've got to research how this feature will affect the ecosystem. So they've got to describe: Will it cause breakage? How much breakage? If it does, do we want a migration strategy? So that's — and that's basically the sections of the proposal. First you've got motivation, then the proposed change specification, and then effects and interactions. And you've got to describe all of that. And hopefully if it's convincing, you submit it to the — you first discuss it with the community, the committee is involved later. You got the feedback, you incorporate it into your proposal, and when you basically get no comments from the community for two weeks or something, you decide, okay, it's time. I'm submitting it. You ping the secretary, who relays your proposal to the committee's mailing list, which is also open. Everybody can subscribe to it and see the deliberations. It's nothing secret. Nothing behind the curtain, everything is in the open. So you can see what the committee members think about this proposal, and their reasoning for either accepting it or sending it back. You don't have to follow the committee's deliberations, because in the end you get a report. Basically we either accept it, or we say, you first need to change this, that and that. And after some back and forth with the author, hopefully we get to some sort of conclusion whether this is the future that we want for Haskell or not. And then once it gets accepted to the GHC proposals repository, it's waiting for someone to implement it. So maybe if somebody wants to contribute to Haskell and they don't know where to start, this person could go through the list of accepted, but not implemented proposals and think — think through if any of those proposals appeals to them and looks implementable. Maybe they could drop the author of this proposal a message, to ask for advice for implementation, and basically start hacking. I've got like this page, GHC.dev, which helps you set up an environment and start working on GHC. But I suppose we can discuss it later.

>> Sure, yeah. Thank you for sharing the proposal process and your role and the committee's role in it. I do want to come back to GHC.dev later, but suffice to say, it's a valuable resource for getting started with GHC development. And as you mentioned, if anyone is interested in making a new proposal or implementing a proposal, that website should be able to help you out with that. But you mentioned something earlier that I want to loop back on, which is, for dependent types you had described what a sigma type is, but not what a pi type is. And you said you wanted to tell us what that is. So could you tell us now?

>> Oh right, exactly. So the sigma types are dependent sums and they generalize Either. And in the same manner, pi types are dependent product types and they generalize tuples. So with the pi type, in order to understand it first, you need to think about tuples as functions. And the isomorphism of tuples as functions works as … you can think about it as follows. So if you have a tuple of two elements, that's basically a function taking a Bool as input. If you — if you pass it True you get the first component. If you pass it False you get the second component. But in current Haskell you can — you can only use this encoding if both components have the same type. And in dependent Haskell, you can actually encode the tuple as a pi type by saying, if you pass True to this function, you will get value of one type. If you get — if you pass False to it, you get a value of another type. So dependent function is a function such that you don't know the result type of this function until you know what value it was applied to. And back to this example with vectors. Maybe you have a function replicate. And on lists you have the replicate function, which takes an integer, then takes some value, and gives you a list with this value repeated this N times. But if you've got the length of the list, in the type, you would say, okay, so replicate on vectors takes an int, then it takes a value, and it produces a vector of N — a vector of values of length N. But what is N? N is the value that we passed in. So we don't — so we know the general shape of the return type. We know it's a vector, and we know the type of its elements, but we don't know its length because the length actually is referencing the runtime value. And the reason it's called a dependent product is basically you can think about it as a giant tuple where … so … and since we are dependent on an integer. It means that, the first component of a tuple is when we have run replicate zero, and we've got an empty vector there. The second component is a vector of one element. Then the third component, a vector of two elements, then three elements, four elements. When you are invoking this function, you're accessing the Nth component of this infinite tuple and that's like the mathematical abstraction behind it. Of course, at runtime there will be no tuples. There will be just functions. But that, like — that's the intuition behind why it's called a product type, even though it's a function. Similarly a dependent sum, is actually a product in the — in the usual sense because you have the first component and second component, but it's called the sum because it generalizes the normal sums. And it's easy to get confused with it, but only because of the terminology. Actually the concepts themselves are not that confusing.

>> Yeah. Again, thank you for explaining. And I feel like I understand it even better. And having these, you know, analogies helps me wrap my head around them. Because again, I've heard of sigma and pi types before, but never really understood what they meant. And now I understand why they're not called, you know, dependent sums or dependent products. Because while they can model those things, maybe that's not the best way to think about them. Or that's not how they're actually implemented.

>> Wait, wait. Maybe I wasn't clear enough, but sigma types are dependent sums, and pi types are dependent products. So that's — those are synonymous.

>> Right. But also you said that the, sum types are implemented as tuples, right? And the tuples are implemented as functions.

>> Yeah, yeah. Okay, yeah. So you — you've got like, internally to represent a dependent sum you actually have a pair, and to represent a dependent product you actually have a function. So you've got like, to — to up the — to have the more expressive, like, more general type to represent a less expressive type, in order to add dependence to it.

>> Right. So a little tricky, but I feel like I'm getting it. That reminds me of a maybe similar effort, which other people may be familiar with, of Liquid Haskell. And I understand that that's refinement types, which are different than dependent types, but it seems to me like there's a significant amount of overlap there. But I do understand that they are different. Can you explain what the differences between, you know, the current implementation of Liquid Haskell and what may be in the future dependent Haskell?

>> Yes. So, liquid Haskell is also — refinement types are basically a limited form of dependent types. Why? Because you are referencing value level variables in your types. You are saying that — when you're taking, for example, you could assert using Liquid Haskell that when you sort a list, this list is of the same length. So in the return type of your — of your function, you can reference the length of the input type using refinement types. And this is in fact dependence, but unlike full-blown dependent types where you've got sigma and pi types, you can only use annotations like this. And this basically means two things. First, for the use cases that — where refinement types are applicable, they've got better ergonomics. So whenever it's possible, I would rather use refinement types over raw dependent types. Because it's this, like — the same thing we've discussed with, using your — user defined data type or encoding everything using primitives? Yeah. So dependent types that are introduced as part of dependent Haskell, they're a more primitive notion. While refinement types are imagined as a more user — like closer to what users will want to write. But that also means that we will — we'll be able to combine them. And eventually Liquid Haskell can be built on top of dependent Haskell as an extension to it. Which basically adds SMT solving. So every proof in dependent Haskell, you would have to write by hand. And if you want to convince the compiler that for example, addition commutes, you will have to go through the first argument of your addition and structurally recurse over it, and then prove associativity or commutativity or whatever property you care about. While with refinement types, you're relying on an SMT solver to do all the proving for you. And that also means that with refinement types when it works, it's nicer because you didn't have to write out the proofs. But when it doesn't work, you're stuck. You either get the SMT solver to do it for you or it's a lost situation. Whereas with dependent types, you just think harder and write a more complex proof.

>> Okay. Thanks for explaining. It sounds like refinement types, as implemented by Liquid Haskell, are a limited subset of dependent types. And they may play nicely together in the future.

>> Yes, definitely. And my only gripe with Liquid Haskell is that it's a separate tool. So it lags a little bit behind like the latest version of GHC. And if GHC shipped refinement types in its standard distribution, I would be a huge proponent of this feature and I would be trying to use it like in as much places as possible. As many places — as many places as possible. Yes.

>> Yeah. So, I think, you mentioned earlier the website you created for getting GHC developers kind of onboarded and up and running. Could you tell us more about that?

>> Yeah, so basically we snatched the domain GHC.dev. And I tried to think of something that would be fit for this domain. And I decided that it's got to be a page about GHC development. And I've got like, these huge grandiose ideas about writing a blog post there with the news about GHC development. And maybe some computed stats about contributions, who is the top contributor, what they're working on, and stuff like that. And the — and I would just start with the instructions, how somebody could — somebody could get hacking, and participate in GHC development. But as it happens, like that's where it's ended. So I've got this static page, but I've also had a lot of people tell me that they found it extremely useful. And it was like the only steps they could find online that worked for them to build GHC. So basically it's a very small page that's got 10 or so sections. And it says like, for documentation go here, for downloading go here, for building go here. Here are two commands to test your program. Here are two commands to run GHC in debug mode to figure out what's wrong with your patch, and so on. And I made sure that every command on this page is copyable. So it's a cheat sheet that I use myself frequently. So GHC, it's got this — an unfortunate, in my opinion, decision. It uses sub-modules. And it makes Git really flaky. And when I try to change branches, it sometimes just tells me no. So my approach to development is to have a separate GHC copy — GHC repository copy for each of the features I'm working on. And I — that means I frequently need to build from scratch. So I think, okay I need to work on yet another feature, I'm building from scratch. And I go to this GHC.dev and start copying stuff from there. So I'm not proud of that one, but it also means that others get to copy paste as well, which I guess is great.

>> Yeah. No, that is great. And I like that you use it yourself routinely because that means these commands are likely to continue to work moving into the future. Because I know one problem, not just with Haskell documentation, but pretty much any documentation, is that they may have been accurate at the time they were written, but things change. And the fact that you continue to use it means that hopefully if one of those commands doesn't work for you, you'll go and update it.

>> Yeah, definitely. And by the way, it's — I also wanted this page to be as small as possible. So, I mean in terms of download time. And I wrote it in Haskell. So it attests to the existence of CSS and the HTML libraries for Haskell. So I didn't use some — so I started thinking about writing like raw HTML and CSS and using some sort of Webpack or whatever other [JS] tool is currently the favorite one in the — in the JS community. And I like, couldn't get the result I wanted from any of them. They keep — they kept inserting things that I didn't actually want in my end product. And I decided: screw it, I'll write it with Haskell. And that was like the best decision. So Haskell is — turns out it's like a good DSL, not only for web services, where you need to generate things dynamically, but also if you need a simple static page, Haskell also comes in handy.

>> Yeah, I agree. The Haskell Weekly website itself is generated or served with Haskell. It used to be generated now it's served. So yeah, if any of the listeners are interested in contributing to GHC, I would recommend going to GHC.dev and seeing Vlad's copy pasteable commands there to get started. So Vlad, we've covered a lot of ground here. And we've talked a lot about dependent Haskell, and your contributions via the proposal process, and the committee, and this website. Is there anything else that I haven't asked you about that you want to talk about?

>> Yeah, I guess you have like a better sense of what the listeners of the podcast would be interested in hearing. So like I'm open to any questions, professional and personal. So fire away.

>> Well, I think we've covered most of the things — I have been taking notes through this whole thing. One thing you mentioned at the start of the episode that I wanted to get a little more clarity on was: you mentioned actually simplifying things with dependent types, which was intriguing to me, because dependent types themselves feel like a very advanced feature, but it does seem like you can use that advanced feature to make your implementation simpler, easier to understand, or have fewer bugs. Could you expand on that?

>> Yeah. So basically it wouldn't be a simplification of the current best practices for Haskell. Because current best practices, in my opinion, is to avoid type level features. Cause it's a rabbit hole. And as soon as you start using them, you need more and more. Because if you like stick to types where not many type parameters are involved, sometimes you just throw runtime errors and like you continue development. And sometimes it shows up in tests and you have to debug it, but like that's life. Even with dependent types, you will not prove every property of your program and you will have runtime failures. And if you want to statically guarantee that everything about your program is correct, currently Haskell isn't the right language for that. But there is like this gray area where you start — where you're trying to get as far as possible with the features that GHC currently provides. And you first start by enabling GADTs. And when you enable GADTs, you get these type indices. And some — and sometimes to compute the type index of some return type, you need to do some computation on the type indices of the input type. And you start writing closed type families in order to do that. Because that's the current practice for writing what would be a term level function, but on the type level. And once you get into type families, you are writing in a language with no evaluation semantics. It's not a lazy and not a strict language. It's like, whatever-the-type-checker-wants-to-do language. And it means that if you write a sort that in terms would have one asymptotic complexity, let's say — let's say it's a quadratic sort, maybe you're bad at algorithms, but — oh! But maybe you are not bad at algorithms. Maybe it's just that the features of type families are so annoying to use that you just write the minimum amount of code possible to get it to compile. Because you don't even have like first class case analysis. You don't have lambdas. You don't have any of that. Anyway, you write your sorting algorithm and you are thinking to yourself, well, maybe it's not the best algorithm, but it's got to work in a reasonable amount of time. But it will not finish compiling until the end of universe because in branches that should not be computed, it actually computes them all. And if you have like a condition if, and it evaluates — the conditional evaluated to true, the type checker will decide to evaluate both branches, both then and false. And as soon as you get something recursive, like it's, hopeless. I mean, it might change the — its asymptotics with the minor compiler version update. So type families: a great tool in a very limited amount of use cases. And I would say the use cases where you are not defining recursive type families. So if you just have like flat instances, associating one type to another without complex computation, then type families are like ready for prime time. Use it in production. But as soon as you start doing it, it's very easy to slip, and start writing just a little bit of recursive type families. And maybe like it works in this limited amount of circumstances, but then you're — for example, you're processing a small type level list. It used to have five elements, but now it has six, and now has 10. And with the bad asymptotics, by the time it gets to 12 elements, uh-oh. We actually had this problem, that we needed 80 gigabytes of RAM to compile our project. Yeah, and we could only use it — we could only do it on our CI machine and locally we had to put some unsafe coerces here and there just to get it to shut up. Yeah, so that wasn't a pleasant experience and I'm hoping that all of that will like go away with dependent Haskell. And one of the things that we are planning is that you would be able to use term level functions at the type level, and they will have term level evaluation semantics. Which means that you get lazy evaluation. You get reasonable asymptotic behavior. Maybe we could even, on platforms that support dynamic linking, maybe we could load the dynamic object and execute the actual term level function at native speeds. As long as you don't have any skolems in your input, and skolems are basically unknown at the current — in the current part of program, it's an unknown variable. And if you don't have any of those in your inputs, then we could do this dynamic linking thing. So it wouldn't cover everything, but it would be a possible speedup. And that's what basically Template Haskell does currently. Because Template Haskell, if you're using some function, of course it needs to have some shared object, it loads it, and then every — so for example, makeLenses. If — when you run it, the generator for your lenses, it runs at native speeds on your machine. And we are basically looking towards making this integration seamless. So instead of writing like Template Haskell splices, and working on the meta level where you are producing ASTs from values. So you get values as input, but you're producing ASTs. We are like removing this unnecessary abstraction barrier and we are producing values from values. Or types from types. And the hope is that it will have the ergonomics of term level of programming, the performance of term level programming, but integrated with the type checker.

>> That all sounds really promising. And I can definitely empathize with some of the problems that you've expressed about type level programming. Just in the last episode, my coworker, Cameron, and I were talking about our tech stack. And one of the tools that we use is Servant to implement our HTTP server. And I think Servant is kind of a poster child for something that's doing — something that's approaching dependent Haskell. Using all of these kind of weird approaches, type families and stuff like that. And we've actually run into some of those problems where either it requires too many resources, whether that's CPU time or RAM or whatever, or it's accidentally quadratic in, you know, the number of routes we have or something like that. And they've been challenging to work around. So I can appreciate that a complicated effort, like dependent Haskell, could make that type of programming easier and more ergonomic in the future. So I look forward to that.

>> Yes. So Servant is a great example. It's a library I absolutely love, and I use it whenever possible. And as long as I have not too many endpoints, it works beautifully. But like it starts to get slower when you get more endpoints. And there is nothing wrong with the library. It's just that GHC doesn't have the facilities to implement this wonderful type level approach in a more direct manner. And I hope that with dependent types, this will be a solved issue.

>> Yeah, I hope so too. It would make our development life easier. Okay, well, that was the last thing that I wanted to loop back on. So I'll throw it back to you, Vlad. Anything else that you wanted to talk about?

>> Well, I guess I will just abuse this platform to say that I'm learning German. So if anyone wants to help me, please do.

>> Maybe they can trade, they'll teach you German and you can teach them some GHC development.

>> That's definitely a trade I would do.

>> All right. Cool. Well, thank you so much for being on the show, Vlad. It's been a pleasure talking with you about dependent Haskell and everything else related to GHC development. I hope you had a good time.

>> I definitely had, and, thank you for having me. So I'll do my best not to disappoint with dependent types. Yeah, so I guess, bye everyone.

>> Oh, before you go, actually. If people want to find you online, where should they look for you?

>> So I open Twitter daily, unfortunately. So you can find me there.

>> All right. And I think your handle is @int_index. We'll put a link to it in the show notes, but yeah. Unfortunately I feel that, with Twitter.

>> Yeah. Find me there.

>> All right. Well, thanks again, Vlad. Thanks for being on the show.

>> Thank you. Bye.

>> Thanks for listening to the Haskell Weekly podcast. I've been your host Taylor Fausak. And with me today was special guest Vladislav Zavialov. If you want to find out more about the podcast, please visit our website, HaskellWeekly.News. This week we're brought to you by our employer, ITProTV, an ACI Learning company. They would like to offer you 30% off the lifetime of your subscription by using the promo code HaskellWeekly30 at checkout. Head over to ITPro.TV to get started today, and use promo code HaskellWeekly30 at checkout to get 30% off the lifetime of your subscription. Thanks again for listening and we'll see you next week.

Podcast

Dependent Haskell with Vladislav Zavialov

Links

Transcript