Podcast
Type Safety
You can also follow our feed. Listen to more episodes in the archives.
Newtypes let you give things names, but is that type safety? Andres Schmois, Cameron Gera, and Taylor Fausak explore a blog post by Alexis King.
Episode 30 was published on 2020-11-09.
Links
- https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-type-safety/
- https://www.youtube.com/watch?v=MEmRarBL9kw
- https://np.reddit.com/r/haskell/comments/jnwg7i/haskell_foundation_ama/
Transcript
>> Hello and welcome to the Haskell weekly podcast. I'm your host, Taylor Fausak. I'm the lead engineer at ITProTV. And earlier this week, I was at the Haskell Exchange 2020 Conference, learning about the new Haskell Foundation, as announced by Simon Peyton Jones. If you haven't heard about it yet, I encourage you to go check out the talk on YouTube and the AMA on Reddit. There will be links in the show notes.
>> And I'm Cameron Gera another engineer here at ITProTV. And it is November, and it is the state of Haskell Survey is out there for you guys to check out. We would love to hear your opinions. It is open until November 14th. So that is this coming Saturday for those who listen right away. So please, please, please go check it out. We would love to hear your feedback.
>> And I am Andres Schmois. Also an engineer here at ITProTV. I have been doing Haskell now for I would say about a year and a half very much enjoying it. Uh, today we will be talking about a blog post, uh, called names are not type safety.
>> It's a hot topic, huh?
>> Yeah. At the blog post is by Alexis King.
>> So, what did you think when you first read this blog post Andres?
>> Uh, I I tend to think blog posts that tell me something that they're not seem pretty aggressive from the beginning. Um, it took me a little while to realize what it was trying to say because I start to see red when things are aggressive. Mhm. But no. Just kidding, uh, the the blog post tries to say that new types are, uh, potentially not a good thing when dealing with type safety in a language. However, it at first I thought that the blog post tried to tell you not to use them and just to go straight for using the variables or the primitive types themselves. Uh, but then I started to realize that that's not exactly what it meant. It was more, uh, you know, just make sure that the new types that you use are, uh okay. And they won't tell you that the code is type safe just because you're using them.
>> Yeah. I had a similar thought process as you where when I first read this blog post, I thought it was trying to say we shouldn't use new types, But really, I've come to realize that. I think it's saying or Alexis is saying that new types may not give you all of the securities that you think you're getting from them, and you have to keep that in mind when you work with them. What did you think about it, Cam?
>> Yeah, so starting. I really I was intrigued at first because, you know, like Andres said, this is kind of against the grain, right? Like, don't use something that everybody uses and you know, So it kind of felt aggressive in the beginning. But I realized that, you know, she's trying to express that. You know, new types have value, but don't get just confused with using them for, you know, changing the name right. At that point, you should just use a type alias. Or something lighter, less heavyweight that you can actually change the underlying type of where, as with a new type you could, you know, if you're exporting the constructor and allowing modification to happen anywhere you can kind of get yourself into a bind. And you know, I like her leading example too, uh which at first I thought it was gonna be like, Yeah, I never use new types only always get to the bottom of the construction and make sure every impossible cases gone. She's just saying like, That's a nice to have, like, that's something. If you can do it, do it because it's gonna provide that safety and allow you not to have to worry about handling quote unquote the impossible case.
>> Right?
>> So I thought that was good value.
>> So the example that she starts with is a hypothetical data type that represents the numbers from 1 to 5. So a very synthetic example. And she shows how if you define it the maybe normal way as an ADT where you have five constructors named one through five and she compares that to using a new type wrapper around int and kind of shows. I think some of the pros and cons of each approach, um, and it starts to get at the question of, um, what this the title of the blog post gets at, which is what is type safety. And it's kind of showing you that with the ADT you are actually excluding impossible cases from your code. And and that's type safety, I think. Um, whereas if you use a new type, you are giving yourself the ability to exclude cases in your runtime, but they you still have to account for them when you're writing your code.
>> Right, because the compiler is gonna say, Hey, you're missing something here.
>> And I think we need to keep in mind exactly what your new types are used for. So in this example, uh, the obvious choice is to use the 1 to 5. Uh, because you know, there's only five fields and the compiler will tell you when you add a new case where you've used it. And that doesn't completely remove the possibility that you use, uh, pattern matching that that matches on multiple patterns. But that's normally looked as you know, it's a whole another topic altogether. But, um, that's more type safe to use an ADT Than use a new type here. But what about, for example, that you know the example that she herself shows here, which is the new type email address wrapped around a text type, and I think that's where the main question is about. Is that type safe or isn't it type safe? In my opinion, very off the top, I think it is type safe. Um, it is not always type safe, but just a wrapper around text can be type safe.
>> Right. And that's the pardon, the pun. That's the type of thing we use a lot in our code base here at ITProTV, where we have things that are representationally the same. So like an email address and somebody's first name. Um, and we want to distinguish those. So we want to give them separate names. But we don't want to actually make a type in our code base that is correct by construction or the correct by construction encoding of an email address. Because I'm struggling to think of what that would even look like emails are horrendously complicated. You know, you start out thinking like, Oh, it's pretty simple. There's only a couple parts, but, uh, we don't need to get into that. That could be an entirely separate ah thing, but the point for me is that, like, yes, you're not actually getting any additional type safety here in, like, excluding impossible cases. But you are saying that I'm not gonna accidentally use somebody's name as an email address.
>> Rgiht. Yeah. So bottom line is, there still needs to be some layer of validation for the underlying text of the email. If you're worried about that being, you know, some something a user you know, misconstrues and sends in some, you know, SQL injection that messes up your entire database, which I would imagine we're gonna escape it correctly. But you never know so I think that is Ah, you know, something to note as well is like it may not be quote unquote the underlying construction of an email. Correct. But you can do validation with that type and say, Hey, I'm gonna give you quote unquote an email from text and I want you to validate that is quote unquote still an email. And so you know, in that way, yes, it's an extra step, but it's a lot less headache.
>> Right. And that validation uses a very common pattern in Haskell and probably other languages as well. But where you have a module that only exposes the type and then a smart constructor. So, uh, this is the way that she talks about this in the post is, um, new types as a token. So you're saying that since the only way you can construct, for example, an email address is using this blessed function in this module, then you can use the the email address type as a token that says, I have validated this thing and all of the functions that operate on that type are responsible for holding those invariants.
>> And I'd just like to mention that we say validated a lot, but I think we all really mean parse it. So you know, there's that entire blog post by the same person that says parse don't validate. Uh, the the thing I want to mention is that we don't check to see if the email is valid or, you know, if the name is valid or whatever, um, we try to parse it and if it fails to parse, then that is an invalid email. So I think we use validate in parse pretty interchangeably because you know that that's the right way, at least that we think we should program those types.
>> That tends to be our parsing, validating for us tends to be the same.
>> Right. And email addresses is maybe a bad example because they are very complicated. And even if you have something that looks like an email address that successfully parses it will, you know you may never be able to send an email to that thing. Um, but bringing up the parse don't validate Post is a great point, and she brings it up in this Post is well, because maybe a simpler example to grapple with is the non empty list where if you take the approach of parsing, you might go with this new type approach where it's a wrapper around a list, and when you construct it, you prove to yourself that it's not empty, and then you hand the thing back, and every time you operate on it, you have to grapple with that invariant, or you can validate it and produce a data structure that carries around that validation. Um, and again, I think the crux of this blog post is showing that those approaches have different tradeoffs and there. Um, one is arguably less type safe, and one is more type safe.
>> So when um, you know, we talk about using new types deriving versus, you know, deconstruct, constructing something that could never have the impossible case, you know, And we do different types of development. In what situation would we choose to do? New type, deriving versus? You know, the alternate, which is, Well, it's not really the only alternate, but it's the alternate kind of in this discussion of constructing out the impossible case.
>> So I think that using new types, um, is usually a preferred way that I like to program because, um as long as these new types are simple and they don't do more than, um, you know, then the need to I think you're constraining yourself to this This, like this module, not module in the Haskell module sense. But this section, um, And when you do the opposite, which is to use type alias or just not use the new type, you're opening this up to pretty much whatever it wants to be or whatever it can be, which in this, you know, if we use a text, it could just be a string of characters. Um, now the I think the only bad. I don't think it's bad, but the only downside of using a new type is feeling. So just because a new type is there for some people, it might feel as if it's type safe or isn't type safe. And that really should just be dependent on your programming, not the actual code itself. So just because you're using a new type doesn't necessarily mean it's not type safe. It just means it can be, you know, something of a new type. So I think we need to focus on the fact of what is the alternative of not using a new type. So do we not use them? Do we use type aliases? Do we just, you know, you strings text int whatever. Do we go even crazier and go into liquid Haskell or refinement type or anything like that? Um, I don't know, but I prefer to use new types, and I don't think this blog tells you what to do. Um, I think that's a critic of Of what I'm trying to say is that this blog doesn't really say anything, so it doesn't say Don't do this or do that, or it just says, you know things about how Alexis feels, Uh, when programming, which is totally fine. Uh, but that's that left me with the question of like, Okay, what do I do? And so yeah, that's where I am.
>> Yeah, this blog post is not prescriptive. It doesn't say Use new types in this situation or use ADTs in this other situation. And you mentioned Liquid Haskell refinement, typing, dependent typing. I think that there is a sliding scale here where at one end you have using the primitive types or the types that are already available to you strings, text int whatever. And then as you progress, you introduce type aliases, which don't give you any safety at all. But they let you, um, express intent, maybe with your type signatures. And then you have new types which give you a little bit of type safety more than type aliases. But less than ADTs and then ADTs give you even more, and that's kind of where it stops for Haskell. But as you add liquid Haskell, you could do refinement types. So that example we started with a number between one and five with liquid Haskell, you could have a function that takes a regular primitive int. But you can express that there's a refinement on that where it has to be between one and five. And it will check those for you. Um and then all the way on the far end of the spectrum, you can do dependent typing like Agda or Idris Um, and Haskell seems to be inching its way in that direction as well. Um, and for me, I think I'm in the same boat as you Andres where my preference is to stay as close to the simpler or primitive side of that until some problem happens. And that, I think, is where for me, the benefit of new types is where you can start to see the invariants that you wish your type had. So, like you can start with a new type around text. But if you're always like with an email address splitting on the at sign, then you may think, Okay, I should make a type where it's already split up, and I don't have to do that all the time. Um, but you know, clearly this comes with experience, and it's a judgment call. So like when we separate email addresses from first names? That's not because we've had a bunch of bugs in our past at ITProTV, where we mix them up. It's that in my professional experience, I have seen or maybe not even seen but been afraid of bugs like that. So I just want to exclude them from the right out of the gate. I don't wanna have them be possible.
>> Yeah, and I would say I'm going to continue to use new types, and I think they are very good for what they do. I do think I've been bit a couple of times by, you know, some underscore case that captures, you know, something that I wasn't expecting it to happen, which, like Andres said, that's, ah, programming error not necessarily a language error, Um, but I think the like. Factoring out to the impossible case is ideal at some point, kind of like you, said Taylor. Like, if we're always splitting at the at sign, then let's just make a type that has two, you know, some email with two texts pieces and, you know it. Ideally, it would be easier to refactor or factor out the impossible case. But sometimes it's not so new types is really sometimes the best you can, do. And so you know, I think she was trying to bring some just awareness to the fact that, like naming something as something else doesn't really guarantee it's always gonna be right, And, you know, and I appreciate her going out on a limb in that regard, but it doesn't necessarily change my feeling because, like you said, it wasn't really like a hey, don't do this But do this instead, which, you see a lot in blog posts. I think it was just saying, Hey, just just to be aware, this doesn't really mean you're more type safe because you're using new types.
>> Yeah, and it also does a good job of comparing these things against each other, the data type versus the new type and showing you how how you work with them, how you produce them, how you consume them. Um, and that highlights. I think some of the costs of new types where you have to do all of this wrapping and unwrapping and derive all these instances that are already available on the type you're wrapping, so it's like it's kind of like boilerplate. And it's, um maybe it's a position of privilege as a Haskell programmer to call two lines of a bunch of type classes boilerplate. You know, you're not actually writing all those instances, but you still have to derive them.
>> There, um, you know the blog. You know, we talked a lot about you know what things are and aren't And I think one thing that we probably should mention is you can misuse new types, and that's true of almost any programming feature. You can most of the times misuse something. And I think one thing in this blog post is the argument name example where it's just some kind of complex type that will derive a whole bunch of stuff. And at the end of the day, it just turns out that using this new type is exactly the same as using a type alias. Now, in that case, and probably many other cases, this is not type safe. So we are allowing the possibility of constructing a type without actually touching the type, and so that that is, I think, something to point out is that just because you use type safe. I'm sorry. New type doesn't necessarily mean that you have even any type safety. Now you can use new types and have type safety. But you have to be. You have to make sure that your new type does do what you want it to do. And there's a lot of examples of how to make new types into basically not type safe types. And I think a lot of that, you know, you go into coerce or converting from one to the other convertible. You know, that kind of stuff that can get fuzzy between type safety and not type safe. So that's that's a good argument, and I think that this blog makes very good arguments. But we should be mindful that new types are very useful tool and we shouldn't shy away from them. I mean, I don't think anyone's ever going to shy away from new type. They are one of the most use features of Haskell Just wanted to mention that we shouldn't just start removing all new types from our code bases.
>> New types are definitely a mainstay of Haskell programming, and I doubt they'll go anywhere on the argument name. I think it's hard to say in a vacuum if it is truly unnecessary and should have been a type alias or even like a Haddock comment on arguments, you know, just say like, Oh, this is It's called argument name So that's confusing on function parameters. You'd say a comment here. This is the argument name. Um, but it could be that in this code base, there are lots of different types of names that it's worthwhile to differentiate between them, like an argument name versus a function name or an argument name versus module name. Um, so we can't really say, you know, we're not looking at the code base so We'll have to take Alexis's word here that this was, in fact, an unnecessary wrapper, and you pay the cost of wrapping and unwrapping it everywhere. But you don't ever get any benefit from it.
>> Yeah, And I think the main thing to look into here is something like overloaded strings with, um is string derivative. And I think that that is what makes a new type potentially useless. Um, it's still not useless because they're still, um, you know the value from using it in a function, for example. But in terms of type safety, you've just thrown it out the window. It's no longer just an email address. It is now any string. So any string could become an email address, which I think that should have been talked about more. Um, you know, in terms of type safety of whether or not we should be focusing on type safety as conversion. So if we want to go from something into a new type, what is the best way? Do we use wrapping functions that we use record type wrapping? Do we use instances? You know that? I think that's very interesting and it's a problem that's been plaguing me for, you know, since I started and I would like to have an answer to it. I just you know, it's still something that I'm struggling with, and I'm sure there are answers or many answers. But what is the right one?
>> Yeah, I think we've stumbled upon yet another good topic for another podcast episode, so we'll have to talk about that one of these days.
>> Yes.
>> Well, Cam, do you have any closing thoughts about this? Names are not type safety blog post.
>> I think we've covered it really Well, I really appreciate you guys coming on the podcast today to kind of talk about it. I really value you guys's opinion. You're a little more seasoned under the development belt than I am. So, you know, I've learned a lot from your opinions as well as Alexis's here so yeah, I think it was a good post Created some good conversation, and it seems like what, four or five new podcast topics. So, you know, I think I think we've got a lot lot going for us here.
>> Yeah. Agreed Andres, How about you?
>> Uh, yeah. I don't have much more to add. Just like to reiterate new types of good size bad sides. I think we should be very mindful of them. And, uh, you know, the better code you write is always going to be the better answer, regardless of what others think. So
>> Yeah. All right. Well, I think that will do it for us today. Thank you for listening to the Haskell weekly podcast. I have been your host, Taylor Fausak. Um, if you wanna follow us on social media, you can check out our website, which is Haskell weekly dot news. From there, you could find links to our Twitter Reddit GitHub All the various places we are.
>> Yep, And Haskell Weekly is brought to you by ITProTV a e learning platform for IT professionals and also our employer. And they want to extend a gracious offer to you guys of 30% off the lifetime of your subscription with the promo code. Haskell Weekly 30. All One word. All caps Easy peasy if you ever have any questions. Member Services team is pretty bomb here at ITProTV. So they'd love to help you out. But go check it out. We'd love to see you seeing what's going on in the IT world.
>> Stay safe.
>> See you next week.
>> Take it easy.