Podcast
Bad Parts
You can also follow our feed. Listen to more episodes in the archives.
Even well-designed languages have rough edges. Cameron Gera and Taylor Fausak review some of the bad parts of Haskell.
Episode 29 was published on 2020-11-02.
Links
Transcript
>> Hello and welcome to the Haskell Weekly podcast. I'm your host, Taylor Fausak. I'm the lead engineer at ITProTV. With me today is one of the engineers on my team, Cameron Gera. Thanks for joining me today, Cam.
>> Hey, Taylor. Glad to be here. Excited. It is a new month. It is November, which is going to be a great month. It's also my birthday month, so I'm a little biased towards that. But October was Hacktoberfest. That's behind us. We're moving on. Thanks for all of you guys who participated. You know, it's good to get that open source community some love. And I am just glad that we all got to participate. But this month is even more exciting. We have the State of Haskell survey that is out. It is live. It lives in this week's edition of Haskell Weekly and also in our show notes. But it's gonna give you a chance to show your voice and share how you interact with Haskell and the things you like. The things you could think need some more love. We're really glad that we can offer this to you. You know, it's a Haskell Weekly exclusive thing. So we do this, we collect this data, and we share it with the community. So we've done this for four years. And we're just glad to be kicking it off this month. So thank you, Taylor, for kind of heading that up.
>> Yeah, I'm happy to do it. And this, like you mentioned, is the fourth year. So if you're interested in what it looked like previous years, you can go back and take a look at those results.
>> You mean we could go back in time?
>> We can. We have that power through the power of HTML and old pages.
>> Nice.
>> But today we're not only talking about the Haskell survey, we're also talking about this blog post by Michael Snoyman called Haskell the bad parts part one, implying somehow there's more than one bad part of Haskell.
>> It could be, you know, everybody uses Haskell little differently. So I think this was a great post. I really appreciate his kind of banter, in it as well. He was pretty lighthearted, but also really giving you the chance to know some things you may not have known before because I know I've learned some things reading this blog post.
>> It's nice to see a critical post that's not written in a really critical tone. You can tell that Michael cares about Haskell and Haskell community and the language, and he wants it to be better. So he's writing this post to point out some of the things that he's spotted that maybe could be better. So what's the first thing that he noticed?
>> Yeah. So he first starts off with talking about the foldl function, which is lazy, but that doesn't quite, make it useful, right? It doesn't really give us what we want. Um, you know, we tend to run away from foldl because we have other alternatives that really actually work well for us that don't cause memory leaks like a foldr, a foldl', which are strict. Well, foldl' is strict, foldr is still lazy, but you know, when you're folding left, lazy doesn't quite make sense.
>> Yeah, it's funny because foldl is kind of like too lazy because foldr is lazy in the spine in the list that it's folding over and foldl' is strict in the combining function, so foldl no prime is just, like, too lazy. Never, never gets anything done.
>> I mean, I wish I could be that lazy and still get used, right? Um yeah, but if you wanted to actually have a lazy foldl, Snoyman jokingly says you should name it foldlButLazyIReallyMeanIt. Or else I think he'll probably, tell you about yourself a little bit if you create a PR against a code base, he maintains, which I appreciate his brutal honesty there. I think it's he's a lover of Haskell. So he's very passionate about good code. And I'm thankful that we have people like him in our community.
>> Yeah. Same. And in the Haskell community, we often use the prime to indicate that this is a strict version of a function. But there's no similar thing for saying this is a lazy version of a function. And I like that suffix butLazyIReallyMeanIt. That could stand in for prime. It's a little bit wordier, but I think it does the same job,
>> Right, well, I know. Don't we use tilde to enforce laziness for we're enabling the strict data language extension could we use tilde in function names?
>> I wish you could. In the same way that you can use. Like you mentioned tilde for fields to mark them as lazy or even on pattern matches. To mark them is lazy. And on the other side, you can use exclamation points to mark things as strict. But you can't use those in function names, unfortunately. I would love it if you could say foldl~ or foldl! It would look a little more like Ruby or Clojure or something like that. But I think that would be nice. And we wouldn't have to repurpose prime for all these different things.
>> Could that be, in part two of this, uh, podcast and blog post?
>> It might be, I think, that if Haskell allowed you to do that, some people might think it was a bad part. But who knows?
>> Nice. Well, you know, everyone's got an opinion that's totally valid. I'm not gonna ever discredit someone's opinion. Most people are smarter than me, so I'm okay with that.
>> Speaking of folding or reducing functions that are lazier than they should be the next bad part that Snoyman points out is sum or product. They do different things, but they kind of work in the same way. And what is it that he says about those?
>> Yeah, so you know, he kind of talks about how the combining function determines the strictness of the sum and product reducing functions, which is a little odd. You wouldn't expect a function you pass in to determine the strictness. And, you know, only really works for, you know, hypothetically lazy numeric types so you could create your own, sum and you know, product functions that are quote unquote lazy. But that doesn't necessarily make a ton of sense, like you can't add a number if there's not another number to add it to. Right?
>> Right. But this is a little tricky because a lot of times when people talk about this, they're talking like you mentioned about some hypothetical lazy numeric type that may or may not actually exist and may or may not actually be used. So one that pops up a lot is the Peano numbers, where you have zero and then successor, and from those you build up everything else. So one is successor zero. Two is successor, successor zero. And this is where you could actually benefit from laziness because you could have some logic that wants to take the sum of some potentially infinite list of Peano numbers. But once that sum goes above, let's say 100 then you stop doing any work, and if your number is lazy, you could do that without consuming the entire list. But if you're using a much more typical numeric type like int or integer, then you're going to have to consume the entire list to produce the sum of that list. Which makes sense.
>> Yeah, definitely makes sense. Um, yeah. He also kind of talks about how you can't really rely on G.H.C.'s optimizer to know what to make strict verse, what not to make strict. And, you know, it makes more sense to say, Hey, I want this function to be strict and write it as such, rather than relying on the compiler to optimize it to be strict.
>> Right. It's not useful to tell the compiler you're using a lazy function and then expect it to notice that it could be strict or ultimately it is strict. And then optimize it away for you. Just write the strict version in the first place. But the good news here is that there is an open merge request against G.H.C. to fix this problem with sum and product to make them strict. And this came out of a discussion on one of the mailing lists. Probably Haskell Cafe, but I don't remember right now, and it was actually a pretty quick turnaround. I think it's been like less than a month since someone said, Hey, this doesn't look quite right. Let's get it fixed. And then they opened a merge quest and so yeah, already, one of the four bad parts of Haskell has already been fixed.
>> Woohoo! Now we just got to cut a new version. We're good to go.
>> Yeah. So what's the next bad part?
>> Yeah, so the next one he talks about is data text IO. So we talk a lot about: Do we use string? Do we use text? And the community tends to move away from string in favor of text. But the data text IO gives us a false sense of security when dealing with in particular read file. That doesn't ensure specific locale encoding. So I feel like this is a problem in Haskell and a lot of language is that, you know, you try to read a file and it's in a format or locale you haven't programmed for, you know you're gonna create an exception that doesn't, I mean, honestly, catches you off guard.
>> We've run into this one quite a bit at work because we run a lot of our stuff in Docker containers and those containers are pretty minimal. So at first we weren't setting the LANG or LC_ALL. There's a couple environment variables you can set to control which locality you're in by default. And we weren't setting those. So when we ran some report that produced A CSV that happened to contain a non ASCII character in it, it would try to print it out, and then it would crash because it says, Hey, I tried to print something to the screen, but the encoding I was using didn't match up with what you wanted. That was really surprising and difficult to work around. And I remember I wasn't the one that ran into it, but somebody asked me for help. And I was like, Oh, yeah, I've seen this exact problem before where it just says, like, invalid character or whatever the error is, and I happened to know what the solution is. But if I didn't already have that experience, I would have been completely lost. So this can be completely mystifying.
>> Yeah, and I love how he uses the metaphor of wolf in sheep's clothing, right? Like you think it's gonna be nice. It's going to do what you need it to do. But then it bites you in the butt, and you're like, Holy cow, What just happened? And like, I mean, I probably was that person because I do remember a familiar situation and that dealing with this script that we were running and getting a file format from, you know, some company and they have special characters and we didn't encode correctly the locale, and we're kicking ourselves. So, um.
>> Yep, and we've got to fix these days, so we're good now. But it was confusing at the time, and another place that I've run into this before is when switching between Linux and Windows in most desktop Linux environments, the default locale is UTF-8, but on Windows, that's not the case. So I'll take software that works perfectly fine on my Linux desktop and on my Mac laptop. But as soon as I try to run it on my Windows machine, everything stops working. And it's like, Oh, right. I either have to change my code page on Windows or explicitly set the encoding or just stop using data dot text dot IO, which is often the easiest answer. And in fact, the Rio Library that Snoyman works on has a function that does this for you called readFileUtf8. And it sets the encoding for you to UTF-8 and then reads the file
>> Right, So you tend to not need to worry at that point, right? Because you know what the encoding is gonna be because this functions well named as well, right? It's saying, Hey, you're gonna read a UTF-8 file. But it also does what it says it's supposed to do where we get this false and security with data text IO.
>> Snoyman really has a knack for naming. He's got foldlButLazyIReallyMeanIt and readFileUtf8. Another thing here is that when you're reading a file, often the file type has an encoding already specified. So, like with JSON or XML, you know, up front that those they're going to be UTF-8 So if your system locale is set to UTF-16 and you want to try to read a JSON file, you probably don't want to read that file using the UTF-16 system locale. And that's where the problem comes in.
>> Gotcha. Well, thanks for diving into that one. What's the next one that he talks about?
>> So the next one and the last one, at least for part one of the bad parts, is the bracket function from control dot exception. And this is a tricky one, for sure.
>> Yeah, I think the fact that dealing with async conceptions is tricky to deal with. I know in our code base we've had some odd async exception handling that has kind of baffled us. With time and effort, we get to where we need to be, but in the moment it's a lot.
>> This is tricky to talk about, even because async exceptions when you hear that you may think like Oh, that's an exception that was thrown on an async thread, uh, something that was either spawned with forkIO or with the async package. But that's not quite what it means. And there's a good chance that I will say something wrong in this part because I'm not well versed in exception handling in Haskell myself, Snoyman has spent a ton of time on this. And whenever I run into an exception problem, if there's a blog post by Snoyman about that particular problem I'm encountering, I feel really grateful and happy because I know I can just use that as a solution. But the way I understand it is that async exceptions are exceptions that are thrown from another thread into your thread, and they can happen during pure code. But I guess that's kind of besides the point here. I just wanted to mention that async exceptions as a term can be a little confusing,
>> Right, I think I was probably using it incorrectly, so I appreciate you sharing that.
>> But the part about bracket that he thinks is a bad part or a wart in Haskell is that, I guess if people are unfamiliar with bracket. It's a function where you can acquire some resource and then perform some action on that resource and clean up the resource after you're done with it. So the most common example of this is opening a file, writing something to it and then closing the file. The purpose of bracket is to make sure that opening and closing always happen, regardless of if an exception is thrown while you're dealing with the resource while you're writing to or reading from that file or whatever it is you're doing. And this is really nice, because the naive code you would write to do this, where if you're in a do block and you say, open the file, grab that handle write to the file and then close it, that reads really nicely top to bottom. But if I was going through that in code review, I would say, Well, what happens if an exception is thrown while we're writing to the file, then this file handle is still open and it never gets closed. That's the problem that bracket is meant to solve.
>> So what's wrong with bracket? Why not use bracket?
>> The problem with bracket is that the cleanup action. So like closing a file uses a thing called an interruptible mask, and this is where we kind of get out of my area of comfort or my wheelhouse. But if I understand the situation correctly during your clean up, if an async exception is thrown, your cleanup will be interrupted or canceled, and then that exception will be thrown somewhere else. Probably maybe the main thread. But what this means is that you can't rely on your clean up running because it might be interrupted. So if I've got that interpretation correct, the fix is to use a different thing called an uninterruptible mask. And an uninterruptible mask means that other async exceptions can't prevent that chunk of code from running. Or rather, they can't like cut into that piece of code running so you can count on your cleanup always running.
>> Yeah, and I think Snoyman does a great job here in the blog post, kind of explaining mask, explaining the mystery around it because, honestly, I don't think a lot of people have dealt with it.
>> I'm one of those people. I haven't dealt with it.
>> Right and you're in my mind, one of the most experienced Haskellers I know. So to say that you don't know it is probably, you know, a lot of people who deal with Haskell probably never experienced what that is or how it would create a problem. So, grateful that he has been in those places and he knows the solution and he's willing to share that. So, uh, he does a great job of being sarcastic and like, Yeah, you expect this to do this right? And no, you don't.
>> The good news is that even if you don't understand masks or haven't even heard of them before, it doesn't really matter. You can reach for a library that solves this problem for you. This isn't something that's a problem when the language necessarily it's a problem with what the bracket function in the standard library does. And once again, Snoyman has this library called unliftio which is ultimately used in Rio. So they all kind of fit together, but the unliftio library and its bracket function use an uninterruptible mask so it doesn't have this particular problem.
>> So you're telling me, is all these warts that Snoyman has talked about, you know, as quote unquote bad parts of Haskell. He pretty much has a solution for.
>> Yeah. Yeah, It's, uh I don't know if he's trying to sell us something, you know he wants He wants us to get on board with the Rio bandwagon. Um, but, yeah, sum and product. You know, there's a solution in the works data text IO, you can use the Rio Library. It saves you from that, um, bracket unliftio will save you from. It's really just foldl. And you know, there are solutions to that, too. But the problem with Foldl is mostly that it's there at all, and it's in the prelude.
>> Right? Well, we appreciate him talking about this, and if anybody has any more questions or warts that they've run into, I know Snoyman's looking for that so tweet him or, um, I don't know, maybe make a PR against his repo or something along those lines to let him know some warts. And I'm sure he would love to do part two of the Haskell the bad parts.
>> Yeah. I wonder how many parts they're gonna be.
>> Well, hopefully not too many. Because I know in my mind I don't have a long list of warts with Haskell. I love what it gives us what allows us to do and what, How fast we can iterate. I'm really thankful for that.
>> Yeah, me too.
>> All right, well, I think that's it.
>> Yeah, that'll do it for us today. Thank you for listening to the Haskell weekly podcast. I've been your host, Taylor Fausak. And with me today was Cameron Gera. If you'd like to learn more about Haskell Weekly, you can check out our website Haskell weekly dot News. Or you can find us on social media. Our twitter handle is Haskell Weekly our Reddit user name is Haskell Weekly our GitHub user name is Haskell Weekly, and we're probably Haskell weekly everywhere else.
>> Yeah, and I know if you have a blog post that you maybe want featured in Haskell Weekly. Um, don't forget to submit a PR on the high school weekly. Um, GitHub page. So, uh, and the Haskell weekly podcast is brought to you by ITProTV the e-learning platform for IT professionals And we would love to extend a coupon code to you of Haskell Weekly 30. To get 30% off the lifetime of your subscription, feel free to go Take look, sign up for a free membership. If you're curious of what we do on, Do you know if you love it so much? Sign up with that Haskell Weekly 30 podcast promo code, and you will be saving 30% off for the lifetime of your subscription. So I think that about does it. Thanks for joining us today. We'll see you next time.
>> Bye.