Podcast
Refactoring Yahtzee
You can also follow our feed. Listen to more episodes in the archives.
Cameron Gera and Taylor Fausak discuss using types to guide refactoring toward better design.
Episode 22 was published on 2019-10-14.
Links
Transcript
>> Hello and welcome to the Haskell Weekly podcast. I'm your host Taylor Fausak. I'm the lead engineer at ITProTV. And with me today is Cameron Gera, one of the engineers on my team. Thanks for joining me, Cam!
>> Thanks for having me, Taylor! It's been a little bit.
>> Yeah it's been a little bit. We're happy to have you back on the podcast.
>> Appreciate that.
>> So today we are going to be talking about Haskell of course, but more specifically refactoring and types. What's the article we're gonna be covering, Cam?
>> Yeah so it's it's an article by Tom Ellis. It's called "Good Design and Type Safety in Yahtzee" which is actually an article looking at another article, improving some code, and doing some refactoring, which is really, really informational. So I'm really excited about today.
>> Me too! Because this like you mentioned is kind of a response article to another one where somebody had some code that they'd written and they slapped some types on it and we're complaining that it kind of made things harder to read and didn't give them too much type safety.
>> Right yeah. The words they used were "unreadable" and "unmaintainable", which I get. Looking at the code, it was pretty hard to read. And I was like, there has to be a better way. You know there's all kinds of stuff and I think we're going to talk about the various steps that Tom took to kind of say hey like type type safety isn't something you just throw on it's something that you design with you know when you have good design you know type safety is just there and it allows you to really feel confident about the code you're writing
>> exactly so instead of taking some piece of code and throwing types on it you kind of develop the types and the design in lockstep one influences the other okay
>> which i think was cool yeah because yeah there's a quote he said he you know I don't think I'm gonna quote it exactly but you know he talks about building type safety struck type safe structures and combinators relevant to the domain and then that's what we use with the implementation
>> right yeah so this is going to be a little interesting to talk about because obviously we're talking about a piece of code and it's really hard to communicate code over voice
>> right but you know think about the game of Yahtzee that's what we're talking about so the game of Yahtzee you have five dice you roll them you have it believe tends to be three rolls and you can you know it's a dice so it's one to six and yeah so what we're doing I gotta do is understand what the roles are I guess that's that's the name of it it's called all roles as the function we're gonna kind of be evaluating
>> yeah so the the original function by Mark Dominus the blog post he wrote about it it has some role presumably from a game of Yahtzee and it is given some choices like reroll that die keep that one and then figure out what the next results gonna be so that's the that's the game plan here and as Cam mentioned this kind of starts off as something that the the quote-unquote not type safe version is kind of understandable but a little dense and then the quote-unquote type safe version is just completely impenetrable because there's so many like wrapping unwrapping things going on
>> right yeah I mean the type safety doesn't you know doesn't just get slapped on like we said earlier so I'm really excited to kind of dive dive in
>> so let's do that let's dive right in and get started here so we start with the original implementation just kind of repeated in full and from a high level what we're doing is applying a bunch of very small refactorings to this piece of code
>> an iterative basis
>> yeah iteratively so at each step we don't need to know too much if anything about what the original code is doing we're just saying I recognize this pattern as something that could use a little more type safety so let's measure it in that direction
>> Right and and as somebody who's been in Haskell for a year now like some of these suggestions that we're going to talk about are now relevant but starting out I would have been very like wait what you can do that um so you know if you're a beginner like this is something you kind of learn with time and you know this article could be a good help for you to understand and see you know not understood what the code is doing but the patterns and which its implementing and being able to pull it out you know the common you know best practices of Haskell
>> Right um yeah it's kind of like a quick tour of some kind of design patterns that you might use in Haskell for refactoring and I've also heard people describe this as a intermediate level Haskell post where there's a lot of the kind of beginner stuff of like here's a monad here's how to do addition you know real rudimentary things right and then there's all the super advanced you know type level shenanigans stuff and this one's right in the middle of like actual work a day just how do you make software better
>> right yeah which I'm a fan of because you know as a day-to-day engineer this is stuff you use yeah like we're not designing this giant you know data parser that needs to be perfect but we're also not you know just making a simple API they say hey I'm here you know like there's nothing so we're in the middle and this is what kind of you plays into it so I think we should jump right in you know the first thing he notices and this is something we do more in intermediate steps but is the use of undefined so that means there's a case that we aren't accounting for there's something like that's the case we don't expect to get to because you hit that undefined and runtime and you're blowing up
>> Right now and you're blowing up with a really useless error message message that just says undefined happened somewhere good luck figuring it out peace yeah so how do they improve the situation
>> Right so he says you know all right we we have this undefined that's a red flag let's throw a useful air that's what he does first he says all right we're gonna you know call the air function with a string that says you hit this location and there you know choices has to be the same length as the Val's or values that you're working in it's
>> alright so after this change if you do run into that case if something went wrong you now have a breadcrumb to go look for and say okay well I can track things down a little bit
>> Yeah and the other thing he does is you know to kind of set himself up well for successes he avoids you know catch-all patterns right like because I catch all you know there's a chance you're missing something there's some invariant that you don't know about because you're like anything happens here you know we're gonna fail
>> yes so if your pattern matching on like the empty list and then your padding matching pattern matching on the list having exactly one element and then like a catch-all you could think oh well the catch-all is for when there's three or more elements but really you're missing the case where there's exactly two and it's you know ghe is not gonna warn you about that because you're explicitly handling quote-unquote every case with that catch-all but it may not be working the way you expect
>> Right so you know he he kind of you know that's not the next step after you know throwing the areas oh hey like let's not use catch all the uses catch-all internally but he D structures that wasn't working right and you know with the warn all which or warned everything or all the high school you know compiler warnings he kind of turns on he's able to find that there's another invariant in there
>> Yeah so he keeps strengthening the invariant or finding more right where he's like okay previously we just had one case here but really there were two cases kind of hiding behind this thing mm-hmm
>> yeah and I've everything down the road that's something for like for me like as a intermediate you know mid-level high school developer is like understanding that catch-all aren't always the best solution and to maybe try to put out every case and see if there's something that you know you can bubble up to the type system that would have value not to have to even care about these invariant cases right which is kind of what he's doing here which is really really for me informational and I would recommend anybody to read this article because I felt like I was like oh okay I I don't have to just use a catch-all because it's you know what I know to do mm-hm I can write out each case and understand and see where the logic air or the invariant could live and maybe bubble that up to the type system like I said
>> yeah and even though we're talking about kind of removing catch-all cases here as a refactoring step sometimes it can be useful to still use them and especially if you have a bunch of them and you write them all out all excuse me you write all of them out explicitly mm-hmm you may see like oh these I have twenty cases here and nineteen of them are the same I'll go replace those with a catch all right that's done that legwork to say well I looked at all the cases
>> Right you're understanding the code and what you're writing and that's you know part of refactoring like yes there's things you can do in high school that you don't have to fully understand what the code is doing to refactor but if you want to understand the code you want to make it a useful refactor it tends to be better to take the extra step yeah which i think is really cool and I think you know we can all kind of glean that and you know some information from that process
>> So moving on from the catch-all patterns this next refactoring I feel like is a really big one and mm-hmm the key thing that it does here that I see is that they recognize a function is doing two things and they pull them out into two functions
>> Yeah I don't think you know a function doing multiple things is always the clearest choice mm-hmm
>> Especially for pure functions you know it's very common for some effectual function to do a lot of stuff like write you know it's stitching a bunch of other things to get
>> right right right but pure functions yeah there's no reason to do that mm-hmm
>> and as we'll see later on it can often really clarify things to have two very tiny functions that do separate things and then you combine them into one top level function that does everything
>> right yeah definitely makes refactoring easy so yeah this function he pulls out is great because it kind of checks for the unvarying it says I want to pull that logic out and you know if I find this invariant I'll throw my air mm-hmm if not then I'll just return the values that you know the function the function I extracted from expects right and then it also clears up the the main function that we were using yeah oh hey we just have a you know in this case it's a maybe you know okay we just have a case that says is you know is it just or is it nothing and the air is handled within the separate function
>> Yeah it kind of quarantine those invariant checks into one function so that outside of there you can generally just say okay I'm assuming those have been handled which is really nice and as you mentioned it also cleans up reading that top level function that calls the other one because you don't have to when you're reading it you don't have worried about the logic there you can say okay that other function handles it I can read that if I'm interested but when I'm here I just have to pattern match on that maybe
>> Right yeah image yeah I think that's something we as an engineering team have gotten better at is understanding like okay like we don't have to do everything in this one pure function we should break it out make make it smaller make it easier to understand right and kind of keep keep bite size pieces rather than a whole sub or something
>> You know yeah it's almost never a bad idea sometimes you can go crazy and have way too many small functions right but I think it's usually pretty obvious when that when that happens especially when you're trying to read through one function you have to keep bouncing around other definitions makes our I don't you come out
>> Cool well we're gonna move on go to the next one the next one is a little bit trickier because you have to think through the design process and kind of have a better understanding of what your code is doing so this is you know kind of that next step and it's you know kind of finding that a value is unused so when you're looking at a function especially a recursive function and you see a value that doesn't end up being in the final result that kind of stir your stomach and kind of feel like something's wrong and for us I'm sure there's lots of pause spots in our code where we're like yeah we don't need this but you know having senior engineers like you and Cody allows us to think through those things and see them and identify them faster so with time you get better at that but you know Tom here is like hey like there's this value we're kind of using and at the end of the day we don't ever display it or return it it's just to keep track of something in a recursive function mm-hmm and so he does a lot he'll he kind of says hey let's take some time and think about why we're returning this value and he kind of puts in like a test like hey if we hit this value this what we expect to be here then we should throw an error because we actually don't expect to be here like it's not a final value
>> Right and in a way this is another type of invariant and the invariant is we're not using this value but I think it's also to point out here that we're talking about refactoring and using type systems but really in this case you're relying on the test suite because he's saying I looked at this value and I don't think it's used so I'm gonna plug in an error there and then rerun my tests and see that that error doesn't get thrown so yeah that values not used right so a lot of people think that you have to kind of use types or use tests but really you use both right both help
>> Right and that's kind of his whole scheme yeah point in this article was like it's not one or the other it's together right it's a symbiotic relationship which i think is cool and so he kind of indicates okay this is unused and then he kind of takes it you know a step to say oh wait like we're kind of having these different types that are you know they don't need to be bound together they're not you know something that relies on one another so he as a tuple here that he's like I don't feel like this types actually used so he kind of gets rid of the type alias and says okay like let's just name it and let's see and take some steps and see if we can get these to be independent right they're you know independent arguments and then if those arguments don't get used we can factor them out a little bit better
>> Right because it's a lot easier to get rid of one argument than it is to get rid of half of a tuple right mmm half a tuple One Bowl oh one pool okay just parens mm-hmm
>> So yeah he kind of takes a couple steps to rearrange some stuff
>> Yeah and I think it's interesting the way that he does this because he recognizes that these two values that are in a tuple together probably shouldn't be together but he doesn't immediately delete one of them he kind of prized them apart and then makes it easier for himself to delete that single argument later on and I really like that because he identifies the refactoring that he wants to do of getting rid of this member of the tuple and then he does like a intermediate refactoring that makes that thing he wants to do later a lot easier so that instead of getting rid of one thing out of a tuple he can get rid of an ax an argument to a function which as we said is a little easier to manage
>> Mm-hmm yep and I feel like yeah I think like error message is associated with that substance easier to rather than like oh wait we have a to poot yeah yeah I just think it's all around a better choice you know and if if there are two arguments that are very closely tied like yeah a tuple is understandable he kind of gets back to that at some point yeah but he says like hey like let's let's figure out where the real connection is here
>> Right he's not saying that the tuple is the wrong data structure he's saying it doesn't look right right now right here so let's pry it apart
>> Right so I think that's good and I think you know for you know mark and as he wrote that you'll deal role dimension yeah the original authors like you know he's still learning he's kind of figuring it out he's you know saying hey like this tuple seems like the right moving and that is as software engineers that's art that's what happens that's our job we say oh this works we get it functional we make it work we'll make it you know do what we expect it to and then you know we can come back and refactor so I think this is very information was like you know it worked it's a good it's an okay solution but you can kind of look at it and say oh let's let's bring it do something better
>> Yeah and in case in point the refactoring here isn't let's slap some types on it on the arguments after the fact it's okay let's take that function and really break it down step by step and find the better ways to do those things
>> Yeah so I think that's cool and as he rearranges arguments he actually put something back into a tuple mm-hmm that he's passing to this that helper function we created earlier that pure invariant function right and he's you know he made it a tuple and then he realized oh we passed that tuple to this function we deconstruct it but we don't really need to deconstruct it yeah because we're just passing it to a function and that function returns a value that we can you know deconstruct individually rather than having you know this confusing decoupling in the type that it's not that direction but yeah the function declaration
>> So instead of matching on this tuple and grabbing the left and the right value out of it and then later on building that exact same tuple again you can not do structure it and pass that original tuple straight through
>> Mm-hmm yeah which is a nice nice add you know there's nothing wrong with D structuring the tuple
>> But it makes it harder to read like you said it can be hard because you when you read that you think okay I'm gonna be using each piece of this individually but really you're not you're just building that thing again later
>> Right now if you later you're trying to use that value one of those values for me that makes sense yeah but if it's only being passed to another function there's no reason to do that the next one
>> i think is the big one
>> is the Big Kahuna if you know what I mean and this one is something that you know you you probably need to know and understand what the code is doing yeah to do effectively
>> yeah and I back it up a little bit all the things we've talked about so far you don't really need to understand what the code is doing you can just look at it and say oh I recognize that's kind of a weird pattern let's do it this other way instead
>> Mm-hmm yeah and I mean Italy the big idea here and he kind of says in a summary is like you know these factor house is gonna hold your hand through it you know like you can do this kind of refactor in any language it doesn't like there's always a way to refactor the nice thing about high school is it's gonna hold your hand through
>> Yeah it helps you along the way right
>> So this next big one is this idea of like that unused value that we kind of talked about earlier that's not really needed in the core of the function is we're getting rid of this we're making a structural change and design decision that allows like the confidence that the type is doing what we you know or the function and types are doing what we expected yeah
>> We added that invariant that said we're not using this argument mm-hmm and now we're finally getting rid of that argument so we're like taking this information that we had only in our head and pushing it into the type signature of this function so that it doesn't need that argument anymore
>> Right you know and pushing out some of the you know doing the doing the making it pure and pushing out like the core logic into a separate function and allowing the original function just to say hey I know I keep track of this integer because I need to know where I am recursively mm-hmm but it doesn't require that you know you know if I have an empty list of choices that okay I do nothing and if you know we have something then okay let's we don't really care about what that value is we just need to do an action right so I think that's really cool
>> Yeah so that's one of the big refactorings and I feel like we're maybe not doing it justice over audio but they then follow up with another really huge one which I think is the the main kind of rallying cry of type systems which is making legal states unrepresentable so they say like okay we've done all these little refactorings to kind of clean it up and make it easier to understand but now let's do one that actually eliminates some of these invariants that we have and pushes them into the type system
>> mm-hmm oh yeah the big kahuna
>> Yep the other big kahuna there's two of them yeah
>> Yeah which I mean that's kind of what he's saying is like hey there's little things you can do but there's also bigger things you can do so I think it's very informational and yeah I think you know kind of this is the moment where you say okay these two things are so closely tied together that they shouldn't be separated and there should be a you know the way to make the type system make the legal states they represent not representable yeah I don't words are hard anyhow you know we're kind of talking here but no we're trying Travis
>> And this step actually reminds me of a blog post awhile ago by Matt Parsons called type safety back and forth where he mentions that you can have functions that you know return and maybe a return in either or something like that or you can have a function where the input to that function represents that invariant that you're that you were enforcing by the output so I feel like that's what's kind of happening here we had a function that returned to maybe because it was kind of like connecting these two lists together mm-hm and what we did instead was pushed that connection out of the function and said okay I take a tuple where the things have already been connected and then I don't have to return it maybe anymore I can always return something right so in this particular instance they pushed it out to a zip at the top level that says okay combine these two things
>> Right and you know it it it didn't fully get rid of our maybe but he got us closer it got us closer right it allowed us to see what she can talk to on Nexus you know okay we can use on cons which Jack's very step after that is don't use unconscious pattern-matching but that's the thing is you take baby steps exactly okay this I see this as on console okay well okay don't quite need uncon see here I can just pattern match yeah it's just steps
>> you don't need to go 0 to 100 in one step nope this is another upshot of doing really small functions like we talked about earlier is that sometimes you'll have one of those small functions and you'll see oh this is just some other functions that I already knew but I didn't realize it when I was using it in the first place because often especially in Haskell with really polymorphic types you can have something that's really concrete and then when you actually write it out you say oh this is just Traverse or map M or something like that
>> Right which you'll see later yeah he's foreshadowing in case alright so the next one for me is something that's beyond helpful and it's the unit using dude notation mm-hmm you know we have the bind operator in Haskell and that's great and if you can make the bind operator look understandable that's awesome yeah I'm a fan but the fact that 95% of the time in my opinion do notation is much clearer yeah and yeah I know it these sugars to the bind operator underneath the covers but it visually and understandability was it doesn't help and so that maybe that's part of the reason why you know mark said oh this is unattainable and unreadable like this bind operator right here is not helpful
>> Right yeah it's got to bind right in the middle of like the core business logic know that this thing is doing so you have to understand not only all of the Yahtzee rules but also this weird Haskell operator and how it works
>> right which direction is it going what's you know obviously the type system gonna help you with that but
>> Yeah and I'm with you 100% I feel like if you can use do notation you probably should even if it feels a little silly if you're only like pulling one value out or something it just it's so much more familiar to a lot of programmers
>> Right because it's a step-by-step process like okay do this this value now I can use that value here
>> and even in cases like this where we're using do notation with list datatype so we can think of it kind of sequentially like grab this value then grab this value but really the list data type is representing choice for us so it's saying well choose a value here and then choose another value and then combine them in some way so you know that's the that's the value add of mana adds is that you can choose a different representation and this one we're using lists to represent doing different stuff but we still use do notation right nice
>> good old list monad cool well the next step is something that I think as this refactor happened it became apparently clear is we are taking you know the head of a list modifying that value and then sticking it back onto the list right you know and in him anybody's mind who's been around Haskell cultural programming you know even in NOLA no matter what the language is oh I'm just mapping over a list and doing some operation on each value
>> mm-hmm that's a map
>> And so he's like hey and you know we're in the list monad so let's use math in
>> Right because the action we're taking for each element is monadic but yeah this is this is a great example of these really small steps paying off because you're looking at this new saying you know what this is just a map let's call it a map and that way whoever reads it next doesn't have to understand like the manual recursion that's going on or whatever it is they're like oh map I know how that works
>> mm-hmm yeah no and I think that was a big a big boost I mean it literally took it from five you know seven lines to to
>> write it really crunches it down
>> Right and and it's not any less clear right
>> I'd argue it's more clear right even though it's shorter right
>> Because you don't have to worry about oh this case what's going on like it's all kind of taken care of yeah which I think it's really cool
>> and then finally we get to the last factor
>> Which is one we've done an entire podcast on exactly go check it out but avoiding boolean blindness mm-hmm I think last week when our teammates had like this question like oh you just use a boolean here and we're like we're all like wait let's think about this like is this a value we rely on or is it just a value of returning and you know if you're just returning the boolean okay that's fine yeah but if you're trying to use that in an operation it's yeah it's not clear
>> Right so in this particular example we have a tuple with a boolean in it and some integer and if I just describe that to you you don't actually know what that means but what we refactor it into is a data type that says reroll this died or keep this role that we already have had and its value was this integer now obviously the data type constructors are much shorter than what I've just said right but they communicate that intent much better than a tuple with these two things in it
>> Right yeah it's like oh what am I doing with this boolean yeah oh I'm either rolling or I'm keeping it you know because it's a state of type and you know you can create a simple function that turns that data type into the boolean result you want
>> Right or you can pattern match on it right so then it reads a lot better when you're reading through the code you don't have to say if you know this boolean part of this tuple is true then do this blah blah blah you can say oh if we're reroll in do this or if we're keeping it do that
>> right boolean blindness
>> So much nicer being a blind
>> Yeah I love seeing yeah I know I know we've talked about this internally as a team is you know using lambda case mm-hmm like it's cool I was like I guess he you know he uses lambda case which is cool and I think it's nice I just also the normal case statement in my mind is
>> It's not that much worse
>> right I'm not so yeah and there you can you know you were uncontroversial
>> Yeah I mean it's a pretty common language extension mm-hmm but I do feel like it kind of undercuts his argument here where you know Haskell has a reputation of needing all of these language extensions to do anything and I'm happy that he shows one here because it does make it look nicer but I feel like he could have just as easily written out the lamda explicitly and said well this is kind of you know maybe not super nice and if you want to avoid it use lamda case but really every other programming language is pretty comfortable using lambdas you know JavaScript Ruby Python whatever you would just write the lamda and nobody would think twice about it mm-hm so yeah and now we've arrived at the end we have this perfect refactored all roles function and we can now play Yahtzee in Haskell
>> oh yeah and we can understand what we wrote yes helpful yeah no go come back later yeah we know what this is doing rather than there's something going on here I'm not sure so I mean I would definitely agree with with marks like feelings of it being on his original work being unreadable and I'm maintainable but I think you know Tom says hey like I you I get that and Haskell can be unreadable unattainable but if you focus on not just type safety but good design and type safety right you can create a beautiful work of art and you know
>> Yeah and and we now have those two things complementing each other where you use the types to influence your design and your design influences your types as well that's the sweet spot to be in
>> Oh so sweet yeah I mean I think this is a great article by Tom I would definitely encourage all of our listeners to go check it out it's a little hefty but it's a good intermediate post
>> Yeah and it really teaches you or shows you rather these refactoring strategies you can use in a school or really as he mentions any language but the reason that he focuses on Haskell here is a because he knows it I imagine but be it really pushes you in the right direction with a lot of these decisions it makes them easy
>> Right yeah it definitely nudges it away so I think that's pretty awesome but uh yeah I appreciate being on the show today
>> thanks for being on the show with me cam it's always great having you and thank you for listening to the Haskell weekly podcast this has been episode 22 we hope you enjoyed and if you did please go rate and review us on iTunes and tune in next week where we'll be talking about Haskell once again who knows what though
>> Yeah and you know obviously this is sponsored by IT Pro TV so any of your sis admin's or you know networking geeks need any training please have them check out itpro.tv we'd love to get you on board and help you out with all of our engaging content by various IT platform so we're quite excited and we're definitely gonna miss you guys but we'll be back next week
>> See you then