This is the post about how we started using Haskell in a few practical projects.
“After getting over the bump, the language makes a lot of sense and it is not difficult to learn.”
I’m reminded of a spontaneous outburst of mine. I was talking to some business partners about usability problems with a website of theirs, a website we had to use in order to do business at all.
They said, “It’s not that hard after you get used to it.”
I responded: “That’s true of everything except getting tortured!”
I have a theory that difficulty of learning is a worse-than-linear function of conceptual difficulty. That is, the whole (of difficulty) is greater than the sum of its parts. This is because of frustration and despair. Haskell might be only twice as conceptually difficult as, say, Erlang, but maybe four times harder to learn, even at best. But worse – far worse – the introductions to the concepts offered by many in the Haskell community are not exactly the best imaginable.
Take GADTs. Here is a Wiki introduction that I promise you I’m not making up.
“Generalised Algebraic Datatypes (GADTs) are datatypes for which a constructor has a non standard type. Indeed, in type systems incorporating GADTs, there are very few restrictions on the type that the data constructors can take. To show you how this could be useful, we will implement an evaluator for the typed SK calculus. Note that the K combinator is operationally similar to …” (followed by some lambda calculus stuff…)
OK, let me roughly summarize my reaction at this point, as briefly as possible:
What the fuck?! What the fucking fuck?!!
That’s not a “motivating example.” That’s a DEMOTIVATING example. I know. Because when I read that, I was sorely tempted to never look at GADTs again.
But, really, I actually want to help. So I searched further and found this:
Now, as of yesterday, I think that’s actually a more approachable article. Why? Because of my edits. I scanned through it, got to the end, found a paper mentioned as “fun”, scanned through that paper to look for the actual concrete applications promised, and found them: binary compression, and a way to implement something like C printf. (These were quite a few pages into that paper.) And I moved the relevant paragraph of the GADTs for Dummies article to the beginning of the article, and added some mention those two applications.
HOWEVER, I still don’t understand exactly why using GADTs and not typeclasses is superior in any way. OK, so you used GADTs, and didn’t need to use typeclasses. What’s good about that exactly? For that matter, what do GADTs buy you? “They help you reason about your code” is best followed by tiny examples (but still with some vague real-world relevance) that show when and how they are better than the alternative.
I wish I understood. I just have to hope I left the subject more understandable (or at least potentially more useful) to the puzzled Haskell newbie. But perhaps my edits can use some editing. Feel free to correct them:
One of the main use-cases of GADTs in real-world code is to describe the abstract syntax tree of a programming language more accurately. SKI combinators are perhaps one of the simplest languages imaginable. I think that is why the motivating example chooses SKI combinators. I agree that this is not approachable for everybody, but GADTs are an advanced feature. Maybe a very simple imperative language would be easier to understand for some people, something like:
data Stmt = Skip | AssignBool String (Expr Bool) | AssignInt String (Expr Int) | Sequence Stmt Stmt | If (Expr Bool) Stmt Stmt | While (Expr Bool) Stmt data Expr a where Constant :: a -> Expr a Var :: String -> Expr a Unary :: UOp a b -> Expr a -> Expr b Binary :: BOp a b c -> Expr a -> Expr b -> Expr c data UOp a b where Negate :: UOp Int Int Not :: UOp Bool Bool data BOp a b c where Or :: BOp Bool Bool Bool And :: BOp Bool Bool Bool Lt :: BOp Int Int Bool Leq :: BOp Int Int Bool Eq :: BOp Int Int Bool Geq :: BOp Int Int Bool Gt :: BOp Int Int Bool Add :: BOp Int Int Int Sub :: BOp Int Int Int Mul :: BOp Int Int Int
That is a lot larger than the SKI combinators example, but perhaps more concrete. With GADTs like this you can ensure that
If (Binary Lt (Constant 1) (Constant 2)) Skip Skip is allowed, but
If (Binary Add (Constant 1) (Constant 2)) Skip Skip is rejected at compile time by the type checker, because
If expects an
Expr Bool but it actually gets an
Expr Int as first argument. So, it kind of allows you to reuse GHC’s type checker for your own embedded languages.
They said, “It’s not that hard after you get used to it.”
I responded: “That’s true of everything except getting tortured!”
I really can’t agree here. I’m an early(ish) adopter of Python and my experience of it started off easy but got much more difficult as time went on (because I wanted to express increasingly complex things with it). My experience of Haskell has been the opposite. It started off difficult and frustrating but it’s now easy.
I’d take a difficult-up-front-but-eventually-easy language over the opposite any day. That’s not to say we shouldn’t make Haskell easy up front, however.
What I meant is that it is not difficult because it is principled. For example, once you understand that operators are just regular functions used in infix notation, it is not difficult to lookup their meaning.
Or in other words, Haskell is hard, but fair
Not everyone has the same days you do, apparently. If your preference was representative of most people writing software, the beginners mailing list wouldn’t be a thin trickle of messages. To me, that looks like “Haskell is dying.” What do you personally think Haskell deserves? I think it deserves better wiki articles, so I try to improve them to make Haskell more appealing to people trying to learn it. And I’m trying to do it from a newbie perspective (perhaps making some minor errors of terminology or whatever), before I wake up one morning thinking it’s all obvious and people who don’t get it don’t deserve to get it.
I’m reminded of that story about Norbert Wiener. He was lecturing somewhere (maybe MIT.) At one step in a proof he says, “And it obviously follows that . . .” Then his brow furrows. He stares at the gap between one step of the proof and the next he’d just written. He gets so lost in thought that he wanders out of the classroom. 20 minutes later he rushes back in. There are only two students left. One is taking a nap. The other is doing homework for another class. He shouts, “I was right! It is obvious!” Then he fills one of the blackboards with his proof of obviousness.
Lots of things that are rigorously principled are nevertheless difficult. (<-Just in case you miss my point.) “Hard, but fair?” It doesn’t seem fair to me that error messages can feel worse than useless to newbies.
I’m not quite sure how to respond! Despite not sharing many of your perspectives I fully agree with your conclusion that Haskell would greatly benefit from more accessible documentation and I welcome your contributions. (I have made some very modest ones of my own to that end.)
OK, but how advanced? Are GADTs considered “advanced” because they are really hard? Or could it be that they came along later, and therefore tend to be talked about in terms that are more advanced (or simply more elaborate) than people really need? Or is it some combination of difficulty and mere history?
This sort of question isn’t new. It’s a question that Richard Feynman asked about quantum mechanics. Feynman was pretty ferocious when it came to how people wrote about physics, and about how physicists explained things. Actually, about how anybody explained anything. Danny Hillis, in relating how Feynman was useful in the Connection Machine effort, mentions this about Feynman.
“We tried to take advantage of Richard’s talent for clarity by getting him to critique the technical presentations that we made in our product introductions. Before the commercial announcement of the Connection Machine CM-1 and all of our future products, Richard would give a sentence-by-sentence critique of the planned presentation. “Don’t say ‘reflected acoustic wave.’ Say ‘echo’.” Or, “Forget all that `local minima’ stuff. Just say there’s a bubble caught in the crystal and you have to shake it out.” Nothing made him angrier than making something simple sound complicated.”
I thought that was a pretty cool attitude, when I read the story at the time Hillis wrote. What I didn’t realize until twenty years later was that this attitude, combined with Feynman’s sheer force of personality, revolutionized physics teaching. I only found out about this revolution when I read the introduction to a book that had some Feynman lectures in physics that didn’t make it into the final editions. He had changed how physics was taught.
I learned quantum mechanics in my third term of physics. I just thought it was normal. But in Feynman’s time, it wasn’t. Quantum mechanics was just assumed to be unreachable until graduate school. Feynman didn’t think so, and he led a major effort to get physics professors to rewrite undergraduate physics textbooks – even the lower-division undergraduate textbooks.
If I’d been born even ten years earlier, I would not have been taught quantum mechanics. As it happened, I was learning QM toward the end of my second year in college. That’s not to say I liked it. It was just a prerequisite for an electronics course that was a prerequisite for a logic design course, which was a prerequisite for a computer architecture course. Something about quantum mechanics still creeps me out. And that final term got me my first course grade of B in physics. I’d gotten As before. While slaving late at night on homework assignments I didn’t understand, I wanted to go back in time and strangle Louis de Broglie. I wanted to help design computers and Louis was in the way! But maybe this steepening of the curve toward the end was intentional – Feynman could see the wisdom in stacking something on top of the sequence that would filter out students who weren’t really as smart as they were starting to think they were (as was the case with me.) After all, not everybody is cut out to become a physicist. So why encourage students to go on when they might just wash out later? Physics curricula shouldn’t deprive the world of promising electronics designers. The world needs those people too. So, set up a gate at the end that will help divert some students onto a more useful path.
It did turn out to be a little bit useful for me when I took a VLSI design course and we had to understand nMOS transistors. Electron holes as charge carriers. Gate voltages as a way to create them in semiconductors, so you could make simple switches and build more complicated ones from them. There’s a bubble in the crystal and you have to shake it out.
How does that attitude apply to making Haskell more approachable? I think that, even for people who never get to GADTs, to look ahead and see that people have tried to make the harder concepts more approachable – that’s a good thing. It will encourage them to keep moving along from where they are. They will feel like nobody is trying to make Haskell a set of progressively more difficult IQ test questions. Quite the contrary: people are trying to help learners get as far as they can get, at every point along the way.
I relate to this as the “wizards principle” - haskell wizards use advanced feature sets of the language to create spaces where amazing things can happen. The best wizards in haskell understand that they are creating these structures for mere mortals as well, and the structures they make are still approachable for the uninitiated, yet use the best haskell has to offer without holding back that power. Unfortunately, some of the smartest and most creative wizards aren’t as self-aware or cognizant of what the world is like for the rest of us trying to keep pace and build sturdy software with the best of what Haskell has to offer. So yes, unfortunately, there are a lot of poorly written explanations out there (very-well written given another set of criteria, but seen thru the feynman lens, not so spectacular), and that’s just the beginning.
While it is important to understand how things work, writing an HTTP service, json API client, or what-have-you, should not require graduate-level abstract theory. Given the principled foundation Haskell is built on, the challenge is bridging the gap and keeping it all approachable.
This post was flagged by the community and is temporarily hidden.
Technically GADTs are a strict superset of “normal” algebraic data types (ADTs), and they are even a strict superset ADTs extended with existential quantification. Essentially GADTs = ADTs + ExistentialQuantification + type equality and coercions. (Yes, I know that is a whole bunch of jargon)
Existential quantification on its own is already complicated, especially when combined with type classes. Normally I expect type classes to be global and static in the sense that if there is a type class with an instance for a specific type then that instance must be accessible everywhere and unchanging. But with existential quantification you can basically introduce an instance locally by pattern matching on a constructor. See this hopefully simple example:
data Showable a = Show a => Showable a -- or GADT style: -- data Showable a where -- Showable :: Show a => a -> Showable a f :: Showable a -> String -- no (Show a) constraint here f (Showable a) = show a
But then you can write a function that forgets that the instance exist and using that you will run into errors trying to show the contained variable:
forget :: Showable a -> a forget (Showable a) = a f' :: Showable a -> String f' x = show (forget x)
Test.hs:14:8: error: • No instance for (Show a) arising from a use of ‘show’ Possible fix: add (Show a) to the context of the type signature for: f' :: forall a. Showable a -> String • In the expression: show (forget x) In an equation for ‘f'’: f' x = show (forget x) | 14 | f' x = show (forget x) | ^^^^^^^^^^^^^^^
I would definitely call this advanced Haskell.
Kowainik puts Type Level programming (the light yellow blocks) in the “most difficult” and “learn last”, top right corner of their Haskell Knowledge Map.
For what it is worth, we are not using GADTs in the practical use cases showed in the blog post. Actually most of our code is written in plain Haskell2010 with OverloadedStrings. The only exceptions were for optparse-generic record creation and servant api definition.
This was a really good example. I’m not sure if we’re supposed to maintain the wiki going, I recall reading something about archiving it for historical purposes, but if that’s not the case, may I suggest adding this as the main motivator / example in there?
OK, but why, in code that’s for some practical use, would you use GADTs? I don’t count modeling various systems of logic, etc., as necessarily practical. I am looking into logic and MTS because, for my purposes, some of it might be practical. If NSM is complete enough for all natural-language semantics, some significant real-world problems in natural language understanding become easier to solve. But I’m far from sure at this point. In the meantime, encouraging newbies in the belief that Haskell is generally practical doesn’t hurt, even when it comes to advanced topics that they may never reach.
So, thank you for the simple example, but I have to ask: what do GADTs really mean for the working programmer, who ultimately works for someone who only cares about impact on the bottom line? Do they make anything easier to write, easier to maintain, faster, in ways that have real-world application value?
I would add “describing abstract syntax trees more exactly”, except for one thing: I’d like to be able to phrase the value of this exactitude in ways that make a Haskell–newbie programmer feel it’s a productivity enhancement, and for that matter, a manager of programmers as well. It matters less to me whether it actually gets used than whether it can be part of selling Haskell into the enterprise. If a manager is leery about trying Haskell out, having heard vicious rumors that his shop will just fill up with airy math geeks, and when they leave, he’ll be left with almost-unmaintainable code, wiki articles that describe practical use cases everywhere possible are only to the good.
I still vividly remember the moment when I had to look at optparse dependencies and saw abstract algebra. Wow. How deep does the rabbit hole go? If there’s a good treatment for why such a component is used, that’s an excellent contribution to the wiki article on GADTs. Certainly more nuts-and-bolts than SKI.
Language oriented programming aficionados (which I might be one of) will say that almost every programming problem is actually a language problem. If we encounter a problem we can first design a language that exactly fits that problem domain. Then writing a program that solves the problem becomes very easy. And hopefully the language is general enough to solve more than one problem, so that designing the language wasn’t a waste of time.
Practically, many languages have been written in Haskell, some using GADTs. I really like
accelerate which uses GADTs extensively internally, but I think its use is mostly encapsulated in that package.
Another example that is close to my heart are effect systems. Explaining why effect systems are useful is a whole other discussion. But I think GADTs and/or Type Families are absolutely necessary to embed effect systems safely into Haskell. Examples I know of:
And there are more common and smaller examples of GADTs, almost all packages which advertise being “type-safe” will contain GADTs and/or Type Families. Some examples I can find searching “type-safe” on Hackage:
Great posts. At the very top a comment was made about GADTs v type classes. Generally GADTs provide clearer type error messages compared to type classes. A big win when using the type system to help design robust code.
Well, if the plan is to archive it, I think that would be a shame. Before I decided on error message clarification for newbies as a priority, one wiki article I wanted to improve was about use of Haskell in companies. The page was had a lot of dead links for dead companies. So, getting some history of failure (which is practically a meetup institution now in Silicon Valley) would be instructive, especially for Haskellers with entrepreneurial ambitions.
One successful company I found in the list intrigued me: most of the code is done in Quito, for clients mostly to the north (U.S./Canada). At first it made little sense to me. But then I realized: (1) there are lots of Spanish-English bilingual business brains in America. (2) From my time in Quito, I knew that there are also lots of very bright and underutilized people there (and elsewhere in Ecuador – I also visited Guayaquil.) Finally, going for coders in a “middle-income country” made a lot of sense if you’re training people up on a hard language – it saves money on the learning curve.
Since my own ambitions intersect STEM education in poorer and middle-income countries, I’d love to know how they pulled this off. So those are the kinds of stories that I’d like to document (at least in capsule form) on that wiki page, for the benefit of other Haskellers who either have entrepreneurial eyes or who want to sell the business benefits to their management.