Support pun-free code #270

There’s currently a proposal to enable a set of warnings for code that uses punning. This is carefully not taking a position on whether punning is a Good/Bad Thing, but providing tools to help those who are making a stylistic choice to avoid puns.

I learnt Haskell from the ‘Gentle Introduction’. It introduces punning on [1,2,3] :: [Integer]; ('b',4) :: (Char,Integer) without comment.

A little later (Section 2.2) it explains in an aside for user-defined data types/constructors

[Type constructors such as `Point` and data constructors such as `Pt` are in separate namespaces. This allows the same name to be used for both a type constructor and data constructor, as in the following:

data Point a = Point a a

While this may seem a little confusing at first, it serves to make the link between a type and its data constructor more obvious.]

And that’s all it says. (The word “pun/ning” doesn’t appear.)

Now there was plenty in that Intro that puzzled me; but punning didn’t; and “the link … more obvious” appealed to me. Yet (from that proposal and comments):

“Beginner confusion” [with link to a SO q; in which I don’t think the confusion is with the punning]

“I see students struggling with puns”

“… are extremely confusing to newbies. (…, I’ve seen 4 different classes of 2nd/3rd year university students getting confused by this)”

  • A long straggly reddit thread that comes to no definite conclusion/has only anecdotal experiences, including teaching experiences.

So did I not suffer this (alleged) confusion because I taught myself Haskell/was not infected from getting taught by an academic who came with a prejudice that punning was confusing?

Did I not suffer because DataKinds was not a thing at the time I was learning?

Perhaps DataKinds should have a restriction [**] that it won’t promote data constructors that use punning(?) So the types typically used for kind-level programming would be unproblematic:

data Bool = False | True

data Either a b = Left a | Right b

data Nat = Zero | Succ Nat

data IntNat = 0 | 1 | 2 | ... | maxInt         -- pseudodecl

data List a = Nil | Cons a (List a)

...

[**] Note DataKinds already has a restriction that not all constructors are promotable.

1 Like

One difficulty (as the proposal points out) is that the Prelude uses punning. But does anyone want to promote these?

newtype Identity a = Identity a
newtype ReaderT r m a = ReaderT {runReaderT :: r -> m a}
newtype ExceptT e m a = ExceptT (m (Either e a))

Also Proxy; but I think (hope) that’s on the way out anyway. Does anyone want to promote it?

I didn’t initially find the punning of [] of (,) confusing, but this was because I didn’t appreciate that there were cases where constructors and types were different, like Just vs Maybe. These did confuse me, and pretty seriously. I think the fact that some types pun and some don’t makes it harder to cleanly explain the difference between types and values to a beginner, and when I write code that I want to show non-Haskellers, I often find myself wanting to say List a just so it’s super clear I’m talking about a type.

While I like punning, and use it, it wouldn’t really change my experience of the language if I didn’t, other than (in a fairly significant) cosmetic way.

3 Likes

I think what personally made Haskell more confusing for me was the distinction between terms existing on the type level vs the value level. Once I understood that, punning never really caused me any issues or confusion

1 Like

I’ve never bought the arguments against punning. Pattern matching on, e.g., MkPoint feels exceptionally odd.

In my experience, the confusing part is not that the names are the same, but that they’re both PascalCase.

I think it’s quite natural to write

makePoint = Point

and prefer to use (or even solely export) the obviously named function over the raw constructor. It also has the benefit of comprising the words “make point” which describe what the function does, while MkPoint reeks of notational abuse.

So how do you usually say type [ a ] vs term [ 'a' ]? That is, how do you pronounce square brackets? (I say “List” for both types and terms. To me, “List a” would still be unclear which that was.)

With Datakinds we still have to distinguish type [a] vs type-of-type [a].

Anyway, that proposal I linked to isn’t tackling Lists or Tuples; only the PascalCase lexemes.

I suscribe this. Also types parameters doesn’t help. The fact the Maybe is a type constructor and Maybe Int is an actual type is something which becomes quite an stopper when it comes to learning

That being said, I think punning becomes a no-problem once you distinguish between term/type level. But I’d say punning for [] and (,) is actually a bad thing, because newcomers are faced against these two at the very begining of their learning. It isn’t uncommon to see things like

head :: (x:xs) -> x -- safe version of head, written by a novice not a dependant haskeller
head (x:xs) = x

The difficulty with punning that I’ve run into in classrooms is that students aren’t always yet capable of parsing code to figure out which namespace to use. If you haven’t yet worked out which syntactic positions are types and which are values, then using the same names in both removes a source of useful redundant signaling. It’s never been the end of the world in my experience, but it has caused some confusion.

2 Likes

Is that any different to Just being a data constructor (that is, a form of function) vs Just 'X' being an “actual” value? (Of course in functional languages, functions are ‘merely’ values too, but you know what I mean.)

Thanks. That’s an interesting thought. The term ‘constructor’ in OOP means that calling it runs arbitrary code – possibly such that you can’t get out the values you put in. (GHC has those too, they’re called PatternSynonyms – “synonym” I think is a terribly misleading name.)

With Haskell’s laziness, ‘constructing’ with a data constructor won’t even evaluate the arguments; and pattern-matching with the bare constructor + vars won’t evaluate either.

Another language I dabble in, the designer was so adamant they weren’t OOP-style constructors, they’re called ‘Selectors’. And in a pattern match, selecting is what they’re doing. Now that you point it out,

foo (MkPoint x y) = ...

is notational abuse: we’ve already made the data values into a point; foo is analysing/unwrapping it.

Even worse abuse is 'Type Family’s in PascalCase: they’re clearly functions at the type level/you can’t pattern match on them. They should start lower-case.

Thanks for that experience from the chalk face. I can see that teaching Haskell needs introducing too many strange ideas all at once. Wouldn’t you begin with familiar types like Int, Char and scalar functions to demarcate types vs terms, before getting into data structures?

(You might tell a small fib re String, to postpone explaining lists.)

I think what annoys me about Datakinds is they promote every datatype in your program, including those from imports – whether or not the author of the library had any intent to promote them. (Which is why Prelude types are causing grief: in 1990 nobody could have known their naming would be problematic.)

Yes typeful programming wants type-level Bools, Nats, Maybe, Either. But Proxy? Type-level Monad Transformers?

And why is it so vital to use square brackets for typeful programming? Just promote

data List a = Nil | Cons a (List a)

I concede having Tuple1, Tuple2, ... rather than tuple comma-parens would be tiresome.

So how do you usually say type [ a ] vs term [ 'a' ] ? That is, how do you pronounce square brackets? (I say “List” for both types and terms. To me, “List a” would still be unclear which that was.)

Yes, when spoken it sounds the same, but that seems fine.

Anyway, that proposal I linked to isn’t tackling Lists or Tuples; only the PascalCase lexemes.

Ah, true. Re those, when I want to share an example of Haskell code with non-Haskell coworkers, I often find myself not wanting to write e.g.

data Foo a = Foo (a -> Foo a)

(to make up a type), because the difference between the first and second Foo seems like a minefield. Now, data Foo a = Bar (a -> Foo a) still requires explaining a confusing thing (i.e. what’s Bar?) but that seems more manageable. So long story short, I avoid punning in code I want to be easy to explain. I think if I were teaching a beginner, I would also first give a non-punning example, make sure they understood, and then mention punning.

I actually think the GADT version,

data Foo a where
    Bar :: a -> Foo a

might be the easiest for a beginner (but I’m not an educator so this is pure speculation), because it makes explicit exactly what kind of thing Bar is.

With Datakinds we still have to distinguish type [a] vs type-of-type [a].

Yes, but I’m happy for DataKinds to be confusing for a beginner. Unless we were in a future where type-level programming was very fundamental to Haskell, it seems like it would be an advanced feature.

I absolutely agree that MkPoint is a very awkward convention. It’s extremely imperative “go make me a point!” when the right way to understand pattern-matching is declarative. I have taught in my own a dialect of Haskell, and among other things strove to avoid punning at least in user-defined data types. I established the convention of naming the non-punned constructors PointOf, so you’d say PointOf 3 2. (Or, actually, you’d say PointOf(3, 2) because I also avoided currying in this dialect, but that’s a different matter.

3 Likes

To be honest, when I started learning Haskell, I didn’t face any problems with punning so I won’t make any comments on that part. On the other hand, When you mentioned Maybe and Maybe Int, something came to my mind. When I started to learn Lambda-calculus, most of the things that I’ve heard people say when it comes to teaching Haskell to beginners, seemed to click much faster. The fact that at the time of learning parametrized types (Proper type constructors), people may get confused, could be avoided a little bit simpler by just bringing up the difference between a Type Constructor and a Proper Type Constructor. In my opinion learning the logic and language agnostic reasons behind the different terminologies would both make the knowledge deeper and more applicable in the long run.